differences.ATTgt.fit

differences.ATTgt.fit(formula: str, weights_name: str = None, control_group: str = 'never_treated', base_delta: str | list | dict = 'base', est_method: str | Callable = 'dr', as_repeated_cross_section: bool = None, boot_iterations: int = 0, random_state: int = None, alpha: float = 0.05, cluster_var: list | str = None, split_sample_by: Callable | str | dict = None, n_jobs: int = 1, backend: str = 'loky', progress_bar: bool = True) → DataFrame

Computes the cohort-time-(stratum) average treatment effects:

effects for each cohort, in each time, (for each stratum).

Parameters:¶

formula : str¶

Wilkinson formula for the outcome variable and covariates

If no covariates the formula must contain only the name of the outcome variable

# example with covariates
formula = 'y ~ a + b + a:b'

# example without covariates
formula = 'y'

Formulas are implemented using formulaic, refer to its documentation for additional details.

weights_name: str = None¶

The name of the column containing the sampling weights. If None, all observations have same weights.

control_group: str = 'never_treated'¶

"never_treated"
"not_yet_treated"

base_delta: str | list | dict = 'base'¶

Use base period values for covariates and/or delta values, i.e. the change in value, between the value of covariates at time and the value at base period.

Available options are:

"base"
the value of each covariate is set to its base period value
"delta"
the value of each time-varying covariate is set to the delta. Time-constant covariates included through x_formula are dropped, and a warning issued.
["base", "delta"] or "base_delta"
the value of each covariate is set to its base period value, and the value of each time-varying covariate is set to the delta.
{'base': ['a', 'b', ..]}
the value of the specified covariates is set to its base period value, and the value of each time-varying covariate is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in base_delta.
{'delta': ['c', 'd', ..]}
the value of each covariate is set to its base period value, and the value of the specified time-varying covariates is set to the delta. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.
{'base': ['a', 'b', ..], 'delta': ['c', 'd', ..]}
the value of the specified covariates is set to its base period value, and the value of the specified time-varying covariates is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in ‘delta’. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.

est_method: str | Callable = 'dr'¶

"dr-mle" or "dr"
for locally efficient doubly robust DiD estimator, with logistic propensity score model for the probability of being treated
"dr-ipt"
for locally efficient doubly robust DiD estimator, with propensity score estimated using the inverse probability tilting
"reg"
for outcome regression DiD estimator
"std_ipw-mle" or "std_ipw"
for standardized inverse probability weighted DiD estimator, with logistic propensity score model for the probability of being treated

as_repeated_cross_section: bool = None¶

boot_iterations: int = 0¶

random_state: int = None¶

alpha: float = 0.05¶

The significance level.

cluster_var: list | str = None¶

split_sample_by: Callable | str | dict = None¶

The name of the column along which to split the data, or a function which takes the data and returns a sample mask for a binary split, for example:

lambda: x = x['column name'] >= x['column name'].median()

The estimation of the ATT will be run separately for each specified sample; used for heterogeneity analysis.

n_jobs: int = 1¶

The maximum number of concurrently running jobs. If -1 all CPUs are used.

If ≠ 1, concurrent jobs will be run for two separate tasks:

computing the cohort-time ATT; each cohort-time is assigned to a job
computing the bootstrap; the influence function is split into n_jobs parts and the boostrap is computed concurrently for each part

Parallelization is implemented using joblib, refer to its documentation for additional details on n_jobs.

backend: str = 'loky'¶

Parallelization backend implementation.

Parallelization is implemented using joblib, refer to its documentation for additional details on backend.

progress_bar: bool = True¶

If True, a progress bar will display the progress over the cohort-times iterations and/or the iterations over the number of boostrap concurrent splits (not the bootstrap iterations).

Return type:¶

A DataFrame with the group time ATTs