-
differences.ATTgt.fit(formula: str, weights_name: str =
None, control_group: str ='never_treated', base_delta: str | list | dict ='base', est_method: str | Callable ='dr', as_repeated_cross_section: bool =None, boot_iterations: int =0, random_state: int =None, alpha: float =0.05, cluster_var: list | str =None, split_sample_by: Callable | str | dict =None, n_jobs: int =1, backend: str ='loky', progress_bar: bool =True) DataFrame Computes the cohort-time-(stratum) average treatment effects:
effects for each cohort, in each time, (for each stratum).
- Parameters:¶
- formula : str¶
Wilkinson formula for the outcome variable and covariates
If no covariates the formula must contain only the name of the outcome variable
# example with covariates formula = 'y ~ a + b + a:b' # example without covariates formula = 'y'Formulas are implemented using formulaic, refer to its documentation for additional details.
- weights_name: str =
None¶ The name of the column containing the sampling weights. If None, all observations have same weights.
- control_group: str =
'never_treated'¶ "never_treated""not_yet_treated"
- base_delta: str | list | dict =
'base'¶ Use base period values for covariates and/or delta values, i.e. the change in value, between the value of covariates at time and the value at base period.
Available options are:
"base"the value of each covariate is set to its base period value
"delta"the value of each time-varying covariate is set to the delta. Time-constant covariates included through x_formula are dropped, and a warning issued.
["base", "delta"]or"base_delta"the value of each covariate is set to its base period value, and the value of each time-varying covariate is set to the delta.
{'base': ['a', 'b', ..]}the value of the specified covariates is set to its base period value, and the value of each time-varying covariate is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in base_delta.
{'delta': ['c', 'd', ..]}the value of each covariate is set to its base period value, and the value of the specified time-varying covariates is set to the delta. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.
{'base': ['a', 'b', ..], 'delta': ['c', 'd', ..]}the value of the specified covariates is set to its base period value, and the value of the specified time-varying covariates is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in ‘delta’. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.
- est_method: str | Callable =
'dr'¶ "dr-mle"or"dr"for locally efficient doubly robust DiD estimator, with logistic propensity score model for the probability of being treated
"dr-ipt"for locally efficient doubly robust DiD estimator, with propensity score estimated using the inverse probability tilting
"reg"for outcome regression DiD estimator
"std_ipw-mle"or"std_ipw"for standardized inverse probability weighted DiD estimator, with logistic propensity score model for the probability of being treated
- as_repeated_cross_section: bool =
None¶ - boot_iterations: int =
0¶ - random_state: int =
None¶ - alpha: float =
0.05¶ The significance level.
- cluster_var: list | str =
None¶ - split_sample_by: Callable | str | dict =
None¶ The name of the column along which to split the data, or a function which takes the data and returns a sample mask for a binary split, for example:
lambda: x = x['column name'] >= x['column name'].median()The estimation of the ATT will be run separately for each specified sample; used for heterogeneity analysis.
- n_jobs: int =
1¶ The maximum number of concurrently running jobs. If -1 all CPUs are used.
If ≠ 1, concurrent jobs will be run for two separate tasks:
computing the cohort-time ATT; each cohort-time is assigned to a job
computing the bootstrap; the influence function is split into n_jobs parts and the boostrap is computed concurrently for each part
Parallelization is implemented using joblib, refer to its documentation for additional details on n_jobs.
- backend: str =
'loky'¶ Parallelization backend implementation.
Parallelization is implemented using joblib, refer to its documentation for additional details on backend.
- progress_bar: bool =
True¶ If True, a progress bar will display the progress over the cohort-times iterations and/or the iterations over the number of boostrap concurrent splits (not the bootstrap iterations).
- Return type:¶
A DataFrame with the group time ATTs