-
differences.ATTgt.fit(formula: str, weights_name: str =
None
, control_group: str ='never_treated'
, base_delta: str | list | dict ='base'
, est_method: str | Callable ='dr'
, as_repeated_cross_section: bool =None
, boot_iterations: int =0
, random_state: int =None
, alpha: float =0.05
, cluster_var: list | str =None
, split_sample_by: Callable | str | dict =None
, n_jobs: int =1
, backend: str ='loky'
, progress_bar: bool =True
) DataFrame Computes the cohort-time-(stratum) average treatment effects:
effects for each cohort, in each time, (for each stratum).
- Parameters:¶
- formula : str¶
Wilkinson formula for the outcome variable and covariates
If no covariates the formula must contain only the name of the outcome variable
# example with covariates formula = 'y ~ a + b + a:b' # example without covariates formula = 'y'
Formulas are implemented using formulaic, refer to its documentation for additional details.
- weights_name: str =
None
¶ The name of the column containing the sampling weights. If None, all observations have same weights.
- control_group: str =
'never_treated'
¶ "never_treated"
"not_yet_treated"
- base_delta: str | list | dict =
'base'
¶ Use base period values for covariates and/or delta values, i.e. the change in value, between the value of covariates at time and the value at base period.
Available options are:
"base"
the value of each covariate is set to its base period value
"delta"
the value of each time-varying covariate is set to the delta. Time-constant covariates included through x_formula are dropped, and a warning issued.
["base", "delta"]
or"base_delta"
the value of each covariate is set to its base period value, and the value of each time-varying covariate is set to the delta.
{'base': ['a', 'b', ..]}
the value of the specified covariates is set to its base period value, and the value of each time-varying covariate is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in base_delta.
{'delta': ['c', 'd', ..]}
the value of each covariate is set to its base period value, and the value of the specified time-varying covariates is set to the delta. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.
{'base': ['a', 'b', ..], 'delta': ['c', 'd', ..]}
the value of the specified covariates is set to its base period value, and the value of the specified time-varying covariates is set to the delta. A warning is issued if x_formula included time-constant covariates that are not included in ‘delta’. If the covariates included in ‘delta’ are not time-varying they will be removed from the list.
- est_method: str | Callable =
'dr'
¶ "dr-mle"
or"dr"
for locally efficient doubly robust DiD estimator, with logistic propensity score model for the probability of being treated
"dr-ipt"
for locally efficient doubly robust DiD estimator, with propensity score estimated using the inverse probability tilting
"reg"
for outcome regression DiD estimator
"std_ipw-mle"
or"std_ipw"
for standardized inverse probability weighted DiD estimator, with logistic propensity score model for the probability of being treated
- as_repeated_cross_section: bool =
None
¶ - boot_iterations: int =
0
¶ - random_state: int =
None
¶ - alpha: float =
0.05
¶ The significance level.
- cluster_var: list | str =
None
¶ - split_sample_by: Callable | str | dict =
None
¶ The name of the column along which to split the data, or a function which takes the data and returns a sample mask for a binary split, for example:
lambda: x = x['column name'] >= x['column name'].median()
The estimation of the ATT will be run separately for each specified sample; used for heterogeneity analysis.
- n_jobs: int =
1
¶ The maximum number of concurrently running jobs. If -1 all CPUs are used.
If ≠ 1, concurrent jobs will be run for two separate tasks:
computing the cohort-time ATT; each cohort-time is assigned to a job
computing the bootstrap; the influence function is split into n_jobs parts and the boostrap is computed concurrently for each part
Parallelization is implemented using joblib, refer to its documentation for additional details on n_jobs.
- backend: str =
'loky'
¶ Parallelization backend implementation.
Parallelization is implemented using joblib, refer to its documentation for additional details on backend.
- progress_bar: bool =
True
¶ If True, a progress bar will display the progress over the cohort-times iterations and/or the iterations over the number of boostrap concurrent splits (not the bootstrap iterations).
- Return type:¶
A DataFrame with the group time ATTs