Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
`"survey_tsl"` `vcov_type`, and a Survey Design block in `summary()`. The non-survey path is
byte-for-byte unchanged. Validated against `survey::svyglm` on the stacked long difference
(numeric golden parity is the D2 follow-up).
- **`TROP` non-absorbing (on/off) treatment support** (Athey, Imbens, Qu & Viviano 2025,
§2.1 / Eq. 12 / Algorithm 2). New `non_absorbing` parameter (default `False`). The paper
supports general assignment patterns ("units moving into and out of treatment"), not only
absorbing/staggered adoption; `TROP(non_absorbing=True)` (`method='local'` only) now
accepts treatment that switches on and off, imputing each treated cell's counterfactual via
the paper's `(1-W)` masking. The default `non_absorbing=False` is unchanged and still
rejects non-monotonic D with a `ValueError` (now also pointing to the opt-in), guarding
against the common mistake of encoding absorbing treatment as an event-style spike. This
*removes a prior implementation over-restriction* (the estimator was stricter than the
paper) rather than adding a deviation. `method='global'` keeps its block-assignment
requirement and rejects `non_absorbing=True`. A one-time `UserWarning` is emitted noting
that validity relies on the no-dynamic-effects assumption and that the triple-robustness
guarantee (Theorem 5.1) is proven only under block assignment. The Rust local LOOCV and
point-estimate paths were already mask-driven and unchanged (Rust/Python ATT parity is
regression-tested); the non-absorbing **bootstrap** is routed to the Python path, because
the Rust resampler lacks the no-weighted-control-support guard and can return a degenerate
~0 SE on an empty control stratum. Treated cells with no weighted control support (e.g. an
always-treated unit under `lambda_unit>0`) are materialized as NaN and excluded from the
ATT (the library non-estimable->NaN convention), with a `UserWarning`.
- **`LPDiD` non-absorbing R-parity validation** (Phase C2). Pins both non-absorbing modes
against an independent `fixest::feols` reconstruction of the paper's Eq. 12 (`first_entry`)
and Eq. 13 (`effect_stabilization`) clean-sample restrictions: variance-weighted point and
Expand Down
8 changes: 4 additions & 4 deletions METHODOLOGY_REVIEW.md
Original file line number Diff line number Diff line change
Expand Up @@ -880,7 +880,7 @@ These three are feature deferrals (paper-supported extensions that the library h
| Status | **Complete** (paper `method="local"`, version-pinned to arXiv v2 — see Version Pinning below) |
| Last Review | 2026-05-24 |

**Version Pinning:** This methodology promotion is anchored on **arXiv:2508.21536v2** (the version covered by the paper review on file at `docs/methodology/papers/athey-2025-review.md`). The current arXiv version is **v3** (submitted 2026-02-09). A formal v2→v3 source delta-check against the v3 PDF has **NOT** been performed for any of the sections this PR promotes (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1). **Action item:** before the next paper-author reference implementation or substantive v3 release, refresh the paper review against the most recent arXiv version and re-validate the verified-component checklist; until then the promotion stays v2-anchored.
**Version Pinning:** This methodology promotion is anchored on **arXiv:2508.21536v2** (the version covered by the paper review on file at `docs/methodology/papers/athey-2025-review.md`). The current arXiv version is **v3** (submitted 2026-02-09). The **v3 PDF was consulted for the treatment-assignment-pattern sections** during the non-absorbing support work (§2.1, §2.2 Eq. 2, §6.1 Eq. 12 / Algorithm 2, Assumption 1(i), Theorem 5.1), confirming the general-assignment scope behind `TROP(non_absorbing=True)`. A formal v2→v3 source delta-check across the remaining promoted sections (Eqs. 2-3, Algorithms 1-3, Section 2.2, Section 5.2-5.3, Section 6.1-6.2, Theorem 5.1, Corollary 1, Appendix Theorem 8.1) has **NOT** been performed in full. **Action item:** before the next paper-author reference implementation or substantive v3 release, refresh the paper review against the most recent arXiv version and re-validate the verified-component checklist; until then the promotion stays v2-anchored.

**Scope:** This methodology promotion covers the paper-aligned `method="local"` path (paper Algorithm 2: per-(i, t) estimation with observation-specific weights). The library also exposes `method="global"`, documented in `REGISTRY.md` as a "computationally efficient adaptation using the (1-W) masking principle from Eq. 2" — a library-side adaptation, NOT the paper's full Algorithm 2 estimator. Defensive coverage of the global method lives in `tests/test_trop.py::TestTROPGlobalMethod` (704 lines, ~30 tests for the global-method-specific surface) and is not duplicated in the methodology walk-through. Methodology promotion of `method="global"` as a primary surface would require either (a) a paper-side derivation of the global adaptation's equivalence to Algorithm 2 under specific conditions, or (b) a separate library-extension methodology review; both are deferred.

Expand All @@ -892,15 +892,15 @@ These three are feature deferrals (paper-supported extensions that the library h
- [x] Corollary 1 (paper p. 23) — **single-draw sanity checks consistent with the three unbiasedness conditions, not a repeated-MC mean-bias study**: each of the three balance conditions (a) unit balance, (b) time balance, (c) ``B = 0`` is exercised on a targeted DGP that makes one condition trivially hold while keeping the others sub-optimal. The assertion in each case is a single-realisation ``|att - τ| < 3 * se`` band using the estimator's own bootstrap SE — this is a smoke check, NOT a repeated-draw Monte Carlo bias study of the paper's conditional-unbiasedness statement under fixed weights. A stronger MC bias study at fixed λ values is deferred (would multiply test runtime by ~30x for marginal additional evidence given the existing 3-σ band already catches order-of-magnitude bias regressions).
- [x] Theorem 5.1 (paper p. 23) — **simulation sanity check, not a direct theorem lock**: the paper's bias bound ``|E[τ_hat - τ | L]| <= ||Δ_u|| · ||Δ_t|| · ||B||_*`` is stated for FIXED, non-data-dependent weights. The library's TROP fit uses data-dependent LOOCV-tuned λ values, so the direct conditional bias bound is not tested here. Instead, the methodology test verifies the bound's empirical realisation: TROP RMSE strictly below DID RMSE under a confounded factor DGP with ``true τ = 0`` (calibration measurement: TROP/DID RMSE ratio ≈ 0.34 at ``factor_strength = 1.0``). The direct fixed-weight bound test is deferred — would require exposing oracle Γ / Λ / B from a paper-aligned DGP and computing each component of the bound from instrumented internals.
- [x] Section 2.2 special-case reductions: **DiD benchmark sanity check** (not a direct algebraic-equivalence proof) — on a no-interactive-FE multi-period panel (additive unit + time effects only, no factor structure), TROP with ``λ_nn = ∞`` + uniform weights produces an ATT within 0.5 of `DifferenceInDifferences` fitted as `outcome ~ treat * post_flag` (basic 2×2 design with `[const, D, T, D×T]`, extended to repeated observations within each treat×post cell). This is **empirical numerical agreement on a friendly DGP**, NOT a proof of the paper Section 2.2 algebraic reduction (which would require either a true 2-period block-assignment panel where the basic-DiD comparator is the algebraic target, or a comparison against `TwoWayFixedEffects` — both deferred). **Matrix Completion code path exercised, not equivalence-checked** — TROP with uniform weights + finite ``λ_nn`` engages the nuclear-norm prox solver (effective_rank > 0) and recovers ATT better than the DiD-style baseline on a factor-confounded DGP; this verifies the code path activates but does NOT prove equivalence with an independent MC reference implementation (which would require either an external MC port or a hand-written reference solver). SC / SDID reductions deferred — see "Outstanding Concerns".
- [x] Eq. 13 + Algorithm 2 per-(i, t) estimation: ``treatment_effects`` dict contains one finite ``τ_hat_it`` per treated cell; the aggregate ATT equals the unweighted mean of per-cell effects (Eq. 1). **Tests cover block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path but not directly verified in this methodology surface. **Section 6.1 non-absorbing / on-off / switching assignment patterns are explicitly OUT OF SCOPE** — the implementation rejects non-absorbing D-matrices via `trop_local.py` absorbing-state validation, and the methodology test enforces the rejection contract via `TestTROPDeviations::test_event_style_d_rejected_with_value_error` (event-style D being one specific non-absorbing pattern; the same absorbing-state validator catches all 1→0 transitions). Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated`.
- [x] Eq. 13 + Algorithm 2 per-(i, t) estimation: ``treatment_effects`` dict contains one ``τ_hat_it`` entry per treated cell (finite for estimable cells; NaN for a missing outcome or for a cell whose unit/time fixed effect ``alpha_i + beta_t`` is unidentified by the two-way-FE control fit — i.e. the target unit and target period are not in the same connected component of the observed-control graph (an always-treated unit for any ``lambda_unit``, a fully-treated period, or disconnected control support under ``non_absorbing``; or an unbalanced absorbing panel with entirely-missing unit/period controls — the guard is applied to all local fits, not only non_absorbing, and the bootstrap is forced onto the guarded Python path when trimming occurs); the aggregate ATT equals the unweighted mean of the finite per-cell effects (Eq. 1). Trimming non-estimable cells to NaN matches the library-wide non-estimable→NaN convention and is documented in REGISTRY ## TROP "non-absorbing non-estimable-cell trimming" Note; locked by `TestTROPDeviations::test_non_absorbing_always_treated_unit_not_raw_outcome` and `test_non_absorbing_fully_treated_period_not_estimable`. **Tests cover block adoption with a constant treatment effect**; **absorbing-state staggered adoption** and **heterogeneous per-cell effects** (paper Remark 6.1) are SUPPORTED by the code path but not directly verified for those specific patterns. **Section 6.1 non-absorbing / on-off / switching assignment patterns are SUPPORTED via the opt-in `TROP(non_absorbing=True)` (`method='local'` only)** — matching the paper's general-assignment scope (§2.1; Eq. 12 / Algorithm 2). This *narrows* a prior implementation over-restriction (the shipped estimator was stricter than the paper) rather than adding a deviation. The default `non_absorbing=False` still rejects non-monotonic D as a defensive guard; recovery on a no-dynamic-effects toggling DGP + the caveat warning are locked by `TestTROPDeviations::test_non_absorbing_general_assignment_supported`, and the default-mode rejection contract by `TestTROPDeviations::test_event_style_d_rejected_with_value_error`. Inference caveat: Theorem 5.1's triple-robustness guarantee is proven under Assumption 1(i) block assignment only (see REGISTRY ## TROP Notes). Cross-coverage of the staggered-cohort fit path is `tests/test_methodology_trop.py::TestTROPAlgorithm1LOOCV::test_control_set_includes_pretreat_of_eventually_treated`.
- [x] Algorithm 3 stratified pairs bootstrap: under an unbalanced (3 treated, 17 control) panel, the stratified sampler reliably produces ≥ 67% successful bootstrap draws and a positive finite SE.
- [x] Section 3 / Eq. 6 semi-synthetic factor DGP: five recovery tests verify limiting-case uniform weights, unit-weight bias reduction, time-weight bias reduction, factor-model bias reduction with effective_rank > 0, and null-DGP recovery centred near zero.
- [x] safe_inference contract: confidence interval uses the t-distribution with df = max(1, n_treated_obs - 1), consistent with p_value (matches REGISTRY `## TROP` "Inference CI distribution" note, post safe_inference migration).

**Test Coverage:**

- 36 methodology tests (10 classes) in `tests/test_methodology_trop.py`.
- Defensive guards (107 tests in `tests/test_trop.py`): D-matrix absorbing-state validation, silent-warning audit, FISTA convergence warnings, bootstrap-failure-rate proportional warning, bootstrap NaN-SE propagation, module-split smoke tests.
- 39 methodology tests (10 classes) in `tests/test_methodology_trop.py` (includes non-absorbing opt-in recovery + caveat-warning + default-mode no-warning + unbalanced×non-absorbing).
- Defensive guards (117 tests in `tests/test_trop.py`): D-matrix absorbing-state validation, non-absorbing opt-in acceptance / local-only guard / params round-trip / Rust-Python parity, silent-warning audit, FISTA convergence warnings, bootstrap-failure-rate proportional warning, bootstrap NaN-SE propagation, module-split smoke tests.

**Deviations from paper:**

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
- [TwoWayFixedEffects](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - panel data DiD with unit and time fixed effects via within-transformation or dummies
- [MultiPeriodDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - event study design with period-specific treatment effects for dynamic analysis
- [CallawaySantAnna](https://diff-diff.readthedocs.io/en/stable/api/staggered.html) - Callaway & Sant'Anna (2021) group-time ATT estimator for staggered adoption
- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html) - de Chaisemartin & D'Haultfœuille (2020/2022) for **reversible (non-absorbing) treatments** with multi-horizon event study, normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The only library option for treatments that switch on AND off. Alias `DCDH`.
- [ChaisemartinDHaultfoeuille](https://diff-diff.readthedocs.io/en/stable/api/chaisemartin_dhaultfoeuille.html) - de Chaisemartin & D'Haultfœuille (2020/2022) for **reversible (non-absorbing) treatments** with multi-horizon event study, normalized effects, cost-benefit delta, sup-t bands, and dynamic placebos. The most general option for treatments that switch on AND off (see also `LPDiD`/`TROP` `non_absorbing`). Alias `DCDH`.
- [SunAbraham](https://diff-diff.readthedocs.io/en/stable/api/staggered.html) - Sun & Abraham (2021) interaction-weighted estimator for heterogeneity-robust event studies
- [ImputationDiD](https://diff-diff.readthedocs.io/en/stable/api/imputation.html) - Borusyak, Jaravel & Spiess (2024) imputation estimator, most efficient under homogeneous effects
- [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance
Expand Down
19 changes: 13 additions & 6 deletions diff_diff/chaisemartin_dhaultfoeuille.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,14 @@
"""
de Chaisemartin-D'Haultfoeuille (dCDH) estimator for reversible-treatment DiD.

The dCDH estimator is the only modern DiD estimator in the diff-diff library
that handles **non-absorbing (reversible) treatments** — treatment can switch
on AND off over time. All other staggered estimators in the library
The dCDH estimator is the most general DiD estimator in the diff-diff library
for **non-absorbing (reversible) treatments** — treatment can switch on AND off
over time, switcher vs non-switcher comparisons are its primitive object, and it
allows dynamic (carryover) effects with explicit joiner/leaver (``DID_+`` /
``DID_-``) decomposition. ``LPDiD`` (``non_absorbing="first_entry"`` /
``"effect_stabilization"``) and ``TROP`` (``non_absorbing=True``, under a
no-dynamic-effects assumption) also accept non-absorbing treatment under stronger
assumptions. The remaining staggered estimators in the library
(``CallawaySantAnna``, ``SunAbraham``, ``ImputationDiD``, ``TwoStageDiD``,
``EfficientDiD``, ``WooldridgeDiD``) assume treatment is absorbing.

Expand Down Expand Up @@ -354,9 +359,11 @@ class ChaisemartinDHaultfoeuille(ChaisemartinDHaultfoeuilleBootstrapMixin):
"""
de Chaisemartin-D'Haultfoeuille (dCDH) estimator.

The only modern DiD estimator in the library that handles **reversible
(non-absorbing) treatments** - treatment may switch on AND off over
time. Computes the contemporaneous-switch DiD ``DID_M`` from the
The most general library estimator for **reversible (non-absorbing)
treatments** - treatment may switch on AND off over time, with explicit
joiner/leaver (``DID_+`` / ``DID_-``) decomposition (``LPDiD`` and ``TROP``
also support non-absorbing treatment under stronger assumptions; see their
``non_absorbing`` parameters). Computes the contemporaneous-switch DiD ``DID_M`` from the
AER 2020 paper (equivalently ``DID_1`` at horizon ``l = 1`` of the
dynamic companion paper, NBER WP 29873) plus the full multi-horizon
event study ``DID_l`` for ``l = 1..L_max`` via the ``L_max`` parameter
Expand Down
8 changes: 5 additions & 3 deletions diff_diff/chaisemartin_dhaultfoeuille_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@
This module contains ``ChaisemartinDHaultfoeuilleResults`` and
``DCDHBootstrapResults`` dataclasses produced by the
``ChaisemartinDHaultfoeuille`` (alias ``DCDH``) estimator. The dCDH
estimator is the only modern DiD estimator in the library that handles
non-absorbing (reversible) treatments. Phase 1 ships the contemporaneous-
switch case ``DID_M`` (= ``DID_1`` of the dynamic companion paper).
estimator is the most general library estimator for non-absorbing
(reversible) treatments (``LPDiD`` and ``TROP`` also support non-absorbing
treatment under stronger assumptions; see their ``non_absorbing`` parameters).
Phase 1 ships the contemporaneous-switch case ``DID_M`` (= ``DID_1`` of the
dynamic companion paper).

References
----------
Expand Down
21 changes: 15 additions & 6 deletions diff_diff/guides/llms-autonomous.txt
Original file line number Diff line number Diff line change
Expand Up @@ -531,12 +531,21 @@ When `has_never_treated == False`:

When `treatment_type == "binary_non_absorbing"`:

- `ChaisemartinDHaultfoeuille` is the only estimator in the library
that treats this natively. Switcher / non-switcher comparisons are
its primitive object.
- Other estimators assume absorbing treatment and will produce
estimates whose interpretation is unclear. Do not use them without
a well-argued reason.
- `ChaisemartinDHaultfoeuille` is the most general / default choice and
treats this natively. Switcher / non-switcher comparisons are its
primitive object; it allows dynamic (carryover) effects and reports
joiner/leaver (`DID_+` / `DID_-`) views. Prefer it when effects may
persist after treatment turns off.
- `LPDiD(non_absorbing="first_entry")` or `"effect_stabilization"`
(entry-effect estimands) and `TROP(non_absorbing=True, method="local")`
(valid under a no-dynamic-effects / no-carryover assumption) also handle
non-absorbing treatment, under stronger assumptions. Use TROP's option
only when effects are contemporaneous (no carryover).
- The remaining estimators (`CallawaySantAnna`, `SunAbraham`,
`ImputationDiD`, `TwoStageDiD`, `EfficientDiD`, `WooldridgeDiD`) assume
absorbing treatment and will produce estimates whose interpretation is
unclear on non-absorbing data. Do not use them without a well-argued
reason.

### §4.6 Triple-difference design (DDD)

Expand Down
Loading
Loading