Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **`SpilloverDiD(survey_design=SurveyDesign.subpopulation(...))` full-design retention via zero-pad scores (Wave E.3).** Closes the Wave E.1/E.2/follow-up documented limitation at `REGISTRY.md:3249`: `SurveyDesign.subpopulation()`-derived designs AND warn-and-drop fits now preserve the full-domain resolved survey design — `n_psu` / `n_strata` / `df_survey` / Binder TSL per-stratum centering reflect the FULL domain rather than the post-`finite_mask` fit sample. **Documented synthesis (library-convention adoption, NOT new methodology):** Wave E.3 adopts the canonical "zero-pad scores to full panel + retain full-design resolved survey" pattern from R `survey::svyrecvar(subset())` (Lumley 2010 §2.5) already established in `diff_diff/imputation.py:2175-2183` (PreTrendsImputation lead regression — Omega_0 scores zero-padded back to full panel length) and `diff_diff/prep.py:1401-1432` (DCDH cell variance — IF zero-padded outside the cell). Wave E.3 propagates the same convention to SpilloverDiD's Wave E.1 Binder TSL × Wave D Gardner GMM × Wave E.2/follow-up stratified-Conley + serial Bartlett meat. **Mechanical realization (one new `_compute_gmm_corrected_meat` kwarg):** the gamma_hat / Psi build stays on SURVEY-FINITE-MASK inputs (`X_1_sparse_fit`, `X_10_sparse_fit`, `eps_10_fit` built on `survey_finite_mask = finite_mask & survey_weights > 0`; `X_2_kept_gamma`, `eps_2_fit_gamma`, `survey_weights_fit_gamma` projected from the fit-sample frame down to survey_finite_mask) so the drop-first stage-1 FE column space is bit-identical to the pre-E.3 path. `_compute_gmm_corrected_meat` gains a new optional kwarg `score_pad_mask: Optional[np.ndarray] = None`: when supplied, the helper zero-pads the fit-sample `Psi` to full panel length AFTER construction but BEFORE kernel dispatch via `Psi_padded[score_pad_mask] = Psi`. Kernel-dispatch arrays (`cluster_ids`, `conley_coords`, `conley_time`, `conley_unit`, `resolved_survey`) are passed at FULL length so the meat helpers (Binder TSL / stratified-Conley / serial Bartlett) see the full-domain PSU / strata / centroid / time geometry. The `_validate_conley_kwargs` call inside the helper reads `n_for_conley = len(score_pad_mask)` when the kwarg is set so the Conley shape checks see the full-length geometry. **`gamma_hat` invariance:** the gamma_hat solve operates on fit-sample inputs throughout — bit-identical to the pre-E.3 path (critical for the case where `_build_butts_fe_design_csr`'s `pd.factorize` re-compaction would drop a different unit's column under a full-length FE build than under a fit-length one). **Bread invariance:** `A_22 = X_2_kept' W X_2_kept` at `spillover.py:3187-3214` still uses fit-length `X_2_kept` because `A_22_full = X_2_full' W_full X_2_full` equals `A_22_kept` when zero-weight rows contribute zero. **A2 invariant:** warn-and-drop and `SurveyDesign.subpopulation()` drops are treated identically — both apply the zero-pad mechanism. The "both mechanisms compose cleanly" case (subpop-excluded row that is ALSO warn-and-dropped) produces `Psi = 0` from either cause; the PSU still counts toward `n_psu_full`. Hand-computation methodology anchor at `_scratch/wave_e3_smoke.py` codifies the A2 invariant on 4 PSU × 4 period × 3 obs synthetic. **Subpopulation parity vs upstream-subset:** `df_survey` matches the full domain regardless of how many rows the subpopulation mask excludes (mirrors R `svyglm(design=subset(d, mask))` vs `svyglm(design=svydesign(data=data[mask], ...))`). SE may differ by design — subpopulation retains zero-padded PSU geometry; upstream-subset drops PSUs entirely. **Pre-E.3 baseline parity:** when `finite_mask.all() == True` AND all weights `> 0`, the Wave E.3 zero-pad is a no-op — ATT + SE + n_psu + df_survey match pre-E.3 baseline values via FIXED GOLDEN values at `test_c` (`rtol=1e-12, atol=1e-12`). **Cross-surface n_psu consistency:** top-level `res.n_psu` reads from `len(resolved_survey_fit.weights)` on the implicit-PSU branch (was `int(finite_mask.sum())` pre-codex-R1-P2-fix); this keeps `res.n_psu == res.survey_metadata.n_psu` on weights-only / strata-only survey designs under warn-and-drop. Regression at `test_c2`. **Restrictions inherited:** replicate-weight variance + subpopulation continues to raise `NotImplementedError` at the Wave E.1 gate. TwoStageDiD's analogous `finite_mask + design-subset` pattern at `two_stage.py:567-601` is NOT yet adopted to Wave E.3 — separate parity follow-up tracked in `TODO.md` (an expected-divergence test was attempted but TwoStageDiD's always-treated handling at `two_stage.py:294-336` differs from SpilloverDiD's per-unit Omega_0 check, so the divergence didn't materialize on the standard fixture; the parity follow-up should add its own targeted regression). **Implementation:** `spillover.py:2845-2896` design-subset block deleted; `survey_weights_fit = survey_weights[finite_mask]` retained for the stage-2 OLS solve which still operates on the fit sample; `cluster_ids_full[finite_mask]` subset dropped on the survey path. `_compute_gmm_corrected_meat` call at `spillover.py:3163` now receives FIT-LENGTH gamma_hat-construction inputs (unchanged) plus FULL-LENGTH kernel-dispatch arrays (`cluster_ids_for_meat`, `conley_*_for_meat`, `resolved_survey_fit`) plus the new `score_pad_mask=survey_finite_mask` kwarg; no-survey path passes `score_pad_mask=None` and uses fit-length variables throughout (bit-identical to pre-E.3). `_compute_gmm_corrected_meat` at `two_stage.py:62-80` adds one new optional kwarg `score_pad_mask: Optional[np.ndarray] = None` and one post-Psi-construction zero-pad block; the `_validate_conley_kwargs` call uses `n_for_conley = len(score_pad_mask)` when the kwarg is set. Within-unit-constancy validator at `spillover.py:2913` updated to operate on full-length unit array. Second `compute_survey_metadata` recompute at `spillover.py:2954-2959` uses full-length `raw_w`. No `_compute_stratified_meat_from_psu_scores` / `_compute_stratified_conley_meat` / `_compute_stratified_serial_bartlett_meat` signature changes. **Tests:** new `TestSpilloverDiDWaveE3SubpopulationFullDesign` and `TestSpilloverDiDWaveE3SubpopulationFullDesignEventStudy` classes in `tests/test_spillover.py` (19 tests: pre-E.3 baseline parity via pinned goldens, n_psu cross-surface consistency on implicit-PSU branch, A2 invariant (zero-pad mechanics via mock-spy), subpopulation × explicit-PSU parity, conley + lag>0 + subpopulation × explicit-PSU / cluster-injection / weights-only branches, cluster-as-PSU + subpopulation parity, unit with BOTH zero weight AND no Omega_0 support, gamma_hat-build sample excludes zero-weight rows, n_obs / n_treated / n_control / n_far_away_obs reflect count_mask, warn-drop SE drift golden, ATT bit-equality under PSU-last-sort exclusion, exact event-study n_obs propagation, event-study on both is_staggered branches with analytical + conley+lag variants). Pre-existing Wave E.1 `test_p2_finite_mask_forces_drop_under_survey` assertion flipped from `n_psu=8` (subset) to `n_psu=10` (full domain) to reflect the new contract.
- **ChaisemartinDHaultfoeuille (DCDH) methodology-review-tracker promotion.** Tracker row flipped **In Progress** → **Complete** with full Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns structure mirroring the HAD precedent (PR #473) and ContinuousDiD precedent (PR #476). REGISTRY `## ChaisemartinDHaultfoeuille` gains a formal `### Deviations from the paper / from R / library extensions` block consolidating 7 documented deviations into a single AI-review-recognized labeled surface (per CLAUDE.md "Documenting Deviations (AI Review Compatibility)"): (D1) equal-cell weighting (deviation from BOTH AER 2020 Equation 3 AND R `DIDmultiplegtDYN`); (D2) period-based vs cohort-based stable controls; (D3) balanced-baseline panel + interior-gap drops + terminal-missingness retention + cell-period-allocator targeted `ValueError`; (D4) SE normalization `N_l` vs R `G` (~4% smaller analytical SE); (D5) singleton-cohort degeneracy → NaN with `UserWarning`; (D6) `<50%` switcher warning at far horizons (library extension citing Favara-Imbs application, footnote 14 of NBER WP 29873); (D7) Phase 3 `DID^X` covariate first-stage equal-cell weights. R cross-language coverage holds at documented tolerance bands in `tests/test_chaisemartin_dhaultfoeuille_parity.py` (`POINT_RTOL = 1e-4` on pure-direction point estimates, `MIXED_POINT_RTOL = 0.025` on mixed-direction, `PURE_DIRECTION_SE_RTOL = 0.05` on pure-direction SE, `SE_RTOL = 0.10` on multi-horizon SE, `se_rtol=0.15` on the long-panel `L_max=5` joiners-only scenario where cell-count-weighting compounds). No source code changes, no new tests, no new docstrings — consolidation only against the existing 12 methodology tests (`tests/test_methodology_chaisemartin_dhaultfoeuille.py`), 26 R-parity tests (`tests/test_chaisemartin_dhaultfoeuille_parity.py`), 352 unit tests (`tests/test_chaisemartin_dhaultfoeuille.py`), survey suites (`tests/test_survey_dcdh.py`, `tests/test_survey_dcdh_replicate_psu.py`, three cell-period coverage suites), and two primary-source DCDH paper reviews on disk (2020 AER + 2022/2023 NBER WP 29873 via PR #478; the `dechaisemartin-2026-review.md` on disk is HAD's primary source, not DCDH's, and is referenced as adjacent context only). The REGISTRY Deviations block uses semantic section-name anchors (rather than fragile line numbers) for back-references to other parts of the DCDH section — an intentional divergence from the PR #476 ContinuousDiD precedent reflecting PR-A wording-drift CI feedback that flagged line-number cross-references as drift-prone in long sections. `METHODOLOGY_REVIEW.md` DCDH row promoted **In Progress** → **Complete**; L27 In Progress example paragraph re-pointed to WooldridgeDiD; L1289 priority-order queue item #6 (DCDH) removed and items #7-#11 renumbered to #6-#10.

### Changed
- **Internal refactor: dedup serial Bartlett kernel construction and PSD guard between Conley no-survey and TwoStage survey paths.** Extracts `_serial_bartlett_kernel_matrix(t_codes, L)` and `_validate_meat_psd(M, *, error_msg, warning_template, stacklevel=3)` to `diff_diff/conley.py`. Replaces three inline kernel constructions (`conley.py` panel-block branch, `two_stage.py` survey singleton-adjust branch, `two_stage.py` survey multi-PSU branch) and two inline finite-plus-eigvalsh guards (`conley.py::_compute_conley_meat`, `two_stage.py` survey panel-block orchestrator) with helper calls. No behavior change — methodology anchor at `tests/test_spillover.py::TestSpilloverDiDWaveE2FollowupConleySurveyLagCutoff` (21 tests including hand-computed serial Bartlett HAC at L=1) and existing PSD-warning monkey-patch tests at `tests/test_conley_vcov.py::TestConleyDirectHelper::{test_uniform_kernel_negative_eigenvalue_warns, test_indefinite_meat_warning_fires_for_bartlett}` still pass unchanged (substring `"bartlett"` / `"uniform"` / `"negative eigenvalue"` in warning messages preserved byte-for-byte). New `TestSerialBartlettKernelMatrix`-grouped tests in `TestConleyKernels` (5 tests: hand-computed L=2 / L=1 / L=0 degenerate / single-element / int-vs-float bit-equality contract) and new `TestValidateMeatPsd` class (4 tests: non-finite raises with caller's `error_msg`, negative-eigenvalue warns with `{eigval:.2e}` substituted, PSD matrix silent, threshold boundary at -5e-13 silent). Closes `TODO.md` Bartlett-dedup row.

## [3.4.1] - 2026-05-21

### Added
Expand Down
1 change: 0 additions & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,6 @@ Deferred items from PR reviews that were not addressed before merge.
| `SyntheticDiD(vcov_type="conley")` support. Currently raises `TypeError` at `__init__` because SyntheticDiD uses `variance_method ∈ {bootstrap, jackknife, placebo}` rather than the analytical sandwich that Conley plugs into. Wiring would require either reimplementing an analytical sandwich path for SyntheticDiD or designing a spatial-block bootstrap (new methodology, Politis-Romano 1994 territory). | `synthetic_did.py::SyntheticDiD` | follow-up (spillover-conley) | Low |
| `SpilloverDiD(survey_design=...)` replicate-weight variance (BRR / Fay / JK1 / JKn / SDR). Wave E.1 ships Taylor-linearization only. Per Gerber (2026) Appendix A, the IF-reweighting shortcut does NOT apply to TwoStageDiD-class estimators because `gamma_hat` is weight-sensitive; correct support requires per-replicate full re-fit of stage 1 and stage 2 (200+ LoC of test surface beyond E.1). | `spillover.py::SpilloverDiD.fit`, `survey.py::compute_replicate_refit_variance` | follow-up | Low |
| `compute_survey_metadata(resolved_survey, raw_w_for_meta)` helper extraction. Wave E.1/E.3 contain two near-duplicate `raw_w_for_meta` constructions (upstream + post-cluster-injection) that differ only in which point of the resolution pipeline they fire at. Factor out a shared helper that takes `(survey_design, data, [finite_mask])` and returns `(resolved_survey, raw_w_for_meta)` to reduce drift risk between the two paths. Cosmetic; behaviour unchanged. | `spillover.py::SpilloverDiD.fit` | follow-up | Low |
| Serial Bartlett kernel logic duplicated between `diff_diff/two_stage.py::_compute_stratified_serial_bartlett_meat` (survey path) and `diff_diff/conley.py::_compute_conley_meat` panel-block branch (no-survey path). Both compute `K[t,s] = (1 - |t-s|/(L+1)) * 1{|t-s| <= L, t != s}` over dense panel-period codes. Factor out a shared `_serial_bartlett_kernel_matrix(t_codes, L)` helper and a shared post-meat finite + PSD-warning guard so the survey and no-survey paths can't drift on diagnostics or kernel weights. Cosmetic; refactor doesn't change behavior. | `two_stage.py::_compute_stratified_serial_bartlett_meat`, `conley.py::_compute_conley_meat` | follow-up | Low |
| `SpilloverDiD(vcov_type="conley", conley_lag_cutoff > 0, survey_design=...)` no-effective-PSU serial Bartlett HAC. Wave E.2 follow-up ships the panel-block composition when an effective PSU exists (explicit `survey_design.psu` OR injected via `cluster=<col>` per `_inject_cluster_as_psu`). Weights-only / strata-only survey designs WITHOUT a cluster fallback raise `NotImplementedError` at `SpilloverDiD.fit` post-resolution because under the pseudo-PSU = obs-index fallback each pseudo-PSU appears in exactly one period — the per-PSU serial cross-period loop would silently contribute zero. Fix would either derive a unit-level serial fallback for no-PSU designs (mixes IF allocators with the pseudo-PSU spatial term — needs methodology work) or route the serial loop through `conley_unit` with explicit documentation of the IF-allocator asymmetry. Regression goldens vs the effective-PSU shipped path. | `spillover.py::SpilloverDiD.fit`, `two_stage.py::_compute_stratified_serial_bartlett_meat` | follow-up (Wave E.2 follow-up tail) | Low |
| `SpilloverDiD(ring_method="count")` extension. Currently only the nearest-treated-ring specification is exposed. Count-of-treated-in-ring (paper Section 3.2 end) is methodologically supported by Butts but re-introduces functional-form dependence; expose with an explicit kwarg gate and documentation warning. | `spillover.py::SpilloverDiD.fit` | follow-up | Low |
| `SpilloverDiD` data-driven `d_bar` selection (Butts 2021b / Butts 2023 JUE Insight cross-validation). | `spillover.py::SpilloverDiD` | follow-up | Low |
Expand Down
89 changes: 70 additions & 19 deletions diff_diff/conley.py
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,61 @@ def _uniform_kernel(u: np.ndarray) -> np.ndarray:
return (np.abs(u) <= 1.0).astype(np.float64)


def _serial_bartlett_kernel_matrix(t_codes: np.ndarray, L: int) -> np.ndarray:
"""Within-unit Newey-West (1987) Bartlett HAC kernel matrix for serial
correlation in panel data, indexed by panel-wide dense time codes.

Returns the K matrix with ``K[i, j] = 1 - |t_i - t_j| / (L + 1)`` for
``0 < |t_i - t_j| <= L``, else 0. The lag-0 diagonal is excluded so
callers can add this to a spatial within-period meat without
double-counting the diagonal.

Uses the 1-D radial pairwise form (matches conleyreg::time_dist), NOT
Conley 1999 Eq 3.14's 2-D separable product window — see the methodology
lock at :func:`_compute_conley_meat` for context.
"""
t = t_codes.astype(np.float64, copy=False)
lag_mat = np.abs(t[:, None] - t[None, :])
return ((lag_mat <= L) & (lag_mat != 0)).astype(np.float64) * (1.0 - lag_mat / (L + 1.0))


def _validate_meat_psd(
M: np.ndarray,
*,
error_msg: str,
warning_template: str,
stacklevel: int = 3,
) -> None:
"""Finite + PSD guard for sandwich meat matrices. Raises ``ValueError``
on non-finite entries; warns ``UserWarning`` when ``min(eigvalsh(M)) <
-1e-12``.

Parameters
----------
error_msg
Message passed to ``ValueError`` on non-finite entries.
warning_template
Format string for the negative-eigenvalue warning. May contain an
``{eigval}`` placeholder; the caller embeds ``{eigval:.2e}`` directly
in the template so the helper formats the minimum eigenvalue with
scientific notation.
stacklevel
Frame count from inside ``_validate_meat_psd``: ``stacklevel=N``
attributes the warning to the Nth caller above the helper itself.
Default 3 covers a single intermediate frame (the helper's direct
caller's caller); pass an explicit value matching call-site depth.
"""
if not np.all(np.isfinite(M)):
raise ValueError(error_msg)
eigvals = np.linalg.eigvalsh(M)
if eigvals.size and eigvals.min() < -1e-12:
warnings.warn(
warning_template.format(eigval=eigvals.min()),
UserWarning,
stacklevel=stacklevel,
)


def _compute_spatial_bartlett_meat_sparse(
S: np.ndarray,
coords: np.ndarray,
Expand Down Expand Up @@ -957,37 +1012,33 @@ def _spatial_meat_for_mask(mask: Optional[np.ndarray] = None) -> np.ndarray:
mask_u = unit_arr == u_val
scores_u = scores[mask_u]
# Use dense panel-period codes (NOT raw labels) for lag math.
t_u = time_codes[mask_u].astype(np.float64)
lag_mat = np.abs(t_u[:, None] - t_u[None, :])
K_u = ((lag_mat <= L) & (lag_mat != 0)).astype(np.float64) * (
1.0 - lag_mat / (L + 1.0)
)
K_u = _serial_bartlett_kernel_matrix(time_codes[mask_u], L)
meat += scores_u.T @ K_u @ scores_u
if not np.all(np.isfinite(meat)):
raise ValueError(
"Conley meat contains non-finite values; check residuals and "
"score matrix for NaN/Inf."
)

# PSD guard. Neither the uniform kernel (Conley 1999 fn 11) nor the
# radial 1-D Bartlett specialization is formally PSD-guaranteed —
# Conley's explicit PSD Bartlett formula (Eq 3.14) is the 2-D separable
# product window, not the 1-D radial pairwise form that R `conleyreg`,
# Stata `acreg`, and this implementation use. Check both kernels.
eigvals = np.linalg.eigvalsh(meat)
if eigvals.size and eigvals.min() < -1e-12:
warnings.warn(
# ``{eigval:.2e}`` is a literal placeholder for ``_validate_meat_psd``;
# only ``{kernel!r}`` is interpolated by the f-string here.
_validate_meat_psd(
meat,
error_msg=(
"Conley meat contains non-finite values; check residuals and "
"score matrix for NaN/Inf."
),
warning_template=(
f"Conley meat with conley_kernel={kernel!r} has a materially "
f"negative eigenvalue ({eigvals.min():.2e}); the variance "
"negative eigenvalue ({eigval:.2e}); the variance "
"estimator is not guaranteed PSD on this design. Both "
"supported kernels (radial bartlett and uniform) are "
"practitioner specializations of Conley 1999 and are not "
"formally PSD-guaranteed; consider varying conley_cutoff_km "
"or reviewing the design for collinearity / degenerate "
"residual structure.",
UserWarning,
stacklevel=3,
)
"residual structure."
),
stacklevel=4,
)

return meat

Expand Down
Loading
Loading