Skip to content

Add Stage 2 geography assignment boundary#1088

Draft
anth-volk wants to merge 1 commit into
agent/stage-2/pr-2d-target-catalog-selectionfrom
agent/stage-2/pr-2e-geography-assignment
Draft

Add Stage 2 geography assignment boundary#1088
anth-volk wants to merge 1 commit into
agent/stage-2/pr-2d-target-catalog-selectionfrom
agent/stage-2/pr-2e-geography-assignment

Conversation

@anth-volk
Copy link
Copy Markdown
Collaborator

Fixes #1086

Summary

  • Add GeographyAssignmentSpec and GeographyAssignmentResult as the Stage 2 geography assignment boundary, including deterministic input identity for seed, household AGI, district AGI targets, and fixed state overrides.
  • Emit geography_assignment_summary.json with clone-major block, county, state, and congressional district availability, lengths, unique counts, checksums, ordering, status, and spec material.
  • Reference the geography summary artifact from the Stage 2 output bundle, runtime manifest outputs, and calibration package contract while preserving existing package pickle geography keys.
  • Update pipeline docs and focused tests for summary JSON, contract validation, deterministic assignment, row-count/order checks, target/package coverage, and tiny Stage 4 geography loading.

Validation

  • uv run --no-sync python -m py_compile policyengine_us_data/calibration_package/geography.py policyengine_us_data/calibration_package/payload.py policyengine_us_data/calibration_package/specs.py policyengine_us_data/calibration_package/__init__.py policyengine_us_data/calibration/unified_calibration.py policyengine_us_data/stage_contracts/calibration_package.py policyengine_us_data/stage_contracts/calibration_package_schema.py
  • uv run --no-sync ruff check policyengine_us_data/calibration/unified_calibration.py policyengine_us_data/calibration_package/__init__.py policyengine_us_data/calibration_package/geography.py policyengine_us_data/calibration_package/payload.py policyengine_us_data/calibration_package/specs.py policyengine_us_data/stage_contracts/calibration_package.py policyengine_us_data/stage_contracts/calibration_package_schema.py tests/unit/calibration_package/test_geography.py tests/unit/calibration_package/test_specs.py tests/unit/test_calibration_package_stage_contract.py tests/unit/test_pipeline_docs_extractor.py
  • uv run --no-sync ruff format --check policyengine_us_data/calibration/unified_calibration.py policyengine_us_data/calibration_package/__init__.py policyengine_us_data/calibration_package/geography.py policyengine_us_data/calibration_package/payload.py policyengine_us_data/calibration_package/specs.py policyengine_us_data/stage_contracts/calibration_package.py policyengine_us_data/stage_contracts/calibration_package_schema.py tests/unit/calibration_package/test_geography.py tests/unit/calibration_package/test_specs.py tests/unit/test_calibration_package_stage_contract.py tests/unit/test_pipeline_docs_extractor.py
  • uv run --no-sync python -m pytest tests/unit/calibration_package/test_geography.py tests/unit/calibration_package/test_payload.py tests/unit/calibration_package/test_specs.py tests/unit/test_calibration_package_stage_contract.py tests/unit/test_pipeline_docs_extractor.py (56 passed)
  • uv run --no-sync python -m pytest tests/unit/build_outputs/test_geography_loader.py (8 passed)
  • uv run --no-sync python -m pytest tests/unit/calibration/test_target_config.py tests/unit/calibration_package/test_targets.py tests/unit/test_pipeline_doc_guards.py tests/unit/test_remote_calibration_runner.py tests/unit/test_pipeline.py (43 passed, 1 skipped)
  • uv run --no-sync python -m pytest tests/unit/calibration/test_unified_calibration.py::test_calibration_package_contract_parameters_track_effective_matrix_mode tests/unit/calibration/test_unified_calibration.py::test_calibration_package_contract_parameters_ignore_unused_chunk_options tests/unit/calibration/test_unified_calibration.py::TestForbesStateOverrides tests/unit/calibration/test_unified_calibration.py::TestGeographyAssignmentCountyFips tests/unit/calibration/test_unified_calibration.py::TestRunCalibrationAgiTargets::test_uses_requested_db_for_district_agi_targets (7 passed)
  • uv run --no-sync --with pyyaml python scripts/extract_pipeline_docs.py --json /private/tmp/us-data-pr-2e-pipeline-docs/pipeline_map.json --api-json /private/tmp/us-data-pr-2e-pipeline-docs/pipeline_api.json --markdown /private/tmp/us-data-pr-2e-pipeline-docs/pipeline-map.md (148 decorated objects)
  • uv run --no-sync --with pyyaml python scripts/run_quality_guards.py
  • git diff --check
  • make lint

Notes

  • The selector emitted the full tests/unit/calibration/test_unified_calibration.py. I ran the directly geography/package-related node IDs from that file; a collect-only probe for the full file took about 66 seconds and the selected seven tests took about 2:19, so the whole file was not run to avoid the known heavy local collection/runtime behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant