Skip to content

Add Stage 1 checkpoint reuse boundary#1076

Draft
anth-volk wants to merge 3 commits into
mainfrom
agent/stage-1/pr-4-rerun-reuse-checkpoints
Draft

Add Stage 1 checkpoint reuse boundary#1076
anth-volk wants to merge 3 commits into
mainfrom
agent/stage-1/pr-4-rerun-reuse-checkpoints

Conversation

@anth-volk
Copy link
Copy Markdown
Collaborator

Fixes #1074

Summary

  • Add checkpoint store and rerun planner abstractions for Stage 1 dataset build substeps.
  • Route checkpoint helper behavior through the store while preserving existing checkpoint path layout and cleanup semantics.
  • Record checkpoint/reuse decisions in substep results, status events, and output contract metadata.

Validation

  • uv run --no-sync pytest tests/unit/test_build_dataset_checkpoints.py tests/unit/test_build_dataset_rerun.py tests/unit/test_dataset_build_stage_contract.py tests/unit/test_modal_data_build.py tests/unit/test_pipeline_doc_guards.py tests/unit/test_pipeline_docs_extractor.py
  • uv run --no-sync --with pyyaml python scripts/run_quality_guards.py
  • uv run --no-sync --with pyyaml python scripts/extract_pipeline_docs.py --json /private/tmp/stage1-pr4-docs/pipeline_map.json --api-json /private/tmp/stage1-pr4-docs/pipeline_api.json --markdown /private/tmp/stage1-pr4-docs/pipeline-map.md
  • make lint

@anth-volk anth-volk force-pushed the agent/stage-1/pr-3-command-substep-status branch 5 times, most recently from 454cf30 to 056c46f Compare May 21, 2026 19:33
Base automatically changed from agent/stage-1/pr-3-command-substep-status to main May 21, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant