Paper-focused reproducibility repository for Attention Is Not All You Need for Diffraction.
This repo reproduces the paper-facing table rows and figure layer for the mixed-curriculum results:
- benchmark summary rows from bundled JSON artifacts
- supplemental positional-ablation benchmark rows from bundled JSON artifacts
- topology-distance figure
- topology-flow figure set
The repo does not bundle model checkpoints or benchmark HDF5 files. Those come from:
- Zenodo checkpoints: see reproducibility/checkpoint_manifest.csv
- external benchmark/trainready datasets: see reproducibility/dataset_manifest.csv
Because the upstream RRUFF-derived benchmark files are not redistributed here, this repo publishes the benchmark-construction algorithms instead:
The repo does bundle compact reviewer-facing artifacts:
- two example diffraction CSVs derived from the paper benchmark
- their paired JSON metadata
- SG/EG lookup CSVs
- a compact prior JSON/CSV
- a compact precomputed
RRUFF-325summary JSON
Those are sufficient for the shipped notebook walkthrough without Box or the full RRUFF benchmark.
Reviewer-facing notebook support is documented in:
Supported notebook usage paths:
- local machine with the train/eval environment and a released checkpoint
- TACC TAP on Stampede3 with the same repo checkout and checkpoint placement
Google Colab is plausible for the lightweight checkpoint-only reviewer demo, but it is not the primary validated path.
Zenodo archival package:
- DOI: 10.5281/zenodo.19558452
- Record: zenodo.org/records/19558452
Current archive split:
- GitHub repo:
- code
- notebooks
- benchmark-construction scripts
- paper-facing wrappers and docs
- Zenodo:
- checkpoints
- compact result JSONs
- configs
- launchers
- short archival manifests/notes
- reviewer compact assets:
- packaged separately in
reviewer_compact_assets.tar.gz
- packaged separately in
For paper tables and figures:
conda env create -f environment.yml
conda activate paper-ai-diffraction
pip install -e .For checkpoint evaluation or training reruns:
conda env create -f environment-train-eval.yml
conda activate paper-ai-diffraction-train-eval
pip install -e .TACC-specific notes are in:
Checkpoints are downloaded from Zenodo and should be placed under:
external/checkpoints/ # core paper checkpoints (flat)
external/checkpoints/ # supplemental ViT checkpoints (Tables S3–S7, Figs S3/S5)
The exact filenames and expected local paths are listed in:
This manifest includes every checkpoint explicitly named in the manuscript main text or supplement. Exploratory checkpoints are still archived, but are labeled accordingly in the notes column.
External benchmark and trainready datasets are not redistributed in this repo. Their required environment variables and example source paths are listed in:
Table rows from bundled paper JSONs:
python scripts/make_main_tables.pyTopology-distance figure from bundled failure JSONs:
./scripts/make_topology_distance_figure.shTopology-flow figure set from bundled failure JSON plus external canonical CSV:
export CANONICAL_CSV=/path/to/canonical_extinction_to_space_group.csv
./scripts/make_topology_flow_figure.shCalibration sweep figure from the bundled Stage-2c sweep JSON:
./scripts/make_calibration_figure.shCurriculum holdout figure from bundled paper values:
python scripts/make_curriculum_real_holdout.pyRRUFF-473 decoder-tradeoff figure from bundled paper values:
python scripts/make_stage_decoder_tradeoffs_rruff473.pyPhysics-PE supplementary ruler figure from the bundled checkpoint-curve JSON:
python scripts/make_physics_pe_q2_ruler.pyReconstruct the frozen RRUFF-473 benchmark from an upstream manifest plus raw XY files:
python scripts/reconstruct_rruff_473.py --manifest-json /path/to/rruff_cukalpha_manifest.json --xy-dir /path/to/xy_raw --reference-manifest-json /path/to/option1_metadata_manifest.json --output-json results/rruff473_reconstruction_summary.jsonRebuild RRUFF-325 deterministically from frozen RRUFF-473:
python scripts/build_rruff_325_from_473.py --input-h5 /path/to/RRUFF_option1_473_with_buckets_maxnorm.hdf5 --output-h5 /path/to/RRUFF_usable_plus_recoverable_325_with_labels_maxnorm.hdf5Reviewer-support artifact generation:
python scripts/export_prior_asset.py --prior-h5 /path/to/trainready.hdf5 --output-csv results/reviewer/ext_group_priors.csv --output-json results/reviewer/ext_group_priors.json
python scripts/export_rruff_examples.py --benchmark-h5 /path/to/RRUFF_usable_plus_recoverable_325_with_labels_maxnorm.hdf5 --failure-json results/mixed2500k_compare_325_failure_modes_655279.json --output-dir assets/reviewer_examples
python scripts/precompute_benchmark_inference.py --checkpoint external/checkpoints/xrd_model_82ept35h_best.pth --config configs/final_mixed_2500k_dualsource.json --benchmark-h5 /path/to/RRUFF_usable_plus_recoverable_325_with_labels_maxnorm.hdf5 --prior-h5 /path/to/trainready.hdf5 --output-json results/reviewer/rruff325_precomputed_inference.jsonIf results/reviewer/rruff325_precomputed_inference.json is present, the reviewer notebook can browse the full paper-backed 325-example summary directly instead of recomputing it inside Jupyter.
results/contains only compact paper-backed JSON artifacts.results/figures/is generated output and is not tracked.scripts/contains the canonical paper-facing wrappers.scripts/tacc_archive/contains preserved historical campaign launchers for provenance only.