Skip to content

feat(models): add UMA (fairchem-core) interatomic-potential wrapper#117

Merged
dallasfoster merged 8 commits into
NVIDIA:mainfrom
dallasfoster:dallasf/uma-wrapper
Jun 25, 2026
Merged

feat(models): add UMA (fairchem-core) interatomic-potential wrapper#117
dallasfoster merged 8 commits into
NVIDIA:mainfrom
dallasfoster:dallasf/uma-wrapper

Conversation

@dallasfoster

Copy link
Copy Markdown
Collaborator

ALCHEMI Toolkit Pull Request

Description

Add UMAWrapper, a BaseModelMixin-compatible wrapper around fairchem-core's
UMA (Universal Models for Atoms) MLIPPredictUnit, so UMA foundation models can
drive nvalchemi dynamics and inference. UMA is multi-task: one checkpoint
(uma-s-1p1 / uma-s-1p2 / uma-m-1p1) ships heads for OMol, OMat, OC20, ODAC,
and OMC; the wrapper pins a single task at construction (one-wrapper-one-model,
drive nvalchemi dynamics and inference. UMA is multi-task: one checkpoint
(uma-s-1p1 / uma-s-1p2 / uma-m-1p1) ships heads for OMol, OMat, OC20, ODAC,
and OMC; the wrapper pins a single task at construction (one-wrapper-one-model,
matching MACEWrapper). The conversion is tensor-native (no ASE round trip),
energy is the differentiable primitive, and forces/stress come from autograd.

Type of Change

  • New feature (non-breaking change that adds functionality)

Changes Made

  • nvalchemi/models/uma.py: UMAWrapper (task-aware model_config, adapt_input/
    adapt_output, compute_embeddings, from_checkpoint). from_checkpoint exposes
    fairchem's native inference_settings (incl. "turbo"); forward routes the
    one-time lazy-init/MoLE-merge through CPU input to dodge a fairchem device-
    placement bug under turbo on GPU-resident first batches.
  • nvalchemi/_optional.py + nvalchemi/models/init.py: register/export UMA.
  • pyproject.toml: uma extra (fairchem-core>=2.0.0); declare uma conflicting
    with mace (e3nn pin) and cu12/cu13 (fairchem torch<2.9 vs toolkit-ops
    torch>=2.11); pin setuptools<81 for fairchem's torchtnt. uv.lock regenerated.
  • examples/advanced/09_uma_nve.py: NVE/NVT/NPT MD example driven by built-in
    LoggingHook + EnergyDriftMonitorHook; NPT exercises the stress path; turbo
    selectable via inference_settings.
  • test/models/test_uma.py: consolidated suite — structural (mock), forward
    equivalence vs FAIRChemCalculator, charged-input response, NVE drift (@slow),
    turbo/compile device path (@slow, CUDA).
  • docs: userguide/models.md (supported-models table + UMA usage / HF-token /
    torch-environment notes), modules/models.rst, models/index.md, examples README.
  • .github/workflows/ci.yml: install a dedicated .venv-uma and run the UMA tests
    from it (gated on UMA-file changes or full runs); optional HF_TOKEN secret.

Testing

  • Unit tests pass locally (UMA suite: 36 passed with --slow against
    uma-s-1p1 on CUDA; 33 passed / 3 slow-skipped without --slow)
  • Linting passes (make lint)
  • New tests added for new functionality

Equivalence matches FAIRChemCalculator to 1e-4 (OMol energy/forces, OMat
energy/forces/stress); charged OMol runs match the calculator at charge -1;
small (uma-s-1p1/1p2) and medium (uma-m-1p1) checkpoints load and run.

Additional Notes

UMA's deps conflict with the mace/cuXX stack, so it must be installed in its
own environment (uv sync --extra uma); it brings its own CUDA-enabled torch
(2.8, cu12.8) and does not use the cuXX GPU stack. Checkpoints are gated on
HuggingFace (facebook/UMA) — CI structural tests run without a token; the
checkpoint-based tests skip unless an HF_TOKEN secret is provided.

Tip

This repository uses Greptile, an AI code review service, to help conduct
pull request reviews. We encourage contributors to read and consider suggestions
made by Greptile, but note that human maintainers will provide the necessary
reviews for merging: Greptile's comments are not a qualitative judgement
of your code, nor is it an indication that the PR will be accepted/rejected.
We encourage the use of emoji reactions to Greptile comments, depending on
their usefulness and accuracy.

Signed-off-by: Dallas Foster <dallasf@nvidia.com>
Signed-off-by: Dallas Foster <dallasf@nvidia.com>
Signed-off-by: Dallas Foster <dallasf@nvidia.com>
Signed-off-by: Dallas Foster <dallasf@nvidia.com>
@dallasfoster dallasfoster requested a review from laserkelvin June 16, 2026 20:42
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@dallasfoster

Copy link
Copy Markdown
Collaborator Author

/ok to test 8344178

@greptile-apps

greptile-apps Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces UMAWrapper, a BaseModelMixin-compatible wrapper around fairchem-core's MLIPPredictUnit, enabling UMA foundation model checkpoints to drive nvalchemi dynamics and inference. The integration is tensor-native (no ASE round-trip), task is pinned at construction, and forces/stress are derived via autograd.

  • nvalchemi/models/uma.py: New UMAWrapper with adapt_input/adapt_output, task-aware ModelConfig, compute_embeddings, from_checkpoint, and a one-shot _cpu_route_first_forward flag to work around fairchem's turbo/MoLE device bug on the first call.
  • Packaging & CI: Adds a uma extra with fairchem-core declared as conflicting with mace and cuXX; CI runs UMA tests from an isolated .venv-uma environment gated on UMA-file changes, with optional HF_TOKEN for checkpoint-based tests.
  • Tests: Consolidated structural (mock) and checkpoint-based suite covering adapt_input/adapt_output, forward equivalence vs FAIRChemCalculator, charged input, NVE drift (@slow), and turbo/compile device path (@slow, CUDA).

Important Files Changed

Filename Overview
nvalchemi/models/uma.py New UMAWrapper for fairchem's MLIPPredictUnit; optional_inputs advertises "tags" but the adapter reads atom_categories — the two names don't match, so callers following the declared contract get silent zero-tags.
test/models/test_uma.py Comprehensive structural and checkpoint test suite; tests atom_categories passthrough via test_passes_tags but doesn't test that optional_inputs accurately lists it.
.github/workflows/ci.yml Adds isolated .venv-uma step gated on UMA file changes; coverage is correctly appended and HF_TOKEN secret is optional.
pyproject.toml Adds uma extra with fairchem-core>=2.0.0, declares conflicts with mace and cuXX, and pins setuptools<81 for fairchem's torchtnt compatibility.
nvalchemi/models/init.py Adds lazy-import and __all__ export for UMAWrapper following the existing pattern.
nvalchemi/_optional.py Registers fairchem.core as an optional dependency under the UMA enum entry; straightforward addition.
examples/advanced/09_uma_nve.py Well-documented NVE/NVT/NPT example; gracefully skips when checkpoint is unavailable and exercises the stress path via NPT.

Reviews (4): Last reviewed commit: "Merge branch 'main' into dallasf/uma-wra..." | Re-trigger Greptile

Comment thread nvalchemi/models/uma.py
Comment thread nvalchemi/models/uma.py Outdated
Comment thread nvalchemi/models/uma.py Outdated
dallasfoster added a commit to dallasfoster/nvalchemi-toolkit that referenced this pull request Jun 16, 2026
…IA#45 WIP)

Adopt the canonical UMAWrapper from NVIDIA/nvalchemi-toolkit PR NVIDIA#117 (the
forked UMAWrapper line of work) as the base, while preserving this branch's
distributed `distribution_spec` (halo storage + Triton-kernel OpAdapters) so
the DD work (NVIDIA#45) continues on the up-to-date wrapper.

From NVIDIA#117: torch.compile via fairchem InferenceSettings (no compile_model
flag) + module docstring; inference_settings typed Any (preset name OR an
InferenceSettings instance); turbo/merge_mole CPU-lazy-init workaround in
predict; adapt_input uses data.num_nodes_per_graph + cleaner cell/pbc/charge/
spin handling; dropped the per-atom / energy debug monkey-patches.

Kept (not in NVIDIA#117): the `distribution_spec` property (SPEC_UMA_HALO + the 5
fairchem Wigner Triton OpAdapters with ScatterOutputs on the edge->node
kernel) and the "no distributed_setup needed" note.

Merge is mechanical (take NVIDIA#117's uma.py, splice our distribution_spec before
embedding_shapes); imports cleanly in .venv-uma (fairchem) and ast-parses in
.venv. RUNTIME-UNVALIDATED: the facebook/UMA checkpoint is HF-gated, so
single-process + eager-DD + compile validation is pending an HF token on the
box.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dallasfoster

Copy link
Copy Markdown
Collaborator Author

/ok to test 00044fa

@laserkelvin laserkelvin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me; the spin default value needs to change though

Comment thread nvalchemi/models/uma.py Outdated
Comment thread nvalchemi/models/uma.py Outdated
Comment thread nvalchemi/models/uma.py Outdated
Comment thread nvalchemi/models/uma.py Outdated
Comment thread nvalchemi/models/uma.py Outdated
Comment thread examples/advanced/09_uma_nve.py
Signed-off-by: Dallas Foster <dallasf@nvidia.com>
@dallasfoster

Copy link
Copy Markdown
Collaborator Author

/ok to test 816b230

@dallasfoster dallasfoster added this pull request to the merge queue Jun 25, 2026
Merged via the queue into NVIDIA:main with commit 7c8bb06 Jun 25, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants