Skip to content

🐛[BUG]: Default derivative function set will include positions if energies is specified #100

Description

@laserkelvin

Version

e0015d3

On which installation method(s) does this occur?

No response

Describe the issue

The refactor introduced in #99 makes the pipeline model trainable, however due to the original design, if someone were to train solely on energies gradients would be tracked unnecessarily.

PipelineModelWrapper currently prepares autograd inputs for every PipelineGroup(use_autograd=True) before checking whether derivative outputs are actually requested.

When pipe.model_config.active_outputs == {"energy"}, no derivative helper runs, so inputs like positions do not need requires_grad=True. Today, grad_keys are still collected from submodels’ model_config.autograd_inputs, and _prepare_autograd_leaves(...) still detaches those tensors and calls requires_grad_(True). As a result, we're paying the autograd overhead even if we don't need to for positions.

Relevant line numbers based on e0015d3

nvalchemi/models/pipeline.py:688: computes requested_derivatives
nvalchemi/models/pipeline.py:817: _run_autograd_group(...) prepares leaves for autograd groups
nvalchemi/models/pipeline.py:857: prepare_autograd_leaves(...) marks tensors with requires_grad(True)

Minimum reproducible example

import torch
from torch import nn
from collections import OrderedDict

from nvalchemi.data import AtomicData, Batch
from nvalchemi.models.base import BaseModelMixin, ModelConfig
from nvalchemi.models.pipeline import PipelineGroup, PipelineModelWrapper


class EnergyOnlyModel(nn.Module, BaseModelMixin):
    def __init__(self):
        super().__init__()
        self.scale = nn.Parameter(torch.tensor(1.0))
        self.positions_requires_grad_seen = None
        self.model_config = ModelConfig(
            outputs=frozenset({"energy"}),
            autograd_outputs=frozenset({"forces"}),
            autograd_inputs=frozenset({"positions"}),
            needs_pbc=False,
            active_outputs={"energy"},
        )

    @property
    def embedding_shapes(self):
        return {}

    def compute_embeddings(self, data, **kwargs):
        raise NotImplementedError

    def forward(self, data, **kwargs):
        self.positions_requires_grad_seen = data.positions.requires_grad
        energy = self.scale * data.positions.pow(2).sum().reshape(1, 1)
        return OrderedDict(energy=energy)


data = AtomicData(
    positions=torch.randn(4, 3),
    atomic_numbers=torch.tensor([1, 1, 1, 1]),
    energy=torch.zeros(1, 1),
    forces=torch.zeros(4, 3),
)
batch = Batch.from_data_list([data])

model = EnergyOnlyModel()
pipe = PipelineModelWrapper(groups=[PipelineGroup(steps=[model], use_autograd=True)])
pipe.model_config.active_outputs = {"energy"}
pipe.train()

out = pipe(batch)
out["energy"].sum().backward()

assert model.scale.grad is not None, (
    "energy-only training should still backprop to model parameters"
)
assert model.positions_requires_grad_seen is False, (
    "energy-only autograd pipeline should not mark positions requires_grad=True "
    "when no derivative outputs are requested"
)

Relevant log output

Environment details

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions