Version
e0015d3
On which installation method(s) does this occur?
No response
Describe the issue
The refactor introduced in #99 makes the pipeline model trainable, however due to the original design, if someone were to train solely on energies gradients would be tracked unnecessarily.
PipelineModelWrapper currently prepares autograd inputs for every PipelineGroup(use_autograd=True) before checking whether derivative outputs are actually requested.
When pipe.model_config.active_outputs == {"energy"}, no derivative helper runs, so inputs like positions do not need requires_grad=True. Today, grad_keys are still collected from submodels’ model_config.autograd_inputs, and _prepare_autograd_leaves(...) still detaches those tensors and calls requires_grad_(True). As a result, we're paying the autograd overhead even if we don't need to for positions.
Relevant line numbers based on e0015d3
nvalchemi/models/pipeline.py:688: computes requested_derivatives
nvalchemi/models/pipeline.py:817: _run_autograd_group(...) prepares leaves for autograd groups
nvalchemi/models/pipeline.py:857: prepare_autograd_leaves(...) marks tensors with requires_grad(True)
Minimum reproducible example
import torch
from torch import nn
from collections import OrderedDict
from nvalchemi.data import AtomicData, Batch
from nvalchemi.models.base import BaseModelMixin, ModelConfig
from nvalchemi.models.pipeline import PipelineGroup, PipelineModelWrapper
class EnergyOnlyModel(nn.Module, BaseModelMixin):
def __init__(self):
super().__init__()
self.scale = nn.Parameter(torch.tensor(1.0))
self.positions_requires_grad_seen = None
self.model_config = ModelConfig(
outputs=frozenset({"energy"}),
autograd_outputs=frozenset({"forces"}),
autograd_inputs=frozenset({"positions"}),
needs_pbc=False,
active_outputs={"energy"},
)
@property
def embedding_shapes(self):
return {}
def compute_embeddings(self, data, **kwargs):
raise NotImplementedError
def forward(self, data, **kwargs):
self.positions_requires_grad_seen = data.positions.requires_grad
energy = self.scale * data.positions.pow(2).sum().reshape(1, 1)
return OrderedDict(energy=energy)
data = AtomicData(
positions=torch.randn(4, 3),
atomic_numbers=torch.tensor([1, 1, 1, 1]),
energy=torch.zeros(1, 1),
forces=torch.zeros(4, 3),
)
batch = Batch.from_data_list([data])
model = EnergyOnlyModel()
pipe = PipelineModelWrapper(groups=[PipelineGroup(steps=[model], use_autograd=True)])
pipe.model_config.active_outputs = {"energy"}
pipe.train()
out = pipe(batch)
out["energy"].sum().backward()
assert model.scale.grad is not None, (
"energy-only training should still backprop to model parameters"
)
assert model.positions_requires_grad_seen is False, (
"energy-only autograd pipeline should not mark positions requires_grad=True "
"when no derivative outputs are requested"
)
Relevant log output
Environment details
Version
e0015d3
On which installation method(s) does this occur?
No response
Describe the issue
The refactor introduced in #99 makes the pipeline model trainable, however due to the original design, if someone were to train solely on energies gradients would be tracked unnecessarily.
PipelineModelWrappercurrently prepares autograd inputs for everyPipelineGroup(use_autograd=True)before checking whether derivative outputs are actually requested.When
pipe.model_config.active_outputs == {"energy"}, no derivative helper runs, so inputs like positions do not needrequires_grad=True. Today,grad_keysare still collected from submodels’model_config.autograd_inputs, and_prepare_autograd_leaves(...)still detaches those tensors and callsrequires_grad_(True). As a result, we're paying the autograd overhead even if we don't need to forpositions.Relevant line numbers based on
e0015d3nvalchemi/models/pipeline.py:688: computes requested_derivatives
nvalchemi/models/pipeline.py:817: _run_autograd_group(...) prepares leaves for autograd groups
nvalchemi/models/pipeline.py:857: prepare_autograd_leaves(...) marks tensors with requires_grad(True)
Minimum reproducible example
Relevant log output
Environment details