refactor(data): replace fragile regex in __inc__ and _get_cat_dim with explicit key sets

## Context

During the attribute rename work in #56, we added `neighbor_list` to the regex pattern in `DataMixin.__inc__()` (`nvalchemi/data/data.py:155`):

```python
return self.num_nodes if bool(re.search("(index|face|neighbor_list)", key)) else 0
```

This method determines which fields receive cumulative node-index offsets when batching. The regex approach is implicit and fragile — it matches any field whose name *contains* the substring, so a field like `neighbor_list_shifts` would incorrectly match (flagged by @zubatyuk in [PR #56 review](https://github.com/NVIDIA/nvalchemi-toolkit/pull/56#discussion_r3054078572)).

A similar pattern exists in the Zarr backend's `_get_cat_dim()` (`nvalchemi/data/datapipes/backends/zarr.py:231`):

```python
if bool(re.search("(index|face)", key)):
    return -1
```

In contrast, `_get_field_level()` in the same file already does this correctly — it checks membership in `AtomicData._default_node_keys`, `_default_edge_keys`, and `_default_system_keys`.

## Proposed fix

Replace both regex sites with explicit set membership, as @zubatyuk suggested:

1. Add an `_index_keys` ClassVar on `AtomicData` (or `DataMixin`) containing the canonical set of fields that require node-index offsets (currently `{"neighbor_list"}`).
2. Update `__inc__` to use `key in self._index_keys` instead of regex.
3. Update `_get_cat_dim` to use `AtomicData._index_keys` (or an equivalent edge-key set) instead of regex.

This makes the behavior explicit, self-documenting, and safe against substring collisions.

## Affected files

- `nvalchemi/data/data.py` — `DataMixin.__inc__()`
- `nvalchemi/data/datapipes/backends/zarr.py` — `_get_cat_dim()`

## Origin

Identified during review of PR #56 (attribute rename). Deferred as out of scope for that PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(data): replace fragile regex in inc and _get_cat_dim with explicit key sets #58

Context

Proposed fix

Affected files

Origin

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

refactor(data): replace fragile regex in __inc__ and _get_cat_dim with explicit key sets #58

Description

Context

Proposed fix

Affected files

Origin

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

refactor(data): replace fragile regex in inc and _get_cat_dim with explicit key sets #58