Skip to content

refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18)#5733

Open
wanghan-iapcm wants to merge 37 commits into
deepmodeling:masterfrom
wanghan-iapcm:feat-graph-pair-exclude
Open

refactor: pair exclude_types as canonical NeighborGraph transform; dpa1 graph path supports exclude_types (decision #18)#5733
wanghan-iapcm wants to merge 37 commits into
deepmodeling:masterfrom
wanghan-iapcm:feat-graph-pair-exclude

Conversation

@wanghan-iapcm

@wanghan-iapcm wanghan-iapcm commented Jul 5, 2026

Copy link
Copy Markdown
Collaborator

Pair exclude_types as a canonical NeighborGraph transform (decision #18)

Stacked on #5715 (NeighborGraph PR-D). Only the commits after 6fc45bd26 belong to this PR; will rebase onto master once #5715 merges.

Makes pair-type exclusion a single canonical transform applied once at the neighbor-list/graph build seam, and uses it to add descriptor-level exclude_types support to the dpa1 graph path (removing that eligibility gate), consistently across dpmodel, pt_expt, jax, and the C++ inference path.

What changed

  • Canonical graph transform apply_pair_exclusion(graph, atype, pair_excl, *, compact=False) in deepmd/dpmodel/utils/neighbor_graph/: ANDs PairExcludeMask.build_edge_exclude_mask into graph.edge_mask. compact=False is mask-only (shape-static, export/AOTI-safe); compact=True drops masked edges (eager-only; raises on angle-carrying graphs). Idempotent.
  • Atomic-model seam (base_atomic_model) refactored to call the transform for model-level pair_exclude_types; stays as idempotent backstop.
  • dpa1 graph path supports descriptor-level exclude_types: the NotImplementedError and the uses_graph_lower() exclude condition are removed; exclusion applied inside DescrptBlockSeAtten.call_graph before the segment sums. Graph-vs-dense parity at non-binding sel is exact (rtol=atol=1e-12, attn_layer 0 and 2, type_one_side both).
  • Build-time exclusion (dispatcher): build_neighbor_graph and the pt_expt graph builders (dense/ase/vesin/nv) gain optional pair_excl/compact with default post-search application; _call_common_graph passes model-level excludes at build time; oracle set-equality tests per available builder.
  • Dense-nlist port: apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl) extracted from the inline seam code; build_neighbor_list + Vesin/Nv/Default strategies gain pair_excl; return_mode='edges' + pair_excl fails fast.
  • C++ twin: buildPairExcludeTable / applyPairExclusion / applyPairExclusionNlist in source/api_cc/include/commonPT.h, mirroring the Python transforms (same arg order/variable names, cross-referenced docs); pair_exclude_types serialized into .pt2 metadata.json and rebuilt in DeepPotPTExpt::init. Exclusion is compiled into the exported graph (traced seam); the C++ call is an idempotent backstop. New gtest (8 tests) vs Python DeepEval reference at 1e-10.
  • Fix: apply_pair_exclusion uses logical_and + bool cast (array_api_strict rejected bool*bool), caught by the jax/strict consistency rows now traversing the graph path.

Known limitations

  • nv builder's pair_excl path has no local oracle test (CUDA-only); to be validated on a GPU box.
  • Input statistics remain on the dense path (graph-native stats is a separate follow-on).
  • smooth_type_embedding + exclude parity untestable at 1e-12 (pre-existing dense sel-padding divergence, feat(dpmodel): graph-native se_atten attention (NeighborGraph PR-D) #5715).
  • Multi-rank (mpirun) exclusion shares the same seam but has no dedicated gtest.
  • build_edge_exclude_mask still returns int32 (bool cast at call sites; follow-up).

🤖 Generated with Claude Code

Spin routing

Spin models auto-inject exclude_types (virtual/placeholder types) into their backbone descriptor; before this PR that condition accidentally kept spin on the dense path. With exclude_types now graph-eligible, spin backbones flipped onto the carry-all graph route, which (a) diverges from the sel-capped reference on sel-binding spin systems and (b) trips a torch-inductor scatter codegen assertion during spin .pt2 export. Fixed explicitly: DescrptDPA1.disable_graph_lower() (not serialized; re-derived structurally) is set in SpinModel.__init__ — the single choke point covering get_spin_model, SpinModel.deserialize, and the pt_expt spin classes — plus a belt-and-braces neighbor_graph_method="legacy" at SpinModel.call_common. Regression tests pin the routing and its serialize→deserialize survival; the full spin export suite (23) and spin checkpoint-interop suite (12) are green.

Verification

Full pt_expt suite: 1196 passed / 39 skipped / 3 failed — the 3 failures (test_dpa4_freeze_to_pt2, test_dpa4_deep_eval_*) are byte-identical on the base commit (pre-existing torch-inductor dpa4 export issue on this box, unrelated). dpmodel exclusion suites 69 passed; consistency dpa1 99 passed/63 skipped (incl. jax + array_api_strict exclude rows); C++ Dpa1PairExcl gtest 8/8.

Summary by CodeRabbit

  • New Features

    • Added support for pair-type exclusions across neighbor-list and neighbor-graph generation.
    • Enabled graph-based attention workflows for more descriptor configurations, including attention layers.
    • Added export checks to better guide compatible graph tracing on supported torch versions.
  • Bug Fixes

    • Improved consistency between dense and graph-based results, including attention and exclusion handling.
    • Fixed spin-model routing to use the supported legacy path for inference.
    • Reduced numerical mismatches in graph exports and improved behavior for empty or masked neighbor cases.

Han Wang and others added 30 commits July 5, 2026 00:03
…ftmax

Built on the existing xp_maximum_at (no new array_api helper needed).
Part of NeighborGraph PR-D (graph-native attention).
Segment-based (global (E,E) boolean deliberately avoided): compact eager
form for carry-all graphs + shape-static nonzero-free form for the
center-major static layout (jit/export/make_fx traceable).
Part of NeighborGraph PR-D; PR-E angles reuse (unordered, no-self).
…r > 0)

DescrptBlockSeAtten.call_graph grows _graph_attention: the dense per-center
(nnei, nnei) attention square becomes the edge-pair axis (center_edge_pairs,
ordered + self-included), softmax over keys becomes segment_softmax grouped
by the query edge. Op-for-op mirror of GatedAttentionLayer.call (head_dim
QKV slicing, normalize q/k/v, temperature/scaling, smooth shift trick,
post-softmax sw and dotr weighting, residual + LayerNorm per layer).

- shape-static adapter path (static_nnei threaded from the dense call
  adapter): bit-exact vs the dense body, rtol 1e-12, full flag matrix
  (attn_layer 1/2 x dotr x smooth x normalize x temperature, binding and
  non-binding sel).
- carry-all (compact) graphs: exact for non-smooth; for smooth the dense
  branch keeps sel-padding slots in the softmax denominator (dense output is
  sel-DEPENDENT, up to ~1e-4) — the carry-all form drops those phantom terms
  by design (user decision 2026-07-03), pinned by a clean-divergence test.
- edge_env_mat(return_sw=True) exposes the per-edge switch (zeroed on
  padding) for the smooth branch.
- uses_graph_lower: attention configs are now graph-eligible (concat tebd,
  no exclude_types still required).
…ial parity

- test_make_fx_graph_attn: graph forward + autograd.grad at attn_layer=2
  traces under make_fx for BOTH smooth branches (the shape-static
  center_edge_pairs form is nonzero-free) — required since pt_expt compiled
  training routes eligible models through the graph lower.
- model-level graph-vs-legacy lower parity now parametrized over
  attn_layer {0, 2} (energy/force/virial/atom_virial, 1e-12 CPU).
- eligibility pins: attention+concat is graph-eligible; se_atten_v2
  (tebd_input_mode='strip') correctly stays dense (strip = later PR;
  the plan's 'se_atten_v2 inherits for free' did not hold).
- linear-model weight tests: pin smooth_type_embedding=False — the standard
  (graph-routed, carry-all) and linear (graph-ineligible, dense) submodels
  otherwise differ by the accepted smooth-attention denominator divergence
  (~1e-6), which is a route artifact, not a weight-combination bug.
- new binding-sel sanity: carry-all graph attention diverges from the
  sel-truncated dense path when sel binds (spec decision deepmodeling#17).
…rity)

neighbor_list=None now takes the carry-all graph default for eligible
attention models; explicit World-1 builders take the legacy dense route.
With smooth attention the two routes differ by design (PR-D), so the
route-equivalence tests pin smooth_type_embedding=False.
…SymInts

The compact (carry-all) pair enumeration used nonzero + tensor-repeat with
Python control flow on their data-dependent sizes, so the attention graph
lower failed torch.export with GuardOnDataDependentSymNode. Register those
sizes as unbacked SymInt sizes (new torch-free xp_hint_dynamic_size shim,
no-op for numpy/jax), take the empty-input fast paths only on concrete int
shapes, build iotas via cumsum(ones)-1 (the array_api_compat arange wrapper
branches on the length in Python), and skip the policy-compression nonzero
when no filter applies (include_self and ordered - the attention default).
Eager numpy/torch results are unchanged.
…> 0)

With the compact pair enumeration unbacked-SymInt-traceable, the carry-all
attention graph lower now exports to a graph-form .pt2 unchanged in ABI
(same 5-tensor NeighborGraph schema, dynamic edge axis) and with carry-all
semantics preserved (no sel truncation, unlike the dense-adapter nlist-form
export). Update the stale freeze-gate message (attention is eligible), add a
symbolic-trace merge gate at attn_layer in {0,2}, parametrize the DeepEval
graph .pt2 fixture over attn_layer (both artifacts: dynamic sizes, PBC and
non-PBC, 1e-10 vs the sel-capped dense reference at non-binding sel), and
add a single-atom zero-real-edge runtime test (the R==0 extreme of the
unbacked sizes).
The dense call is wrapped in @cast_precision, but the graph route's only
float input (edge_vec) lives inside the NeighborGraph dataclass where the
decorator cannot see it, so non-global-precision models (e.g. float32)
crashed with a double-vs-float matmul on the graph route while the dense
route worked. Cast edge_vec down to the descriptor precision on entry and
the outputs back to the caller's dtype on exit (differentiable, so the
model-level force autograd is unaffected). Add an fp32 graph-vs-dense route
parity test at attn_layer 0 and 2.
…nt-wide NaN

A masked entry whose raw logit exceeds the unmasked per-segment max by more
than the exp overflow threshold (~709 fp64 / ~88 fp32) overflowed exp() to
inf, and the post-hoc inf * 0 mask multiply produced nan, which the
denominator sum then spread across the entire segment. Shift data_for_max
(masked entries already -inf, exp(-inf) == 0 exactly) instead of the raw
data; the mask multiply stays as a defensive no-op. Regression test with a
masked logit 1e5 above the unmasked max. Addresses CodeRabbit review.
…sh eligibility wording

Address OutisLi review on deepmodeling#5715:
- Notes on DescrptDPA1.call_graph (+ pointers on uses_graph_lower and the
  freeze lower_kind docstring): for smooth_type_embedding=True the carry-all
  graph attention intentionally drops the dense layout's sel-padding terms
  from the softmax denominator - sel-independent semantics that differ from
  the legacy dense lower by up to ~1e-4; bit-tight dense parity holds for the
  non-smooth branch and for the static_nnei dense-adapter realization.
- Update the stale 'dpa1 attn_layer == 0' eligibility wording (freeze()
  docstring, training-path predicate/comment, dpmodel/pt_expt graph-lower
  docstrings) to the actual contract: mixed-types dpa1/se_atten with concat
  type embedding and no exclude_types, attention layers included; call_common
  docs no longer imply unconditional dense parity.
…he pt_expt graph default and legacy escape hatch
…layer and type_one_side

Add @pytest.mark.parametrize for attn_layer in [0, 2] and type_one_side in
[False, True] to test_exclude_types_graph_parity. Also adds the missing parity
assertion (graph vs dense at rtol=atol=1e-12, non-binding sel). Uses
smooth_type_embedding=False to avoid the known by-design softmax denominator
divergence in the dense smooth path.
Descriptor-level exclude_types is now graph-eligible (fully supported via
apply_pair_exclusion). Remove 'no exclude_types' from four docstrings/error
messages that list graph eligibility conditions. The gate condition was removed
in the NeighborGraph implementation; only tebd_input_mode='concat' restriction
remains.

- deepmd/pt_expt/entrypoints/main.py: freeze_model docstring (~502) + ValueError message (~589)
- deepmd/dpmodel/model/make_model.py: forward docstring (~317)
- deepmd/pt_expt/train/training.py: _model_uses_graph_lower docstring (~591)
build_neighbor_graph, build_neighbor_graph_ase, build_neighbor_graph_vesin,
build_neighbor_graph_nv all gain optional keyword-only pair_excl=None and
compact=False; default path = geometric search then apply_pair_exclusion.

_call_common_graph in pt_expt make_model wires atomic_model.pair_excl to
every builder call so model-level pair_exclude_types is applied at build time
(the atomic-model seam backstop stays as idempotent identity).

Oracle tests assert set-equality of the valid-edge set between
builder(pair_excl=X) and builder() + separate apply_pair_exclusion(X),
for dense (2 id + 3 oracle cases) and ase (2 cases); vesin gets 4 new tests
(2 identity, 2 oracle, parametrized over periodic).
…_neighbor_list/strategies (A4)

Extract the inline pair-exclusion from base_atomic_model.forward_common_atomic
into apply_pair_exclusion_nlist(nlist, atype_ext, pair_excl) in nlist.py.
The seam is refactored to call the named helper (idempotent backstop remains).

Add pair_excl=None to:
- build_neighbor_list (dpmodel, nlist.py)
- DefaultNeighborList.build
- VesinNeighborList.build (pt_expt)
- NvNeighborList.build (pt; CUDA-only, API parity)
- NeighborList base class signature

12 new unit tests covering: None/empty identity, excluded pairs -> -1,
-1 slot preservation, ghost-atom types, idempotence, torch namespace smoke,
build_neighbor_list oracle equivalence, DefaultNeighborList oracle,
VesinNeighborList oracle. NvNeighborList CUDA-only (not validated locally).
Han Wang added 6 commits July 5, 2026 02:11
…cross-refs

Serialize the model-level pair_exclude_types into metadata.json so the C++
DeepPotPTExpt loader can rebuild the flat (ntypes+1)^2 keep table and re-apply
the same pair-exclusion transform at the ingestion seam. Add See Also
cross-references between the Python transforms (apply_pair_exclusion,
apply_pair_exclusion_nlist) and their forthcoming C++ twins.
Add buildPairExcludeTable / applyPairExclusion (graph) / applyPairExclusionNlist
(dense) in commonPT.h, structurally mirroring the Python transforms
(apply_pair_exclusion, apply_pair_exclusion_nlist) with the same argument order
and variable names (type_ij, keep). DeepPotPTExpt::init rebuilds the flat
(ntypes+1)^2 keep table from the pair_exclude_types metadata; the seam applies
it before every model call (graph and dense, LAMMPS and standalone) as an
idempotent backstop to the exclusion already compiled into the .pt2.
gen_dpa1_pairexcl.py exports graph-route and dense-route DPA1(attn_layer=0)
.pt2 models with model-level pair_exclude_types=[[0,1]] plus a no-exclusion
baseline; references come from Python DeepEval of each model.

test_deeppot_dpa1_pairexcl_ptexpt.cc validates both C++ ingestion routes
(applyPairExclusion / applyPairExclusionNlist) against the Python references at
1e-10 (fp64), cross-checks graph==dense, and proves the exclusion is active by
comparing against the empty-table baseline. Wired into test_cc_local.sh.
8/8 tests pass locally.
The SpinModel backbone (dpa1 attn_layer=0) was being routed to the
carry-all graph path by the default-flip (decision deepmodeling#17) introduced in
the graph-pair-exclude branch.  Virtual/placeholder types injected by
get_spin_model double the atom density, making the system sel-binding;
the carry-all graph keeps neighbors the capped dense nlist discards,
so SpinModel.call_common diverged from call_common_lower (the dense
lower used by pt_expt eager inference) by ~2e-6.

Fix: pass neighbor_graph_method='legacy' when SpinModel.call_common
invokes the backbone, forcing the dense-nlist path until spin-graph
support is explicitly implemented.

The graph .pt2 export path already has a fail-fast guard for spin
(serialization.py:963).

Adds a regression test pinning that:
- SpinModel.call_common total energy equals backbone(legacy) on the
  same spin-doubled inputs (exact bit-identity).
- The backbone in graph mode gives a DIFFERENT energy at this density,
  confirming the fixture exercises the diverging regime.
The pair-exclude branch removed exclude_types from
DescrptDPA1.uses_graph_lower(), which previously kept spin backbones
(they inject exclude_types) on the dense path. As a side effect the
descriptor-level dispatch routed spin .pt2/.pte export through the
graph kernel, whose scatter/atomic_add tripped a torch-inductor CPU
codegen assertion (23 errors in test_deep_eval_spin.py).

Add an explicit disable_graph_lower() knob on DescrptDPA1 and set it
structurally in SpinModel.__init__ (covers get_spin_model and both
dpmodel/pt_expt deserialize paths, so it survives serialize round
trips). The flag is not serialized; re-derived at construction.
The neighbor_graph_method=legacy kwarg in call_common is kept as
belt-and-braces.
@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds model-level and descriptor-level pair-type exclusion support across dense neighbor lists and neighbor-graph builders (dpmodel, pt, pt_expt), extends DPA1 with graph-native attention (including exclusion-aware masking), disables graph-lower for spin models via legacy routing, and wires equivalent pair-exclusion enforcement into the C++ pt_expt inference seam with new tests and a torch-version guard for graph tracing.

Changes

Python pair exclusion and graph-native DPA1 attention

Layer / File(s) Summary
Array-API and segment primitives
deepmd/dpmodel/array_api.py, deepmd/dpmodel/utils/neighbor_graph/segment.py, source/tests/common/dpmodel/test_segment_softmax.py
Adds xp_hint_dynamic_size for export size hints and segment_max/segment_softmax mask-aware reductions with tests.
Center edge-pair enumeration
deepmd/dpmodel/utils/neighbor_graph/pairs.py, source/tests/common/dpmodel/test_center_edge_pairs.py
Implements center_edge_pairs in compact and shape-static modes for attention edge enumeration.
NeighborGraph pair-exclusion primitive
deepmd/dpmodel/utils/neighbor_graph/graph.py, .../env.py, .../__init__.py, source/tests/common/dpmodel/test_apply_pair_exclusion.py
Adds apply_pair_exclusion, edge_env_mat(return_sw=...), module exports, and masking/compaction/idempotence tests.
Dense neighbor-list pair-exclusion
deepmd/dpmodel/utils/{nlist,neighbor_list,default_neighbor_list,__init__}.py, deepmd/pt/utils/nv_nlist.py, deepmd/pt_expt/utils/vesin_neighbor_list.py, source/tests/common/dpmodel/test_apply_pair_exclusion_nlist.py
Adds apply_pair_exclusion_nlist and pair_excl parameters across build_neighbor_list and neighbor-list builder classes.
Neighbor-graph builder wiring
deepmd/dpmodel/utils/neighbor_graph/{ase_builder,builder}.py, deepmd/pt_expt/utils/{nv_graph_builder,vesin_graph_builder}.py, source/tests/{common/dpmodel/test_neighbor_graph_builder.py, pt_expt/utils/test_vesin_graph_builder.py}
Threads pair_excl/compact through neighbor-graph builders with oracle-equivalence tests.
Atomic-model exclusion integration
deepmd/dpmodel/atomic_model/base_atomic_model.py, source/tests/common/dpmodel/test_graph_atomic_parity.py
Replaces inline exclusion masking with apply_pair_exclusion_nlist/apply_pair_exclusion calls.
DPA1 graph-native attention
deepmd/dpmodel/descriptor/dpa1.py, deepmd/dpmodel/model/make_model.py, doc/model/train-se-atten.md, source/tests/common/dpmodel/test_dpa1_*, source/tests/pt_expt/descriptor/test_dpa1.py
Adds _graph_lower_disabled eligibility, static_nnei, _graph_attention/_graph_attention_one_layer, and exclusion-aware sw masking.
SpinModel legacy routing
deepmd/dpmodel/model/spin_model.py, source/tests/common/dpmodel/test_spin_model_legacy_routing.py
Disables graph-lower on spin backbone descriptor and forces neighbor_graph_method="legacy".
pt_expt wiring and torch-version guard
deepmd/pt_expt/{entrypoints/main.py, model/make_model.py, train/training.py, utils/serialization.py}, source/tests/pt_expt/utils/test_graph_pt2_metadata.py
Forwards pair_excl into builders, adds check_graph_trace_torch_version guard, and exports pair_exclude_types metadata.
pt_expt graph-lower parity tests
source/tests/pt_expt/{model/test_dpa1_graph_lower.py, infer/test_graph_deepeval.py, model/test_linear_model.py, utils/test_neighbor_list.py}
Extends parity tests for attention/exclusion, single-atom edges, float32 route, and pins smooth_type_embedding=False.

C++ pt_expt pair-exclusion ingestion seam

Layer / File(s) Summary
Pair-exclusion table and application
source/api_cc/include/{DeepPotPTExpt.h, commonPT.h}, source/api_cc/src/DeepPotPTExpt.cc
Adds pair_exclude_table_, buildPairExcludeTable/applyPairExclusion/applyPairExclusionNlist, applied in graph and dense compute paths.
C++ test suite and model generator
source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc, source/tests/infer/gen_dpa1_pairexcl.py, source/install/test_cc_local.sh
Adds a model generator producing graph/nlist/baseline .pt2 artifacts and a GoogleTest suite validating ingestion-route parity and exclusion activation.

Estimated code review effort: 4 (Complex) | ~75 minutes

Sequence Diagram(s)

sequenceDiagram
  participant DescrptDPA1
  participant DescrptBlockSeAtten
  participant CenterEdgePairs
  participant SegmentSoftmax
  DescrptDPA1->>DescrptBlockSeAtten: call_graph(graph, atype, static_nnei)
  DescrptBlockSeAtten->>DescrptBlockSeAtten: apply_pair_exclusion(graph, atype, pair_excl)
  DescrptBlockSeAtten->>CenterEdgePairs: center_edge_pairs(dst, edge_mask, static_nnei)
  DescrptBlockSeAtten->>DescrptBlockSeAtten: _graph_attention(gg, rr, dst, sw_e)
  DescrptBlockSeAtten->>SegmentSoftmax: segment_softmax(scores, query_edge, mask)
  SegmentSoftmax-->>DescrptBlockSeAtten: normalized attention weights
  DescrptBlockSeAtten-->>DescrptDPA1: grrg, rot_mat
Loading
sequenceDiagram
  participant DeepPotPTExptInit
  participant DeepPotPTExptCompute
  participant PairExcludeTable
  participant ExportedModel
  DeepPotPTExptInit->>PairExcludeTable: buildPairExcludeTable(ntypes, pair_exclude_types)
  DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusion(edge_index, edge_mask, atype)
  DeepPotPTExptCompute->>PairExcludeTable: applyPairExclusionNlist(nlist, atype_ext)
  PairExcludeTable-->>ExportedModel: filtered edge_mask / nlist
  ExportedModel-->>DeepPotPTExptCompute: energy, forces, virial
Loading

Possibly related PRs

  • deepmodeling/deepmd-kit#5491: Extends the same neighbor-list builders (VesinNeighborList, DefaultNeighborList/build_neighbor_list) that this PR adds pair_excl support to.
  • deepmodeling/deepmd-kit#5583: Both PRs heavily modify deepmd/dpmodel/descriptor/dpa1.py's uses_graph_lower/call_graph routing and signatures.
  • deepmodeling/deepmd-kit#5714: Extends the same pt_expt carry-all NeighborGraph builders (nv_graph_builder.py, vesin_graph_builder.py) with pair_excl/compact and apply_pair_exclusion.

Suggested labels: enhancement, Python, C++

Suggested reviewers: OutisLi, iProzd

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: canonical pair-exclusion handling and DPA1 graph-path support for exclude_types.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

atype = np.array([[0, 1, 0, 1, 0]], dtype=np.int64)
ds = DescrptDPA1(rcut=4.0, rcut_smth=0.5, sel=[200], ntypes=2, attn_layer=0)
ft = InvarFitting("energy", 2, ds.get_dim_out(), 1, mixed_types=True)
am = DPAtomicModel(ds, ft, type_map=["a", "b"])
try:
import ase # noqa: F401
except ImportError as e:
import unittest

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
source/tests/pt_expt/descriptor/test_dpa1.py (1)

117-147: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Pass mapping to torch.export.export here

test_exportable still goes through the dense fallback because TestCaseSingleFrameWithNlist sets nloc=3 and nall=4, while this export call passes no mapping. That means the new exclude_types case only covers the legacy dense exclusion mask, not the graph-native apply_pair_exclusion path. Add mapping to the exported inputs so the parametrization exercises the intended route.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt_expt/descriptor/test_dpa1.py` around lines 117 - 147,
test_exportable is missing the graph mapping input, so it still exercises the
dense fallback instead of the graph-native exclusion path. Update the export
setup in test_exportable to pass mapping into torch.export.export alongside dd0
and the existing inputs, using the test fixture’s mapping source so the
exclude_types parametrization covers apply_pair_exclusion as intended.
🧹 Nitpick comments (7)
deepmd/dpmodel/model/make_model.py (1)

316-322: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Docstring update looks accurate; stale example nearby.

Matches the new DPA1 graph-native attention behavior (attention layers now included in graph eligibility). Note the unchanged _call_common_graph exception message a few dozen lines below ("e.g. dpa1 attn_layer=0") is now a narrower example than what this docstring describes — consider updating that message text for consistency in a follow-up.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/model/make_model.py` around lines 316 - 322, The exception
message in _call_common_graph is now too narrow compared with the updated
graph-native attention behavior described in the nearby docstring. Update the
message text in _call_common_graph so it reflects the broader DPA1
attention-layer graph eligibility instead of only referencing the old “e.g. dpa1
attn_layer=0” example, keeping the wording consistent with the behavior
documented in make_model.py.
source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py (1)

166-179: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Test comment overstates what is actually verified.

The comment says this block will "verify excluded pairs contribute sw == 0" and "check ... call_graph sw channel", but call_graph only returns (grrg, rot_mat) — it has no sw output — and the actual assertions only check for NaN/Inf, not the claimed masking behavior. The earlier out[4] vs ref[4] comparison (lines 162-164) already indirectly validates exclusion parity for sw via the dense reference, so this block is largely redundant and its comment is misleading about intent/coverage. Either remove the stale comment or replace it with an assertion that actually validates zeroed contributions from excluded pairs (e.g., inspect the block's edge_mask/sw_e via se_atten.call_graph directly).

♻️ Suggested comment fix (minimal)
-        if exclude_types:
-            # verify excluded pairs contribute sw == 0 in the dense reference
-            # (atype=[0,1,0,1] -> pairs (0,1) and (1,0) should be masked)
-            # sw shape: (nf, nloc, nnei, 1); just check the graph output is also 0
-            # for excluded-pair edges by checking call_graph sw channel
+        if exclude_types:
+            # additional sanity check on the raw call_graph output (no sw
+            # channel here; exclusion parity for sw is already verified via
+            # out[4] vs ref[4] above).
             graph = from_dense_quartet(ext_coord, nlist, mapping, compact=False)
             atype_local = self.atype.reshape(-1)
-            grrg_g, rot_mat_g = dd.call_graph(
+            grrg_g, _rot_mat_g = dd.call_graph(
                 graph, atype_local, type_embedding=dd.type_embedding.call()
             )
             # no nan/inf in output with exclusions applied
             assert not np.any(np.isnan(grrg_g))
             assert not np.any(np.isinf(grrg_g))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py` around lines
166 - 179, The comment in test_dpa1_call_graph_descriptor is misleading because
this block does not verify excluded-pair sw masking; call_graph only returns
grrg and rot_mat, and the current assertions only check NaN/Inf. Update the test
by either removing/rephrasing the stale comment to match the actual coverage, or
add a real assertion for zeroed excluded-pair contributions by checking the
relevant sw/edge-mask path through se_atten.call_graph or the returned graph
data. The earlier out[4] vs ref[4] comparison already covers sw parity, so keep
this block focused on what it truly validates.
source/tests/common/dpmodel/test_neighbor_graph_builder.py (1)

419-427: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant import unittest.

unittest is already imported at the top of this file; the local re-import inside the except block is unnecessary.

🧹 Proposed cleanup
     `@classmethod`
     def setUpClass(cls) -> None:
         try:
             import ase  # noqa: F401
         except ImportError as e:
-            import unittest
-
             raise unittest.SkipTest("ase not installed") from e
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py` around lines 419
- 427, Remove the redundant local import inside test_neighbor_graph_builder’s
setUpClass method: the file already imports unittest, so keep the ImportError
handling but drop the inner import and use the existing unittest.SkipTest
reference when ase is missing.

Source: Linters/SAST tools

deepmd/dpmodel/utils/neighbor_graph/graph.py (1)

192-194: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Docstring overstates what compact=True replaces.

The parameter doc says edge_index, edge_vec, angle_index, angle_mask are all replaced when compact=True. In practice angle_index/angle_mask are never touched by the compact branch — the function only reaches the compaction step after confirming both are None (otherwise it raises NotImplementedError). Listing them as "replaced" could mislead a future implementer extending angle-compaction support into thinking this path already handles it.

📝 Suggested doc fix
     graph
-        The neighbor graph; only ``edge_mask`` (and, if ``compact=True``,
-        ``edge_index``, ``edge_vec``, ``angle_index``, ``angle_mask``) are
-        replaced.
+        The neighbor graph; only ``edge_mask`` (and, if ``compact=True``,
+        ``edge_index`` and ``edge_vec``) are replaced. ``angle_index`` /
+        ``angle_mask`` are never touched — compaction is rejected outright
+        when either is present (see the ``compact`` behavior below).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/utils/neighbor_graph/graph.py` around lines 192 - 194, The
docstring for the neighbor graph parameter overstates the effect of compact=True
by implying that angle_index and angle_mask are also replaced. Update the
documentation in graph.py to say compact mode only compacts edge_index and
edge_vec (along with edge_mask), and make it clear that angle_index and
angle_mask are not handled by this branch because the code path only proceeds
when they are None.
deepmd/dpmodel/utils/neighbor_graph/ase_builder.py (1)

154-163: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Pin device explicitly when converting atype for apply_pair_exclusion.

xp = array_api_compat.array_namespace(coord) followed by xp.asarray(atype) doesn't pin a device, unlike the analogous pair_excl wiring in nv_graph_builder.py and vesin_graph_builder.py, which both use torch.as_tensor(atype, device=<coord's device>). If atype isn't already a tensor on the same device as coord (e.g. a CPU/numpy atype paired with a CUDA coord), xp.asarray will silently produce a CPU tensor, which will then device-mismatch against graph.edge_index/edge_mask inside apply_pair_exclusion.

🔧 Suggested fix
     if pair_excl is not None:
         import array_api_compat

         xp = array_api_compat.array_namespace(coord)
-        atype_flat = xp.reshape(xp.asarray(atype), (-1,))
+        dev = array_api_compat.device(coord)
+        atype_flat = xp.reshape(xp.asarray(atype, device=dev), (-1,))
         graph = apply_pair_exclusion(graph, atype_flat, pair_excl, compact=compact)
     return graph
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py` around lines 154 - 163,
The atype conversion in the ASE neighbor graph path is not explicitly pinned to
coord’s device, so apply_pair_exclusion can receive tensors on the wrong device.
Update the ase_builder flow that builds graph and handles pair_excl to convert
atype the same way as the nv_graph_builder and vesin_graph_builder paths: derive
the device from coord and create atype on that device before flattening and
passing it into apply_pair_exclusion. This keeps the device consistent with
graph.edge_index and graph.edge_mask.
source/tests/common/dpmodel/test_graph_atomic_parity.py (1)

318-344: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Drop the unused model scaffolding. am is never referenced here, so DescrptDPA1, InvarFitting, and DPAtomicModel can be removed from this test.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_graph_atomic_parity.py` around lines 318 -
344, The test builds unused model scaffolding that is never referenced, so
remove the dead setup from test_apply_pair_exclusion_idempotent. Eliminate the
DescrptDPA1, InvarFitting, and DPAtomicModel construction (including the am
variable) and keep only the inputs actually needed for
extend_input_and_build_neighbor_list, from_dense_quartet, and
apply_pair_exclusion. Make sure the test still covers both the empty and
non-empty pair_exclude_types branches.

Sources: Coding guidelines, Linters/SAST tools

source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc (1)

101-159: 🎯 Functional Correctness | 🔵 Trivial | 🏗️ Heavy lift

Test coverage gap: LAMMPS InputNlist ingestion route not exercised for pair-exclusion.

check_against_ref/all TYPED_TESTs call the 6-arg dp.compute(ener, force, virial, coord, atype, box), which routes to DeepPotPTExpt's standalone (no-nlist, build_nlist-based) compute() overload. The LAMMPS-style InputNlist overload — the actual pair-style ingestion seam, which caches edge_index_tensor/firstneigh_tensor at ago==0 and recomputes geometry via compactEdgeTensors every step before calling applyPairExclusion/applyPairExclusionNlist — is never invoked here. A bug isolated to that branch's node/edge tensor construction (e.g. the multi_rank ? nall_real : nloc node-count selection feeding applyPairExclusion) wouldn't be caught by this suite.

Consider adding a case that drives the InputNlist overload (mirroring the pattern in test_deeppot_dpa1_graph_ptexpt.cc) with pair_exclude_types set, so both C++ ingestion entry points are validated against the Python reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc` around lines 101 -
159, Add coverage for the LAMMPS-style InputNlist ingestion path in this
pair-exclusion test, because the current check_against_ref and TYPED_TESTs only
exercise the 6-arg DeepPot::compute route. Introduce a test that calls the
InputNlist compute overload on DeepPotPTExpt, using pair_exclude_types and
matching the pattern used in test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node
tensor caching and applyPairExclusion/applyPairExclusionNlist branch are
validated against the Python reference.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deepmd/pt_expt/entrypoints/main.py`:
- Around line 571-576: Update the stale inline comment in main by removing the
“no type exclusion” restriction so it matches the current graph-eligibility
behavior. Keep the note aligned with `_model_uses_graph_lower` in training.py
and the nearby ValueError message: describe graph lower as opt-in for
graph-eligible models (dpa1 with concat tebd, attention layers, and supported
exclude_types) and preserve the rest of the fail-fast/per-atom-virial
explanation.

In `@doc/model/train-se-atten.md`:
- Around line 160-164: Update the pt_expt training doc sentence describing
graph-eligible descriptors so it no longer says descriptor-level exclude_types
disqualifies the carry-all neighbor-graph path. Use the surrounding
se_atten/neighbor_graph_method explanation to state that mixed-type descriptors
with tebd_input_mode "concat" and no descriptor-level compression remain
graph-eligible, while exclude_types is not a blocking condition anymore. Keep
the dense-vs-graph parity note tied to smooth_type_embedding and attn_layer, but
make the eligibility rule consistent with the current behavior exercised by
test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().

---

Outside diff comments:
In `@source/tests/pt_expt/descriptor/test_dpa1.py`:
- Around line 117-147: test_exportable is missing the graph mapping input, so it
still exercises the dense fallback instead of the graph-native exclusion path.
Update the export setup in test_exportable to pass mapping into
torch.export.export alongside dd0 and the existing inputs, using the test
fixture’s mapping source so the exclude_types parametrization covers
apply_pair_exclusion as intended.

---

Nitpick comments:
In `@deepmd/dpmodel/model/make_model.py`:
- Around line 316-322: The exception message in _call_common_graph is now too
narrow compared with the updated graph-native attention behavior described in
the nearby docstring. Update the message text in _call_common_graph so it
reflects the broader DPA1 attention-layer graph eligibility instead of only
referencing the old “e.g. dpa1 attn_layer=0” example, keeping the wording
consistent with the behavior documented in make_model.py.

In `@deepmd/dpmodel/utils/neighbor_graph/ase_builder.py`:
- Around line 154-163: The atype conversion in the ASE neighbor graph path is
not explicitly pinned to coord’s device, so apply_pair_exclusion can receive
tensors on the wrong device. Update the ase_builder flow that builds graph and
handles pair_excl to convert atype the same way as the nv_graph_builder and
vesin_graph_builder paths: derive the device from coord and create atype on that
device before flattening and passing it into apply_pair_exclusion. This keeps
the device consistent with graph.edge_index and graph.edge_mask.

In `@deepmd/dpmodel/utils/neighbor_graph/graph.py`:
- Around line 192-194: The docstring for the neighbor graph parameter overstates
the effect of compact=True by implying that angle_index and angle_mask are also
replaced. Update the documentation in graph.py to say compact mode only compacts
edge_index and edge_vec (along with edge_mask), and make it clear that
angle_index and angle_mask are not handled by this branch because the code path
only proceeds when they are None.

In `@source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc`:
- Around line 101-159: Add coverage for the LAMMPS-style InputNlist ingestion
path in this pair-exclusion test, because the current check_against_ref and
TYPED_TESTs only exercise the 6-arg DeepPot::compute route. Introduce a test
that calls the InputNlist compute overload on DeepPotPTExpt, using
pair_exclude_types and matching the pattern used in
test_deeppot_dpa1_graph_ptexpt.cc, so the edge/node tensor caching and
applyPairExclusion/applyPairExclusionNlist branch are validated against the
Python reference.

In `@source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py`:
- Around line 166-179: The comment in test_dpa1_call_graph_descriptor is
misleading because this block does not verify excluded-pair sw masking;
call_graph only returns grrg and rot_mat, and the current assertions only check
NaN/Inf. Update the test by either removing/rephrasing the stale comment to
match the actual coverage, or add a real assertion for zeroed excluded-pair
contributions by checking the relevant sw/edge-mask path through
se_atten.call_graph or the returned graph data. The earlier out[4] vs ref[4]
comparison already covers sw parity, so keep this block focused on what it truly
validates.

In `@source/tests/common/dpmodel/test_graph_atomic_parity.py`:
- Around line 318-344: The test builds unused model scaffolding that is never
referenced, so remove the dead setup from test_apply_pair_exclusion_idempotent.
Eliminate the DescrptDPA1, InvarFitting, and DPAtomicModel construction
(including the am variable) and keep only the inputs actually needed for
extend_input_and_build_neighbor_list, from_dense_quartet, and
apply_pair_exclusion. Make sure the test still covers both the empty and
non-empty pair_exclude_types branches.

In `@source/tests/common/dpmodel/test_neighbor_graph_builder.py`:
- Around line 419-427: Remove the redundant local import inside
test_neighbor_graph_builder’s setUpClass method: the file already imports
unittest, so keep the ImportError handling but drop the inner import and use the
existing unittest.SkipTest reference when ase is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f66e001c-837b-4b14-a6b8-84d867cc8a56

📥 Commits

Reviewing files that changed from the base of the PR and between ffe57a3 and 1b5c75d.

📒 Files selected for processing (48)
  • deepmd/dpmodel/array_api.py
  • deepmd/dpmodel/atomic_model/base_atomic_model.py
  • deepmd/dpmodel/descriptor/dpa1.py
  • deepmd/dpmodel/model/make_model.py
  • deepmd/dpmodel/model/spin_model.py
  • deepmd/dpmodel/utils/__init__.py
  • deepmd/dpmodel/utils/default_neighbor_list.py
  • deepmd/dpmodel/utils/neighbor_graph/__init__.py
  • deepmd/dpmodel/utils/neighbor_graph/ase_builder.py
  • deepmd/dpmodel/utils/neighbor_graph/builder.py
  • deepmd/dpmodel/utils/neighbor_graph/env.py
  • deepmd/dpmodel/utils/neighbor_graph/graph.py
  • deepmd/dpmodel/utils/neighbor_graph/pairs.py
  • deepmd/dpmodel/utils/neighbor_graph/segment.py
  • deepmd/dpmodel/utils/neighbor_list.py
  • deepmd/dpmodel/utils/nlist.py
  • deepmd/pt/utils/nv_nlist.py
  • deepmd/pt_expt/entrypoints/main.py
  • deepmd/pt_expt/model/make_model.py
  • deepmd/pt_expt/train/training.py
  • deepmd/pt_expt/utils/nv_graph_builder.py
  • deepmd/pt_expt/utils/serialization.py
  • deepmd/pt_expt/utils/vesin_graph_builder.py
  • deepmd/pt_expt/utils/vesin_neighbor_list.py
  • doc/model/train-se-atten.md
  • source/api_cc/include/DeepPotPTExpt.h
  • source/api_cc/include/commonPT.h
  • source/api_cc/src/DeepPotPTExpt.cc
  • source/api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc
  • source/install/test_cc_local.sh
  • source/tests/common/dpmodel/test_apply_pair_exclusion.py
  • source/tests/common/dpmodel/test_apply_pair_exclusion_nlist.py
  • source/tests/common/dpmodel/test_center_edge_pairs.py
  • source/tests/common/dpmodel/test_dpa1_call_graph_block.py
  • source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py
  • source/tests/common/dpmodel/test_dpa1_graph_attention_parity.py
  • source/tests/common/dpmodel/test_graph_atomic_parity.py
  • source/tests/common/dpmodel/test_neighbor_graph_builder.py
  • source/tests/common/dpmodel/test_segment_softmax.py
  • source/tests/common/dpmodel/test_spin_model_legacy_routing.py
  • source/tests/infer/gen_dpa1_pairexcl.py
  • source/tests/pt_expt/descriptor/test_dpa1.py
  • source/tests/pt_expt/infer/test_graph_deepeval.py
  • source/tests/pt_expt/model/test_dpa1_graph_lower.py
  • source/tests/pt_expt/model/test_linear_model.py
  • source/tests/pt_expt/utils/test_graph_pt2_metadata.py
  • source/tests/pt_expt/utils/test_neighbor_list.py
  • source/tests/pt_expt/utils/test_vesin_graph_builder.py

Comment on lines +571 to +576
# The graph lower is opt-in and only valid for graph-eligible models
# (dpa1 with concat tebd and no type exclusion; attention layers included
# -- the carry-all pair enumeration exports via unbacked SymInts). Fail
# fast with a clear message rather than emitting a broken .pt2. Enable the
# per-atom virial for the graph form -- it is near-free there (one extra
# scatter off the single shared backward).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Stale comment: "no type exclusion" contradicts this PR's own change.

This inline comment still describes graph eligibility as "dpa1 with concat tebd and no type exclusion", but the PR's whole premise (per stack context: "dpa1 graph path supports exclude_types... removing the prior eligibility gate") is that exclude_types is now supported on the graph path. The companion predicate _model_uses_graph_lower in training.py (and its docstring) no longer mentions any exclusion restriction, and the ValueError message a few lines below (Line 587) also drops the exclusion caveat — only this comment is stale.

📝 Proposed fix
-    # The graph lower is opt-in and only valid for graph-eligible models
-    # (dpa1 with concat tebd and no type exclusion; attention layers included
-    # -- the carry-all pair enumeration exports via unbacked SymInts). Fail
-    # fast with a clear message rather than emitting a broken .pt2. Enable the
-    # per-atom virial for the graph form -- it is near-free there (one extra
-    # scatter off the single shared backward).
+    # The graph lower is opt-in and only valid for graph-eligible models
+    # (dpa1 with concat tebd, incl. attention layers and exclude_types --
+    # the carry-all pair enumeration exports via unbacked SymInts). Fail
+    # fast with a clear message rather than emitting a broken .pt2. Enable the
+    # per-atom virial for the graph form -- it is near-free there (one extra
+    # scatter off the single shared backward).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# The graph lower is opt-in and only valid for graph-eligible models
# (dpa1 with concat tebd and no type exclusion; attention layers included
# -- the carry-all pair enumeration exports via unbacked SymInts). Fail
# fast with a clear message rather than emitting a broken .pt2. Enable the
# per-atom virial for the graph form -- it is near-free there (one extra
# scatter off the single shared backward).
# The graph lower is opt-in and only valid for graph-eligible models
# (dpa1 with concat tebd, incl. attention layers and exclude_types --
# the carry-all pair enumeration exports via unbacked SymInts). Fail
# fast with a clear message rather than emitting a broken .pt2. Enable the
# per-atom virial for the graph form -- it is near-free there (one extra
# scatter off the single shared backward).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/pt_expt/entrypoints/main.py` around lines 571 - 576, Update the stale
inline comment in main by removing the “no type exclusion” restriction so it
matches the current graph-eligibility behavior. Keep the note aligned with
`_model_uses_graph_lower` in training.py and the nearby ValueError message:
describe graph lower as opt-in for graph-eligible models (dpa1 with concat tebd,
attention layers, and supported exclude_types) and preserve the rest of the
fail-fast/per-atom-virial explanation.

Comment on lines +160 to +164
In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.
The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.
When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.
Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Stale "no exclude_types" eligibility gate contradicts this PR's own change.

This new sentence states that descriptor-level exclude_types disqualifies a descriptor from the graph-eligible/carry-all path in pt_expt. But this cohort explicitly removes that gate: test_exclude_types_graph_eligible_and_parity in source/tests/common/dpmodel/test_dpa1_call_graph_descriptor.py asserts dd.uses_graph_lower() is True when exclude_types is set, with the comment "gate: with any exclude list the descriptor must now be graph-eligible." The PR objective also states the eligibility gate for exclude_types was removed. This doc line will mislead users about which models route through the carry-all graph path.

📝 Proposed fix
-In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.
+In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no descriptor-level `exclude_types` or compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list.
The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.
When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.
Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.
In the pt_expt backend, graph-eligible descriptors (mixed types, `tebd_input_mode` `"concat"`, no compression) are evaluated by default through the carry-all neighbor-graph path instead of the legacy dense neighbor list. Descriptor-level `exclude_types` no longer disqualifies a descriptor from this path.
The graph path considers all neighbors within the cutoff, so its result does not depend on {ref}`sel <model[standard]/descriptor[se_atten]/sel>`.
When `smooth_type_embedding` is `true` and {ref}`attn_layer <model[standard]/descriptor[se_atten]/attn_layer>` is larger than 0 (the defaults), the dense path keeps `sel`-padding phantom terms in the attention softmax denominator while the graph path drops them, so checkpoints trained under the dense semantics shift by up to about 1e-4 in energy when evaluated on the graph path.
Passing `neighbor_graph_method="legacy"` to the model forward (or the corresponding evaluation option) restores the dense-path numbers exactly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@doc/model/train-se-atten.md` around lines 160 - 164, Update the pt_expt
training doc sentence describing graph-eligible descriptors so it no longer says
descriptor-level exclude_types disqualifies the carry-all neighbor-graph path.
Use the surrounding se_atten/neighbor_graph_method explanation to state that
mixed-type descriptors with tebd_input_mode "concat" and no descriptor-level
compression remain graph-eligible, while exclude_types is not a blocking
condition anymore. Keep the dense-vs-graph parity note tied to
smooth_type_embedding and attn_layer, but make the eligibility rule consistent
with the current behavior exercised by
test_exclude_types_graph_eligible_and_parity and dd.uses_graph_lower().

@codecov

codecov Bot commented Jul 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.83820% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.20%. Comparing base (40d7a49) to head (1b5c75d).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
.../api_cc/tests/test_deeppot_dpa1_pairexcl_ptexpt.cc 79.31% 12 Missing ⚠️
deepmd/pt_expt/utils/nv_graph_builder.py 0.00% 5 Missing ⚠️
deepmd/pt/utils/nv_nlist.py 40.00% 3 Missing ⚠️
deepmd/pt_expt/model/make_model.py 60.00% 2 Missing ⚠️
source/api_cc/src/DeepPotPTExpt.cc 86.66% 0 Missing and 2 partials ⚠️
deepmd/dpmodel/descriptor/dpa1.py 98.55% 1 Missing ⚠️
deepmd/dpmodel/utils/neighbor_graph/env.py 80.00% 1 Missing ⚠️
source/api_cc/include/commonPT.h 98.03% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5733      +/-   ##
==========================================
- Coverage   81.29%   81.20%   -0.09%     
==========================================
  Files         990      992       +2     
  Lines      111019   111372     +353     
  Branches     4235     4250      +15     
==========================================
+ Hits        90252    90444     +192     
- Misses      19243    19401     +158     
- Partials     1524     1527       +3     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants