fix(loss): exclude mixed_type padding atoms from the training loss by wanghan-iapcm · Pull Request #5738 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-07-05T16:13:22Z

Problem

In the mixed_type data format, short frames are padded with type = -1 ghost atoms up to a fixed nloc, and the real atom count varies per frame within a batch. The training loss normalized by the padded scalar natoms and took unmasked or cross-frame-pooled means, so ghost atoms diluted the force/atomic denominators and mis-normalized the extensive energy/virial/property terms. As a result a padded [3-atom + 5-atom] batch did not produce the same loss/gradient as processing the 3-atom and 5-atom frames separately. Only mixed_type batches are affected; non-mixed training was already exact.

Fix

The per-atom mask (atype >= 0) now reaches the loss under the existing model_dict["mask"] convention: pt_expt already propagates it; the pt (torch.jit) backend recovers it from atype via a new TaskLoss._inject_atom_mask helper called from every pt loss forward (the exported forward drops the model's per-atom mask, so it is recovered training-side only — the exported artifact is untouched).

Every loss term is then normalized per frame so that a padded batch equals the grad-accumulation of the individual frames at their real sizes: per-atom terms (force, atom_ener, atom_pref, atomic dos/tensor, spin real-force, generalized force) use a per-frame masked mean; extensive terms (energy, virial, property) divide by the per-frame real atom count; global already-reduced terms (global dos/tensor) use a plain mean with the previous atom-count weighting dropped. Every change reduces exactly to the previous formula when the mask is all-ones, so non-mixed training is numerically identical (no-op to rounding). Covered across five shared loss types in both backends: deepmd/dpmodel/loss/{ener,ener_spin,dos,tensor,property}.py (which serve pt_expt) and deepmd/pt/loss/{ener,ener_spin,dos,tensor,property}.py.

Two additional fixes surfaced during the work: the extensive property normalization called xp.sum(mask, -1) with a positional axis, which raises TypeError under the array_api_compat torch namespace (every pt_expt extensive-property run) — now axis=-1; and ener_spin's MAE energy and real-force terms were pre-existingly inconsistent with ener.py (they summed over frames without per-atom normalization) — they are now aligned with ener.py, which changes their non-mixed MAE loss values (a deliberate bug fix, see Known Limitations).

Test

New source/tests/common/dpmodel/test_loss_padding.py and source/tests/pt/test_loss_padding.py assert, for every per-atom and extensive term of all five loss types in both backends, that a padded [3+5] batch loss equals the mean of the two frames processed separately, plus an all-ones-mask non-mixed no-op guard per term, and a torch-tensor path through the dpmodel property loss (which reproduces the positional-axis crash on the old form). An audit added invariant coverage for the generalized-force and spin magnetic-force terms and confirmed they are free of padding artifacts.

Known limitations

The tf backend loss is unchanged and retains the same mixed_type behavior (follow-up). The pt-only losses dens, population, denoise are not covered (follow-up). ener_spin's magnetic-force (force_mag) MAE term uses a sum over frames rather than a mean, so it does not satisfy the frame-average invariant — this is not a padding artifact (ghost atoms are correctly excluded via mask_mag), but a separate pre-existing MAE frame-normalization inconsistency, left for a follow-up decision. The enable_atom_ener_coeff path sums ghost atomic energies before the energy reduction (pre-existing; ghost atom_ener is ~0 by convention). Existing mixed_type trainings will not reproduce numerically — the new values are the correct ones. Ghost label forces are assumed ~0 by the dpdata convention; the mask makes the loss robust even if they are not.

Summary by CodeRabbit

New Features
- Improved support for padded, mixed-atom batches across loss calculations.
- Added more consistent handling of masked frames for energy, force, tensor, DOS, property, and spin-related losses.
Bug Fixes
- Fixed loss normalization so ghost atoms and padded entries no longer skew results.
- Made global loss metrics and reported RMSE/MAE values more consistent under masking.
- Preserved intensive-property behavior while improving extensive-property scaling.

…sts can be excluded pt losses build model_pred internally and drop the model mask; recover it from atype in the TaskLoss base and call it from every pt loss forward; pt_expt already carries the mask; enables per-frame loss normalization in later tasks; no-op for non-mixed.

…e padding Atomic terms (ados/acdf for DOSLoss; local tensor for TensorLoss) now use per-frame masked mean (idiom 1) instead of a cross-frame pooled masked mean, ensuring each frame is normalized by its own real-atom count. Global terms (dos/cdf for DOSLoss; global tensor for TensorLoss) now use plain mean (idiom 3), dropping the prior atom-count weighting that violated the grad-accumulation invariant. Both meet the invariant: a padded [3+5]-atom batch yields the same loss as processing each frame separately and averaging. Boolean fancy indexing removed from pt atomic branches. Changes are bit-identical for non-mixed batches (all-ones mask). Applies to deepmd/dpmodel/loss/{dos,tensor}.py (serves pt_expt) and deepmd/pt/loss/{dos,tensor}.py (mirror). TDD: invariant tests confirmed RED before and GREEN after (24/24 passed).

…padding Apply per-frame normalization to every loss term (energy, force, virial, atom_ener, atom_pref, gen_force) in both deepmd/dpmodel/loss/ener.py and deepmd/pt/loss/ener.py. When model_dict["mask"] is present (mixed_type batch), ghost-padded atoms are excluded from the loss signal via Idiom 1 (masked per-atom mean) for force/atom_ener/atom_pref and Idiom 2 (extensive inv**norm_exp weighting) for energy/virial; all sub-branches (mse/mae/huber, intensive/non-intensive) are covered. Non-mixed callers that never inject a mask hit the unchanged else-branch and are bit-identical to the pre-fix behavior. 40 new grad-accumulation-invariant unit tests (20 dpmodel + 20 pt) verify the fix and the no-op property.

…ay metric Add test_huber_grad_accum to TestDPModelEnergyLossAtomEnerGradAccum and TestPTEnergyLossAtomEnerGradAccum verifying the masked Huber atom_ener path satisfies the [3+5]==separate grad-accum invariant. Fix rmse_f in the masked force MSE branch to use the masked per-frame l2 (l2_force_masked / l2_f_masked) rather than the ghost-diluted unmasked l2_force_loss; hoist the masked MSE computation before the use_huber split so it is always in scope for the display metric in both huber and non-huber paths.

…ount In mixed_type batches, ghost atoms (atype=-1) pad short frames to a fixed nloc width. PropertyLoss previously divided pred/label by the scalar natoms (the padded count), causing frames with fewer real atoms to be incorrectly normalized. When model_dict["mask"] is present, use the per-frame real atom count (sum of mask along the atom axis, broadcast over task_dim) instead. The else branch retains the original /natoms behavior so mask-free callers are unchanged. Tests: RED→GREEN for mse/mae/smooth_mae grad-accum invariant in both dpmodel and pt; no-op and intensive guards added.

…type padding Apply the grad-accumulation-invariant per-frame normalization to EnergySpinLoss in both the dpmodel (deepmd/dpmodel/loss/ener_spin.py) and pt (deepmd/pt/loss/ener_spin.py) backends: - Energy (has_e, MSE): idiom 2 — per-frame 1/real_natoms^norm_exp normalization replaces the flat atom_norm**norm_exp * mean() formulation. - Force real (has_fr, MSE): idiom 1 — per-atom masked mean (ncomp=3) replaces the unmasked global mean over all nloc atoms. - Virial (has_v, MSE): idiom 2 — per-frame 1/real_natoms^norm_exp normalization (k=9) replaces atom_norm**norm_exp * mean(). The force_mag / mask_mag (has_fm) path is left completely untouched; mask_mag is the spin virtual-atom mask and is a distinct concept from the padding mask. Each masked branch is guarded by `if "mask" in model_dict/model_pred:` so mask-less callers (non-mixed batches without a padding mask) use the original code path unchanged — the fix is a bit-identical no-op for non-mixed batches. Tests added to source/tests/common/dpmodel/test_loss_padding.py and source/tests/pt/test_loss_padding.py: invariant tests for energy (norm_exp 1 and 2), force_real, and virial; no-op tests for each term; and a guard that confirms force_mag loss is numerically unchanged when a padding mask is present.

… padding Energy/force_real/virial MAE branches in EnergySpinLoss (dpmodel and pt) used padded-scalar natoms or unmasked global means, violating the grad-accumulation invariant for mixed_type batches. Apply the same per-frame masked pattern already used for MSE, mirroring the authoritative treatment in ener.py (Task 3). Also fix the mae=True display metric for energy to use per-frame inv weighting instead of padded atom_norm. force_mag/mask_mag and MSE branches are untouched.

The extensive property mask normalization called xp.sum(model_dict["mask"], -1) with axis positional; array_api_compat's torch namespace declares axis keyword-only, so this raised TypeError for every pt_expt extensive-property training. Use axis=-1, matching the dos/tensor/ener losses. Adds a torch-tensor test through PropertyLoss.call that reproduces the crash on the old form.

…pe padding Apply idiom 1 (per-atom masked mean, ncomp=1) to the has_ae block in both EnergySpinLoss backends so that ghost-padded atoms (mask=0) are excluded from the atom_ener loss normalization. Before this fix the loss was computed as mean(square(diff)) over a flattened [nf*nloc] vector, so ghost atoms diluted the denominator in mixed_type batches. The masked path uses xp.reshape(maskf, (_nf, _nloc, 1)) as a broadcastable column mask, then idiom-1 per-frame sum / dof → mean over frames. The unmasked (no-mask) fallback is unchanged. Also add grad-accum invariant tests for: ener_spin.has_ae (mse/mae, RED→GREEN), gen_force/has_gf (already GREEN — ghost forces are masked before projection so the mean(square) over [nf, ngen] is frame-decomposable), and force_mag MSE (GREEN — n_valid excludes ghost atoms via mask_mag=0; invariant holds when NM is equal across frames, which is the practical spin-model case). force_mag MAE uses xp.sum over frames instead of xp.mean (2x accumulation artifact, pre-existing, independent of mask_mag semantics) and is reported as NEEDS_CONTEXT.

Importing deepmd.pt installs a cuda:9999999 default-device trap, so the property torch test failed only when run after the pt test file (order-dependent). Pin device=cpu on the constructed tensors to make the test hermetic. Source behavior unchanged.

+            _nloc = maskf.shape[1]
+        else:
+            maskf = None
+            inv = None


+        else:
+            maskf = None
+            inv = None
+            _nf = None


+            maskf = None
+            inv = None
+            _nf = None
+            _nloc = None


+            _nloc = maskf.shape[1]
+        else:
+            maskf = None
+            inv = None


+        else:
+            maskf = None
+            inv = None
+            _nf = None


+        else:
+            maskf = None
+            inv = None
+            _nf = None


+            maskf = None
+            inv = None
+            _nf = None
+            _nloc = None


+            _nloc = maskf.shape[1]
+        else:
+            maskf = None
+            inv = None


+        else:
+            maskf = None
+            inv = None
+            _nf = None


+            maskf = None
+            inv = None
+            _nf = None
+            _nloc = None


coderabbitai · 2026-07-05T16:23:43Z

📝 Walkthrough

Walkthrough

Loss functions across dpmodel and pt backends (DOS, energy, energy-spin, tensor, property) are refactored to support per-frame atom masking for padded/mixed-type batches. A new pt helper injects atom masks from atype, local losses use per-frame masked reductions, global losses use unweighted means with mask-derived atom counts, and extensive invariant tests are added for both backends.

Changes

Mask-aware loss normalization

Layer / File(s)	Summary
Atom mask injection helper `deepmd/pt/loss/loss.py`	Adds `TaskLoss._inject_atom_mask`, reconstructing a per-atom mask from `atype` (ghost atoms → 0) when absent from predictions.
DOS loss masked normalization `deepmd/dpmodel/loss/dos.py`, `deepmd/pt/loss/dos.py`	Local DOS/CDF use per-frame masked squared-error reductions normalized by masked degrees of freedom; global DOS/CDF use unweighted mean squared error with mask-derived `atom_num` for RMSE scaling; pt `forward` wraps model output with `_inject_atom_mask`.
Tensor loss masked normalization `deepmd/dpmodel/loss/tensor.py`, `deepmd/pt/loss/tensor.py`	Local tensor loss reshapes into per-frame masked mean; global tensor loss becomes a plain mean of squared error with mask used only to derive `atom_num`.
Energy loss masked normalization `deepmd/dpmodel/loss/ener.py`, `deepmd/pt/loss/ener.py`	Energy, force, virial, atomic energy, atomic-pref, and generalized-force terms branch on a per-frame mask for MSE/MAE/Huber modes; adds `custom_huber_loss` helper.
EnergySpin loss masked normalization `deepmd/dpmodel/loss/ener_spin.py`, `deepmd/pt/loss/ener_spin.py`	Energy, real-force, atom-energy, and virial terms gain masked/unmasked branches with per-frame normalization by real atom counts.
Property loss masked normalization `deepmd/dpmodel/loss/property.py`, `deepmd/pt/loss/property.py`	Non-intensive property normalization divides by per-frame real atom counts derived from a mask when present, else falls back to `natoms`.
Grad-accumulation invariant tests `source/tests/common/dpmodel/test_loss_padding.py`, `source/tests/pt/test_loss_padding.py`	Adds shared padded-batch invariant harnesses and comprehensive test coverage across DOS, tensor, energy, energy-spin, and property losses for both backends, plus unit tests for `_inject_atom_mask`.

Estimated code review effort: 5 (Critical) | ~120 minutes

Possibly related PRs

deepmodeling/deepmd-kit#5156: Both PRs modify the spin-energy virial loss computation in deepmd/pt/loss/ener_spin.py.
deepmodeling/deepmd-kit#5250: The retrieved PR adds/aligns mask propagation in model outputs that this PR's losses now consume for masked normalization.
deepmodeling/deepmd-kit#5345: Both PRs touch the same loss implementations (dos.py, ener.py, ener_spin.py, tensor.py, property.py), with #5345 introducing the losses this PR later refactors.

Suggested reviewers: njzjz, iProzd

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: excluding mixed_type padding atoms from training loss calculations.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

source/tests/pt/test_loss_padding.py (2)
1976-2064: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Known force_mag MAE bug documented but left untested.

The comment acknowledges that force_mag MAE fails the grad-accum invariant by a 2x factor due to frame-sum normalization, but only the MSE variant is exercised as a test — there's no xfail/skip test capturing the MAE case, so this documented gap in deepmd/pt/loss/ener_spin.py isn't tracked by CI and could silently regress or be forgotten.

Consider adding an explicit @pytest.mark.xfail test asserting the known MAE force_mag discrepancy, so the fix (or its continued absence) is tracked rather than only documented in a comment.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt/test_loss_padding.py` around lines 1976 - 2064, The force_mag
MAE grad-accum discrepancy is only described in comments and not covered by CI.
Add an explicit pytest xfail (or skip) test near
TestPTEnerSpinLossForceMagMSEGradAccum that exercises the MAE path through
_spin_loss_fn/EnergySpinLossPT and asserts the known 2x invariant mismatch, so
the documented behavior is tracked and won’t be forgotten. Keep the existing MSE
test unchanged and use the same make_A, make_B, and make_padded helpers to
locate the relevant loss behavior.
109-122: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Duplicate mock model classes.

_MockModel and _EnerLossMockModel are functionally identical (same __init__/__call__). Consider consolidating into a single shared helper used across all test classes in this file.
♻️ Suggested consolidation
-class _MockModel:
-    """Callable that ignores inputs and returns a fixed model_pred dict.
-    ...
-    """
-
-    def __init__(self, pred: dict):
-        self._pred = pred
-
-    def __call__(self, **kwargs):
-        return dict(self._pred)
-
...
-class _EnerLossMockModel:
-    """Callable returning a fixed model_pred dict (mask pre-populated)."""
-
-    def __init__(self, pred: dict):
-        self._pred = pred
-
-    def __call__(self, **kwargs):
-        return dict(self._pred)
+# Reuse the single _MockModel class defined earlier in this file instead of
+# redefining an identical class for the energy-loss test section.
Also applies to: 628-636
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/pt/test_loss_padding.py` around lines 109 - 122, Consolidate the
duplicate test helpers by removing the redundant `_EnerLossMockModel` and
reusing `_MockModel` across the pt loss tests. Keep the shared behavior in one
helper with the same `__init__` and `__call__` contract, and update the affected
test classes in `test_loss_padding` to instantiate that single helper instead of
maintaining two identical mock model classes. Ensure any existing expectations
around the shallow copy behavior and pre-populated `pred` mask remain unchanged.
deepmd/dpmodel/loss/dos.py (1)
133-143: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoff

Optional: extract the repeated masked per-frame reduction.

This masked "per-frame sum / per-frame dof → mean" idiom is duplicated verbatim in the local CDF block (Lines 160-170) and mirrored in tensor.py and both pt backends. A small shared helper (e.g. masked_per_frame_mse(diff3d, maskf, ndof)) would reduce the maintenance surface. Behavior-preserving; safe to defer.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deepmd/dpmodel/loss/dos.py` around lines 133 - 143, The masked per-frame
reduction in the DOS loss is duplicated across the local DOS and local CDF paths
and mirrored in other backends, so extract it into a shared helper to reduce
maintenance. Create a small utility such as masked_per_frame_mse(diff3d, maskf,
ndof) and have the DOS block in dos.py call it instead of inlining the
"per-frame sum / per-frame dof → mean" logic. Keep behavior identical and reuse
the helper wherever the same masked reduction pattern appears, including the
local CDF block and matching implementations in tensor.py and the pt backends.
source/tests/common/dpmodel/test_loss_padding.py (1)
204-207: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant acdf/cdf tests add no independent coverage.

test_acdf_grad_accum_invariant simply re-invokes test_ados_grad_accum_invariant (and test_cdf_grad_accum_invariant at Lines 282-285 re-invokes test_dos_grad_accum_invariant). The acdf/cdf paths are only exercised incidentally because _make_loss sets both *_ados/*_acdf (and *_dos/*_cdf) prefs to 1.0, so these named tests don't isolate the cumulative-distribution path. Consider either isolating the cdf term (e.g., pref only on cdf/acdf) or dropping the alias tests to avoid implying coverage that isn't independent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@source/tests/common/dpmodel/test_loss_padding.py` around lines 204 - 207, The
acdf/cdf grad-accum tests are just aliases of the ados/dos tests and don’t add
independent coverage. Update the relevant test methods in test_loss_padding
(test_acdf_grad_accum_invariant and test_cdf_grad_accum_invariant) so they
either isolate the cumulative-distribution path by configuring _make_loss prefs
for only acdf/cdf, or remove the redundant alias tests altogether. Use the
existing test_ados_grad_accum_invariant and test_dos_grad_accum_invariant
helpers as the reference point when deciding whether to separate coverage or
drop the wrappers.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@source/tests/common/dpmodel/test_loss_padding.py`:
- Around line 2011-2019: The `force_mag` MAE path in `ener_spin.py` is still
frame-scaled because it uses a sum over per-frame averages instead of a true
frame-wise mean. Update the MAE computation in the `force_mag` loss logic to
normalize across frames consistently with the other `ener_spin` MAE branches,
using the relevant symbols around the `force_mag` reduction code path. If this
change is intentionally deferred, add a reference to the tracking issue in the
surrounding test/comment; otherwise make the loss independent of `nf` so the 2x
effect disappears.

---

Nitpick comments:
In `@deepmd/dpmodel/loss/dos.py`:
- Around line 133-143: The masked per-frame reduction in the DOS loss is
duplicated across the local DOS and local CDF paths and mirrored in other
backends, so extract it into a shared helper to reduce maintenance. Create a
small utility such as masked_per_frame_mse(diff3d, maskf, ndof) and have the DOS
block in dos.py call it instead of inlining the "per-frame sum / per-frame dof →
mean" logic. Keep behavior identical and reuse the helper wherever the same
masked reduction pattern appears, including the local CDF block and matching
implementations in tensor.py and the pt backends.

In `@source/tests/common/dpmodel/test_loss_padding.py`:
- Around line 204-207: The acdf/cdf grad-accum tests are just aliases of the
ados/dos tests and don’t add independent coverage. Update the relevant test
methods in test_loss_padding (test_acdf_grad_accum_invariant and
test_cdf_grad_accum_invariant) so they either isolate the
cumulative-distribution path by configuring _make_loss prefs for only acdf/cdf,
or remove the redundant alias tests altogether. Use the existing
test_ados_grad_accum_invariant and test_dos_grad_accum_invariant helpers as the
reference point when deciding whether to separate coverage or drop the wrappers.

In `@source/tests/pt/test_loss_padding.py`:
- Around line 1976-2064: The force_mag MAE grad-accum discrepancy is only
described in comments and not covered by CI. Add an explicit pytest xfail (or
skip) test near TestPTEnerSpinLossForceMagMSEGradAccum that exercises the MAE
path through _spin_loss_fn/EnergySpinLossPT and asserts the known 2x invariant
mismatch, so the documented behavior is tracked and won’t be forgotten. Keep the
existing MSE test unchanged and use the same make_A, make_B, and make_padded
helpers to locate the relevant loss behavior.
- Around line 109-122: Consolidate the duplicate test helpers by removing the
redundant `_EnerLossMockModel` and reusing `_MockModel` across the pt loss
tests. Keep the shared behavior in one helper with the same `__init__` and
`__call__` contract, and update the affected test classes in `test_loss_padding`
to instantiate that single helper instead of maintaining two identical mock
model classes. Ensure any existing expectations around the shallow copy behavior
and pre-populated `pred` mask remain unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3120253d-2b09-40b1-9819-187f97feab86

📥 Commits

Reviewing files that changed from the base of the PR and between ffe57a3 and 4990db2.

📒 Files selected for processing (13)

deepmd/dpmodel/loss/dos.py
deepmd/dpmodel/loss/ener.py
deepmd/dpmodel/loss/ener_spin.py
deepmd/dpmodel/loss/property.py
deepmd/dpmodel/loss/tensor.py
deepmd/pt/loss/dos.py
deepmd/pt/loss/ener.py
deepmd/pt/loss/ener_spin.py
deepmd/pt/loss/loss.py
deepmd/pt/loss/property.py
deepmd/pt/loss/tensor.py
source/tests/common/dpmodel/test_loss_padding.py
source/tests/pt/test_loss_padding.py

coderabbitai · 2026-07-05T16:23:47Z

+# ---------------------------------------------------------------------------
+# force_mag MSE grad-accum invariant (NM equal across frames, ghost atoms non-magnetic)
+#
+# NOTE: force_mag MAE (dpmodel) uses xp.sum over frames instead of xp.mean,
+# so MAE force_mag FAILS the invariant with a 2x factor when frames=2.
+# This is a pre-existing frame-normalization artifact independent of ghost-atom
+# masking. Ghost atoms are correctly excluded (mask_mag=0 there). The MAE
+# artifact is reported as NEEDS_CONTEXT in the audit report.
+# ---------------------------------------------------------------------------


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Inspect force_mag MAE normalization in the dpmodel spin loss. fd -t f 'ener_spin.py' deepmd/dpmodel/loss --exec ast-grep outline {} rg -nP -C4 'force_mag|mask_mag' deepmd/dpmodel/loss/ener_spin.py

Repository: deepmodeling/deepmd-kit

Length of output: 2756

Normalize force_mag MAE across frames. deepmd/dpmodel/loss/ener_spin.py still computes the MAE term as xp.sum(per_frame_sum_fm / per_frame_count_fm), so the loss scales with the number of frames (the 2× behavior for nf=2 remains). If this is out of scope here, link the tracking issue; otherwise switch it to a frame-wise mean to match the other ener_spin MAE paths.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/common/dpmodel/test_loss_padding.py` around lines 2011 - 2019, The `force_mag` MAE path in `ener_spin.py` is still frame-scaled because it uses a sum over per-frame averages instead of a true frame-wise mean. Update the MAE computation in the `force_mag` loss logic to normalize across frames consistently with the other `ener_spin` MAE branches, using the relevant symbols around the `force_mag` reduction code path. If this change is intentionally deferred, add a reference to the tracking issue in the surrounding test/comment; otherwise make the loss independent of `nf` so the 2x effect disappears.

njzjz-bot · 2026-07-05T17:03:29Z

Thanks for the careful fix. The important invariant here should indeed be frame-wise equivalence: a padded mixed-type batch should give the same loss/gradient as processing the real frames separately and averaging. This PR consistently applies that criterion across the pt/pt_expt loss paths, and the all-ones-mask guards are a good way to protect non-mixed behavior.

A few notes before merging:

Please wait for the remaining Python/C++ CI shards to finish; several were still pending when I checked.
The documented follow-ups are real behavior gaps, especially the unchanged TF backend and the pt-only dens/population/denoise losses. I would prefer these to have explicit tracking issues or TODO references before merge, so the scope boundary is not only in the PR body.
The ener_spin MAE normalization change for non-mixed cases is reasonable as a bug fix, but it is user-visible for existing trainings. It should be called out in the release note/changelog if this PR is merged.
I agree with tracking the known force_mag MAE frame-normalization inconsistency separately; an xfail/skip test or linked issue would make that debt harder to lose.

No blocker from my side on the main masking/normalization approach once CI is green and the follow-up tracking is clear.

Authored by OpenClaw 2026.6.8 (844f405) (model: custom-chat-jinzhezeng-group/gpt-5.5)

njzjz

Approve but see above non-blocking comments.

codecov · 2026-07-05T17:19:35Z

Codecov Report

❌ Patch coverage is 88.91786% with 85 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.22%. Comparing base (c381452) to head (4990db2).
⚠️ Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
deepmd/dpmodel/loss/ener.py	85.04%	32 Missing ⚠️
deepmd/pt/loss/ener.py	86.32%	29 Missing ⚠️
deepmd/pt/loss/ener_spin.py	90.90%	13 Missing ⚠️
deepmd/dpmodel/loss/ener_spin.py	90.59%	11 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5738      +/-   ##
==========================================
- Coverage   81.28%   81.22%   -0.06%     
==========================================
  Files         988      990       +2     
  Lines      110891   111535     +644     
  Branches     4234     4232       -2     
==========================================
+ Hits        90137    90595     +458     
- Misses      19229    19412     +183     
- Partials     1525     1528       +3

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Han Wang added 10 commits July 4, 2026 19:03

dosubot Bot added the bug label Jul 5, 2026

github-actions Bot added the Python label Jul 5, 2026

wanghan-iapcm requested a review from njzjz July 5, 2026 16:13

github-advanced-security AI found potential problems Jul 5, 2026

View reviewed changes

coderabbitai Bot reviewed Jul 5, 2026

View reviewed changes

njzjz approved these changes Jul 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(loss): exclude mixed_type padding atoms from the training loss#5738

fix(loss): exclude mixed_type padding atoms from the training loss#5738
wanghan-iapcm wants to merge 10 commits into
deepmodeling:masterfrom
wanghan-iapcm:fix-loss-mixed-type-padding

wanghan-iapcm commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 5, 2026

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jul 5, 2026

Uh oh!

njzjz-bot commented Jul 5, 2026

Uh oh!

njzjz left a comment

Uh oh!

codecov Bot commented Jul 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

wanghan-iapcm commented Jul 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test

Known limitations

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 5, 2026

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 5, 2026

Choose a reason for hiding this comment

Uh oh!

njzjz-bot commented Jul 5, 2026

Uh oh!

njzjz left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wanghan-iapcm commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

codecov Bot commented Jul 5, 2026 •

edited

Loading