Skip to content

feat(skills): add dpgen-run skill for concurrent learning workflow#1885

Draft
SchrodingersCattt wants to merge 6 commits into
deepmodeling:masterfrom
SchrodingersCattt:feat/add-dpgen-run-skill
Draft

feat(skills): add dpgen-run skill for concurrent learning workflow#1885
SchrodingersCattt wants to merge 6 commits into
deepmodeling:masterfrom
SchrodingersCattt:feat/add-dpgen-run-skill

Conversation

@SchrodingersCattt

@SchrodingersCattt SchrodingersCattt commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive guide for the dpgen run concurrent learning workflow, covering the required dual JSON configuration format (param.json and machine.json), the exact run command, hard setup guardrails (field spelling/casing and required sections), descriptor guidance, runtime environment boundaries, and a pre-run validation checklist.
    • Documented expected output/reporting behavior and common pitfalls to avoid misconfiguration.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds a comprehensive skill document for the DP-GEN concurrent learning workflow, defining the two-JSON-file configuration contract, hard-rule guardrails for common misconfiguration pitfalls, agent workflow responsibilities, and validation/execution procedures for running dpgen run end-to-end.

Changes

DP-GEN Run Skill Definition

Layer / File(s) Summary
Document metadata and overview
skills/dpgen-run/SKILL.md (lines 1–11)
Skill metadata (name, description, compatibility, license, author/version) for the dpgen-run workflow documentation.
Core execution rules and agent responsibilities
skills/dpgen-run/SKILL.md (lines 12–75)
Critical verified configuration pitfalls (descriptor naming se_atten_v2, parameter case sensitivity, CP2K/DeePMD formatting, type_map ordering) and must-embed rules; enumerated agent responsibilities for workflow confirmation, input preparation, validation, execution, and output reporting.
Configuration construction and execution policy
skills/dpgen-run/SKILL.md (lines 76–317)
Stepwise working policy for input collection preserving user scientific choices; descriptor-selection guidance (TensorFlow/PyTorch backend constraints); explicit execution environment boundaries (local vs scheduler, outer launcher vs dispatched stage jobs); recommended project directory layout; minimum required inputs checklist; structured guidance for building param.json (logical blocks, trust-threshold interpretation, example references) and machine.json (stage-specific blocks with resources, queue, environment activation).
Validation, execution, and output contract
skills/dpgen-run/SKILL.md (lines 324–388)
Stepwise pre-run validation order (JSON syntax, dataset/type-map verification); execution command specification; post-setup output contract (paths, commands, checklists, unresolved fields); additional guardrails preventing incorrect configuration mixing, missing files, hardcoded cluster settings, and reiterated critical "must" rules (remote_root, batch_type capitalization, se_atten_v2 spelling, type_map ordering, model_devi_engine defaulting behavior).
Bundled templates and external documentation
skills/dpgen-run/SKILL.md (lines 389–409)
References to bundled asset templates, DP-GEN run parameter definitions, machine configuration definitions, and external DeePMD documentation.

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new documentation skill file for the dpgen-run concurrent learning workflow.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.80%. Comparing base (66232e5) to head (f1ee958).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1885   +/-   ##
=======================================
  Coverage   49.80%   49.80%           
=======================================
  Files          83       83           
  Lines       14986    14986           
=======================================
  Hits         7464     7464           
  Misses       7522     7522           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/dpgen-run/SKILL.md`:
- Around line 325-357: The numbered checklist in SKILL.md after step 4 uses "1."
for four subsequent items; update those list markers to continue the sequence as
steps 5, 6, 7, and 8 (i.e., change the four "1." occurrences following "verify
`type_map.raw` content matches `param.json` `type_map` ordering" to "5.", "6.",
"7.", and "8.") so the validation steps read as a proper ordered sequence.
- Around line 104-116: Update the descriptor/backend guidance text to remove the
incorrect absolute claim that "dpgen (v0.13.x) only supports the TensorFlow
backend" and instead state that the codebase includes both train_backend values
"tensorflow" and "pytorch", so backend support may include PyTorch; soften the
PyTorch-only wording by either citing the authoritative source or rephrasing to
say users requiring PyTorch-only models or DPA-2/DPA-3 should prefer dpgen2;
keep the descriptor recommendations for se_e2_a, se_atten, and se_atten_v2
(spell se_atten_v2 exactly) and add a short note that users should verify
backend/descriptor compatibility against the repo's train_backend options
("tensorflow", "pytorch") if unsure.
- Around line 369-384: Update the guardrail text in SKILL.md so it no longer
requires specifically "[]" for training.training_data.systems; instead state
that training.training_data.systems should be left unset or empty (dpgen fills
it) — reference dpgen/generator/run.py behavior and the default_training_param
name when editing the rule text to mention both unset or empty as acceptable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b559805-4192-4c79-acde-08145257e989

📥 Commits

Reviewing files that changed from the base of the PR and between 66232e5 and 841acda.

📒 Files selected for processing (1)
  • skills/dpgen-run/SKILL.md

Comment thread skills/dpgen-run/SKILL.md Outdated
Comment thread skills/dpgen-run/SKILL.md
Comment thread skills/dpgen-run/SKILL.md

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
skills/dpgen-run/SKILL.md (1)

341-346: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the configured init_data_prefix here.

The validation example hardcodes init_data/, which conflicts with the earlier init_data_prefix input and will mislead users whose project layout uses a different prefix.

🛠 Suggested fix
- ls init_data/*/type_map.raw
+ ls "${init_data_prefix}"*/type_map.raw
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/dpgen-run/SKILL.md` around lines 341 - 346, In the validation example
in the SKILL.md file around lines 341-346, the bash command uses a hardcoded
directory path `init_data/` instead of the configurable `init_data_prefix` value
that was introduced earlier in the documentation. Replace the hardcoded
`init_data/` reference in the ls command with the appropriate placeholder or
variable reference that represents the configured `init_data_prefix`, so the
example remains consistent with the actual configuration users provide and
doesn't mislead them into using incorrect paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@skills/dpgen-run/SKILL.md`:
- Around line 341-346: In the validation example in the SKILL.md file around
lines 341-346, the bash command uses a hardcoded directory path `init_data/`
instead of the configurable `init_data_prefix` value that was introduced earlier
in the documentation. Replace the hardcoded `init_data/` reference in the ls
command with the appropriate placeholder or variable reference that represents
the configured `init_data_prefix`, so the example remains consistent with the
actual configuration users provide and doesn't mislead them into using incorrect
paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d8abfb50-acd6-4802-baac-90db57a1cd18

📥 Commits

Reviewing files that changed from the base of the PR and between 841acda and 39adc8f.

📒 Files selected for processing (1)
  • skills/dpgen-run/SKILL.md

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
skills/dpgen-run/SKILL.md (1)

71-72: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit user confirmation requirement before auto-executing dpgen run.

The agent responsibilities include "validate the workflow before execution" and "provide the exact command for running," but lack an explicit requirement to ask the user for confirmation before actually executing the command. This creates an autonomous execution path without user verification—a concern flagged by the SkillSpector analyzer (EA2: Excessive Agency).

To mitigate this risk, either:

  1. Add a new responsibility step: "ask the user for explicit confirmation before executing the command," or
  2. Add a guardrail: "Never auto-execute dpgen run without explicit user confirmation after validation."

This ensures the agent validates thoroughly but defers the final execution decision to the user.

💡 Suggested additions

Add to agent responsibilities (after line 72):

 1. validate the workflow before execution
 1. provide the exact command for running
+1. ask the user for explicit confirmation before executing

Or add to guardrails section (after line 384):

 - Do not assume outer-shell activation is inherited by stage jobs; for scheduler execution, require explicit `source_list` per stage.
+- Never auto-execute `dpgen run param.json machine.json` without explicit user confirmation after validation passes.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/dpgen-run/SKILL.md` around lines 71 - 72, The SKILL.md file lacks an
explicit requirement for user confirmation before executing the dpgen run
command, creating a potential autonomous execution risk. Add either a new
responsibility item after line 72 that states "ask the user for explicit
confirmation before executing the command," or add a guardrail in the guardrails
section (after line 384) that states "Never auto-execute dpgen run without
explicit user confirmation after validation." This ensures the agent validates
the workflow and provides the command but defers the final execution decision to
the user for explicit approval.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@skills/dpgen-run/SKILL.md`:
- Around line 71-72: The SKILL.md file lacks an explicit requirement for user
confirmation before executing the dpgen run command, creating a potential
autonomous execution risk. Add either a new responsibility item after line 72
that states "ask the user for explicit confirmation before executing the
command," or add a guardrail in the guardrails section (after line 384) that
states "Never auto-execute dpgen run without explicit user confirmation after
validation." This ensures the agent validates the workflow and provides the
command but defers the final execution decision to the user for explicit
approval.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5b543243-4fa4-4894-b52d-054cefc40be9

📥 Commits

Reviewing files that changed from the base of the PR and between 39adc8f and f1ee958.

📒 Files selected for processing (1)
  • skills/dpgen-run/SKILL.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant