Skip to content

feat(cli): add job management workflow#430

Open
Jinghao-coding wants to merge 1 commit into
raids-lab:mainfrom
Jinghao-coding:feat/cli-job-management
Open

feat(cli): add job management workflow#430
Jinghao-coding wants to merge 1 commit into
raids-lab:mainfrom
Jinghao-coding:feat/cli-job-management

Conversation

@Jinghao-coding

Copy link
Copy Markdown
Member

Summary / 摘要

English

This PR adds a full crater job CLI workflow on top of the latest upstream/main CLI baseline. It covers job read surfaces, lifecycle helpers, basic job creation, administrator job operations, docs, a dedicated Skill, and snapshot coverage.

中文

本 PR 基于最新 upstream/main 的 CLI 基线新增完整的 crater job 作业管理工作流,覆盖作业读取、生命周期操作、基础作业提交、管理员作业操作、文档、专用 Skill 以及快照测试。

Changes / 变更

English

  • Add authenticated API helpers and job API DTO/client methods for /api/v1/vcjobs, /api/v1/admin/vcjobs, and /api/v1/admin/operations job endpoints.
  • Add crater job read commands: ls, get, pods, events, yaml, template.
  • Add job access/lifecycle commands: token, secret, ssh, snapshot, alert, delete --admin.
  • Add job creation commands:
    • crater job create jupyter
    • crater job create webide
    • crater job create custom
    • crater job create tensorflow --file <json>
    • crater job create pytorch --file <json>
  • Add administrator commands: lock, unlock, keep, and cleanup workflows for waiting, long-running, and low-GPU jobs.
  • Add local validation for negative CPU, memory, GPU, task replicas, durations, and invalid list filters before making API requests.
  • Add bilingual i18n, command contract documentation, and crater-cli-job Skill guidance.
  • Add English and Chinese snapshot tests for job CLI usage/API error behavior.

中文

  • 新增认证 API helper 与作业 API DTO/client 方法,覆盖 /api/v1/vcjobs/api/v1/admin/vcjobs/api/v1/admin/operations 作业相关接口。
  • 新增 crater job 读取命令:lsgetpodseventsyamltemplate
  • 新增作业访问与生命周期命令:tokensecretsshsnapshotalertdelete --admin
  • 新增作业创建命令:
    • crater job create jupyter
    • crater job create webide
    • crater job create custom
    • crater job create tensorflow --file <json>
    • crater job create pytorch --file <json>
  • 新增管理员命令:lockunlockkeep,以及等待作业、长时间运行作业、低 GPU 利用率作业清理流程。
  • 在发起 API 请求前增加本地校验,覆盖负数 CPU、Memory、GPU、task replicas、锁定/清理时长以及无效列表筛选参数。
  • 补充中英文 i18n、命令契约文档与 crater-cli-job Skill 使用说明。
  • 补充作业 CLI 中英文快照测试,覆盖参数错误和 API 错误路径。

Testing / 测试

English

  • UPDATE_SNAPSHOTS=1 go test ./test/snapshots/job
  • go test ./...
  • go build ./...

中文

  • UPDATE_SNAPSHOTS=1 go test ./test/snapshots/job
  • go test ./...
  • go build ./...

Notes / 说明

English

TensorFlow and PyTorch creation use --file intentionally because the backend request contains nested tasks[] specifications. This keeps the CLI aligned with the platform DTO instead of flattening complex task specs into brittle flags.

中文

TensorFlow 和 PyTorch 的创建命令刻意使用 --file,因为后端请求包含嵌套的 tasks[] 结构。这样可以保持 CLI 与平台 DTO 对齐,避免把复杂任务规格拆成脆弱的长 flags。

@Jinghao-coding Jinghao-coding force-pushed the feat/cli-job-management branch from 6005f1e to 930b75b Compare June 17, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant