Build a gated validation and benchmark strategy for accelerator support.
Scope:
- Add environment-gated tests for CUDA/ROCm/XPU/MPS/DirectML/ONNX providers where available.
- Record device info, provider info, sample rate, duration drift, peak/RMS, golden metrics, and runtime speed.
- Keep GitHub-hosted CI CPU-only unless a self-hosted runner is deliberately configured.
- Avoid turning throughput trend metrics into hard blockers until baselines are stable.
Acceptance criteria:
- Default test suite stays CPU/offline.
- Hardware tests are opt-in via environment variables.
- Benchmark output is machine-readable JSON.
- Docs explain how to run the matrix locally or on a self-hosted runner.
Build a gated validation and benchmark strategy for accelerator support.
Scope:
Acceptance criteria: