Skip to content

feat(stt): add NVIDIA Canary STT engine support#402

Open
coleleavitt wants to merge 5 commits into
mkiol:mainfrom
coleleavitt:feat/nvidia-canary-stt-engine
Open

feat(stt): add NVIDIA Canary STT engine support#402
coleleavitt wants to merge 5 commits into
mkiol:mainfrom
coleleavitt:feat/nvidia-canary-stt-engine

Conversation

@coleleavitt

Copy link
Copy Markdown

Supersedes closed PR #360. GitHub would not allow reopening that PR from this account after the branch was updated.

Summary:

  • Restore Qt5-compatible scope by reverting the Qt6 migration changes from this branch.
  • Add NVIDIA Canary STT engine support using local NeMo .nemo restore.
  • Add native Qt Hugging Face Hub download resolution for hf:// model URLs with metadata checksum validation.
  • Move the Canary model definition into models with pinned Hub revision and checksums.

Validation:

  • git diff --check
  • Canary model JSON placement/checksum checks with jq
  • Focused C++ syntax checks for src/canary_engine.cpp, src/models_manager.cpp, and src/checksum_tools.cpp
  • Full CMake configure is blocked locally by missing Qt5 LinguistTools (Qt5LinguistToolsConfig.cmake).

Add support for NVIDIA's Canary speech-to-text models via NeMo toolkit:

- Canary 1B v2: 4.89% WER, 630x RTF (5x faster than Whisper)
- Canary Qwen 2.5B: Higher accuracy variant for demanding use cases

Both models use NeMo's EncDecMultiTaskModel architecture with automatic
model download via HuggingFace. Supports GPU acceleration (CUDA/ROCm),
translation (s2t_translation), and punctuation restoration.

New files:
- src/canary_engine.hpp: Engine class definition
- src/canary_engine.cpp: NeMo Python integration via py_executor

Modified:
- models_manager.h/cpp: Add stt_canary engine type and feature flags
- speech_service.cpp: Engine instantiation and type checking
- CMakeLists.txt: Add canary_engine source files
- config/models.json: Add both Canary model entries

Requires: pip install nemo_toolkit[asr]
@coleleavitt coleleavitt force-pushed the feat/nvidia-canary-stt-engine branch from 655e712 to df4d674 Compare June 21, 2026 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant