[codex] Add CLI outputs, language detection, punctuation removal, and streaming demos#109
Open
MXuer wants to merge 8 commits into
Open
[codex] Add CLI outputs, language detection, punctuation removal, and streaming demos#109MXuer wants to merge 8 commits into
MXuer wants to merge 8 commits into
Conversation
This was referenced Jun 11, 2026
Open
There was a problem hiding this comment.
Pull request overview
This PR enhances Dolphin’s CLI/Python API usability by adding (1) direct CLI output emission to stdout/files with multiple formats, (2) a language-detection-only task exposed via both CLI and dolphin.detect_language(...), and (3) optional punctuation removal as an output post-processing step.
Changes:
- Add CLI
--output+--output_format {txt,json,srt}and implement formatting/emission helpers. - Add
--task detect_language+--lid_durationand exportdetect_languageat the package top level. - Add
--remove_punctuation/remove_punctuation=Trueto strip Unicode punctuation from returned text and word timestamps, with unit tests and updated README examples.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_punctuation.py | Adds unit tests for punctuation removal (text, special tokens, word timestamps) and CLI parsing. |
| tests/test_language_detection.py | Adds unit tests for detect-language API behavior, package export, CLI output, and audio duration limiting. |
| tests/test_cli_output.py | Adds unit tests for txt/json/srt formatting and stdout/file emission (including nested output dirs). |
| reports/issue-93-language-detection-test-report.md | Documents validation steps/results for language detection feature. |
| reports/issue-92-disable-punctuation-test-report.md | Documents validation steps/results for punctuation removal feature. |
| reports/issue-80-cli-output-test-report.md | Documents validation steps/results for CLI output formats and file writing. |
| reports/integration-issues-80-92-93-test-report.md | Integration validation report across all three features and their CLI interactions. |
| README.md | Updates install URL/model links and adds CLI/Python usage examples for new flags/tasks. |
| dolphin/transcribe.py | Implements punctuation removal, language detection duration limiting, CLI output formatting/emission, and new CLI arguments. |
| dolphin/model_registry.py | Fixes small.cn model_id typo. |
| dolphin/init.py | Exports detect_language at the package top level. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| use_two_stage_filter: bool = False, | ||
| use_prompt_hotword: bool = False, | ||
| prompt_filter_threshold: float = -2.0, | ||
| remove_punctuation: bool = False, |
| use_two_stage_filter: bool = False, | ||
| use_prompt_hotword: bool = False, | ||
| prompt_filter_threshold: float = -4.0, | ||
| remove_punctuation: bool = False, |
| parser.add_argument("--use_prompt_hotword", type=str2bool, default=False, help="use prompt-based hotword (default: false)") | ||
| parser.add_argument("--prompt_filter_threshold", type=float, default=-2.0, help="filter threshold for prompt hotwords (default: -2.0)") | ||
| parser.add_argument("--remove_punctuation", type=str2bool, default=False, help="remove punctuation from transcription text output (default: false)") | ||
| parser.add_argument("--lid_duration", type=float, default=SPEECH_LENGTH, help="seconds of audio to use for language detection; set 0 to use full audio (default: 30)") |
Collaborator
|
perfect👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR integrates fixes/features for four reported issues:
--outputand--output_format {txt,json,srt}.--remove_punctuation/remove_punctuation=Truefor punctuation-free transcription output.dolphin.detect_language(...)and CLI--task detect_language.forward_encoder_chunk, add CTC greedy partial output, CTC endpointing, optional final attention rescoring, and a microphone streaming terminal demo.The streaming demo no longer includes any punctuation-model handling. It emits raw ASR partial/final text and relies on CTC endpoint rules for segmentation.
Validation
env TRANSFORMERS_NO_TF=1 USE_TF=0 python -m pytestpython -m py_compile examples/streaming_demo.py examples/microphone_streaming_demo.pypython examples/streaming_demo.py --helppython examples/microphone_streaming_demo.py --helpgit diff --checkbaseon CPU:zh CNzh CNhi INsmall.cn.streamingon CPU:--chunk_size 16 --emit line --final_rescore attention--endpoint_rule3_min_utterance_length_ms 3000Test Reports
reports/issue-80-cli-output-test-report.mdreports/issue-92-disable-punctuation-test-report.mdreports/issue-93-language-detection-test-report.mdreports/issue-106-streaming-demo-test-report.mdreports/integration-issues-80-92-93-test-report.mdNotes
--remove_punctuationis output post-processing and does not change model decoding or weights.