Skip to content

[codex] Add CLI outputs, language detection, punctuation removal, and streaming demos#109

Open
MXuer wants to merge 8 commits into
mainfrom
codex/integration-issues-80-92-93
Open

[codex] Add CLI outputs, language detection, punctuation removal, and streaming demos#109
MXuer wants to merge 8 commits into
mainfrom
codex/integration-issues-80-92-93

Conversation

@MXuer

@MXuer MXuer commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR integrates fixes/features for four reported issues:

The streaming demo no longer includes any punctuation-model handling. It emits raw ASR partial/final text and relies on CTC endpoint rules for segmentation.

Validation

  • env TRANSFORMERS_NO_TF=1 USE_TF=0 python -m pytest
    • 23 passed
  • python -m py_compile examples/streaming_demo.py examples/microphone_streaming_demo.py
    • passed
  • python examples/streaming_demo.py --help
    • passed
  • python examples/microphone_streaming_demo.py --help
    • passed
  • git diff --check
    • passed
  • Real audio validation with Dolphin base on CPU:
    • short zh-CN demo audio
    • long zh-CN audio
    • long hi-IN audio
  • Real CLI language detection:
    • short zh-CN: zh CN
    • long zh-CN: zh CN
    • long hi-IN: hi IN
  • Real SRT + punctuation removal validation:
    • short zh-CN: 1 cue, valid SRT, 0 subtitle-body punctuation chars
    • long zh-CN: 60 cues, valid SRT, 0 subtitle-body punctuation chars
    • long hi-IN: 174 cues, valid SRT, 0 subtitle-body punctuation chars
  • Real streaming smoke tests using small.cn.streaming on CPU:
    • short zh-CN demo audio with --chunk_size 16 --emit line --final_rescore attention
    • forced endpoint smoke test with --endpoint_rule3_min_utterance_length_ms 3000

Test Reports

  • reports/issue-80-cli-output-test-report.md
  • reports/issue-92-disable-punctuation-test-report.md
  • reports/issue-93-language-detection-test-report.md
  • reports/issue-106-streaming-demo-test-report.md
  • reports/integration-issues-80-92-93-test-report.md

Notes

  • --remove_punctuation is output post-processing and does not change model decoding or weights.
  • Language detection still loads a Dolphin ASR model; this does not add a separate lightweight LID-only model.
  • The streaming demos are experimental terminal demos, not a production streaming server.

@MXuer MXuer requested a review from wgb14 June 11, 2026 03:22
@wgb14 wgb14 requested a review from Copilot June 11, 2026 03:23

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances Dolphin’s CLI/Python API usability by adding (1) direct CLI output emission to stdout/files with multiple formats, (2) a language-detection-only task exposed via both CLI and dolphin.detect_language(...), and (3) optional punctuation removal as an output post-processing step.

Changes:

  • Add CLI --output + --output_format {txt,json,srt} and implement formatting/emission helpers.
  • Add --task detect_language + --lid_duration and export detect_language at the package top level.
  • Add --remove_punctuation / remove_punctuation=True to strip Unicode punctuation from returned text and word timestamps, with unit tests and updated README examples.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_punctuation.py Adds unit tests for punctuation removal (text, special tokens, word timestamps) and CLI parsing.
tests/test_language_detection.py Adds unit tests for detect-language API behavior, package export, CLI output, and audio duration limiting.
tests/test_cli_output.py Adds unit tests for txt/json/srt formatting and stdout/file emission (including nested output dirs).
reports/issue-93-language-detection-test-report.md Documents validation steps/results for language detection feature.
reports/issue-92-disable-punctuation-test-report.md Documents validation steps/results for punctuation removal feature.
reports/issue-80-cli-output-test-report.md Documents validation steps/results for CLI output formats and file writing.
reports/integration-issues-80-92-93-test-report.md Integration validation report across all three features and their CLI interactions.
README.md Updates install URL/model links and adds CLI/Python usage examples for new flags/tasks.
dolphin/transcribe.py Implements punctuation removal, language detection duration limiting, CLI output formatting/emission, and new CLI arguments.
dolphin/model_registry.py Fixes small.cn model_id typo.
dolphin/init.py Exports detect_language at the package top level.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dolphin/transcribe.py
use_two_stage_filter: bool = False,
use_prompt_hotword: bool = False,
prompt_filter_threshold: float = -2.0,
remove_punctuation: bool = False,
Comment thread dolphin/transcribe.py
use_two_stage_filter: bool = False,
use_prompt_hotword: bool = False,
prompt_filter_threshold: float = -4.0,
remove_punctuation: bool = False,
Comment thread dolphin/transcribe.py
parser.add_argument("--use_prompt_hotword", type=str2bool, default=False, help="use prompt-based hotword (default: false)")
parser.add_argument("--prompt_filter_threshold", type=float, default=-2.0, help="filter threshold for prompt hotwords (default: -2.0)")
parser.add_argument("--remove_punctuation", type=str2bool, default=False, help="remove punctuation from transcription text output (default: false)")
parser.add_argument("--lid_duration", type=float, default=SPEECH_LENGTH, help="seconds of audio to use for language detection; set 0 to use full audio (default: 30)")
@wgb14

wgb14 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

perfect👍

@MXuer MXuer marked this pull request as ready for review June 11, 2026 07:36
@MXuer MXuer changed the title [codex] Add CLI outputs, language detection task, and punctuation removal [codex] Add CLI outputs, language detection, punctuation removal, and streaming demos Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants