Skip to content

Add adjustable tail recording to capture the last word#426

Open
EliseiNicolae wants to merge 1 commit into
altic-dev:mainfrom
EliseiNicolae:recording-tail-setting
Open

Add adjustable tail recording to capture the last word#426
EliseiNicolae wants to merge 1 commit into
altic-dev:mainfrom
EliseiNicolae:recording-tail-setting

Conversation

@EliseiNicolae

@EliseiNicolae EliseiNicolae commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Description

Dictation was clipping the last word. When you trigger stop, the final ~50–130 ms of audio is still in CoreAudio's input pipeline (and people tend to release the key a hair early), but capture freezes immediately — so the tail of the last word is lost.

This keeps the mic tap live for a short, configurable grace period after the stop trigger, so the trailing audio lands in the buffer before the engine tears down. The duration is exposed as a slider in Settings.

Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 🧹 Chore
  • 📝 Documentation update

Related Issues

  • N/A

What changed

  • ASRService.stop() now awaits a short grace period before audioCapturePipeline.setRecordingEnabled(false), so the tap keeps appending the trailing audio. The exact cessation point (the tap's guard enabled check) is why the delay has to go before that line — anything after it is already discarded.
  • Guarded by a new isStopping flag to close the re-entrancy window the delay opens: while stop() is sleeping, isRunning is still true, so the un-guarded playground/Welcome stop path could otherwise slip in a second stop().
  • New persisted setting SettingsStore.recordingTailDuration (seconds, default 0.2, clamped 0–0.4). Read live at stop time, so a change applies on the very next dictation. 0 disables the tail entirely (the getter uses a nil-check so 0 persists instead of snapping back to the default).
  • Slider added under Settings → Global Hotkey → Options → "Extra recording after stop", mirroring the existing "Bottom Offset" slider pattern.
  • Wired through settings backup/restore as an optional field, so older backups still decode.
  • The Cancel path (stopWithoutTranscription()) is intentionally left undelayed.

Testing

  • Tested on Intel Mac
  • Tested on Apple Silicon Mac
  • Tested on macOS 26
  • Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources
  • Ran formatter locally: swiftformat --config .swiftformat Sources

Built Release (arm64) successfully, signed, installed, and launched on Apple Silicon / macOS 26. End-to-end dictation tail behavior (and dragging the slider) should be confirmed manually, since it needs live mic input.

Notes

  • Latency tradeoff: stop→text grows by the configured tail (default 0.2 s). Reasonable for catching the last word; the user can tune it down to 0 or up to 0.4 s.
  • Parakeet fast-preview interaction: the extra tail audio more often trips the tail_has_audio guard, so finalize runs a full transcription instead of reusing the streaming preview — slightly slower finalize on that path, but that's exactly what recovers the last word.
  • No overlay flicker: the overlay already flips to "Transcribing" before stop() is called, and isProcessingActive keeps it visible while isRunning flips a beat later.

Screenshots / Video

The new control lives in Settings → Global Hotkey card → Options, directly under "Activation Mode" — label "Extra recording after stop", a 0.00–0.40 s slider showing the live value. Screenshot to be added.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d293442db6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/Services/ASRService.swift
@EliseiNicolae EliseiNicolae force-pushed the recording-tail-setting branch 4 times, most recently from 04b24d7 to 16b530d Compare June 26, 2026 12:26

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 16b530d9e8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/Services/ASRService.swift
@EliseiNicolae EliseiNicolae force-pushed the recording-tail-setting branch from 16b530d to ab959d0 Compare June 26, 2026 12:47

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab959d0b95

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/Services/ASRService.swift
@EliseiNicolae

Copy link
Copy Markdown
Contributor Author

Demo:
https://drive.google.com/file/d/1KBdOcTNM3FDFH1E2nX-p5nRSeWpI3IR1/view?usp=sharing

@altic-dev

Copy link
Copy Markdown
Owner

Amazing job making it configurable. Because I don't wanna ship it by default as people would mostly complain about how slow it is compared to other apps. So making it configurable helps deal with getting the last word I guess if needed.

I wonder if we can look into optimizing coreAudio to capture it without the 100ms delay in the first place.

Also, Nice demo. Shows it off immediately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants