Skip to content

Fix: send AI enhancement instructions in system role, not user message#439

Open
Kayaba-Attribution wants to merge 3 commits into
altic-dev:mainfrom
Kayaba-Attribution:fix/issue-388-system-prompt-placement
Open

Fix: send AI enhancement instructions in system role, not user message#439
Kayaba-Attribution wants to merge 3 commits into
altic-dev:mainfrom
Kayaba-Attribution:fix/issue-388-system-prompt-placement

Conversation

@Kayaba-Attribution

@Kayaba-Attribution Kayaba-Attribution commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Description

Fixes a bug where AI enhancement instructions were placed in the user message instead of the system prompt, causing strict role-separation models to answer the transcript rather than apply the instructions.

Root cause — two independent sites

1. ContentView.processTextWithAI (user-facing dictation path)

The dictation branch hardcoded systemPrompt = "" and folded instructions + transcript into the user turn via renderDictationUserMessage. A comment in the code marked this as intentional ("dictation enhancement folds the prompt + transcript into a single user turn"), but it breaks any model that treats role separation strictly (e.g. Cerebras gpt-oss-120b, local models with system-prompt enforcement).

2. DictationPostProcessingService.process (local API path — /v1/postprocess)

Same pattern, same root cause. This path is called from InferenceAPIController, not from the main UI dictation flow.

Fix

Both sites now consistently put the instruction text in the system role and the transcript alone in the user role:

// Before (dictation branch in ContentView)
systemPrompt = ""
userMessageContent = SettingsStore.renderDictationUserMessage(
    promptText: promptText,
    transcript: inputText
)

// After
let systemPrompt = promptText
let userMessageContent = inputText

The isDictationCall conditional in ContentView is removed — both branches now do the same thing. DictationPostProcessingService gets the same treatment.

The no-prompt case (prompt turned off) is unaffected: effectiveDictationSystemPrompt returns "" when prompts are off, so the existing if !systemPrompt.isEmpty guard in the messages builder correctly omits the system entry.

Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)

Related Issues

Testing

  • Tested on Apple Silicon Mac
  • Tested on macOS Sequoia 15
  • 4 unit tests added in DictationSystemPromptTests:
    • effectiveDictationSystemPrompt returns configured instruction text (not empty)
    • effectiveDictationSystemPrompt falls back to a non-empty built-in default
    • renderDictationUserMessage("") returns transcript only (no instructions bleeding in)
    • ${transcript} placeholder substitution still works
  • Tested on Intel Mac
  • Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources — 0 violations
  • Ran formatter locally: swiftformat --config .swiftformat Sources/Fluid/Services/DictationPostProcessingService.swift Sources/Fluid/ContentView.swift — 0 additional changes

Notes

Full end-to-end verification requires a live provider that exhibits the role-conflation failure (e.g. Cerebras gpt-oss-120b or a local model strict about system vs. user separation). The unit tests cover the message-composition logic without a network call.

…ge (altic-dev#388)

When building the LLM request for dictation enhancement,
DictationPostProcessingService hardcoded `systemPrompt = ""` and
folded the instruction text into the user message via
renderDictationUserMessage. This caused providers that treat role
separation strictly (e.g. Cerebras gpt-oss-120b) to answer the
transcript as a question rather than apply the instructions as a
directive.

Fix: assign the resolved prompt text to `systemPrompt` directly and
set `userMessageContent` to the raw transcript only. The messages
builder already guards on `systemPrompt.isEmpty` before appending
the system entry, so the no-prompt case (prompt off) is unaffected.

Four unit tests added in DictationSystemPromptTests covering:
- effectiveDictationSystemPrompt returns the configured instruction text
- effectiveDictationSystemPrompt falls back to a non-empty built-in default
- renderDictationUserMessage with empty prompt returns transcript only
- renderDictationUserMessage ${transcript} placeholder substitution still works

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ded39d9a9b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/Services/DictationPostProcessingService.swift Outdated
…ev#388)

DictationPostProcessingService (fixed in the previous commit) is only
called from the local API endpoint. The actual user-facing dictation
path goes through ContentView.processTextWithAI, which had its own
independent prompt assembly that folded instructions + transcript into
a single user turn via renderDictationUserMessage.

This aligned the ContentView path with the correct behaviour: prompt
goes in the system role, transcript (or user input) is the sole user
turn. Both isDictationCall and non-dictation branches now do the same
thing, so the conditional is removed.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e92e4e75d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread Sources/Fluid/ContentView.swift Outdated
Comment on lines +1814 to +1815
let systemPrompt = promptText
let userMessageContent = inputText

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve ${transcript} substitution in dictation prompts

For any saved dictation prompt that uses the documented ${transcript} placeholder, this now sends the raw prompt text (including the literal placeholder) as the system message and the transcript as a separate user message. I checked repo-wide callers: after this diff renderDictationUserMessage is only used by the new tests, so placeholder substitution is no longer performed in either this hotkey path or DictationPostProcessingService; the added placeholder test therefore exercises dead code rather than the request sent to providers.

Useful? React with 👍 / 👎.

Add SettingsStore.renderSystemPrompt(promptText:transcript:) which
substitutes the \${transcript} placeholder when present and returns the
prompt unchanged otherwise. Call it in ContentView.processTextWithAI and
DictationPostProcessingService before assigning the system prompt, so
saved prompt templates that reference the placeholder continue to work
under the new two-role message layout.

Update DictationSystemPromptTests to test renderSystemPrompt (the
production path) instead of the now-production-dead renderDictationUserMessage.

Fixes the P2 raised in review of altic-dev#439.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7a0d1c18e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

/// the transcript is always sent as a separate user turn.
static func renderSystemPrompt(promptText: String, transcript: String) -> String {
guard promptText.contains(self.transcriptPlaceholder) else { return promptText }
return promptText.replacingOccurrences(of: self.transcriptPlaceholder, with: transcript)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid duplicating placeholder transcripts

For dictation prompts that contain ${transcript}, this replacement puts the transcript into the system prompt, while both new call sites still send the same text as the user turn (ContentView sets userMessageContent = inputText, and DictationPostProcessingService sets it to trimmed). Those prompts used to include the transcript only once via renderDictationUserMessage; now long dictations can double token usage or exceed context, and templates that relied on the placeholder to control where the sole transcript appears will see duplicate input. Please either avoid expanding the placeholder once the transcript is a separate user message, or suppress the extra user turn for placeholder templates.

Useful? React with 👍 / 👎.

@altic-dev

Copy link
Copy Markdown
Owner

This is a tricky fix as it causes conflicts betweeen another PR about the state of system vs user prompt. Would to have a discussion before we merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants