fix(llm): strip thinking_budget from kwargs before ChatOpenAI fallback by AasheeshLikePanner · Pull Request #7939 · BasedHardware/omi

AasheeshLikePanner · 2026-06-14T08:30:08Z

Summary

Fixes issue #7898 - thinking_budget kwarg passed to Completions.parse() crashes memory discard in pusher

Changes

backend/utils/llm/clients.py:
- _get_or_create_gemini_llm: Strip thinking_budget from kwargs before passing to ChatOpenAI in the fallback path (no GEMINI_API_KEY/USE_VERTEX_AI). thinking_budget was leaking into ChatOpenAI.model_kwargs and getting merged into the API request payload. When .with_structured_output() routes through OpenAI SDK's Completions.parse(), the unknown kwarg caused an exception.

Impact

Memory discard/trend detection in pusher no longer silently fails when the active QoS profile routes trends to a Gemini model. The except Exception in trends.py was catching the crash and returning [], causing no trends to ever be detected.

Testing

87/87 QoS unit tests passed (3 pre-existing subprocess failures unrelated)
End-to-end verified: all 5 Gemini-routed features no longer leak thinking_budget into ChatOpenAI.model_kwargs
Verified both ChatGoogleGenerativeAI (native SDK with GEMINI_API_KEY) and ChatOpenAI (fallback) paths

greptile-apps · 2026-06-14T08:32:15Z

Greptile Summary

This PR strips thinking_budget from kwargs before constructing the ChatOpenAI fallback in _get_or_create_gemini_llm, preventing the kwarg from leaking into model_kwargs and crashing Completions.parse() when no Gemini API key is configured. A targeted regression test is added that verifies thinking_budget is absent from model_kwargs on the fallback path.

clients.py: One-line kwargs.pop('thinking_budget', None) in the else (no-key) branch, correctly placed before the ChatOpenAI constructor call.
test_omi_qos_tiers.py: New test_gemini_fallback_no_thinking_budget_leak asserts the kwarg is absent; skips when GEMINI_API_KEY is set so it only exercises the fallback path.

Confidence Score: 4/5

Safe to merge — the one-line fix correctly resolves the reported crash on the fallback path and is covered by a new regression test.

The change is minimal and well-targeted: one pop in the right branch, one test that directly exercises it. The only concern is that the fix is per-kwarg rather than structural, so the next Gemini-specific kwarg (e.g. thinking_level for Gemini 3, already noted in the docstring) will need its own matching pop. That is a future maintenance burden but not a current defect.

No files require urgent attention. clients.py is the only code file changed, and the one-line fix is easy to verify in isolation.

Important Files Changed

Filename	Overview
backend/utils/llm/clients.py	Adds `kwargs.pop('thinking_budget', None)` before the ChatOpenAI fallback constructor, fixing the kwarg-leak crash. The fix is correct for the current codebase but only strips this one Gemini-specific key; other future kwargs would require the same treatment.
backend/tests/unit/test_omi_qos_tiers.py	Adds a regression test for the thinking_budget leak on the ChatOpenAI fallback path. Test correctly skips when GEMINI_API_KEY is set and asserts the key is absent from model_kwargs.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["_get_or_create_gemini_llm(model, streaming, thinking_budget)"] --> B{USE_VERTEX_AI\n+ GOOGLE_CLOUD_PROJECT?}
    B -- Yes --> C["Add thinking_budget to kwargs\n(if model is gemini-2.5*)"]
    C --> D["ChatGoogleGenerativeAI\n(Vertex AI)"]
    B -- No --> E{GEMINI_API_KEY?}
    E -- Yes --> F["Add thinking_budget to kwargs\n(if model is gemini-2.5*)"]
    F --> G["ChatGoogleGenerativeAI\n(AI Studio)"]
    E -- No --> H["Add thinking_budget to kwargs\n(if model is gemini-2.5*)"]
    H --> I["kwargs.pop('thinking_budget', None)\n✅ FIX: strip before ChatOpenAI"]
    I --> J["ChatOpenAI\n(fallback, api_key='not-set')"]
    style I fill:#d4edda,stroke:#28a745
    style J fill:#fff3cd,stroke:#ffc107

_{Reviews (1): Last reviewed commit: "test: add regression test for thinking_b..." | Re-trigger Greptile}

…back (issue BasedHardware#7898)

greptile-apps Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread backend/utils/llm/clients.py Outdated

AasheeshLikePanner force-pushed the fix/thinking-budget-leak-structured-output branch from efbea25 to c07d4b4 Compare June 14, 2026 08:32

AasheeshLikePanner added 2 commits June 14, 2026 14:06

fix: strip thinking_budget from kwargs before ChatOpenAI fallback

15fc4bd

test: add regression test for thinking_budget leak in ChatOpenAI fall…

64e8593

…back (issue BasedHardware#7898)

AasheeshLikePanner force-pushed the fix/thinking-budget-leak-structured-output branch from c07d4b4 to 64e8593 Compare June 14, 2026 08:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): strip thinking_budget from kwargs before ChatOpenAI fallback#7939

fix(llm): strip thinking_budget from kwargs before ChatOpenAI fallback#7939
AasheeshLikePanner wants to merge 2 commits into
BasedHardware:mainfrom
AasheeshLikePanner:fix/thinking-budget-leak-structured-output

AasheeshLikePanner commented Jun 14, 2026

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AasheeshLikePanner commented Jun 14, 2026

Summary

Changes

Impact

Testing

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant