Multimodal LLM backbone + GoogleGenAI backend + image utilities by allenanie · Pull Request #76 · AgentOpt/OpenTrace

allenanie · 2026-06-02T06:31:14Z

What & why

Introduces the multimodal conversation layer used by the v3 optimizers, Trace-Bench, and debug_polca. This is a minimal, reviewable extraction from the prototype branch features/multimodal_opt (which had an unreviewable ~11.7k-line diff bundling unrelated trainer-refactor work). Basing on experimental (where the trainer refactor already landed) drops that noise automatically.

This is PR 1 of 2 (stacked). PR 2 (feature/optoprime-v3) adds the optimizers that consume this layer and targets this branch.

Changes

opto/utils/backbone/ — the former 2809-line backbone.py is now a package: content.py, template.py, turns.py, chat.py, with __init__.py re-exporting the public API (Content, ContentBlockList, TextContent, ImageContent, PromptTemplate, UserTurn, AssistantTurn, Chat, DEFAULT_IMAGE_PLACEHOLDER, …). Unverified surface removed: ToolCall/ToolResult/ToolDefinition/UnparsedToolCall, PDFContent, FileContent (all internal-only; no consumer used them).
opto/utils/llm.py — adds GoogleGenAILLM, embed(), and an mm_beta multimodal path returning AssistantTurn. Removed GeminiRESTLLM + helpers. openai/google-genai are now imported lazily so the module loads without them.
opto/trace/nodes.py — adds is_image(), verify_data_is_image_url(), and the Node.is_image property (PIL/requests lazy).
opto/optimizers/utils.py — adds is_bedrock_model().
opto/utils/display/ — optional Jupyter HTML rendering (lazily loaded; backbone degrades gracefully without it).
setup.py: pin litellm==1.80.8, add google-genai and pillow.
Tests: tests/unit_tests/test_backbone.py and extended test_llm.py. Live LLM/multimodal tests are opt-in via RUN_LIVE_LLM_TESTS=1 so CI (a text-only stub) skips them.

Backward-compatibility (bugs fixed vs prototype)

LLM.__new__ now defaults mm_beta=False — LLM(model=...) returns raw completion responses (resp.choices[0].message.content), so Trace-Bench and OptoPrime v1/v2 are unaffected. Only the v3 optimizers opt into mm_beta=True.
Removed the stale ConversationHistory import (class is now Chat).
LLMFactory.get_llm(profile) positional usage (e.g. in optoprimemulti.py) still works.

Verification

pytest tests/unit_tests/ passes (live tests skipped without RUN_LIVE_LLM_TESTS=1).
Import smoke: from opto.utils.backbone import Chat, Content, ... and from opto.utils.llm import LLM, LLMFactory, DummyLLM succeed.
Confirmed every backbone symbol imported by debug_polca is still exported, and no consumer references a removed symbol.

Introduce the multimodal conversation layer used by the v3 optimizers and Trace-Bench, refactored from the prototype on `features/multimodal_opt` to keep the change minimal and reviewable. - opto/utils/backbone/: new package (content/template/turns/chat) providing Content, ContentBlockList, TextContent, ImageContent, PromptTemplate, UserTurn, AssistantTurn, and the Chat conversation manager. Public API is re-exported from the package __init__. Unverified surface (tool calling, PDFContent, FileContent) was dropped. - opto/utils/llm.py: add GoogleGenAILLM backend, embed(), and an mm_beta multimodal path returning AssistantTurn. mm_beta defaults to False so existing callers keep getting raw completion responses (backward compatible). openai/google-genai are imported lazily. (GeminiREST backend dropped.) - opto/utils/display/: optional Jupyter HTML rendering (loaded lazily; backbone degrades gracefully without it). - opto/trace/nodes.py: add is_image()/verify_data_is_image_url() and the Node.is_image property (PIL/requests imported lazily). - opto/optimizers/utils.py: add is_bedrock_model(). - setup.py: pin litellm==1.80.8, add google-genai and pillow. - tests: add tests/unit_tests/test_backbone.py; extend test_llm.py.

The live-call tests in test_backbone.py and test_llm.py were gated on a loose HAS_CREDENTIALS check (any of OAI_CONFIG_LIST / TRACE_LITELLM_MODEL / OPENAI_API_KEY). CI sets those to point at a text-only ollama stub (openai/phi4-mini), so the tests ran and failed: they hardcode gpt-4o/gpt-4o-mini (absent on the stub) and send image URLs the stub can't accept. Gate these tests behind an explicit RUN_LIVE_LLM_TESTS=1 opt-in (which CI does not set) so they only run against a real, image-capable provider. Also drop a stale assertion that AssistantTurn exposes `tool_calls` (tool support was removed from the backbone).

* Add OptoPrimeV3 and OPROv3 multimodal optimizers Stacked on the multimodal backbone branch. These optimizers build prompts as multimodal Content (text + images) via the backbone Chat/UserTurn/AssistantTurn primitives and require an mm_beta LLM. - opto/optimizers/optoprime_v3.py: OptoPrimeV3 (subclasses OptoPrime), OptimizerPromptSymbolSet variants, ProblemInstance, and value_to_image_content. - opto/optimizers/opro_v3.py: OPROv3 (subclasses OptoPrimeV3) with a smaller prompt symbol set. - opto/optimizers/__init__.py: export OptoPrimeV3 and OPROv3. - tests/llm_optimizers_tests/test_optoprime_v3.py. Fixes a pre-existing bug in ProblemInstance: content fields passed as plain strings (feedback/context) crashed __repr__/to_content_blocks. Added a __post_init__ that normalizes fields via ContentBlockList.ensure, and made __repr__ include the Context section so it matches to_content_blocks. * Make live OptoPrimeV3 tests opt-in (RUN_LIVE_LLM_TESTS) Mirror the backbone-branch test gating: real LLM optimizer-step tests now run only when RUN_LIVE_LLM_TESTS=1, so they don't fail against CI's text-only stub.

chinganc · 2026-06-10T20:05:05Z

@allenanie can you resolve the conflict first.

allenanie mentioned this pull request Jun 2, 2026

Add OptoPrimeV3 and OPROv3 multimodal optimizers #77

Merged

allenanie force-pushed the feature/llm-backbone branch from 9f8153d to 73b0347 Compare June 2, 2026 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multimodal LLM backbone + GoogleGenAI backend + image utilities#76

Multimodal LLM backbone + GoogleGenAI backend + image utilities#76
allenanie wants to merge 3 commits into
experimentalfrom
feature/llm-backbone

allenanie commented Jun 2, 2026 •

edited

Loading

Uh oh!

chinganc commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

allenanie commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What & why

Changes

Backward-compatibility (bugs fixed vs prototype)

Verification

Uh oh!

chinganc commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

allenanie commented Jun 2, 2026 •

edited

Loading