Slim multimodal LLM backbone to minimal stateless primitives by allenanie · Pull Request #79 · AgentOpt/OpenTrace

allenanie · 2026-06-12T18:20:08Z

Summary

Strips opto/utils/backbone down to a small, reviewable multimodal layer and removes the surrounding bloat that made review hard.

Keeps the multimodal content primitives (TextContent, ImageContent, ContentBlockList, Content, PromptTemplate) and trims redundant helpers/constructors.
Replaces the Chat conversation manager with a stateless to_messages(system_prompt, user_content, history=None) helper. Optimizers now own their own message history as a plain list[dict].
Deletes the opto/utils/display (Jupyter HTML) package and all _repr_html_ hooks, plus leftover tool-call/Files traces.
Trims the very long llm.py docstrings while keeping mm_beta AssistantTurn wrapping and Gemini message conversion intact.
Updates OptoPrimeV3 and OPROv3 to build requests via to_messages(), preserving the image-as-node input path and image-generation output path.

Net: backbone + display shrink from ~3,200 lines to a focused multimodal layer (13 files changed, +693 / -3441).

Note: helix.py and its surrounding test/example files are intentionally left untracked and are not part of this PR.

Test plan

pytest tests/unit_tests/test_backbone.py (rewritten for the slimmed API)
pytest tests/unit_tests/test_llm.py
pytest tests/llm_optimizers_tests/test_optoprime_v3.py
Backbone/optimizer imports verified; stateless image message path smoke-tested
Opt-in live LLM tests (RUN_LIVE_LLM_TESTS=1) against real providers

Made with Cursor

Introduce the multimodal conversation layer used by the v3 optimizers and Trace-Bench, refactored from the prototype on `features/multimodal_opt` to keep the change minimal and reviewable. - opto/utils/backbone/: new package (content/template/turns/chat) providing Content, ContentBlockList, TextContent, ImageContent, PromptTemplate, UserTurn, AssistantTurn, and the Chat conversation manager. Public API is re-exported from the package __init__. Unverified surface (tool calling, PDFContent, FileContent) was dropped. - opto/utils/llm.py: add GoogleGenAILLM backend, embed(), and an mm_beta multimodal path returning AssistantTurn. mm_beta defaults to False so existing callers keep getting raw completion responses (backward compatible). openai/google-genai are imported lazily. (GeminiREST backend dropped.) - opto/utils/display/: optional Jupyter HTML rendering (loaded lazily; backbone degrades gracefully without it). - opto/trace/nodes.py: add is_image()/verify_data_is_image_url() and the Node.is_image property (PIL/requests imported lazily). - opto/optimizers/utils.py: add is_bedrock_model(). - setup.py: pin litellm==1.80.8, add google-genai and pillow. - tests: add tests/unit_tests/test_backbone.py; extend test_llm.py.

The live-call tests in test_backbone.py and test_llm.py were gated on a loose HAS_CREDENTIALS check (any of OAI_CONFIG_LIST / TRACE_LITELLM_MODEL / OPENAI_API_KEY). CI sets those to point at a text-only ollama stub (openai/phi4-mini), so the tests ran and failed: they hardcode gpt-4o/gpt-4o-mini (absent on the stub) and send image URLs the stub can't accept. Gate these tests behind an explicit RUN_LIVE_LLM_TESTS=1 opt-in (which CI does not set) so they only run against a real, image-capable provider. Also drop a stale assertion that AssistantTurn exposes `tool_calls` (tool support was removed from the backbone).

* Add OptoPrimeV3 and OPROv3 multimodal optimizers Stacked on the multimodal backbone branch. These optimizers build prompts as multimodal Content (text + images) via the backbone Chat/UserTurn/AssistantTurn primitives and require an mm_beta LLM. - opto/optimizers/optoprime_v3.py: OptoPrimeV3 (subclasses OptoPrime), OptimizerPromptSymbolSet variants, ProblemInstance, and value_to_image_content. - opto/optimizers/opro_v3.py: OPROv3 (subclasses OptoPrimeV3) with a smaller prompt symbol set. - opto/optimizers/__init__.py: export OptoPrimeV3 and OPROv3. - tests/llm_optimizers_tests/test_optoprime_v3.py. Fixes a pre-existing bug in ProblemInstance: content fields passed as plain strings (feedback/context) crashed __repr__/to_content_blocks. Added a __post_init__ that normalizes fields via ContentBlockList.ensure, and made __repr__ include the Context section so it matches to_content_blocks. * Make live OptoPrimeV3 tests opt-in (RUN_LIVE_LLM_TESTS) Mirror the backbone-branch test gating: real LLM optimizer-step tests now run only when RUN_LIVE_LLM_TESTS=1, so they don't fail against CI's text-only stub.

Reduce opto/utils/backbone to a small, reviewable multimodal layer: - Keep text+image content primitives (TextContent, ImageContent, ContentBlockList, Content, PromptTemplate) and trim redundant helpers. - Replace the Chat conversation manager with a stateless to_messages() helper; optimizers now own their own message history as a plain list. - Remove the opto/utils/display (Jupyter HTML) package and all _repr_html_ hooks, plus leftover tool-call/Files traces. - Trim verbose llm.py docstrings; keep mm_beta AssistantTurn wrapping and Gemini message conversion intact. - Update OptoPrimeV3 and OPROv3 to build requests via to_messages(), preserving the image-as-node and image-generation output paths. - Rewrite test_backbone.py for the slimmed API and fix test_llm.py. Co-authored-by: Cursor <cursoragent@cursor.com>

allenanie and others added 4 commits June 2, 2026 01:54

allenanie changed the base branch from feature/llm-backbone to experimental June 12, 2026 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slim multimodal LLM backbone to minimal stateless primitives#79

Slim multimodal LLM backbone to minimal stateless primitives#79
allenanie wants to merge 4 commits into
experimentalfrom
feature/llm-backbone-minimal

allenanie commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

allenanie commented Jun 12, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant