Multimodal LLM backbone + GoogleGenAI backend + image utilities#76
Open
allenanie wants to merge 3 commits into
Open
Multimodal LLM backbone + GoogleGenAI backend + image utilities#76allenanie wants to merge 3 commits into
allenanie wants to merge 3 commits into
Conversation
Introduce the multimodal conversation layer used by the v3 optimizers and Trace-Bench, refactored from the prototype on `features/multimodal_opt` to keep the change minimal and reviewable. - opto/utils/backbone/: new package (content/template/turns/chat) providing Content, ContentBlockList, TextContent, ImageContent, PromptTemplate, UserTurn, AssistantTurn, and the Chat conversation manager. Public API is re-exported from the package __init__. Unverified surface (tool calling, PDFContent, FileContent) was dropped. - opto/utils/llm.py: add GoogleGenAILLM backend, embed(), and an mm_beta multimodal path returning AssistantTurn. mm_beta defaults to False so existing callers keep getting raw completion responses (backward compatible). openai/google-genai are imported lazily. (GeminiREST backend dropped.) - opto/utils/display/: optional Jupyter HTML rendering (loaded lazily; backbone degrades gracefully without it). - opto/trace/nodes.py: add is_image()/verify_data_is_image_url() and the Node.is_image property (PIL/requests imported lazily). - opto/optimizers/utils.py: add is_bedrock_model(). - setup.py: pin litellm==1.80.8, add google-genai and pillow. - tests: add tests/unit_tests/test_backbone.py; extend test_llm.py.
The live-call tests in test_backbone.py and test_llm.py were gated on a loose HAS_CREDENTIALS check (any of OAI_CONFIG_LIST / TRACE_LITELLM_MODEL / OPENAI_API_KEY). CI sets those to point at a text-only ollama stub (openai/phi4-mini), so the tests ran and failed: they hardcode gpt-4o/gpt-4o-mini (absent on the stub) and send image URLs the stub can't accept. Gate these tests behind an explicit RUN_LIVE_LLM_TESTS=1 opt-in (which CI does not set) so they only run against a real, image-capable provider. Also drop a stale assertion that AssistantTurn exposes `tool_calls` (tool support was removed from the backbone).
9f8153d to
73b0347
Compare
* Add OptoPrimeV3 and OPROv3 multimodal optimizers Stacked on the multimodal backbone branch. These optimizers build prompts as multimodal Content (text + images) via the backbone Chat/UserTurn/AssistantTurn primitives and require an mm_beta LLM. - opto/optimizers/optoprime_v3.py: OptoPrimeV3 (subclasses OptoPrime), OptimizerPromptSymbolSet variants, ProblemInstance, and value_to_image_content. - opto/optimizers/opro_v3.py: OPROv3 (subclasses OptoPrimeV3) with a smaller prompt symbol set. - opto/optimizers/__init__.py: export OptoPrimeV3 and OPROv3. - tests/llm_optimizers_tests/test_optoprime_v3.py. Fixes a pre-existing bug in ProblemInstance: content fields passed as plain strings (feedback/context) crashed __repr__/to_content_blocks. Added a __post_init__ that normalizes fields via ContentBlockList.ensure, and made __repr__ include the Context section so it matches to_content_blocks. * Make live OptoPrimeV3 tests opt-in (RUN_LIVE_LLM_TESTS) Mirror the backbone-branch test gating: real LLM optimizer-step tests now run only when RUN_LIVE_LLM_TESTS=1, so they don't fail against CI's text-only stub.
Member
|
@allenanie can you resolve the conflict first. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Introduces the multimodal conversation layer used by the v3 optimizers, Trace-Bench, and
debug_polca. This is a minimal, reviewable extraction from the prototype branchfeatures/multimodal_opt(which had an unreviewable ~11.7k-line diff bundling unrelated trainer-refactor work). Basing onexperimental(where the trainer refactor already landed) drops that noise automatically.This is PR 1 of 2 (stacked). PR 2 (
feature/optoprime-v3) adds the optimizers that consume this layer and targets this branch.Changes
opto/utils/backbone/— the former 2809-linebackbone.pyis now a package:content.py,template.py,turns.py,chat.py, with__init__.pyre-exporting the public API (Content,ContentBlockList,TextContent,ImageContent,PromptTemplate,UserTurn,AssistantTurn,Chat,DEFAULT_IMAGE_PLACEHOLDER, …). Unverified surface removed:ToolCall/ToolResult/ToolDefinition/UnparsedToolCall,PDFContent,FileContent(all internal-only; no consumer used them).opto/utils/llm.py— addsGoogleGenAILLM,embed(), and anmm_betamultimodal path returningAssistantTurn. RemovedGeminiRESTLLM+ helpers.openai/google-genaiare now imported lazily so the module loads without them.opto/trace/nodes.py— addsis_image(),verify_data_is_image_url(), and theNode.is_imageproperty (PIL/requests lazy).opto/optimizers/utils.py— addsis_bedrock_model().opto/utils/display/— optional Jupyter HTML rendering (lazily loaded; backbone degrades gracefully without it).setup.py: pinlitellm==1.80.8, addgoogle-genaiandpillow.tests/unit_tests/test_backbone.pyand extendedtest_llm.py. Live LLM/multimodal tests are opt-in viaRUN_LIVE_LLM_TESTS=1so CI (a text-only stub) skips them.Backward-compatibility (bugs fixed vs prototype)
LLM.__new__now defaultsmm_beta=False—LLM(model=...)returns raw completion responses (resp.choices[0].message.content), so Trace-Bench and OptoPrime v1/v2 are unaffected. Only the v3 optimizers opt intomm_beta=True.ConversationHistoryimport (class is nowChat).LLMFactory.get_llm(profile)positional usage (e.g. inoptoprimemulti.py) still works.Verification
pytest tests/unit_tests/passes (live tests skipped withoutRUN_LIVE_LLM_TESTS=1).from opto.utils.backbone import Chat, Content, ...andfrom opto.utils.llm import LLM, LLMFactory, DummyLLMsucceed.debug_polcais still exported, and no consumer references a removed symbol.