autoevals: support Zod v4 via peer dependency#194
Draft
Ronald Koh (ronaldkohhh) wants to merge 1 commit into
Draft
autoevals: support Zod v4 via peer dependency#194Ronald Koh (ronaldkohhh) wants to merge 1 commit into
Ronald Koh (ronaldkohhh) wants to merge 1 commit into
Conversation
Move `zod` from a hard dependency (pinned to `^3.25.76`) to a peer dependency with range `^3.25.34 || ^4.0`, mirroring the pattern already used by the main `braintrust` TypeScript SDK (`braintrust/sdk/js/package.json:244-246`). Users on either Zod major version can now consume autoevals without needing to allow duplicate Zod installs or apply local patches. Adds `js/zod-utils.ts` with a small `zodToJsonSchema` shim that dispatches between Zod v3 (via `zod-to-json-schema`) and Zod v4 (via v4's native `z.toJSONSchema()`). The shim is a direct copy of the same pattern at `braintrust/sdk/js/src/zod/utils.ts`, so v3 and v4 schemas both produce JSON Schema output that's compatible with OpenAI tool parameters. Updates `js/ragas.ts` to import the shim instead of pulling `zod-to-json-schema` directly, so the same code works for users on either Zod major. `js/templates.ts` is unchanged — it only uses basic `z.object`, `z.string`, etc. APIs that exist in both Zod v3 and v4, so the exported `modelGradedSpecSchema` and `ModelGradedSpec` type resolve to whichever Zod version the consumer has installed. Picks up the work Caitlin started in #155 but takes a minimal approach off current main rather than carrying that branch's incidental drift. The internal integration test failures she flagged on 2026-01-13 in #155 still need to be re-validated against this change since they weren't reproducible from autoevals' public CI alone — flagging that explicitly in the PR description. Reported by Juicebox (Pylon #17165). Linear: BT-5495. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
zodfrom a hard dependency to a peer dependency with range^3.25.34 || ^4.0, mirroring the pattern already in use in the mainbraintrustTypeScript SDK. Users on either Zod major can now consume autoevals without allowing duplicate Zod installs or patching the build locally.js/zod-utils.ts— a direct port of thezodToJsonSchemashim atbraintrust/sdk/js/src/zod/utils.ts. Dispatches Zod v3 schemas throughzod-to-json-schemaand Zod v4 schemas through v4's nativez.toJSONSchema().js/ragas.tsto import the shim.js/templates.tsworks as-is since it uses only basic Zod APIs that exist in both majors.Relationship to prior work
This supersedes #155 (Caitlin's draft from December 2025). That branch had the right idea but accumulated unrelated drift from a stale base (changes to
init-models.test.ts,llm.fixtures.tsmodel strings,thread-utilsexports, etc.). I started fresh from currentmainand applied only the minimal change needed for Zod v4 compat.Closes #155 if/when this lands.
Reviewer note: internal integration tests
Caitlin flagged on 2026-01-13 in #155 that her version of this change was failing some internal integration tests that weren't covered by autoevals' public CI. Those tests still need to be re-validated against this PR. Public CI on this branch should pass cleanly since the change matches the proven SDK pattern, but the internal failures she saw could re-appear in whichever monorepo consumes autoevals.
If they do, the most likely sources are:
modelGradedSpecSchemaexport type now resolves to the consumer's installed Zod, which could break if the monorepo mixes Zod v3 and v4 in the same module graph.zod/v3,zod/v4) requiring Zod 3.25+ — older Zod 3.x versions don't ship those subpaths.Happy to debug whichever specific tests fail; just need pointer to the failing CI run.
Test plan
zod@^3.25.34— confirm runtime + types work.zod@^4.0— confirm runtime + types work.ragasscorer (ContextRelevancy,Faithfulness, etc.) still produces correct JSON Schema output for OpenAI tool params, on both Zod versions.