Skip to content

chore(release): bump autoevals version to 0.3.0#197

Merged
Abhijeet Prasad (AbhiPrasad) merged 1 commit into
mainfrom
abhi-chore-bump-autoevals-version
Jun 9, 2026
Merged

chore(release): bump autoevals version to 0.3.0#197
Abhijeet Prasad (AbhiPrasad) merged 1 commit into
mainfrom
abhi-chore-bump-autoevals-version

Conversation

@AbhiPrasad

Copy link
Copy Markdown
Member

No description provided.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Braintrust eval report

Autoevals (HEAD-1781018601)

Score Average Improvements Regressions
NumericDiff 79.8% (+1pp) 10 🟢 5 🔴
Time_to_first_token 11.11tok (+2.01tok) 33 🟢 179 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 522.25tok (-2.04tok) 4 🟢 2 🔴
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Prompt_cache_creation_5m_tokens 0tok (+0tok) - -
Prompt_cache_creation_1h_tokens 0tok (+0tok) - -
Completion_tokens 464.8tok (+3.8tok) 85 🟢 112 🔴
Completion_reasoning_tokens 356.65tok (+6.11tok) 76 🟢 95 🔴
Completion_accepted_prediction_tokens 0tok (+0tok) - -
Completion_rejected_prediction_tokens 0tok (+0tok) - -
Completion_audio_tokens 0tok (+0tok) - -
Total_tokens 987.05tok (+1.76tok) 85 🟢 112 🔴
Estimated_cost 0$ (+0$) 73 🟢 101 🔴
Duration 11.11s (+2s) 34 🟢 181 🔴
Llm_duration 12.1s (+2.3s) 29 🟢 184 🔴

@AbhiPrasad Abhijeet Prasad (AbhiPrasad) merged commit f372f07 into main Jun 9, 2026
15 checks passed
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Braintrust eval report

Autoevals (main-1781026105)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants