A-CODEAI open-bench reproduce bundle (SEND_GATE HOLD 쨌 per-SKU metrics only)
-
Updated
Jun 16, 2026 - Python
A-CODEAI open-bench reproduce bundle (SEND_GATE HOLD 쨌 per-SKU metrics only)
Pipeline to investigate structured reasoning and instruction adherence in multimodal LLMs
Add a description, image, and links to the deterministic-eval topic page so that developers can more easily learn about it.
To associate your repository with the deterministic-eval topic, visit your repo's landing page and select "manage topics."