You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This paper introduces BloomBench, a new bilingual (English-Arabic), cognitively-informed benchmark based on Bloom's Taxonomy to systematically evaluate the reasoning abilities of Vision-Language Models across different hierarchical cognitive levels.
TIR-Bench: Multi-modal image reasoning benchmark interpretation repository, includes dataset introduction, paper parsing, evaluation pipeline and VLM test results for vision-language model benchmark research.