Show HN: 2500 vision benchmarks / evals for Vision Language Models
Category: ai-ml
Tags: ai-evaluation, vision-language-models, benchmark-dataset
Score: 7.3/10 (Innovation: 7, Technical: 6, Documentation: 8, Utility: 8)
An auto-updating catalog of 2,671 vision-language model benchmarks, automatically curated by scanning arXiv daily and classifying papers with Claude. It's interesting because it solves the discovery problem for VLM evaluation by providing a structured, programmatic dataset that tracks the rapidly evolving multimodal AI research landscape.
Target audience: ai-researchers, ml-engineers, data-scientists
Repository: https://github.com/Overshoot-ai/vlm-benchmarks · Python · MIT · 1 stars
View on Hacker News