Show HN: We benchmarked 18 LLMs on OCR (7K+ calls) – cheaper models win
Category: devtools
Tags: ocr, llm-benchmark, typescript
Score: 6.5/10 (Innovation: 5, Technical: 7, Documentation: 8, Utility: 6)
A reproducible benchmark for evaluating LLM-based OCR extraction on business documents, measuring quality, latency, cost, and reliability across multiple providers. It includes a dataset, CLI, post-processing pipeline, and a frontend for leaderboard visualization, making it a practical tool for comparing model performance in production-like settings.
Target audience: backend devs, data engineers
Repository: https://www.arbitrhq.ai/leaderboards/ · TypeScript · MIT · 1 stars
View on Hacker News