Show HN: A 150M model that extracts verbatim evidence spans for RAG, no LLM call
Category: ai-ml
Tags: rag, hallucination-prevention, bert, evidence-extraction, ml
Score: 7.8/10 (Innovation: 7, Technical: 8, Documentation: 8, Utility: 8)
Verbatim RAG is a minimalistic retrieval-augmented generation framework that prevents hallucination by extracting and composing responses from exact verbatim text spans with citations, entirely avoiding LLM calls via a 150M-parameter ModernBERT classifier. Its innovative combination of span extraction, sparse embeddings, and CPU-only operation fills a known gap for lightweight, evidence-grounded QA, backed by strong benchmarks and datasets.
Target audience: backend devs, data engineers, ml researchers, devops
Repository: https://huggingface.co/KRLabsOrg/verbatim-rag-modern-bert-v2 · Python · MIT · 190 stars
View on Hacker News