Show HN: I built a small audit layer for LLM-as-judge decisions
Category: library
Tags: llm-judge, audit, evaluation-framework
Score: 7.5/10 (Innovation: 7, Technical: 8, Documentation: 8, Utility: 7)
Claim Memory Graph (CMG) is an audit layer for LLM-as-judge evaluations that forces judges to back verdicts with explicit claims tied to evidence, then flags failures like uncited verdicts or rubric gaps without using a second model. It's interesting because it addresses a critical trust gap in AI evaluation with a lightweight, framework-agnostic approach that provides actionable human-review flags.
Target audience: data scientists, ML researchers, AI engineers
Repository: https://github.com/MatteoLeonesi/claim-memory-graph-sdk · Python · NOASSERTION · 3 stars
View on Hacker News