Show HN: I built a small audit layer for LLM-as-judge decisions

Category: library

Tags: llm-judge, audit, evaluation-framework

Score: 7.5/10 (Innovation: 7, Technical: 8, Documentation: 8, Utility: 7)

Claim Memory Graph (CMG) is an audit layer for LLM-as-judge evaluations that forces judges to back verdicts with explicit claims tied to evidence, then flags failures like uncited verdicts or rubric gaps without using a second model. It's interesting because it addresses a critical trust gap in AI evaluation with a lightweight, framework-agnostic approach that provides actionable human-review flags.

Target audience: data scientists, ML researchers, AI engineers

Repository: https://github.com/MatteoLeonesi/claim-memory-graph-sdk · Python · NOASSERTION · 3 stars

View on Hacker News