Show HN: Przm, a multi-agent AI reliability leaderboard with signed receipts

Category: ai-ml

Tags: ai-benchmark, multi-agent, verifiable-evaluation

Score: 7.8/10 (Innovation: 8, Technical: 8, Documentation: 8, Utility: 7)

Przm is a vendor-neutral benchmark suite and leaderboard for multi-agent AI systems and AI memory recall, using Ed25519-signed receipts for verifiable results. It innovates by providing a deterministic, signed, and reproducible evaluation framework that reveals reliability differences between multi-agent frameworks like AutoGen and hand-rolled baselines. The project is interesting for its cryptographic receipt verification, standardized adapter contract, and potential to influence how multi-agent AI reliability is measured and compared.

Target audience: AI researchers and backend developers building multi-agent systems

Repository: https://przm.sh · TypeScript · Apache-2.0

View on Hacker News