Show HN: Reliably Incorrect – explore LLM reliability with data visualizations
Category: ai-ml
Tags: llm-reliability, agent-tuning, probabilistic-programming
Score: 6.0/10 (Innovation: 7, Technical: 5, Documentation: 8, Utility: 4)
This project explores the reliability of LLM-based coding agents (specifically Claude Code) by treating their instruction directories (.claude/) as probabilistic programs. It's interesting because it frames agent tuning as a computable probability problem and introduces concepts like context drift and self-reflection loops to make non-deterministic systems more predictable.
Target audience: ai-engineers, ml-researchers, backend-devs
Repository: https://adamsohn.com/reliably-incorrect/ · Python · 16 stars
View on Hacker News