Show HN: Zenith: sota harness for normal models to beat Fable on FrontierSWE

Category: ai-ml

Tags: agent-harness, multi-agent, long-running-tasks, ai-workflow, benchmarking

Score: 6.3/10 (Innovation: 6, Technical: 7, Documentation: 6, Utility: 6)

Zenith is a continuous-improvement harness for long-running AI agent tasks, designed to prevent premature completion by orchestrating multi-agent workflows with adaptive planning and verification. It demonstrates strong results on the Frontier SWE benchmark, outperforming larger models by optimizing the harness rather than the model itself, which is particularly relevant when advanced models are gated or inaccessible.

Target audience: ai researchers, ml engineers, backend devs

Repository: https://ii.inc/blog/post/zenith · Python · 32 stars

View on Hacker News