Show HN: We're inviting Anthropic to put the real Mythos 5 on our open benchmark
Category: security
Tags: security, benchmark, scanner-evaluation, vulnerability-detection, open-source
Score: 7.0/10 (Innovation: 6, Technical: 7, Documentation: 8, Utility: 7)
RealVuln Benchmark is an open, extensible benchmark for evaluating security scanners against ground-truth vulnerabilities in real-world Python code. It addresses the lack of credible benchmarks by providing a framework with automated scoring, false positive traps, and support for multiple scanners including LLM-based ones. Its interesting design focuses on real-world code rather than synthetic test cases, with community contribution in mind.
Target audience: security engineers, devops, backend devs
Repository: https://realvuln.com · HTML · MIT · 14 stars
View on Hacker News