Show HN: Jailbreak this model to get 3B tokens
Category: security
Tags: llm-safety, guardrails, jailbreak-challenge
Score: 5.0/10 (Innovation: 5, Technical: 5, Documentation: 3, Utility: 5)
Opir is an open-source encoder guardrails layer for LLM safety, presented with a jailbreak challenge to test its robustness. The project offers a practical benchmark for red-teaming AI safety systems and incentives for bypassing its 430M parameter model.
Target audience: security researchers, AI safety engineers, red teams
Repository: https://opir.ai/challenge
View on Hacker News