Show HN: Jailbreak this model to get 3B tokens

Category: security

Tags: llm-safety, guardrails, jailbreak-challenge

Score: 5.0/10 (Innovation: 5, Technical: 5, Documentation: 3, Utility: 5)

Opir is an open-source encoder guardrails layer for LLM safety, presented with a jailbreak challenge to test its robustness. The project offers a practical benchmark for red-teaming AI safety systems and incentives for bypassing its 430M parameter model.

Target audience: security researchers, AI safety engineers, red teams

Repository: https://opir.ai/challenge

View on Hacker News