JailbreakDB
JailbreakDB

An indexed catalog of working LLM jailbreak techniques.

An indexed, searchable catalog of LLM jailbreak techniques. Each entry: the prompt pattern, the models it works on, when it stopped working, the behavior it exploits — sourced from primary disclosure where possible, with honest attribution.

Disclosure timeline and policy comparison
Anomaly

Responsible Disclosure Norms for LLM Jailbreaks: What's Emerged and What's Still Disputed

Software vulnerability disclosure has 30 years of evolved norms. LLM jailbreak disclosure is 4 years old and still contested. The current state of practice, and where the field is heading.

Open trace

Anomaly

technique

Encoding and Obfuscation Jailbreaks: The Gap Between What Filters See and What Models Process

Content filters typically operate on decoded, normalized text. LLMs process tokens, not text. The gap between these two layers is an attack surface that remains incompletely addressed.

Compare
research

The Jailbreak Detection Evasion Arms Race: How Attackers Adapt to Defenses

Safety classifiers get deployed; attackers find variants that evade them. This cycle is predictable. Understanding the mechanics of classifier evasion tells defenders what to invest in.

Compare
technique

Roleplay and Persona Jailbreaks: Why They Work and Why They Don't Anymore (Mostly)

DAN, AIM, STAN, and dozens of variants. Persona-based jailbreaks were the dominant technique from 2022-2023. Understanding why they worked — and why current defenses handle them better — is instructive for the next attack class.

Compare

Trace

Subscribe

JailbreakDB — in your inbox

An indexed catalog of working LLM jailbreak techniques. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.