Tag
#jailbreak
3 posts tagged jailbreak.
- technique
Encoding and Obfuscation Jailbreaks: The Gap Between What Filters See and What Models Process
Content filters typically operate on decoded, normalized text. LLMs process tokens, not text. The gap between these two layers is an attack surface that remains incompletely addressed.
- research
LLM Jailbreak Taxonomy 2026: How the Techniques Cluster
Six years of jailbreak research has produced a messy literature. This taxonomy organizes working techniques by the behavioral property they exploit — useful for both researchers and defenders.
- technique
Many-Shot Jailbreaking: Why Long Context Windows Created a New Attack Surface
The same architectural decision that makes LLMs better at long-context tasks — extended context windows — enabled a new class of jailbreak. The technique, how it works, and what defenses exist.