We use cookies to improve your experience on our site.
AVID-2026-R0099
Description
OpenAI GPT-4o Guardrail Jailbreak via “Layered Summarization” Tactic
Details
A guardrail jailbreak vulnerability has been discovered affecting OpenAI GPT-4o. The specific flaw manifests when the attacker requests multiple layers of summarization with selective focus instructions, combined with demands for direct quotes, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
References
Affected or Relevant Artifacts
| Type | Name |
|---|---|
| System | OpenAI GPT-4o |
Impact
0DIN
- Social Impact Score: Level 3: Significant Risk
- Jailbreak Taxonomy:
| Category | Strategy | Technique |
|---|---|---|
| Fictionalizing | Re-storying | Goal Hijacking |
| Rhetoric | Persuasion and Manipulation | Latent Space Distraction |
Other information
- Report Type: Issue
- Credits: Mike Takahashi (@TakSec)
- Date Reported: 2026-02-19
- Version: 0.3.1
- AVID Entry