AVID-2026-R0099

Description

OpenAI GPT-4o Guardrail Jailbreak via “Layered Summarization” Tactic

Details

A guardrail jailbreak vulnerability has been discovered affecting OpenAI GPT-4o. The specific flaw manifests when the attacker requests multiple layers of summarization with selective focus instructions, combined with demands for direct quotes, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

References

0din.ai Disclosure

Affected or Relevant Artifacts

Type	Name
System	OpenAI GPT-4o

Impact

0DIN

Social Impact Score: Level 3: Significant Risk
Jailbreak Taxonomy:

Category	Strategy	Technique
Fictionalizing	Re-storying	Goal Hijacking
Rhetoric	Persuasion and Manipulation	Latent Space Distraction

Other information

Report Type: Issue
Credits: Mike Takahashi (@TakSec)
Date Reported: 2026-02-19
Version: 0.3.3
AVID Entry