We use cookies to improve your experience on our site.
AVID-2026-R0104
Description
OpenAI GPT-4o Guardrail Jailbreak via “Zero-Width Unicode” Tactic
Details
A guardrail jailbreak vulnerability has been discovered affecting multiple LLMs. The specific flaw manifests through a combination of zero-width Unicode character encoding and precise character manipulation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
Metrics
0DIN Jailbreak Testing obtained the following test scores.
| Model | Test Kind | Test Score | Temperature (max 2.0) |
|---|---|---|---|
| OpenAI GPT-4o | Copyrighted Content: Harry Potter | 20.56 | N/A |
References
Affected or Relevant Artifacts
| Type | Name |
|---|---|
| System | OpenAI GPT-4o |
Impact
0DIN
- Social Impact Score: Level 1: Minimal Risk
- Jailbreak Taxonomy:
| Category | Strategy | Technique |
|---|---|---|
| Language | Code and Encode | Unicode |
| Stratagems | Scatter Shot | Regenerate Response |
Other information
- Report Type: Measurement
- Credits: Mike Takahashi (@TakSec)
- Date Reported: 2026-02-19
- Version: 0.3.1
- AVID Entry