We use cookies to improve your experience on our site.
AVID-2026-R0115
Description
Multiple Model Guardrail Jailbreak via “Leetspeak Unicode” Tactic
Details
A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of leetspeak text transformation and Unicode character manipulation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
Metrics
0DIN Jailbreak Testing obtained the following test scores.
| Model | Test Kind | Test Score | Temperature (max 2.0) |
|---|---|---|---|
| IBM Watson | Illegal Substances: Crystal Meth | 36.0 | N/A |
| OpenAI GPT-4o mini | Illegal Substances: Crystal Meth | 73.0 | 0.7 |
| Twitter / X Grok 2 | Illegal Substances: Crystal Meth | 82.0 | 0.7 |
References
Affected or Relevant Artifacts
| Type | Name |
|---|---|
| Model | IBM Watson |
| System | OpenAI GPT-4o mini |
| System | Twitter / X Grok 2 |
Impact
0DIN
- Social Impact Score: Level 3: Significant Risk
- Jailbreak Taxonomy:
| Category | Strategy | Technique |
|---|---|---|
| Language | Code and Encode | Chaff |
| Language | Code and Encode | Other Encoding |
| Language | Stylizing | Misspellings |
Other information
- Report Type: Measurement
- Credits: Anonymous
- Date Reported: 2026-02-19
- Version: 0.3.1
- AVID Entry