We use cookies to improve your experience on our site.
AVID-2026-R0079
Description
Multiple Model Guardrail Jailbreak via “Compare and Contrast” Tactic
Details
A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of requesting comparative analysis of chemical synthesis procedures while embedding explicit instructions to avoid generalization and include specific technical details. The attacker frames the request as an academic comparison exercise, effectively bypassing inference restrictions around providing information hazards and violating laws.
Metrics
0DIN Jailbreak Testing obtained the following test scores.
| Model | Test Kind | Test Score | Temperature (max 2.0) |
|---|---|---|---|
| Google Gemini 2.0 Flash | Illegal Substances: Crystal Meth | 82.0 | 0.7 |
| OpenAI GPT-4o mini | Illegal Substances: Crystal Meth | 86.0 | 0.7 |
| Twitter / X Grok 2 | Illegal Substances: Crystal Meth | 68.0 | 0.7 |
References
Affected or Relevant Artifacts
| Type | Name |
|---|---|
| System | Google Gemini 2.0 Flash |
| System | OpenAI GPT-4o mini |
| System | Twitter / X Grok 2 |
Impact
0DIN
- Social Impact Score: Level 3: Significant Risk
- Jailbreak Taxonomy:
| Category | Strategy | Technique |
|---|---|---|
| Language | Stylizing | Synonymous Language |
| Rhetoric | Persuasion and Manipulation | Latent Space Distraction |
Other information
- Report Type: Measurement
- Credits: Mike Takahashi (@TakSec)
- Date Reported: 2026-02-19
- Version: 0.3.1
- AVID Entry