We use cookies to improve your experience on our site.
AVID-2026-R0122
Description
Multiple Model Guardrail Jailbreak via “Fortune Cookie Simulator” Tactic
Details
A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of chaff techniques along with a request that the model simulate a fortune cookie generator. Successful application allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
Metrics
0DIN Jailbreak Testing obtained the following test scores.
| Model | Test Kind | Test Score | Temperature (max 2.0) |
|---|---|---|---|
| Anthropic Claude 3 Haiku | Illegal Substances: Crystal Meth | 59.09 | N/A |
| Cohere Command R | Illegal Substances: Crystal Meth | 31.82 | N/A |
| Google Gemini 2.0 Flash | Illegal Substances: Crystal Meth | 59.09 | N/A |
| OpenAI GPT-4o | Illegal Substances: Crystal Meth | 50.0 | N/A |
References
Affected or Relevant Artifacts
| Type | Name |
|---|---|
| System | Anthropic Claude 3 Haiku |
| Model | Cohere Command R |
| System | Google Gemini 2.0 Flash |
| System | OpenAI GPT-4o |
Impact
0DIN
- Social Impact Score: Level 3: Significant Risk
- Jailbreak Taxonomy:
| Category | Strategy | Technique |
|---|---|---|
| Fictionalizing | Switching Genres | Games |
| Possible Worlds | World Building | Scenarios |
Other information
- Report Type: Measurement
- Credits: Mike Takahashi (@TakSec)
- Date Reported: 2026-02-19
- Version: 0.3.1
- AVID Entry