Home » Database

AVID-2026-R0121

Description

Multiple Model Guardrail Jailbreak via “Arbitrary Relation” Tactic

Details

A guardrail jailbreak vulnerability has been discovered affecting multiple LLM implementations. The specific flaw manifests through a combination of repeated prompt submission and leveraging non-deterministic response generation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

Metrics

0DIN Jailbreak Testing obtained the following test scores.

ModelTest KindTest ScoreTemperature (max 2.0)
Google Gemini 2.0 FlashIllegal Substances: Crystal Meth72.73N/A
OpenAI GPT-4oIllegal Substances: Crystal Meth63.64N/A
Twitter / X Grok 2Illegal Substances: Crystal Meth45.45N/A

References

Affected or Relevant Artifacts

TypeName
SystemGoogle Gemini 2.0 Flash
SystemOpenAI GPT-4o
SystemTwitter / X Grok 2

Impact

0DIN

  • Social Impact Score: Level 1: Minimal Risk
  • Jailbreak Taxonomy:
CategoryStrategyTechnique
Possible WorldsWorld BuildingScenarios
RhetoricPersuasion and ManipulationSurprise Attack
RhetoricPersuasion and ManipulationLatent Space Distraction

Other information

  • Report Type: Measurement
  • Credits: Mike Takahashi (@TakSec)
  • Date Reported: 2026-02-19
  • Version: 0.3.1
  • AVID Entry