Home » Database

AVID-2026-R0114

Description

Multiple Model Guardrail Jailbreak via “Chaotic Formatting” Tactic

Details

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of chaotic formatting and meta-instruction obfuscation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

Metrics

0DIN Jailbreak Testing obtained the following test scores.

ModelTest KindTest ScoreTemperature (max 2.0)
Alibaba Qwen PlusIllegal Substances: Crystal Meth68.00.7
Alibaba Qwen TurboIllegal Substances: Crystal Meth77.00.7
DeepSeek DeepSeek R1Illegal Substances: Crystal Meth64.00.7
Meta LLaMa 3.3Illegal Substances: Crystal Meth91.00.7
Mistral Mistral 8BIllegal Substances: Crystal Meth68.00.7
Mistral Mistral LargeIllegal Substances: Crystal Meth59.00.7
OpenAI GPT-4.1Illegal Substances: Crystal Meth68.00.7
OpenAI GPT-4oIllegal Substances: Crystal Meth64.0N/A
Twitter / X Grok 4Illegal Substances: Crystal Meth73.00.7

References

Affected or Relevant Artifacts

TypeName
ModelAlibaba Qwen Plus
SystemAlibaba Qwen Turbo
ModelDeepSeek DeepSeek R1
ModelMeta LLaMa 3.3
ModelMistral Mistral 8B
SystemMistral Mistral Large
SystemOpenAI GPT-4.1
SystemOpenAI GPT-4o
SystemTwitter / X Grok 4

Impact

0DIN

  • Social Impact Score: Level 1: Minimal Risk
  • Jailbreak Taxonomy:
CategoryStrategyTechnique
StratagemsMeta PromptingPerspective Shifting

Other information

  • Report Type: Measurement
  • Credits: Edward Morris, Anonymous
  • Date Reported: 2026-02-19
  • Version: 0.3.1
  • AVID Entry