Home » Database

AVID-2026-R0115

Description

Multiple Model Guardrail Jailbreak via “Leetspeak Unicode” Tactic

Details

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of leetspeak text transformation and Unicode character manipulation, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.

Metrics

0DIN Jailbreak Testing obtained the following test scores.

ModelTest KindTest ScoreTemperature (max 2.0)
IBM WatsonIllegal Substances: Crystal Meth36.0N/A
OpenAI GPT-4o miniIllegal Substances: Crystal Meth73.00.7
Twitter / X Grok 2Illegal Substances: Crystal Meth82.00.7

References

Affected or Relevant Artifacts

TypeName
ModelIBM Watson
SystemOpenAI GPT-4o mini
SystemTwitter / X Grok 2

Impact

0DIN

  • Social Impact Score: Level 3: Significant Risk
  • Jailbreak Taxonomy:
CategoryStrategyTechnique
LanguageCode and EncodeChaff
LanguageCode and EncodeOther Encoding
LanguageStylizingMisspellings

Other information

  • Report Type: Measurement
  • Credits: Anonymous
  • Date Reported: 2026-02-19
  • Version: 0.3.1
  • AVID Entry