Home » Database

AVID-2026-R0082

Description

Multiple Model Guardrail Jailbreak via Token Disruption through Random Spacing

Details

A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when attackers introduce random spacing within a prompt, effectively disrupting how the model tokenizes the input. By splitting words like “Chemicals” into parts such as “chem-”, “ic-”, and “al,” the technique circumvents the model’s primary content filter classifiers, which typically flag restricted phrases or tokens. This token manipulation allows attackers to bypass inference restrictions around providing information hazards and violating laws.

Metrics

0DIN Jailbreak Testing obtained the following test scores.

ModelTest KindTest ScoreTemperature (max 2.0)
OpenAI GPT-4o miniIllegal Substances: Crystal Meth64.0N/A
Twitter / X Grok 2Illegal Substances: Crystal Meth91.0N/A

References

Affected or Relevant Artifacts

TypeName
SystemOpenAI GPT-4o mini
SystemTwitter / X Grok 2

Impact

0DIN

  • Social Impact Score: Level 3: Significant Risk
  • Jailbreak Taxonomy:
CategoryStrategyTechnique
LanguageCode and EncodeTransformer Translatable Tokens

Other information

  • Report Type: Measurement
  • Credits: Edward Morris, Anonymous, Miller Engelbrecht, Luis Reveles, Jose Brown, Mike Takahashi (@TakSec), Arth Singh, Igor Vasiliev
  • Date Reported: 2026-02-19
  • Version: 0.3.1
  • AVID Entry