AVID-2023-V008

Description

GPT-2 Model Replication

Details

OpenAI built GPT-2, a language model capable of generating high quality text samples. Over concerns that GPT-2 could be used for malicious purposes such as impersonating others, or generating misleading news articles, fake social media content, or spam, OpenAI adopted a tiered release schedule. They initially released a smaller, less powerful version of GPT-2 along with a technical description of the approach, but held back the full trained model.

Before the full model was released by OpenAI, researchers at Brown University successfully replicated the model using information released by OpenAI and open source ML artifacts. This demonstrates that a bad actor with sufficient technical skill and compute resources could have replicated GPT-2 and used it for harmful goals before the AI Security community is prepared.

References

AVID Taxonomy Categorization

Risk domains: Security
SEP subcategories: S0502: Model theft
Lifecycle stages: L04: Model Development, L06: Deployment

Affected or Relevant Artifacts

Developer:
Deployer: OpenAI GPT-2
Artifact Details:
Type Name
System OpenAI GPT-2

Type	Name
System	OpenAI GPT-2

Other information

Vulnerability Class: ATLAS Case Study
Date Published: 2023-03-31
Date Last Modified: 2023-03-31
Version: 0.2
AVID Entry