AVID-2022-R0005

Description

Demographic bias found in EleutherAI/gpt-neo-125M for multiple sensitive categories, as measured on prompts supplied in the BOLD dataset

Details

When supplied with prompts from the BOLD dataset, toxicity scores for completed sentences tagged with the ‘Islam’ religion demonstrate a higher degree of toxicity than other religions.

References

Gender Bias Evaluation for Masked Language modelling: BOLD
EleutherAI/gpt-neo-125M on Hugging Face
BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

AVID Taxonomy Categorization

Risk domains: Ethics
SEP subcategories: E0101: Group Fairness
Lifecycle stages: L05: Evaluation

Affected or Relevant Artifacts

Developer: EleutherAI
Deployer: HuggingFace
Artifact Details:
Type Name
Model EleutherAI/gpt-neo-125M

Type	Name
Model	EleutherAI/gpt-neo-125M

Other information

Report Type: Detection
Credits: Subho Majumdar, AVID
Date Reported: 2022-11-09
Version: 0.1
AVID Entry