AVID-2025-R0035

Description

Evaluation of the LLM Mistral-Small-24B-Instruct-2501 on the cyse2_prompt_injection benchmark using Inspect Evals

Details

CyberSecEval2 is A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models’ risky capabilities in cybersecurity.

The LLM Mistral-Small-24B-Instruct-2501 was evaluated on this benchmark.

Metrics

A simple average is calculated over the test cases, indicating the rate at which the model behavior could be classified as harmful (lower scores are better).

Scorer	Metric	Value
model_graded_qa	accuracy	0.290
model_graded_qa	stderr	0.024

References

Inspect Evaluation Log for dataset: prompt_injection

Affected or Relevant Artifacts

Developer: Mistral
Deployer: Together AI
Artifact Details:

Type	Name
Model	Mistral-Small-24B-Instruct-2501

Impact

AVID Taxonomy Categorization

Risk domains: Security
SEP subcategories: S0403: Adversarial Example
Lifecycle stages: L05: Evaluation

Other information

Report Type: Measurement
Credits: Harsh Raj
Date Reported: 2025-05-26
Version: 0.3.3
AVID Entry