AVID-2026-R1692

Description

vLLM Vulnerable to Remote Code Execution via Mooncake Integration (CVE-2025-32444)

Details

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.

Reason for inclusion in AVID: The report describes a targeted security vulnerability in a software component (vLLM) used to deploy and serve AI models. It enables remote code execution via insecure deserialization (CWE-502) over network sockets when Mooncake integration is used, directly impacting AI serving stacks. It is a software supply-chain issue (model-serving/runtime/component in AI pipelines) and not hardware/firmware-only. Patch details and references are provided, including affected versions and mitigation.

References

Affected or Relevant Artifacts

Developer: vllm-project
Deployer: vllm-project
Artifact Details:

Type	Name
System	vllm

Impact

AVID Taxonomy Categorization

Risk domains: Security
SEP subcategories: S0100: Software Vulnerability
Lifecycle stages: L06: Deployment

CVSS

Version	3.1
Vector String	CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H
Base Score	10.0
Base Severity	🔴 Critical
Attack Vector	NETWORK
Attack Complexity	🟢 Low
Privileges Required	NONE
User Interaction	NONE
Scope	CHANGED
Confidentiality Impact	🔴 High
Integrity Impact	🔴 High
Availability Impact	🔴 High

CWE

ID	Description
CWE-502	CWE-502: Deserialization of Untrusted Data

Other information

Report Type: Advisory
Credits:
Date Reported: 2025-04-30
Version: 0.3.3
AVID Entry