CVE-2026-34159

llama.cpp · llama.cpp

An unauthenticated attacker can achieve remote code execution and ASLR bypass in llama.cpp by exploiting a lack of bounds validation in the RPC backend's tensor deserialization.

Executive summary

The llama.cpp LLM inference engine is vulnerable to unauthenticated remote code execution via its RPC backend, allowing attackers to fully compromise the host system.

Vulnerability

The RPC backend's deserialize_tensor() function fails to perform bounds validation when a tensor's buffer field is zero. This allows an unauthenticated attacker with TCP access to the RPC server to read/write arbitrary memory, bypass ASLR, and execute code.

Business impact

The CVSS score of 9.8 indicates a critical risk. Successful exploitation allows an attacker to take full control of the AI inference server, potentially stealing proprietary models, sensitive training data, or using the server for malicious purposes.

Remediation

Immediate Action: Update llama.cpp to version b8492 or later immediately to implement the necessary bounds validation in the RPC backend.

Proactive Monitoring: Monitor the RPC server port for anomalous traffic patterns and unexpected memory usage spikes that could signal an exploitation attempt.

Compensating Controls: Do not expose the llama.cpp RPC server port to the public internet; use VPNs or SSH tunnels for remote access and implement strict IP whitelisting.

Exploitation status

Public Exploit Available: No

Analyst recommendation

Given the unauthenticated nature of this RCE, immediate patching is mandatory. Organizations should also ensure that LLM infrastructure is isolated from untrusted networks to reduce the attack surface.