CVE-2025-23310

NVIDIA · NVIDIA Triton Inference Server for Windows and Linux

A critical vulnerability has been discovered in NVIDIA's Triton Inference Server for both Windows and Linux.

Executive summary

A critical vulnerability has been discovered in NVIDIA's Triton Inference Server for both Windows and Linux. This flaw, identified as a stack buffer overflow, can be exploited by a remote attacker sending specially crafted inputs to the server. A successful exploit could allow the attacker to execute arbitrary code, leading to a complete system compromise, data theft, or a denial-of-service condition that would disrupt critical AI/ML operations.

Vulnerability

The vulnerability is a stack-based buffer overflow within the NVIDIA Triton Inference Server. An unauthenticated, remote attacker can exploit this by sending a specially crafted request with an overly large input to a vulnerable endpoint. This action overwrites the memory buffer on the stack, allowing the attacker to corrupt adjacent memory, including the function's return pointer. By overwriting the return pointer with an address of their choosing, an attacker can redirect the program's execution flow to malicious code (shellcode) they have injected, resulting in arbitrary code execution with the same privileges as the Triton server process.

Business impact

This vulnerability is rated as critical severity with a CVSS score of 9.8. A successful exploit poses a significant risk to the organization, potentially leading to a complete compromise of the affected server. An attacker could gain control of the system to steal sensitive data being processed, such as proprietary AI models, intellectual property, or confidential business information. Furthermore, the compromised server could be used as a pivot point to attack other systems within the corporate network, or be leveraged for malicious activities like hosting malware or participating in a botnet. The vulnerability could also be exploited to cause a denial-of-service (DoS) by crashing the server, which would disrupt business-critical applications and services that rely on the AI/ML infrastructure.

Remediation

Immediate Action: Prioritize updating all affected NVIDIA Triton Inference Server instances to the latest patched version as recommended by the vendor. After patching, verify that the service is operating correctly.

Proactive Monitoring: Implement enhanced monitoring on affected systems. Review Triton server access logs for requests that are unusually large, malformed, or result in server errors or crashes. Monitor network traffic for anomalous outbound connections from the Triton server, which could indicate a successful compromise and communication with a command-and-control server. Utilize Intrusion Detection/Prevention Systems (IDS/IPS) with updated signatures to detect and block potential exploitation attempts.

Compensating Controls: If immediate patching is not feasible, implement compensating controls to reduce the risk. Place the Triton Inference Server behind a Web Application Firewall (WAF) or a reverse proxy capable of inspecting and sanitizing input traffic to block malicious requests. Enforce strict network segmentation to isolate the server, limiting its ability to communicate with critical internal network segments and thus containing the potential impact of a compromise. Ensure the server process is running with the lowest possible user privileges.

Exploitation status

Public Exploit Available: False

Analyst recommendation

Given the critical severity (CVSS 9.8) and the potential for remote code execution, this vulnerability represents a direct and severe threat to the organization. We strongly recommend that all system owners identify and patch affected NVIDIA Triton Inference Server instances with the highest priority. Although this CVE is not yet on the CISA KEV list, its high impact score makes it a prime target for future exploitation. Organizations should treat this vulnerability with the utmost urgency and apply remediation actions immediately to prevent potential system compromise and data breaches.