NVIDIA TensorRT-LLM Vulnerability Let Hackers Run Malicious Code
A critical vulnerability identified as CVE-2025-23254 has been discovered in the NVIDIA TensorRT-LLM framework, affecting versions prior to 0.18.2 on Windows, Linux, and macOS. This vulnerability, rated 8.8 on the CVSS v3.1 scale, arises from improper handling of Inter-Process Communication (IPC) in the Python executor component, where the use of the pickle
module for serialization and deserialization of untrusted data poses a security risk. A local attacker could exploit this flaw to execute arbitrary code, alter data, or access sensitive information. NVIDIA has mitigated this issue by releasing TensorRT-LLM version 0.18.2, which employs HMAC encryption by default for IPC channels to ensure communication security. Users are strongly advised to upgrade to version 0.18.2 or later to address this vulnerability, as the ability to disable this security feature could reintroduce the risk. The vulnerability was responsibly reported by Avi Lumelsky from Oligo Security.
This security issue gives an attacker the ability to execute arbitrary code, alter data, or access sensitive information on systems running vulnerable NVIDIA TensorRT-LLM versions by exploiting improper pickle
deserialization in the Python executor's Inter-Process Communication channel. The following protection guardrails can further prevent the following steps an attacker can take: When an attacker attempts to inject a malicious pickled object into the IPC channel, Python Deserialization Protection intercepts the deserialization process, applying security policies to restrict function calls from the deserialized object, thereby aiming to block the execution of malicious functions designed for arbitrary code execution or unauthorized data access before full impact. Should this initial payload attempt to execute operating system commands, for example, to gather system information or run further malicious scripts, Python OS Command Injection Prevention monitors the Python runtime environment and blocks these unauthorized command execution attempts. If the attacker's code then tries to establish a persistent command and control channel, Reverse Shell Protection prevents the binding of shell STDIN/STDOUT/STDERR to network sockets, effectively blocking reverse shell attempts. In scenarios where the vulnerable software runs within a container and the attacker tries to introduce new malicious binaries or scripts not part of the original image, such as a tool to escalate privileges, Container Drift Protection (Binaries & Scripts) would block their execution. Complementing this, Process Path Exec Allow prevents the execution of any attacker-introduced executables or scripts if they are saved to and run from non-allowlisted filesystem paths, like a temporary download folder. To counter data theft or unauthorized modification, if the malicious code attempts to read or alter files designated as sensitive, such as configuration files, private keys, or even the LLM's model weights, Sensitive File Access monitors and can block these unauthorized file operations. Finally, if the attacker attempts to exfiltrate stolen sensitive information by having the compromised TensorRT-LLM process initiate outbound network connections to an attacker-controlled server, or even attempts to download additional tools, Process Socket Deny can deny these socket operations if the process is not on an explicit allowlist for network activity.
- T1059.006: Command and Scripting Interpreter: Python: The vulnerability stems from the improper handling of Inter-Process Communication (IPC) within the Python executor component, specifically through the use of the Python 'pickle' module for serialization and deserialization of untrusted data. This technique of exploiting deserialization vulnerabilities is commonly associated with executing arbitrary code. Therefore, the attacker could exploit this vulnerability to execute arbitrary malicious code on the system, which aligns with the MITRE ATT&CK Technique T1059.006: Command and Scripting Interpreter: Python. This technique ID applies as the attacker leverages the Python environment to execute scripts or commands via deserialization attacks.
- T1003: OS Credential Dumping: A local attacker with access to the system could exploit this vulnerability to tamper with data or disclose sensitive information. The ability to tamper with data or disclose sensitive information falls under the MITRE ATT&CK Technique T1003: OS Credential Dumping, where attackers attempt to gain access to credentials and other sensitive information. Although the article doesn't explicitly mention credential dumping, the risk of sensitive information disclosure through IPC vulnerabilities suggests potential exposure of credentials.
F1: Exploitation of CVE-2025-23254 via malicious pickle
deserialization in the Python executor's IPC channel, leading to arbitrary code execution.
- Attacker gains local access to a system running a vulnerable version of NVIDIA TensorRT-LLM (any version prior to 0.18.2). (Cited from: "A local attacker with access to the system could exploit this vulnerability", "This flaw affects all versions prior to 0.18.2 across Windows, Linux, and macOS platforms.")
- Attacker identifies the Inter-Process Communication (IPC) mechanism utilized by the Python executor component within TensorRT-LLM. (Cited from: "improper handling of Inter-Process Communication (IPC) within the Python executor component.")
- Attacker crafts a malicious payload using Python's
pickle
module. This payload is specifically designed to execute arbitrary commands or code when it is deserialized by a vulnerable application. (Cited from: "the use of the Pythonpickle
module for serialization and deserialization of untrusted data (CWE-502)", "exploit this vulnerability to execute arbitrary malicious code") - Attacker injects the crafted malicious pickled data into the identified IPC channel, targeting the TensorRT-LLM Python executor. (Cited from: "deserialization of untrusted data", "IPC within the Python executor component")
- The Python executor component of the vulnerable TensorRT-LLM version receives and processes the injected data, attempting to deserialize it using the
pickle
module. (Cited from: "deserialization of untrusted data")- BR-76: Python Deserialization Protection - This mechanism is applicable because the vulnerability involves Python deserialization (CWE-502). BR-76 intercepts the Python deserialization process and applies security policies to restrict function calls originating from deserialized objects, potentially blocking the execution of malicious functions before full code execution occurs.
- BR-77: Python OS Command Injection Prevention - This mechanism is applicable because the vulnerability involves Python and leads to arbitrary code execution, which can be seen as a form of OS command injection or unauthorized Python command execution. BR-77 monitors Python runtime environments for suspicious patterns indicative of command injection and blocks unauthorized command execution attempts.
- Due to the CWE-502 vulnerability (improper deserialization of untrusted data), the
pickle
module executes the arbitrary malicious code embedded within the payload during the deserialization process, granting the attacker code execution capabilities within the context of the TensorRT-LLM application. (Cited from: "execute arbitrary malicious code", "CWE-502")- BR-76: Python Deserialization Protection - This mechanism is applicable because it aims to limit the actions that Python deserialized objects can take. By intercepting the deserialization process, it enforces policies to restrict function calls from deserialized objects, potentially blocking the execution of the arbitrary malicious code mentioned.
- BR-77: Python OS Command Injection Prevention - This mechanism is applicable as it monitors Python runtime environments for attempts to execute system-native binaries or shell commands from within Python applications, which aligns with the scenario of arbitrary code execution resulting from the pickle deserialization. It would identify and block such unauthorized command execution.
- BR-55: Reverse Shell Protection - This mechanism is applicable because the vulnerability allows arbitrary code execution. As per its LLM Correlation Rule, if RCE is achieved, it's assumed a reverse shell could be established. BR-55 would prevent the binding of shell STDIN/STDOUT/STDERR to network sockets, blocking such an attempt.
- BR-54: Container Drift Protection (Binaries & Scripts) - If the vulnerable TensorRT-LLM software ran inside a container, then this mechanism applies because the execution of arbitrary malicious code could involve introducing new executable binaries or scripts not part of the original container image. BR-54 would block the execution of such new, unauthorized code.
- BR-62: Linux/Host Drift Protection - This mechanism is applicable if the TensorRT-LLM is running on a Linux host, as arbitrary code execution might involve adding new files/scripts outside trusted package managers and then executing them. BR-62 tracks installations via trusted package managers and blocks execution of code added otherwise.
- BR-88: Process Path Exec Allow - This mechanism is applicable if the arbitrary code execution involves writing a new executable or script to a non-standard or temporary filesystem path and then attempting to execute it from that path. BR-88 would block execution from non-allowlisted paths.
- BR-90: Process Exec Deny - This mechanism is applicable if the arbitrary malicious code attempts to execute a process explicitly on the deny list, such as
/nc
,/wget
, or/curl
, by matching the final suffix path of the command.
F2: Exploitation of CVE-2025-23254 to achieve data tampering or sensitive information disclosure through the Python pickle
vulnerability in the IPC channel.
- An attacker, having obtained local access, targets a system with an unpatched NVIDIA TensorRT-LLM installation (version prior to 0.18.2). (Cited from: "A local attacker with access to the system could exploit this vulnerability", "all versions prior to 0.18.2")
- The attacker prepares a specialized
pickle
payload. This payload contains instructions not for direct command execution, but for actions such as reading specific files, modifying data structures in memory, or exfiltrating sensitive information accessible to the TensorRT-LLM process. (Cited from: "tamper with data, or disclose sensitive information", "use of the Pythonpickle
module for serialization and deserialization of untrusted data") - The attacker introduces this data-focused malicious pickled object into the IPC channel of the TensorRT-LLM Python executor. (Cited from: "IPC within the Python executor component", "deserialization of untrusted data")
- The Python executor component deserializes the attacker's pickled object. (Cited from: "deserialization of untrusted data")
- BR-76: Python Deserialization Protection - This mechanism is applicable as it addresses Python deserialization (CWE-502). It restricts the actions of deserialized objects, which could include blocking function calls used for unauthorized file reading or data modification, even if not direct OS command execution.
- Upon deserialization, the embedded instructions are executed, leading to unauthorized access and exfiltration of sensitive information (e.g., model parameters, proprietary data) or the unauthorized modification of data processed or managed by the TensorRT-LLM framework. (Cited from: "tamper with data, or disclose sensitive information")
- BR-76: Python Deserialization Protection - This mechanism is applicable because it limits the actions of deserialized Python objects. The execution of instructions for unauthorized data access or modification would be subject to its policy enforcement, potentially blocking harmful function calls.
- BR-91: Sensitive File Access - This mechanism is applicable if the 'sensitive information' disclosure involves reading files that are on BlueRock's predefined or user-configured list of sensitive files (e.g., configuration files with secrets, private keys). BR-91 monitors and can block access to such files.
- BR-87: Process Socket Deny - This mechanism is applicable if the 'exfiltration of sensitive information' involves the compromised TensorRT-LLM process initiating new, unauthorized network connections. BR-87 can deny socket operations for processes not on an allowlist.