A vital vulnerability in NVIDIA’s Merlin Transformers4Rec library (CVE-2025-23298) permits unauthenticated attackers to attain distant code execution (RCE) with root privileges through unsafe deserialization within the mannequin checkpoint loader.
The invention underscores the persistent safety dangers inherent in ML/AI frameworks’ reliance on Python’s pickle serialization.
NVIDIA Merlin Vulnerability
Pattern Micro’s Zero Day Initiative (ZDI) said that the vulnerability resides within the load_model_trainer_states_from_checkpoint operate, which makes use of PyTorch’s torch.load() with out security parameters. Underneath the hood, torch.load() leverages Python’s pickle module, permitting arbitrary object deserialization.
Attackers can embed malicious code in a crafted checkpoint file—triggering execution when untrusted pickle information is loaded. Within the susceptible implementation, cloudpickle hundreds the mannequin class immediately:
This method grants attackers full management of the deserialization course of. By defining a customized __reduce__ technique, a malicious checkpoint can execute arbitrary system instructions upon loading, e.g., calling os.system() to fetch and execute a distant script.
The assault floor is huge: ML practitioners routinely share pre-trained checkpoints through public repositories or cloud storage. Manufacturing ML pipelines usually run with elevated privileges, which means a profitable exploit not solely compromises the mannequin host however also can escalate to root-level entry.
To show the flaw, researchers crafted a malicious checkpoint:
Loading this checkpoint through the susceptible operate triggers the embedded shell command previous to any mannequin weight restoration—leading to fast RCE beneath the context of the ML service.
NVIDIA addressed the problem in PR #802 by changing uncooked pickle calls with a customized load() operate that whitelists permitted courses.
The patched loader in serialization.py enforces enter validation, and builders are inspired to make use of weights_only=True in torch.load() to keep away from untrusted object deserialization.
Patch including a customized load operate
Builders mustn’t ever use pickle on untrusted information and may limit deserialization to identified, protected courses.
Different codecs—equivalent to Safetensors or ONNX—provide safer mannequin persistence. Organizations ought to implement cryptographic signing of mannequin information, sandbox deserialization processes, and embrace ML pipelines in common safety audits.
Danger FactorsDetailsAffected ProductsNVIDIA Merlin Transformers4Rec ≤ v1.5.0ImpactRemote code execution as rootExploit PrerequisitesAttacker-supplied mannequin checkpoint loaded through torch.load()CVSS 3.1 Score9.8 (Important)
The broader group should advocate for security-first design rules and deprecate pickle-based mechanisms altogether.
Till pickle reliance is eradicated, related vulnerabilities will persist. Vigilance, sturdy enter validation, and a zero-trust mindset stay essential to safeguarding manufacturing ML techniques in opposition to supply-chain and RCE threats.
Observe us on Google Information, LinkedIn, and X for day by day cybersecurity updates. Contact us to function your tales.