Critical RCE Vulnerabilities in AI inference Engines Exposes Meta, Nvidia and Microsoft Frameworks – Cyber Web Spider Blog

As synthetic intelligence infrastructure quickly expands, important safety flaws threaten the spine of enterprise AI deployments.

Safety researchers at Oligo Safety have uncovered a sequence of harmful Distant Code Execution (RCE) vulnerabilities affecting main AI frameworks from Meta, NVIDIA, Microsoft, and PyTorch initiatives, together with vLLM and SGLang.

The vulnerabilities, collectively termed “ShadowMQ,” stem from the unsafe implementation of ZeroMQ (ZMQ) communications mixed with Python’s pickle deserialization.

What makes this risk significantly alarming is the way it unfold throughout the AI ecosystem by means of code reuse and copy-paste improvement practices.

How the Vulnerability Unfold Throughout Frameworks

The investigation started in 2024 when researchers analyzed Meta’s Llama Stack and found the damaging use of ZMQ’s recv_pyobj() technique, which deserializes information utilizing Python’s pickle module.

ShadowMQ Vulnerability CVE Knowledge Desk

CVE IDProductSeverityCVSS ScoreVulnerability TypeCVE-2024-50050Meta Llama StackCritical9.8Remote Code ExecutionCVE-2025-30165vLLMCritical9.8Remote Code ExecutionCVE-2025-23254NVIDIA TensorRT-LLMCritical9.3Remote Code ExecutionCVE-2025-60455Modular Max ServerCritical9.8Remote Code ExecutionN/A (Unpatched)Microsoft Sarathi-ServeCritical9.8Remote Code ExecutionN/A (Incomplete Repair)SGLangCritical9.8Remote Code Execution

This configuration created unauthenticated community sockets that might execute arbitrary code throughout deserialization, enabling distant attackers to compromise programs.

After Meta patched the vulnerability (CVE-2024-50050), Oligo researchers discovered equivalent safety flaws throughout a number of frameworks.

NVIDIA’s TensorRT-LLM, PyTorch initiatives vLLM and SGLang, and Modular’s Max Server all contained almost equivalent susceptible patterns.

Oligo Code evaluation revealed that total information had been copied between initiatives, spreading the safety flaw like a virus. These AI inference servers energy important enterprise infrastructure, processing delicate information throughout GPU clusters.

Organizations trusting SGLang embrace xAI, AMD, NVIDIA, Intel, LinkedIn, Oracle Cloud, Google Cloud, Microsoft Azure, AWS, MIT, Stanford, UC Berkeley, and quite a few different main expertise firms.

Profitable exploitation may permit attackers to execute arbitrary code, escalate privileges, exfiltrate mannequin information, or set up cryptocurrency miners.

Oligo researchers recognized 1000’s of uncovered ZMQ sockets speaking unencrypted over the general public web. Nevertheless, Microsoft’s Sarathi-Serve and SGLang stay susceptible with incomplete fixes.

Organizations ought to instantly replace to patched variations, keep away from utilizing pickle with untrusted information, implement authentication for ZMQ communications, and prohibit community entry to ZMQ endpoints.

Observe us on Google Information, LinkedIn, and X for each day cybersecurity updates. Contact us to function your tales.

Related Posts