A severe security flaw in SGLang has been uncovered, presenting the possibility of remote code execution (RCE) on vulnerable systems. Identified as CVE-2026-5760, this vulnerability has been assigned a CVSS score of 9.8 out of 10, emphasizing its critical nature. The issue involves command injection, which allows for the execution of arbitrary code.
Understanding the SGLang Framework
SGLang is recognized for its high-performance capabilities as an open-source framework designed to support large language and multimodal models. Its popularity is evident from its GitHub project, which has been forked over 5,500 times and starred 26,100 times, highlighting its widespread use. The vulnerability impacts the reranking endpoint “/v1/rerank,” enabling attackers to execute arbitrary code through a crafted GGUF model file.
Exploitation Methodology
The exploitation process involves attackers generating a malicious GPT-Generated Unified Format (GGUF) model file. This file contains a tokenizer.chat_template parameter with a Jinja2 server-side template injection (SSTI) payload. Once a victim downloads and loads the model in SGLang, a request to the “/v1/rerank” endpoint triggers the execution of the attacker’s code.
Security researcher Stuart Beck, who reported the flaw, identified the root cause as the use of jinja2.Environment() without adequate sandboxing, enabling arbitrary Python code execution on the server. The process involves crafting a GGUF model file with a specific trigger phrase, leading to the execution of malicious code on the server.
Comparison with Previous Vulnerabilities
This vulnerability belongs to the same class as CVE-2024-34359, known as Llama Drama, which was a critical flaw in the llama_cpp_python package. This issue, with a CVSS score of 9.7, has been patched. A similar vulnerability in vLLM, CVE-2025-61620, was also addressed last year, indicating a recurring pattern in similar software vulnerabilities.
To prevent exploitation, it is advised to use ImmutableSandboxedEnvironment instead of jinja2.Environment() for rendering chat templates. This change aims to block the execution of arbitrary Python code on servers. Despite efforts, no patch was received during the vulnerability coordination process.
The discovery and potential impact of CVE-2026-5760 underscore the need for robust security measures in software frameworks, especially those in widespread use. Users and developers are urged to apply recommended mitigations promptly to safeguard against potential exploits.
