A recent controlled security test has unveiled significant vulnerabilities within AI agent environments. A malicious AI skill, designed to appear harmless, successfully evaded security measures and took control of over 26,000 agents in both personal and corporate settings.
Exploiting Trust in AI Skill Marketplaces
Researcher Niv Hoffman initiated the attack by developing a seemingly genuine AI skill named “brand-landingpage.” This skill, advertised as a no-code solution for designing product landing pages, utilized Google’s Stitch platform. Its genuine functionality fostered trust among users such as marketers and sales professionals.
The skill quickly disseminated through various platforms, including open marketplaces and social media channels. To boost its credibility, the researchers integrated the skill into a highly-rated GitHub plugin marketplace. This strategic move helped the project gain a trustworthy reputation among both users and automated evaluation systems.
Security Scanner Shortcomings
Despite widespread adoption, prominent AI security scanners from major companies labeled the skill as safe, bolstering user confidence. The attack did not use typical malware tactics; instead, it exploited a core flaw in the evaluation process of AI skills.
Security scanners typically review only the local components of a skill, ignoring external resources like documentation or guides. The skill took advantage of this by redirecting AI agents to an external site that initially mirrored legitimate Stitch documentation. This redirection masked the skill’s true intentions during early assessments.
Implications and Future Outlook
Once the skill was widely adopted, the external content was modified to instruct agents to download a script. Although this experiment only gathered user emails to demonstrate potential impact, the technique could be used for more harmful actions, such as executing malicious code or accessing sensitive information.
The incident, affecting more than 26,000 agents, including those in corporate environments, highlights a critical supply chain risk in AI ecosystems. Unlike conventional software, AI skills can change behavior by altering external content after installation, making one-time security checks insufficient.
For organizations, this poses a significant threat. Many companies permit employees to install AI extensions without oversight, thereby enlarging the attack surface. Experts suggest implementing continuous monitoring of AI behaviors, enforcing centralized approval for third-party skills, and expanding scan capabilities to include external dependencies.
Without these precautions, AI platforms might remain susceptible to large-scale attacks that exploit trust instead of technical vulnerabilities.
