Researchers at SafeBreach have identified a significant security flaw in Google’s Gemini voice assistant, which could have been exploited by attackers through indirect prompt injections via common messaging notifications.
Earlier, the cybersecurity team uncovered a vulnerability in Gemini and Google Workspace involving calendar invites, which could have facilitated spam, phishing, event deletions, location tracking, remote home appliance control, and email exfiltration.
Discovery of Fake Context Alignment
Building on their previous findings, SafeBreach uncovered a new type of attack known as Fake Context Alignment. Google was informed about this in August 2025, and the issue was resolved by mid-November 2025 through enhancements in content classification. This week, SafeBreach released details to highlight the persistent dangers of prompt injection attacks and the necessity for robust defenses against context manipulation.
The assault method leverages notifications from widely used applications like WhatsApp, Slack, and SMS to surreptitiously insert harmful commands into Gemini’s conversation framework without the user’s awareness.
Exploitation Techniques and Implications
The researchers demonstrated various techniques, including the use of hidden instructions in foreign languages or muted hyperlinks. These are processed by the assistant but not read aloud during messaging notification requests, circumventing Google’s safeguards.
This vulnerability was particularly alarming in hands-free scenarios, such as driving, where users depend on voice interaction with Gemini. Attackers could initiate dangerous actions like managing smart home gadgets via Google Home, starting Zoom calls, creating misleading messages from trusted contacts, and establishing ongoing control by compromising the assistant’s memory.
Wider Implications and Security Recommendations
SafeBreach emphasized that as AI assistants become more integrated into daily life, the potential attack surface grows significantly. They highlighted that notification-based attacks demonstrate the feasibility of indirect prompt injections through highly trusted communication channels.
The firm urged organizations and vendors to move beyond localized solutions and rethink how AI systems evaluate trust, context, and cross-channel permissions to ensure user safety.
SafeBreach has shared video demonstrations showcasing the exploitation of Zoom and Google Home, underscoring the practical risks involved.
For further insights, related articles explore the security evaluations of AI agents and national security considerations for top AI models.
