Skip to content
  • Home
  • Cyber Map
  • About Us – Contact
  • Disclaimer
  • Terms and Rules
  • Privacy Policy
Cyber Web Spider Blog – News

Cyber Web Spider Blog – News

Globe Threat Map provides a real-time, interactive 3D visualization of global cyber threats. Monitor DDoS attacks, malware, and hacking attempts with geo-located arcs on a rotating globe. Stay informed with live logs and archive stats.

  • Home
  • Cyber Map
  • Cyber Security News
  • Security Week News
  • The Hacker News
  • How To?
  • Toggle search form
AI Vision Models Vulnerable to Subtle Image Manipulations

AI Vision Models Vulnerable to Subtle Image Manipulations

Posted on May 7, 2026 By CWS

Cisco’s AI Threat Intelligence and Security Research division has published new findings about the vulnerabilities of vision-language models (VLMs), AI systems that interpret visual data. The study reveals that these models can be manipulated by attackers through imperceptible alterations to images.

Exploiting AI with Hidden Instructions

The research demonstrates that attackers can embed commands within images that are undetectable to humans. These commands can instruct an AI to carry out harmful actions, such as data exfiltration, by embedding them into images like webpage banners or document previews. While these commands appear as visual noise to humans, the AI systems can interpret and act on them.

This investigation builds on earlier work which established a connection between visual distortions in text-bearing images and their effectiveness in attacking VLMs. Techniques such as using small fonts and heavy blurring were found to decrease the likelihood of a successful attack.

Advancements in Attack Techniques

The second phase of Cisco’s research, released on Thursday, delves into whether the mathematical distance between the distorted image and its readable form can be minimized. Researchers applied pixel-level changes to images that were initially ineffective as attacks due to readability issues or the AI’s safety mechanisms.

These changes were refined using four publicly available AI models, including Qwen3-VL-Embedding and OpenAI CLIP ViT-L/14-336, before being tested on proprietary systems like GPT-4o and Claude. This approach revealed two primary failure modes: readability recovery and refusal reduction.

Impact on AI Systems and Defenses

Readability recovery occurs when an image, too blurred or small for the AI to read, becomes legible internally within the model without becoming clearer to human observers. Refusal reduction describes instances where an AI, previously declining to follow embedded instructions, is manipulated into compliance without visible changes to the image.

In trials, Claude showed a significant increase in attack success from 0% to 28% when optimized on blurred images, though its safety filter still blocked many of the newly readable contents. Conversely, GPT-4o maintained stronger safety alignment, catching most legible requests even after optimization.

Future Implications and Defense Strategies

Cisco’s findings underscore the need for robust defenses against typographic attacks that evade simple image filters. As AI systems become more integral to operations, enhancing their ability to resist such subtle manipulations is critical to maintaining data security.

Addressing these vulnerabilities is imperative to safeguard against potential exploitation, highlighting the ongoing necessity for advancements in AI security protocols.

Security Week News Tags:AI safety, AI security, AI threats, AI vulnerabilities, Cisco research, cyber attack techniques, Cybersecurity, image manipulation, vision-language models, VLM attacks

Post navigation

Previous Post: Critical Cisco Vulnerability Exposes Networks to DoS Attacks
Next Post: Chinese Hackers Employ Custom Malware to Target Government Data

Related Posts

Iranian Cyber Threat Poses as Ransomware Attack Iranian Cyber Threat Poses as Ransomware Attack Security Week News
Raven Secures M to Enhance Cloud Security Solutions Raven Secures $20M to Enhance Cloud Security Solutions Security Week News
Arch Linux Project Responding to Week-Long DDoS Attack Arch Linux Project Responding to Week-Long DDoS Attack Security Week News
More Cybersecurity Firms Hit by Salesforce-Salesloft Drift Breach More Cybersecurity Firms Hit by Salesforce-Salesloft Drift Breach Security Week News
How Scammers Are Using AI to Steal College Financial Aid How Scammers Are Using AI to Steal College Financial Aid Security Week News
Canadian Tire Data Breach Exposes Millions of Accounts Canadian Tire Data Breach Exposes Millions of Accounts Security Week News

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Recent Posts

  • Fortinet Alerts on Credential Attack Targeting FortiGate
  • GentleKiller Exploits Drivers to Bypass 400+ Security Tools
  • CyberSentinel AI Revolutionizes Security with 33 Tools
  • Macron Advocates Global AI Regulation at G7 Summit
  • Gravity SMTP Plugin Vulnerability Exposes API Keys

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Archives

  • June 2026
  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025

Recent Posts

  • Fortinet Alerts on Credential Attack Targeting FortiGate
  • GentleKiller Exploits Drivers to Bypass 400+ Security Tools
  • CyberSentinel AI Revolutionizes Security with 33 Tools
  • Macron Advocates Global AI Regulation at G7 Summit
  • Gravity SMTP Plugin Vulnerability Exposes API Keys

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Copyright © 2026 Cyber Web Spider Blog – News.

Powered by PressBook Masonry Dark