Skip to content
  • Blog Home
  • Cyber Map
  • About Us – Contact
  • Disclaimer
  • Terms and Rules
  • Privacy Policy
Cyber Web Spider Blog – News

Cyber Web Spider Blog – News

Globe Threat Map provides a real-time, interactive 3D visualization of global cyber threats. Monitor DDoS attacks, malware, and hacking attempts with geo-located arcs on a rotating globe. Stay informed with live logs and archive stats.

  • Home
  • Cyber Map
  • Cyber Security News
  • Security Week News
  • The Hacker News
  • How To?
  • Toggle search form

Echo Chamber Jailbreak Tricks LLMs Like OpenAI and Google into Generating Harmful Content

Posted on June 23, 2025June 23, 2025 By CWS

Jun 23, 2025Ravie LakshmananLLM Safety / AI Safety
Cybersecurity researchers are calling consideration to a brand new jailbreaking methodology known as Echo Chamber that may very well be leveraged to trick standard giant language fashions (LLMs) into producing undesirable responses, regardless of the safeguards put in place.
“In contrast to conventional jailbreaks that depend on adversarial phrasing or character obfuscation, Echo Chamber weaponizes oblique references, semantic steering, and multi-step inference,” NeuralTrust researcher Ahmad Alobaid mentioned in a report shared with The Hacker Information.
“The result’s a delicate but highly effective manipulation of the mannequin’s inner state, steadily main it to supply policy-violating responses.”
Whereas LLMs have steadily integrated varied guardrails to fight immediate injections and jailbreaks, the newest analysis reveals that there exist strategies that may yield excessive success charges with little to no technical experience.

It additionally serves to spotlight a persistent problem related to growing moral LLMs that implement clear demarcation between what subjects are acceptable and never acceptable.
Whereas widely-used LLMs are designed to refuse consumer prompts that revolve round prohibited subjects, they are often nudged in the direction of eliciting unethical responses as a part of what’s known as a multi-turn jailbreaking.
In these assaults, the attacker begins with one thing innocuous after which progressively asks a mannequin a sequence of more and more malicious questions that finally trick it into producing dangerous content material. This assault is known as Crescendo.
LLMs are additionally vulnerable to many-shot jailbreaks, which reap the benefits of their giant context window (i.e., the utmost quantity of textual content that may match inside a immediate) to flood the AI system with a number of questions (and solutions) that exhibit jailbroken conduct previous the ultimate dangerous query. This, in flip, causes the LLM to proceed the identical sample and produce dangerous content material.
Echo Chamber, per NeuralTrust, leverages a mix of context poisoning and multi-turn reasoning to defeat a mannequin’s security mechanisms.
Echo Chamber Assault
“The principle distinction is that Crescendo is the one steering the dialog from the beginning whereas the Echo Chamber is type of asking the LLM to fill within the gaps after which we steer the mannequin accordingly utilizing solely the LLM responses,” Alobaid mentioned in a press release shared with The Hacker Information.
Particularly, this performs out as a multi-stage adversarial prompting method that begins with a seemingly-innocuous enter, whereas steadily and not directly steering it in the direction of producing harmful content material with out making a gift of the tip purpose of the assault (e.g., producing hate speech).
“Early planted prompts affect the mannequin’s responses, that are then leveraged in later turns to bolster the unique goal,” NeuralTrust mentioned. “This creates a suggestions loop the place the mannequin begins to amplify the dangerous subtext embedded within the dialog, steadily eroding its personal security resistances.”

In a managed analysis atmosphere utilizing OpenAI and Google’s fashions, the Echo Chamber assault achieved a hit fee of over 90% on subjects associated to sexism, violence, hate speech, and pornography. It additionally achieved almost 80% success within the misinformation and self-harm classes.

“The Echo Chamber Assault reveals a crucial blind spot in LLM alignment efforts,” the corporate mentioned. “As fashions change into extra able to sustained inference, in addition they change into extra susceptible to oblique exploitation.”

The disclosure comes as Cato Networks demonstrated a proof-of-concept (PoC) assault that targets Atlassian’s mannequin context protocol (MCP) server and its integration with Jira Service Administration (JSM) to set off immediate injection assaults when a malicious assist ticket submitted by an exterior risk actor is processed by a assist engineer utilizing MCP instruments.
The cybersecurity firm has coined the time period “Residing off AI” to explain these assaults, the place an AI system that executes untrusted enter with out enough isolation ensures could be abused by adversaries to realize privileged entry with out having to authenticate themselves.
“The risk actor by no means accessed the Atlassian MCP straight,” safety researchers Man Waizel, Dolev Moshe Attiya, and Shlomo Bamberger mentioned. “As a substitute, the assist engineer acted as a proxy, unknowingly executing malicious directions by way of Atlassian MCP.”

Discovered this text attention-grabbing? Comply with us on Twitter  and LinkedIn to learn extra unique content material we publish.

The Hacker News Tags:Chamber, Content, Echo, Generating, Google, Harmful, Jailbreak, LLMs, OpenAI, Tricks

Post navigation

Previous Post: DHS Warns Pro-Iranian Hackers Likely to Target U.S. Networks After Iranian Nuclear Strikes
Next Post: Critical Teleport Vulnerability Let Attackers Remotely Bypass Authentication Controls

Related Posts

295 Malicious IPs Launch Coordinated Brute-Force Attacks on Apache Tomcat Manager The Hacker News
Microsoft Helps CBI Dismantle Indian Call Centers Behind Japanese Tech Support Scam The Hacker News
Simple Steps for Attack Surface Reduction The Hacker News
Step Into the Password Graveyard… If You Dare (and Join the Live Session) The Hacker News
Türkiye Hackers Exploited Output Messenger Zero-Day to Drop Golang Backdoors on Kurdish Servers The Hacker News
Critical Sudo Vulnerabilities Let Local Users Gain Root Access on Linux, Impacting Major Distros The Hacker News

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Recent Posts

  • Hackers Leveraging WhatsApp That Silently Harvest Logs and Contact Details
  • Elite Cyber Veterans Launch Blast Security with $10M to Turn Cloud Detection into Prevention
  • PoC released for W3 Total Cache Vulnerability that Exposes 1+ Million Websites to RCE Attacks
  • CISA Confirms Exploitation of Recent Oracle Identity Manager Vulnerability
  • 800+ npm Packages and Thousands of GitHub Repos Compromised

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Archives

  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025

Recent Posts

  • Hackers Leveraging WhatsApp That Silently Harvest Logs and Contact Details
  • Elite Cyber Veterans Launch Blast Security with $10M to Turn Cloud Detection into Prevention
  • PoC released for W3 Total Cache Vulnerability that Exposes 1+ Million Websites to RCE Attacks
  • CISA Confirms Exploitation of Recent Oracle Identity Manager Vulnerability
  • 800+ npm Packages and Thousands of GitHub Repos Compromised

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Copyright © 2025 Cyber Web Spider Blog – News.

Powered by PressBook Masonry Dark