Skip to content
  • Home
  • Cyber Map
  • About Us – Contact
  • Disclaimer
  • Terms and Rules
  • Privacy Policy
Cyber Web Spider Blog – News

Cyber Web Spider Blog – News

Globe Threat Map provides a real-time, interactive 3D visualization of global cyber threats. Monitor DDoS attacks, malware, and hacking attempts with geo-located arcs on a rotating globe. Stay informed with live logs and archive stats.

  • Home
  • Cyber Map
  • Cyber Security News
  • Security Week News
  • The Hacker News
  • How To?
  • Toggle search form
Reddit to Block Internet Archive as AI Companies Have Scraped Data From Wayback Machine

Reddit to Block Internet Archive as AI Companies Have Scraped Data From Wayback Machine

Posted on August 12, 2025August 12, 2025 By CWS

Reddit has introduced plans to considerably limit the Web Archive’s Wayback Machine from indexing its platform, citing issues that AI firms have been exploiting the archival service to avoid Reddit’s information safety insurance policies. 

The transfer represents one other escalation in Reddit’s ongoing battle to manage entry to its user-generated content material amid the AI coaching information growth.

Key Takeaways1. The Wayback Machine will solely be capable to archive Reddit’s homepage, not particular person posts or feedback.2. Corporations have been utilizing archived information to bypass Reddit’s direct entry restrictions3. Reddit prefers paid licensing offers over free information entry.

Block Wayback Machine Entry 

Beginning immediately, Reddit will implement what it calls “ramping up” restrictions that can block the Wayback Machine from accessing publish element pages, remark threads, and person profiles. 

The Web Archive will solely retain the flexibility to index Reddit’s homepage, successfully limiting historic information to snapshots of trending headlines and fashionable posts on given dates.

“Web Archive gives a service to the open internet, however we’ve been made conscious of cases the place AI firms violate platform insurance policies, together with ours, and scrape information from the Wayback Machine,” Reddit spokesperson Tim Rathschmidt defined. 

The corporate has recognized particular cases the place AI coaching firms have used the robots.txt bypass capabilities inherent in archived content material to entry Reddit information that may in any other case be restricted by the platform’s present API charge limiting and crawler blocking mechanisms.

Reddit’s technical implementation will probably contain updating its robots.txt file with particular Person-Agent strings focusing on Web Archive crawlers, whereas probably implementing server-side blocking based mostly on IP ranges related to the Wayback Machine’s infrastructure. 

This method mirrors the platform’s current technique of blocking search engine crawlers until firms enter paid licensing agreements.

This restriction kinds a part of Reddit’s complete method to monetizing its information belongings within the AI period. 

The platform has entered into important offers with Google and OpenAI for official information entry, whereas concurrently pursuing authorized motion in opposition to firms like Anthropic for allegedly persevering with to scrape content material after claiming to have stopped.

Reddit’s 2023 API pricing adjustments, which successfully shuttered fashionable third-party functions, have been justified utilizing comparable reasoning about stopping unauthorized AI coaching.

The corporate has carried out charge limiting, authentication necessities, and utilization monitoring throughout its technical infrastructure to keep up management over information entry.

Mark Graham, director of the Wayback Machine, acknowledged ongoing discussions with Reddit concerning the matter, suggesting potential technical options could also be explored. 

Nonetheless, Reddit’s place seems agency: till the Web Archive can assure compliance with platform insurance policies concerning person privateness and content material deletion respect, entry will stay severely restricted.

This growth highlights the rising stress between open internet archival ideas and business information management within the AI coaching panorama.

Increase your SOC and assist your workforce defend your enterprise with free top-notch menace intelligence: Request TI Lookup Premium Trial.

Cyber Security News Tags:Archive, Block, Companies, Data, Internet, Machine, Reddit, Scraped, Wayback

Post navigation

Previous Post: OT Networks Targeted in Widespread Exploitation of Erlang/OTP Vulnerability
Next Post: Critical Vulnerability in Carmaker Portal Let Hackers Unlock the Car Remotely

Related Posts

Achieving Continuous Compliance in Dynamic Threat Environments Achieving Continuous Compliance in Dynamic Threat Environments Cyber Security News
Crypto User Loses ,000 in Seconds After Clicking Instagram Ad Promising Easy Profits Crypto User Loses $9,000 in Seconds After Clicking Instagram Ad Promising Easy Profits Cyber Security News
Enhancing SOC Maturity with Integrated Threat Intelligence Enhancing SOC Maturity with Integrated Threat Intelligence Cyber Security News
Matanbuchus 3.0 Emerges with Advanced Tactics to Deliver AstarionRAT Matanbuchus 3.0 Emerges with Advanced Tactics to Deliver AstarionRAT Cyber Security News
Hackers Exploit Microsoft 365 Mailbox Rules for Email Interception Hackers Exploit Microsoft 365 Mailbox Rules for Email Interception Cyber Security News
OpenClaw Vulnerabilities Lead to Security Risks OpenClaw Vulnerabilities Lead to Security Risks Cyber Security News

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Recent Posts

  • Russian Intelligence Phishing Campaign Targets Messaging Apps
  • Chinese Framework Fuels Massive Scam Network
  • OpenAI Unveils GPT-5.6 Sol with Enhanced Security
  • Critical Cloud Bucket Hijacking Threat Exposed
  • Claude Mythos 5 Redeployed to Protect US Infrastructure

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Archives

  • June 2026
  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025

Recent Posts

  • Russian Intelligence Phishing Campaign Targets Messaging Apps
  • Chinese Framework Fuels Massive Scam Network
  • OpenAI Unveils GPT-5.6 Sol with Enhanced Security
  • Critical Cloud Bucket Hijacking Threat Exposed
  • Claude Mythos 5 Redeployed to Protect US Infrastructure

Pages

  • About Us – Contact
  • Disclaimer
  • Privacy Policy
  • Terms and Rules

Categories

  • Cyber Security News
  • How To?
  • Security Week News
  • The Hacker News

Copyright © 2026 Cyber Web Spider Blog – News.

Powered by PressBook Masonry Dark