A federal decide in New York has ordered OpenAI to supply 20 million anonymized person logs from ChatGPT to the plaintiffs in a serious copyright lawsuit involving AI. The decide made this resolution regardless of OpenAI’s privateness considerations, upholding an earlier ruling on the matter.
District Choose Sidney H. Stein affirmed Justice of the Peace Choose Ona T. Wang’s November order on January 5, 2026, within the Southern District of New York, ruling that privateness safeguards adequately steadiness the logs’ relevance to infringement claims.
OpenAI objected, arguing that the total dataset representing 0.5% of preserved logs was unduly burdensome and risked exposing person knowledge, and as an alternative proposed trying to find conversations referencing plaintiffs’ works. Stein rejected this, noting that no case regulation mandates the “least burdensome” discovery methodology.
The saga started in July 2025 when information shops, together with The New York Occasions Co. and Chicago Tribune Co. LLC, sought 120 million logs to probe whether or not ChatGPT outputs infringed their copyrights by reproducing trained-on content material.
OpenAI supplied a dataset of 20 million samples, which the plaintiffs accepted however later declined full manufacturing, arguing that 99.99% of the logs have been irrelevant.
Wang sided with the plaintiffs in November, denied OpenAI’s reconsideration in December, and Stein’s affirmation seals the deal below a protecting order with de-identification protocols, Bloomberg acknowledged.
OpenAI invoked a Second Circuit securities case that blocked SEC wiretap disclosures, however Stein sharply distinguished it: ChatGPT logs contain voluntary person inputs and undisputed firm possession, in contrast to surreptitious recordings. “Customers’ privateness could be protected by the corporate’s exhaustive de-identification,” Wang had dominated earlier.
This ruling advances pretrial discovery in In re OpenAI, Inc. Copyright Infringement Litigation (No. 1:25-md-03143), consolidating 16 fits from information organizations, authors, and others alleging unauthorized use of works to coach giant language fashions.
It mirrors dozens of instances towards AI companies like Microsoft and Meta, testing copyright’s software to generative tech amid debates over truthful use and knowledge scraping.
Plaintiffs argue the logs are very important to rebut OpenAI’s claims that they “hacked” responses for proof and to evaluate infringement scope. OpenAI maintains anonymization and orders suffice, with no person privateness in danger.
The choice spotlights tensions between discovery proportionality and AI knowledge hoards, probably setting precedents for related instances. Critics fear bulk log handovers may chill person belief in chatbots, whereas supporters see it as important transparency.
OpenAI, represented by Keker Van Nest, Latham & Watkins, and Morrison & Foerster, faces manufacturing deadlines quickly.
As AI litigation proliferates, this order underscores courts’ willingness to compel the manufacturing of expansive proof, even anonymized, to probe coaching knowledge practices. For content material creators, it bolsters instruments to problem AI’s copyright encroachments; for tech giants, it alerts rising scrutiny on person knowledge vaults.
Dr. Kolochenko, CEO at ImmuniWeb, mentioned to Cybersecuritynews that “For OpenAI, this resolution is actually a authorized debacle, which is able to encourage different plaintiffs in related instances to do the identical to prevail in courts or to get significantly better settlements from AI corporations.”
“This case can be a telling reminder that, no matter your privateness settings, your interactions with AI chatbots and different programs might sooner or later be produced in courtroom. Structure of contemporary LLMs and their underlying know-how stack may be very complicated, so even when some user-facing programs are particularly configured to delete chat logs and historical past, some others might inevitably protect them in a single type or one other. In some instances, produced proof might set off investigations and even legal prosecution of AI customers.”
Observe us on Google Information, LinkedIn, and X for day by day cybersecurity updates. Contact us to characteristic your tales.
