- Text Ars Technica
- Text Reuters
-
Court Order Forces Indefinite Retention of ChatGPT Data, Including Deleted Chats
-
A court order in the New York Times v. OpenAI copyright case requires OpenAI to indefinitely preserve all ChatGPT output logs, including deleted user chats, as potential evidence.
-
The order affects users of ChatGPT Free, Plus, Pro, and API—excluding ChatGPT Enterprise, Edu, or those under Zero Data Retention agreements.
-
OpenAI must store this data in a segregated, secure system accessible only to a small, audited legal and security team, under legal hold conditions.
-
-
Legal Backlash and Privacy Concerns
-
OpenAI is actively appealing the order, calling it an overreach that undermines user trust and may be incompatible with GDPR, particularly the “right to be forgotten.”
-
CEO Sam Altman and COO Brad Lightcap have stated the company will fight demands that compromise user privacy, positioning this as a core principle.
-
The order arose from speculation that users could have used ChatGPT to circumvent NYT paywalls, prompting fears that deleted chats could show infringement.
-
-
Origins and Broader Legal Implications
-
The lawsuit, filed in 2023 by the New York Times against OpenAI and Microsoft, alleges unauthorized use of millions of NYT articles to train ChatGPT.
-
A federal judge previously ruled that the Times had plausibly shown that OpenAI and Microsoft might have encouraged copyright infringement, citing widely publicized examples of ChatGPT reproducing NYT content.
-
OpenAI argues the preservation order sets a dangerous precedent for how user data may be compelled and retained in future copyright disputes.
-
-
Potential User Fallout
-
OpenAI has issued an FAQ to clarify that retained chats won’t be shared automatically with the NYT and reassures users that it is taking steps to limit data access and ensure compliance with global privacy laws.
-
Nonetheless, the case has triggered concern among users, with some reportedly considering switching platforms over the perceived erosion of privacy guarantees.
-
This summary reflects both the legal stakes and the real-world privacy concerns raised by the order, as well as OpenAI’s evolving response strategy.
- Text WSJ
- Text TechCrunch
-
Reddit Alleges Anthropic Illegally Used Its Data Without a License
-
Reddit filed a lawsuit in California claiming Anthropic scraped its site over 100,000 times despite having no data licensing agreement and after saying it had stopped.
-
The complaint states Anthropic trained its models—including the recently launched Claude Opus 4—on Reddit content without user consent, violating Reddit’s user data policies and terms of service.
-
Reddit emphasizes that its public content policy includes privacy safeguards (e.g., excluding deleted posts from licensing deals).
-
-
Reddit Positions Itself as a Willing Licensor, Not an Open Buffet
-
Reddit has formal licensing deals with OpenAI and Google, allowing them to use Reddit data under terms that protect users’ privacy.
-
Ben Lee, Reddit’s chief legal officer, stated: “We believe in an open internet. That does not mean open for exploitation.”
-
Reddit argues that companies profiting from AI training should compensate platforms that provide the underlying data.
-
-
Anthropic Accused of Ignoring Repeated Warnings
-
Reddit says it clearly notified Anthropic it was not authorized to scrape or use Reddit content and that Anthropic “refused to engage.”
-
The lawsuit claims Anthropic’s bots ignored robots.txt protocols and continued accessing Reddit’s data even after 2024 assurances to the contrary.
-
Reddit seeks compensatory damages, restitution for unjust enrichment, and a court order barring further use of its data.
-
-
Part of a Broader Legal Trend Against AI Training Practices
-
Reddit is the first Big Tech firm to file suit over unauthorized data use for AI training, joining lawsuits from The New York Times, book authors, and music publishers targeting companies like OpenAI, Meta, and others.
-
The case underscores the growing pressure on AI developers to obtain explicit data rights, as more content providers move to monetize or protect their user-generated material.
-
This lawsuit sharpens the divide between platforms licensing data responsibly and AI developers accused of bypassing those channels. The case could set a precedent for how courts handle data scraping vs. licensing in the AI era
- Text The Information
-
Alibaba’s Qwen Models Have Emerged as Global Leaders in Open-Source AI
-
After internal resistance and early reliance on rival models like Meta’s Llama and DeepSeek R1, Alibaba’s Qwen models now outperform Llama and rival DeepSeek in several benchmarks, especially following the April 2024 release of Qwen3, a suite of eight fully open-source models.
-
Qwen models are now used by over 290,000 customers across industries, and Alibaba’s own business units—once skeptical—have switched back from DeepSeek’s R1 to Qwen3, citing performance and versatility.
-
Global adoption is increasing: Japanese firm Abeja built local LLMs using Qwen, and even Nvidia based its Cosmos-Reason1 AI on a Qwen model, according to a recent research paper.
-
-
Strategic Focus on Open-Source Spurred Adoption and Performance Gains
-
Alibaba strategically pivoted from proprietary development to community-driven open-source releases, gaining valuable feedback from researchers and startups.
-
Qwen2.5, released in late 2024, surpassed Llama 3 and signaled Alibaba’s breakthrough in developer credibility; Qwen3 models continue this trajectory.
-
CEO Eddie Wu and CTO Zhou Jingren prioritized open-source as a long-term innovation driver, positioning Qwen as a core platform for startups and enterprise agents.
-
-
China’s AI Ecosystem Now Rivals—and Sometimes Leads—U.S. Development
-
Alibaba and DeepSeek are pushing China to the forefront of open-source AI, a space previously dominated by U.S. firms like Meta, OpenAI, and Google.
-
Widespread use of open-source AI, which is lower-cost and highly adaptable, is helping Chinese firms gain international influence in AI development and deployment—especially in countries and sectors wary of U.S. tech dominance.
-
-
Alibaba’s Decentralized Structure Accidentally Accelerated Model Quality
-
Alibaba’s post-antitrust restructuring into six autonomous units forced Qwen developers to earn internal adoption, intensifying their focus on practical performance and cost-efficiency.
-
This internal competition sharpened Qwen’s capabilities, as model teams had to win over skeptical internal clients before persuading external customers.
-
Now, Qwen3 is not only uniting previously siloed Alibaba teams but also helping to build interoperable agent ecosystems across business units.
-
Alibaba’s success story reflects not just technical execution but also a strategic and organizational evolution—one that U.S. tech giants may have to study closely as open-source becomes a dominant force in global AI.
No comments:
Post a Comment
Your comments will be greatly appreciated ... Or just click the "Like" button above the comments section if you enjoyed this blog note.