Updates to our Terms of Use

We are updating our Terms of Use. Please carefully review the updated Terms before proceeding to our website.

Monday, May 20, 2024 | Back issues
Courthouse News Service Courthouse News Service

Group of daily newspapers hit Microsoft and OpenAI with copyright suit over AI

The newspaper publishers say the artificial intelligence companies siphon off news organizations’ revenues while benefiting from “mass copyright infringement.”

MANHATTAN (CN) — A coalition of eight newspapers owned by the MediaNews Group and Tribune Publishing companies sued ChatGPT makers OpenAI and Microsoft on Tuesday claiming large-scale copyright infringement of the publishers’ articles without permission or payment, to fuel the commercialization of their artificial intelligence products including ChatGPT and Copilot.

The publishers accuse the AI developers of drawing from large swaths of copyrighted articles online to “train” the large language models that enhance the apps’ ability to generate natural language text in a variety of styles.

The civil complaint filed Tuesday morning in the Southern District of New York includes Tribune Publishing’s Chicago Tribune, Orlando Sentinel and South Florida Sun Sentinel and the New York Daily News, and MediaNews Group-owned Mercury News, Denver Post, Orange County Register and St. Paul Pioneer-Press as plaintiffs.

Frank Pine, executive editor for MediaNews Group and Tribune Publishing said the misappropriation of news content by OpenAI and Microsoft undermines the business model for news publishers.

"We’ve spent billions of dollars gathering information and reporting news at our publications, and we can’t allow OpenAI and Microsoft to expand the Big Tech playbook of stealing our work to build their own businesses at our expense," he said in a statement Tuesday morning.  "They pay their engineers and programmers, they pay for servers and processors, they pay for electricity, and they definitely get paid from their astronomical valuations, but they don’t want to pay for the content without which they would have no product at all. That’s not fair use, and it’s not fair. It needs to stop."

Represented by Steven Lieberman from the Washington-based firm Rothwell, Figg, Ernst & Manbeck, the publishers accuse OpenAI of reaching its $90 billion valuation through being “a joint enterprise based on mass copyright infringement.”

“Despite its early promises of altruism, OpenAI quickly became a multibillion-dollar for-profit business built in large part on the unlicensed exploitation of copyrighted works belonging to publishers and others,” the newspaper publishers say in the complaint.

The publishers claim that collectively, content from their websites accounts for at least 124 million basic pieces of text included in the Common Crawl depository of data constantly dredged from the open internet and used to train the software’s large language models.

The Common Crawl dataset includes 48 million “tokens”  — representing basic units of text  — from the Chicago Tribune, 22 million from the New York Daily News, 12 million from the Mercury News, 11 million from the Orlando Sentinel, 11 million from the Sun Sentinel, 9.8 million from the Denver Post, 6.5 from the Orange County Register, and 3.2 million from the Pioneer Press.

The publishers also accused the AI apps of “actively tarnishing the newspapers’ reputations and spreading dangerous disinformation” by falsely attributing inaccurate reporting — AI-generated “hallucinations” of botched replications of articles resulting in nonsense text  — to their newspapers.

The newspapers’ civil suit is the latest in a series of lawsuits brought in Manhattan federal court accusing OpenAI and Microsoft of infringing copyrights when training their large language models to develop algorithms that allow anyone to generate similar texts they would otherwise pay writers to create.

The first of these cases to be filed in the Southern District of New York was brought by the Authors Guild professional organization for writers in September 2023. Seventeen authors joined that suit, including “Game of Thrones” creator George R.R. Martin and authors Elin Hilderbrand, Jonathan Franzen and Jodi Picoult.

The New York Times filed a suit four months later in the same court seeking to curb OpenAI’s practice of using the newspaper’s stories to train its chatbots.

U.S. District Judge Sidney Stein subsequently consolidated the Authors Guild and New York Times’ cases with two others.

Discovery in the consolidated cases is now underway and due be completed by September, and summary judgment briefing is due in early 2025.

Follow @jruss_jruss
Categories / Media, Technology

Subscribe to Closing Arguments

Sign up for new weekly newsletter Closing Arguments to get the latest about ongoing trials, major litigation and hot cases and rulings in courthouses around the U.S. and the world.

Loading...