Wednesday, November 20, 2024

Content Licensing Deals to Train AI Proliferate

As has been the case for earlier generations of conflicts between content owners (media firms, for example) and new types of firms (search, social media), conflicts over the training of large language models is being resolved in similar fashion: licensing deals. 


Microsoft, for example, recently signed a deal with News Corp.’s Harper Collins allowing “select non-fiction back titles”  to be used for training of artificial intelligence models, if individual authors agree. 


The content is said to be for a new model Microsoft is creating, but not intended to “write books.”  


Such deals have become more common as model owners work to defuse content owner objections to AI training using their copyrighted works. 


Content Owner

AI Company

Deal Details

Payments

News Corp

OpenAI

5-year deal for access to current and archived content from publications like The Wall Street Journal, The New York Post, The Times, etc. Includes display of content in response to user queries and sharing of journalistic expertise56

Over $250 million over 5 years56

Various Publishers

OpenAI

Annual licensing deals for training AI models, including companies like The Associated Press, Axel Springer, Prisa Media, Le Monde, and Financial Times56

$1 million to $5 million per year136

The Atlantic

OpenAI

Access to archives for AI model training and collaboration on product development, including an experimental microsite6

Not specified

Vox Media

OpenAI

Access to archives for AI model training and assistance in creating products for consumers and advertising partners6

Not specified

Hearst

OpenAI

Licensing deal for content use in training AI models2

Not specified

Mumsnet, The Center for Investigative Reporting

OpenAI

No deal; instead, these entities have initiated legal complaints against OpenAI2

-

Conde Nast, NBC News, IAC (People and Daily Beast owner)

Apple

Discussions for licensing content archives for AI training, but no public deals announced yet2

At least $50 million over a multiyear period (reported offer)3

Financial Times, Axel Springer, The Atlantic, Fortune

Prorata.ai

Licensing deal with revenue-sharing model; 50% of subscription revenue shared with content creators2

Revenue-sharing basis

Time, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, Automattic (WordPress.com owner)

Perplexity

Revenue-sharing deal with access to analytics and technology for creating custom answer engines2

Revenue-sharing basis

Reddit

Google

Licensing deal for user-generated content to train AI models4

Not specified


source: Seeking Alpha 



Content Owner

AI/Search/Social Media Firm

Deal Details

Payments

News Corp

OpenAI

Access to current and archived content from publications like The Wall Street Journal, The New York Post, The Times, etc. for training AI models and displaying content in response to user queries. Includes sharing of journalistic expertise155

Over $250 million over 5 years

The Associated Press

OpenAI

Licensing deal for training AI models and developing technology for news gathering45

$1 million to $5 million per year

Axel Springer

OpenAI

Licensing deal for training AI models and developing technology for news gathering45

$1 million to $5 million per year

Prisa Media

OpenAI

Licensing deal for training AI models and developing technology for news gathering5

$1 million to $5 million per year

Le Monde

OpenAI

Licensing deal for training AI models and developing technology for news gathering5

$1 million to $5 million per year

Financial Times

OpenAI

Licensing deal for training AI models and developing technology for news gathering15

$1 million to $5 million per year

Hearst

OpenAI

Licensing deal for training AI models1

$1 million to $5 million per year

Time, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, Automattic (WordPress.com owner)

Perplexity

Revenue-sharing deal with access to analytics and technology to create custom answer engines. Revenue generated from sponsored related questions will be shared with publishers1

Revenue-sharing basis

Conde Nast, NBC News, IAC (People and Daily Beast owner)

Apple

Discussions for licensing content archives, but no public deals announced yet. Apple is offering more substantial remuneration for broader rights to use the content124

At least $50 million over a multiyear period (reported offer)

Reddit

Google

Licensing deal for user-generated content to train AI models3

Not specified

Mumsnet, The Center for Investigative Reporting

OpenAI

No deal yet. Instead, these entities have initiated legal complaints against OpenAI


The New York Times

OpenAI

No deal yet.  The New York Times is suing OpenAI and Microsoft for copyright infringement



No comments:

It Will be Hard to Measure AI Impact on Knowledge Worker "Productivity"

There are over 100 million knowledge workers in the United States, and more than 1.25 billion knowledge workers globally, according to one A...