Friday, March 14, 2025

"Fair Use" of Content by AI Models is Another Example of Disruptive New Technology

Humans learn by reading books, watching videos, and experiencing the world, often using copyrighted material like textbooks or movies. This learning is generally not considered copyright infringement, and is known as “fair use,”  as it involves personal absorption rather than copying or distributing. 


“Fair use” principles and law come into play if humans create new works. It is not the ideas and concepts that are protected, only their form of expression. So new music, writing, songs, movies or TV shows might mirror existing works, but cannot “copy” them. 


The issue for AI training is that AI systems, particularly machine learning models, learn by training on large data sets, which may include copyrighted content that is copied. One early court case not directly involving generative AI suggests the systems do not enjoy “fair use” protection.  


Fair use is a legal doctrine under U.S. copyright law that permits limited use of copyrighted material without permission, for purposes such as criticism, comment, news reporting, teaching, scholarship, or research. 


A student reading a textbook or watching a documentary is not typically seen as infringing copyright, as the act of learning is personal and does not involve making physical copies. But that’s where computers and models, with their efficient “memory,” raise issues. 


We might argue that human memory is porous enough that “copies” of content are never made, with the possible exception of those humans with “photographic memory.” Computers, obviously, suffer no similar issues. 


So human learning is a mental process. “Plagarism” is the obvious example of a fair use violation, as it represents a purportedly new creation that really is copying. 


Proponents argue that AI training is transformative, as the model learns patterns to generate new content, not to reproduce the original works. 


Opponents argue that AI-generated content competes with originals. But that does not inherently strike some of us as a copyright violation, “merely” a case of new competition. 



Aspect

Human Learning

AI Training

Method of Access

Reading, listening, observing

Copying data into memory/storage

Copying Involved

No physical copies, mental absorption

Yes, physical copies for processing

Purpose

Personal learning, education

Model training, often commercial

Fair Use Application

Relevant for new creations, e.g., quoting

Debated for training process itself

Market Impact

Minimal, unless new work competes

Potential, if AI output competes with originals

Legal Precedent

Generally accepted, no infringement

Ongoing lawsuits, no clear consensus


Computer efficiency is among the issues, since an AI model can be trained on millions of books in hours, far surpassing human capacity. Since copyright is about commercial product protection, language models therefore raise the issue of market impact. It is not so much that humans or AI models “learn” but that they can create new content that has commercial implications. 


The commercial concern seems to center on the potential increase in content competition, not so much the knowledge ingestion. That is essentially what underlies the concern about huge amounts of AI-created content “drowning out” human authors. 


As often happens, the conflict is between legacy interests and innovators whose new products could disrupt existing economic models. Such conflicts are common when disruptive technologies emerge.


Industry Affected

Disruptive Innovation

Legacy Industry Concerns

Outcome

Music Industry (2000s)

Digital music streaming and MP3 sharing (Napster, Spotify)

Loss of album sales, piracy concerns

Industry shifted to streaming models, with revenue-sharing for artists and labels

Publishing and Journalism

Google Search and News Aggregators

Decline in ad revenue, loss of control over content distribution

Publishers adapted with paywalls, licensing deals 

TV and Film Industry

Online video streaming (Netflix, YouTube)

Cord-cutting reduced traditional TV revenue

Studios launched their own streaming services (Disney+, HBO Max)

Taxis and Transportation

Ride-sharing apps (Uber, Lyft)

Regulation circumvention, lost driver income

Ride-sharing became mainstream; regulations updated over time

Retail (Brick-and-Mortar Stores)

E-commerce (Amazon, Shopify)

Store closures, price undercutting

Traditional retailers shifted online or hybrid models

Finance and Banking

Cryptocurrencies, Fintech (DeFi, PayPal, Square)

Loss of control over transactions, regulatory concerns

Banks embraced fintech partnerships, crypto regulations emerged

Photography and Film

Digital cameras and smartphones

Film sales collapsed, Kodak and Fujifilm disrupted

Kodak filed for bankruptcy; digital photography dominated

Telecom (Landlines and SMS)

VoIP, Messaging apps (Skype, WhatsApp)

Decline in SMS and landline revenue

Telcos adapted by offering data-driven pricing models

AI and Content Creation

Generative AI (ChatGPT, Midjourney)

Copyright concerns, job displacement fears

Legal battles ongoing; potential for licensing frameworks


Fair use of content “scraped” by AI models is another example of a clash of perceived business interests.


No comments:

"Fair Use" of Content by AI Models is Another Example of Disruptive New Technology

Humans learn by reading books, watching videos, and experiencing the world, often using copyrighted material like textbooks or movies. This ...