Tuesday, November 12, 2024

Commercial Agreements on AI Model Use of Content are Coming

A statement opposing the use of creative works for artificial intelligence model training, without compensation, is not unusual. Nor are efforts to diffuse the issue by revenue sharing mechanisms.


One example is a new licensing deal between Meta and Reuters allowing Meta AI to use Reuters content when a user query involves current events and news. Meta's AI chatbot is integrated into the search and messaging features on Facebook, Instagram, WhatsApp and Messenger. 


So even if Meta has been downplaying news content for its main feeds, user Meta AI inquiries might often be about  news. The new agreement means Meta AI can cite Reuters and link to its coverage, while 

Reuters also is compensated for such use. 


The ultimate resolution is likely to be some form of payment by models to copyright owners in some fashion, even if an argument can be made that the models are not infringing copyright. 


Both AI crawlers and humans consume large amounts of information to learn and develop understanding. The models are different mostly because of their efficiency, compared to human consumption.


AI models and human brains both identify patterns and extract meaningful insights from the content they process and the acquired knowledge is used to generate new ideas, solve problems, or create content in both cases.


The key difference, one might argue, is that the AI crawlers can process vastly larger amounts of data at much higher speeds than humans. And that might matter for copyright as it is not an infringement to read a book, watch a film or video or listen to a recording. That, in fact, is how people learn. 


And, in fact, one might argue that “copying” or “imitating” an existing bit of content also is part of the training or learning process. In art school, students often have exercises where they attempt to mimic the styles of renowned painters, for example, learning to paint in those styles. 


And since copyright protections do not last forever, even the “good copy” of an old work is not an infringement. The analogy then is that the mere act of ingesting content is not, under existing law, a copyright violation. 


And yet that is essentially what content owners happens when an AI web crawler indexes content on the internet. So some might argue that AI crawlers are not infringing copyright simply by indexing. 


AI models also can retain and access all processed information, unlike human memory which is more limited. 


The issue is whether that quantitative capability also is a qualitative difference when applying notions of copyright. 


AI web crawlers can process enormous volumes of web content at speeds far exceeding human capacity. While a human might spend hours or days reading through websites, an AI crawler can scan and index millions of pages in a matter of minutes or hours. 


That quantitative difference might not be a reason to differentiate such operations from the human process of reading, listening or viewing, for purposes of copyright. 


In fact, even the creation of content should not be a copyright infringement if the way ideas are presented is different from that of the crawled content. Recall that ideas themselves cannot be copyrighted, only the form in which ideas are presented. 


That’s the principle some would invoke. Commercial reality is another matter. Licensing might be the less-expensive way to move forward, compared to litigation.


No comments:

AI "Performance Plateau" is to be Expected

There is much talk now about generative artificial intelligence model improvement rates slowing. But such slowdowns are common for most--if...