Sunday, November 26, 2023

What's the Best Analogy for LLMs?

For large language model use cases, one size does not necessarily fit all, all the time. On the other hand, to the extent that LLMs are conceived of as similar to operating systems, one size arguably is much more important.


Looking for historical analogies is one way of trying to understand large language models and other forms of artificial intelligence, when assessing business model implications. Where and when is scale essential versus merely helpful?


For example, are LLMs more akin to operating systems, platforms or applications? 


Are LLMs in part “picks and shovels,” which are more like OSes, and also, in part, applications that are always designed to run in loosely-coupled ways on any compliant platforms? Are LLMs sometimes also platforms? The importance of scale or market share might well hinge on which scenario matters most to particular providers. 


Feature

LLMs

OSs

Purpose

Generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way

Manage hardware resources, provide a platform for running software applications, and facilitate user interaction with computer systems

Underlying technology

Artificial neural networks

Kernel, device drivers, and user interface software

Scalability

Highly scalable, can be trained on massive amounts of data and run on distributed computing systems

Limited by hardware resources, require specialized software and configurations for different types of devices

Applications

Chatbots, virtual assistants, machine translation, content creation, code generation, research

Personal computers, servers, mobile devices, embedded systems

Maturity

Relatively new technology, still under development

Mature technology with a long history of development

Adoption

Growing rapidly, but still not as widely used as OSs

Ubiquitous, used by billions of people worldwide

Scale

Potential to be used on a wide range of devices and platforms

Typically designed for specific types of devices and platforms

Niche AI Models

Possible, as LLMs can be trained on specialized datasets

Less likely, as OSs need to be general-purpose and compatible with a wide range of hardware and software


OSs have tended to be a scale phenomenon, with a few dominant players controlling the market. This is due to the network effects that exist in the OS market: the more people use an OS, the more valuable it becomes to other users. As a result, it has been difficult for new OSs to gain traction.


However, the landscape for AI models may be different. Maybe the model is more social media, e-commerce, search, messaging than operating system, for example. In other words, instead of resembling an operating systems or device market, LLMs could resemble application markets, where age, culture, language, tone, content and function can vary widely.


Though scale still matters, apps are far less monolithic than operating systems or consumer devices such as smartphones. In other words, LLMs can enable app use cases that are more variegated than the OS market or device markets tend to be. 


Highly specialized LLMs usable by a single company and its own applications will be possible. So too will apps targeted to different age groups, language groups, cultures and functions. 


Edge computing might also mean it is possible to deploy AI models on devices with limited resources, such as smartphones and IoT devices, creating more “niche” use cases.


So we might argue that operating systems  require scale to be successful. Without scale, an operating system would have limited reach and adoption.


In the context of LLMs, scale is crucial for models that aim to be general-purpose solutions, catering to a wide range of tasks and domains. For instance, LLMs used for machine translation or text summarization need to be trained on massive amounts of data from various sources to handle diverse language contexts and content types. Scale allows these models to perform well across a broad spectrum of applications.


Platforms like social media networks, e-commerce sites, and content sharing platforms benefit from scale but don't necessarily require it. For LLMs, scale arguably is helpful but not essential for models that target specific platforms or applications. 


For example, an LLM integrated into a customer service chatbot might not require the same level of scale as a general-purpose language model, though scale generally is helpful. 


End-user applications like productivity tools, creative software, and games can succeed without scale. Similarly, LLMs can be incorporated into end-user applications without requiring massive scale.


As often is the case, LLM scale is a fundamental requirement in some use cases, but not in others. For suppliers of wholesale, general-purpose LLMs, scale likely will matter. 


When used as a platform, maybe not. And when used to enable apps and use cases, scale might not be particularly important. Model owners must care about scale. “Computing as a service” providers or data centers can afford to “support all” stances. 


App developers might not necessarily care which LLM is used, beyond the obvious matters of supplier stability, cost, reliability, reputation, ease of use, support and other parts of a value bundle. 


No comments:

Have LLMs Hit an Improvement Wall, or Not?

Some might argue it is way too early to worry about a slowdown in large language model performance improvement rates . But some already voic...