Friday, November 15, 2024

Have LLMs Hit an Improvement Wall, or Not?

Some might argue it is way too early to worry about a slowdown in large language model performance improvement rates. But some already voice concern, as OpenAI appears to see a slowdown in rates of  improvement. 


Gemini rates of improvement might also have slowed, and Anthropic might be facing similar challenges.   


To be sure, generative artificial intelligence language model size has so far shown a correlation with performance. More inputs--such as larger model size--have resulted in more output. 


source: AWS 


In other words, scaling laws exist for LLMs, as they do for machine learning and other aspects of AI. The issue is how long the current rates of improvement can last. 


Scaling laws describe the relationships between a model’s performance and its key attributes: size (number of parameters), training data volume, and computational resources. 


Scaling laws also imply that there are limits. At some point, the gains from improving inputs do not produce proportional output gains. So the issue is how soon LLMs might start to hit scaling law limits. 


Aside from the cost implications of ever-larger model sizes, there is the related matter of the availability of training data. At some point, as with natural resources (oil, natural gas, copper, gold, silver, rare earth minerals), LLMs will have used all the accessible, low-cost data. 


Other data exists, of course, but might be expensive to ingest. Think about the Library of Congress collection, for example. It is theoretically available, but the cost and time to “mine” it is likely more than any single LLM can afford. Nor is it likely any would-be provider could create (digitize) and supply such resources fast and affordably. 

source: Epoch AI 


Consider the cost to digitize and make available the U.S. Library of Congress collection. 


Digitization and metadata creation might cost $1 billion to $2 billion total, spread over five to 10 years, including the cost of digitizing and formatting:

  • Textual Content: $50 million - $500 million.

  • Photographic and Image Content: $75 million - $300 million.

  • Audio-Visual Content: $30 million - $120 million.

  • Metadata Creation and Tagging: Approximately 20-30% of total digitization costs ($200 million - $600 million).


I think the point is that with the speed of large language model updates (virtually continually in some cases, with planned model updates at least annually), no single LLM provider could afford to pay that much, and wait that long, for results. 


Then there are the additional costs of data storage, maintenance, and infrastructure, which could range from $20 million to $50 million annually. Labor costs might be in the range of $10 million to  $20 million annually as well.


Assuming the owner of the asset would want to license access to many other types of firms, sales, marketing, and customer support could add another $5 million to $10 million in annual costs.


The point is that even if an LLM wanted to spend $1 billion to $2 billion to gain access to the knowledge embedded in the U.S. Library of Congress, perhaps no LLM owner could afford to wait five years to a decade to derive the benefits. 


And that is just one example of a scaling law limit. The other issues are energy consumption; computing intensity and model parameter size. At some point, diminishing returns from additional investment would occur.


Marginal Cost and ISP Data Caps

Some critics of internet service provider usage-based (buckets of usage) object to the practice as unfair, since the marginal cost of supplying the next unit of consumption is considered quite low. But the marginal cost of the next unit of capacity consumption is not the gating factor dictating ISP cost structure. 


Product/Service

Consumer Margin

Business Margin

Mobile Voice

30-40%

35-45%

Mobile Data

45-55%

50-60%

Fixed Broadband

40-50%

45-55%

TV/Video

25-35%

30-40%

VoIP

35-45%

40-50%

Cloud Services

30-40%

35-45%

IoT Connectivity

40-50%

45-55%

Managed Services

N/A

30-40%

Content Apps

60-70%

N/A

Enterprise 5G

N/A

50-60%


Of course, connectivity service is a highly capital intensive business as well, also featuring the necessity of high dividend payouts, capex, interest and amortization expenses, as well as the customary operating costs all businesses incur, so gross profit margin is only part of the story. 


Sunk costs, high capital investment and borrowing costs are the key drivers of cost, not incremental costs of supplying the next unit of consumption. 


Consider a sample business model for a firm whose revenue has been simplified from billions of dollars to just $100 million, but using the same ratios of cost in the model.


The point is that high gross profit margins also come with significant costs. The result is that high gross profit margins are matched by expenses high enough to reduce net profits to five percent to 15 percent, which might be broadly representative of many other types of businesses. 


Metric

Description

% of Revenue

Model Impact

Revenue

Total income from services and products

100%

$100 million

Gross Margin

Revenue minus cost of goods sold (COGS)

60-70%

$60-70 million

Operating Expenses

SG&A, marketing, administrative, salaries

20-25%

$20-25 million

EBITDA

Earnings before interest, taxes, depreciation, and amortization

35-50%

$35-50 million

Depreciation & Amortization

Wear and tear on assets, often high in telcos

10-15%

$10-15 million

Operating Income (EBIT)

EBITDA minus depreciation & amortization

20-35%

$20-35 million

Interest Expense

Payments on debt, which can be high

5-10%

$5-10 million

Pretax Income

EBIT minus interest expense

10-25%

$10-25 million

Tax Expense

Varies by jurisdiction

2-5%

$2-5 million

Net Income

Profit after all expenses and taxes

8-20%

$8-20 million

Dividends

Shareholder payments, often a priority for telcos

3-5%

$3-5 million

Net Margin after Dividends

Net margin after dividends

5-15%

$5-15 million


Connectivity services might generally range in the middle of industries for net profit margin, keeping in mind that participants at different stages of the value chain in each industry can have distinctly-different profit margin profiles. 


Industry

Typical Net Profit Margin

Pharmaceuticals

15-25%

Software & IT Services

15-30%

Banking & Financial Services

10-20%

Telecom Services

5-15%

Consumer Packaged Goods (CPG)

5-10%

Utilities

5-10%

Manufacturing

5-12%

Automobile Manufacturing

5-10%

Retail

2-6%

Airline Transportation

1-5%

Hospitality

2-6%

Construction

2-8%

"Winner Takes All" or "Winner Takes Most" Market Structure for LLMs?

According to the Chatbot Arena leaderboard, a platform for evaluating AI determined by user votes, Gemini’s latest update--Gemini-Exp-1114--ranks best among large language models. 


It is worth noting that leaders change somewhat frequently, with the top-five models presently all versions of OpenAI or Google models. Perhaps notably, Grok-2-08-13 ranks sixth. 


source: Chatbot Arena 


It might also be worth noting that OpenAI's models (such as GPT-4) and Anthropic's Claude models have consistently ranked near the top of the leaderboard.


And leadership seems to have changed since the spring of 2023. Consider the leaderboard published by LMsys in the spring of 2023. ChatGPT 3.5 launched in late 2022 and seems to have been in the top five of the Arena leaderboard since its inception in the spring of 2023. 


source: lmsys.org 


Eventually, business history would suggest leadership of the market will condense, as have other technology markets. So. the LLM market is likely to evolve into a structure characterized by oligopolistic competition among a few major players, complemented by a range of specialized providers catering to specific industries or use cases.


In 2023 five LLMs had more than 88 percent market share. That leadership group might condense further, eventually. 


On the other hand, the room for specialized platforms might remain. How many of us would not see any way for OpenAI, Google, Microsoft, Meta, Amazon, Apple and IBM, for example, to continue as operators of domain-specific LLMs, no matter what happens ;with the broader market?


And who might doubt that specialized industry-specific platforms could number between 10 and 20 (catering to different sectors like healthcare, finance, legal)?


And of the leaders, might open-source initiatives include three to five significant contributors? 


Might AI-as-a-Service providers number 10 to 15 “significant” players, even if the top five or so positions include AWS, Google Cloud, Azure, Meta and Amazon? 


Also, if history is instructive, could there not exist five to 10 Integration and orchestration platforms as well?


The issue is what “winner takes all” will mean in the LLM ecosystem and platforms markets. Current examples include just one or perhaps two leaders in existing markets, which is more on the “operating system” model. On the other hand, most of us would have a hard time deciding on less than perhaps four leading LLMs for some time to come. 


And some structural differences between existing technology market structures and LLMs come to mind. Unlike the operating system market, LLMs don't require the same level of user lock-in or hardware integration. So the "two-leaders” pattern might not emerge. 


Roughly the same argument might be made about the e-commerce or search market structures, where one leader tends to emerge. The competitiveness of existing LLMs, with continual upgrades, tends to dispel the notion that any single provider will achieve technological superiority on a sustainable basis. 


LLMs also lack the network effects and user-generated content central to social media platforms. So it is possible the one leader model might not develop. Right now, differences between leading platforms are relatively subtle. 


So the likely direction is “winner take most” more than “winner take all.” Even if network effects are not so strong, high capital intensity, branding and trust issues and the ability to vertically integrate with existing ecosystems (Google, Apple, Microsoft, Meta) create enormous advantages for a few contenders. 


At least for the moment, “winner take all” is hard to see. A still-oligopolistic, but “winner take most” structure with a handful of leaders might be more plausible. 


Directv-Dish Merger Fails

Directv’’s termination of its deal to merge with EchoStar, apparently because EchoStar bondholders did not approve, means EchoStar continue...