Monday, November 4, 2024

Which Firm Will Use AI to Boost Revenue by an Order of Magnitude?

Ultimately, there is really only one way for huge AI infrastructure investments up by an order of magnitude over cloud computing investment to pay off: revenues will have to increase by an order of magnitude as well. 


It might be a stretch to argue that is possible for most firms investing in generative AI frontier models, for example. But it is almost certain that at least one or two of those firms will manage to do so, emerging as leaders of a new industry they lead. 


And that is the prize.  


source: Sherwood 


But generative AI, the current focus of capex, might be--if not a full-blown general-purpose technology--the sort of digital product that creates a whole new--and big--industry. Think of the past pattern of new industries built on firms and products including operating systems, e-commerce, search, social media and online advertising in general, plus still-growing businesses such as ride-hailing and peer-to-peer lodging. 


if generative AI winds up being a “winner take all” business, as most other computing segments have been, there will be no prize for third best, and limited advantage for being second best. 


We have already seen that pattern in many other computing markets. The leader in search has 91 percent market share. The browser leader has 65 percent share. The mobile operating system leader has 72 percent share. The U.S. ride-hailing leader has 68 percent share. 


Market

Dominant Player

Market Share

Runner-up

Market Share

Search Engines

Google

91.9%

Bing

3.0%

Desktop Browsers

Chrome

65.72%

Safari

18.22%

Mobile Browsers

Chrome

66.17%

Safari

23.28%

E-commerce

Amazon

37.8% (US)

Walmart

6.3% (US)

Video Streaming

YouTube

2.5B users

Netflix

231M subscribers

Music Streaming

Spotify

31%

Apple Music

15%

Ride-hailing (US)

Uber

68%

Lyft

32%

Cloud Services

AWS

32%

Azure

22%

Mobile OS

Android

71.8%

iOS

27.6%


So, whether investors like it or not, would-be leaders of the generative AI ecosystem are pouring resources into the effort to lead the new market. And that investment intensity affects investor perceptions, even as the big firms continue to post revenue growth. 


Alphabet reported a robust 15-percent revenue growth; 35-percent cloud computing revenue growth; operating income up 34 percent but also AI-focused capital investment up 72 percent. 


“And as we think into 2025, we do see an increase in AI-focused capital investment coming in 2025,” said Alphabet CFO Anat Ashkenazi. 


So does Amazon, which expects capex to be about  $75 billion in 2024 and “more than that in 2025,” according to Amazon CEO Andy Jassy. “And the majority of it is for AWS and specifically, the increased bumps here are really driven by Generative AI.


“Our AI business is a multi-billion dollar business that's growing triple-digit percentages year-over-year and is growing three times faster at its stage of evolution than AWS did itself,” said Jassy.


Generative AI “is a really unusually large, maybe once-in-a-lifetime type of opportunity,” he said.


All that is fueling investment into generative AI, which based on recent computing product precedent, will produce  a “winner take all” market. 


Company

2024 Estimated AI Capex

2025 Estimated AI Capex

Microsoft

$80 billion

Significant increase

Amazon

$75 billion

Further increase

Alphabet

$52 billion

Increase expected

Meta

$38-40 billion

Significant growth


Microsoft and Meta Platforms both beat analyst expectations with their quarterly earnings reports, but also said more AI spending is coming, pushing down share prices for both firms. 


Microsoft CEO Satya Nadella noted continued capacity constraints at data centers amid surging demand, but also continues heavy spending  on cloud and AI to scale to alleviate capacity constraints. 


Meta CEO Mark Zuckerberg also forecast a "significant acceleration" in spending on AI-related infrastructure in 2025. Zuckerberg acknowledged that this may not be what investors want to hear in the near term, but insisted that the opportunities here "are really big."


GenAI is a big gamble. Based on history, we might suggest that all but one or two of these efforts will fail, and the list of serious contenders also includes OpenAI and others. Should that pattern hold, the top two companies might have 60 percent to 80 percent share of the total market. 


Market Position

Market Share

Profit Share

Leader

70-90%

80-90%

Runner-up

10-20%

5-15%

Others (3-10)

5-10%

0-5%


AI Model Inference Costs Will Decline 20% to 30% Per Year

Despite concern over the high capital investment costs in infrastructure to support generative artificial intelligence models, many studies suggest that costs for inference, which should ultimately be the primary on-going costs, should drop over time, as have costs for other computing instances.


Study Title

Date

Publication Venue

Key Conclusions on Cost Declines

Scaling and Efficiency of Deep Learning Models

2019

NeurIPS

Demonstrates how advances in model scaling (larger models running on optimized hardware) lead to inference cost reductions of around 20-30% per year.

The Hardware Lottery

2020

Communications of the ACM

Highlights the role of specialized hardware (GPUs, TPUs) in reducing AI inference costs, estimating a 2x decrease every 1-2 years with hardware evolution.

Efficient Transformers: A Survey

2021

Journal of Machine Learning Research

Describes optimization techniques (such as pruning, quantization) that contribute to cost declines, estimating an average 30-50% drop in inference costs over two years.

The Financial Impact of Transformer Model Scaling

2022

IEEE Transactions on AI

Examines the economic impacts of scaling transformers and shows that large models can reduce costs by ~40% through efficiencies gained in distributed inference and hardware.

Inference Cost Trends in Large AI Deployments

2023

ICML

Finds a 50% reduction in inference costs per year for large-scale deployments, driven by optimizations in distributed computing and custom AI chips.

Beyond Moore’s Law: AI-Specific Hardware Innovations

2023

MIT Technology Review

Discusses how specialized hardware design reduces inference costs by 2-4x every 2 years, shifting from general-purpose GPUs to domain-specific architectures.

Optimizing Inference Workloads: From Data Center to Edge

2024

ArXiv

Analyzes cost reductions from 2020 to 2024 for both data center and edge deployments, concluding that distributed systems and model compression lead to 50% annual cost drops.


The implication is that inference costs should continue to drop. 


Year

Cost per Inference ($)

Cost Decline Compared to Prior Year

2018

1

-

2019

0.8

20%

2020

0.5

37.50%

2021

0.3

40%

2022

0.15

50%

2023

0.08

47%

2024

0.04

50%

2025

0.02

50%


Of course, a trend towards larger models, using more parameters, will run counter to that trend, in terms of model building. Still, AI model-building (training) cost declines over time, because of hardware acceleration, improved algorithms and model design optimization.


Study

Date

Publication Venue

Key Conclusions on Cost Declines

Scaling Neural Networks with Specialized Hardware

2018

NeurIPS

Describes how hardware advances, especially GPUs and early TPUs, helped reduce model-building costs by around 50% annually for larger models compared to CPU-only setups.

Reducing the Cost of Training Deep Learning Models

2019

IEEE Spectrum

Shows a 40% cost reduction for model training per year through advances in parallel computing and early model optimizations such as batch normalization and weight sharing.

The Lottery Ticket Hypothesis

2019

ICLR

Proposes pruning techniques that significantly reduce computational needs, allowing for up to a 2x reduction in training costs for large models without performance loss.

Efficient Training of Transformers with Quantization

2020

ACL

Demonstrates that quantization can cut training costs nearly in half for transformer models by using fewer bits per parameter, making training large models more economical.

Scaling Laws for Neural Language Models

2020

OpenAI Blog

Finds that while model sizes are increasing exponentially, training cost per parameter can be reduced by ~30% annually through more efficient scaling laws and optimized architectures.

AI and Compute: How Models Get Cheaper to Train

2021

MIT Technology Review

Highlights that training cost per model dropped by approximately 50% from 2018 to 2021 due to more efficient GPUs, TPUs, and evolving cloud infrastructures.

Scaling Up with Low Precision and Pruning Techniques

2022

Journal of Machine Learning Research

Examines pruning and low-precision computation, showing that cost reductions of 50-60% are possible for large-scale models by aggressively reducing unnecessary computations.

The Carbon Footprint of Machine Learning Training

2022

Nature Communications

Highlights how reduced training costs, linked to hardware improvements and energy-efficient computing, also lower the environmental impact, with 35% cost reductions per year.

Optimizing AI Model Training in Multi-GPU Systems

2023

ICML

Finds that advanced multi-GPU and TPU systems reduce training costs for models by ~50% annually, even as model sizes grow, through parallelization and memory sharing.

Scaling AI Economically with Distributed Training

2024

ArXiv

Analyzes distributed training techniques that cut training costs nearly in half for large models, balancing model complexity with infrastructure improvements.


AI model creation costs are quite substantial, representing perhaps an order of magnitude more capital intensity than did cloud computing, for example. But capital intensity should decline over time, as do all computing instances. 


Directv-Dish Merger Fails

Directv’’s termination of its deal to merge with EchoStar, apparently because EchoStar bondholders did not approve, means EchoStar continue...