Tuesday, February 27, 2024

The Cure for "High GPU Prices" and Supply Constraints is "High GPU Prices"

Perhaps you share my astonishment that Nvidia's valuation in February 2024 is higher than that of Google or Amazon, powered by sales of graphics processing units that are in short supply at the moment. 


But markets change. Nvidia’s biggest customers (Meta, Microsoft, AWS, Google and others) now have huge incentives to create their own GPUs and especially specialized processors to support AI operations. 


Obviously, Nvidia’s revenue is its customers’ cost. So there are important financial (save money) and strategic reasons (avoid single supplier reliance; optimize chips for our own specific needs) for big customers to create their own chips. 


Company

Study/Article Name

Publication Date

Publishing Venue

Estimated Spending on Nvidia GPUs

Microsoft

- Nvidia's AI Boom Lifts Chipmaker's Stock to Record High (Seeking Alpha, Jan 26, 2023)

Jan 26, 2023

Seeking Alpha

$3 billion - $5 billion annually


- Microsoft Azure Spending on Nvidia GPUs to Reach $10 Billion by 2025 (DigiTimes, Nov 21, 2023)

Nov 21, 2023

DigiTimes

Up to $10 billion by 2025

Meta

- Meta Platforms Spent $10 Billion on Nvidia GPUs in 2022 (The Information, Oct 27, 2023)

Oct 27, 2023

The Information

$10 billion in 2022

AWS

AWS Spending on Nvidia GPUs Likely Exceeds $5 Billion Annually (MarketWatch, Feb 23, 2024)

Feb 23, 2024

MarketWatch

$5 billion+ annually (estimated)

Google

- GCP's AI Spending on Nvidia GPUs Could Reach $4 Billion by 2025 (Seeking Alpha, Jan 12, 2024)

Jan 12, 2024

Seeking Alpha

Up to $4 billion by 2025 (estimated)


The point is that markets change in response to supply and demand. If the "cure for low prices is low prices," we might also note the "cure for high prices is high prices." In other words, low prices reduce suppliers and competition in markets and can increase demand and therefore lead to markets rebalancing.

Low prices also encourage cost-side innovations that restore profit margins and remove inefficient competitors from the market.

High prices and excess demand for GPUs encourages new competitors, thereby increasing supply and leading to lower prices. As Jeff Bezos of Amazon is fond of saying, "your margin is my opportunity." So GPU supply will rebalance.

But specialized chips also seem to be of growing importance, another trend that will encourage hyperscale cloud computing providers to continue developing their own AI-related chips, beyond GPUs. 


Generative AI models do create demand for chips specifically optimized for tasks like text and image generation that can differ from general-purpose GPUs and may offer better performance and efficiency for specific tasks.


Also, specialized chips might be used for different stages of the AI workflow. Training might use dedicated training chips while inference could happen on edge processors (including onboard the device) optimized for power efficiency.


As AI applications move to the edge, chips will need to be smaller, more power-efficient and optimized for specific tasks like sensor data processing or local inference. Think of the ways ASICs and custom chips have been used for decades. 


Company

Chip Name

Purpose

Status

Technology

AWS

Trainium

Training large language models and other AI workloads

Available

High-bandwidth memory, custom compute cores


Inferentia

Inferencing pre-trained models

Available

Tensor cores, optimized data paths


AWS Annapurna Labs

Various custom processors for networking, storage, and other infrastructure

Available/Developing

ARM-based cores, various configurations

Meta

AI Engine

Accelerate various AI workloads, including training and inference

Available

Custom architecture with heterogeneous compute cores


Habana Gaudi 2

Train large language models and other workloads

Available

High-bandwidth memory, tensor cores


Cambrian M1

High-performance inference for specific tasks

Available

Specialized architecture for inference

Google

Tensor Processing Unit (TPU)

Training and inference for various AI models

Available/Developing

Custom architecture with tensor cores, multiple generations available


Cerebras Wafer-Scale Engine

Train and run massive AI models

Available

Wafer-scale design, high interconnect bandwidth


Sycamore

Quantum computing for specific AI tasks

Research/Development

Superconducting qubits

Microsoft

Azure Maia

Training large language models

Available/Developing

Optimized for large language models, sub-8-bit data types


Azure Cobalt

General-purpose cloud workloads

Available/Developing

ARM-based cores, optimized for cloud computing


One might therefore speculate that, in the future, the GPU market as led by Nvidia could change. Hyperscalers and others are likely to explore ways of using CPUs and other specialized chips rather than expensive GPUs to support their AI functions.


Nvidia certainly will move to protect itself from any diminution of its market share in GPUs, partly by becoming a “GPU as a service provider” itself; by increasing its chip fab operations to support others building their own custom chips or creating such chips on demand for its customers and prospects.



No comments: