Showing posts sorted by date for query up the stack. Sort by relevance Show all posts
Showing posts sorted by date for query up the stack. Sort by relevance Show all posts

Tuesday, May 19, 2026

Google, Blackstone Create TPU "as a Service" Business

Google and Blackstone’s TPU-as-a-service venture is important for any number of reasons:

  • it turns TPUs from a mostly Google-hosted product into a broader external infrastructure platform

  • strengthens Google’s push to monetize its custom silicon

  • gives AI customers a non-Nvidia acceleration path

  • might clarify the neocloud business model. 


Blackstone is committing $5 billion in equity and an initial 500 MW of capacity coming online in 2027. 


The move tends to ratify the GPU as a service market and provides an alternative to the Nvidia ecosystem, at least in the “bare metal” portion of the business. 


The venture also might intensify pricing pressure and reduce differentiation in the inference market. 


The venture also tests the durability of the neocloud business itself. Today, a global scarcity of high-end AI training and inference compute creates the basis for the market.


Neoclouds originally emerged as stopgaps to address the GPU shortage, but their bare-metal economics are fragile, being based on what most believe are temporary shortages of capacity. 


Perhaps their long-term viability hinges on their ability to move up the stack into AI-native services, which puts them in direct competition with hyperscalers. And some will note how little protection the business has, given the thin profit margins and high continuing capital investment. 


source: McKinsey 


Neoclouds have a strong demand story, but their business model is structurally difficult because they combine very high capital intensity with fast hardware depreciation and aggressive price competition. The result is a market that can grow fast while still being hard to make sustainably profitable.


The core problem is that graphics processing units are expensive, and their resale or rental value falls quickly as new generations arrive. 


McKinsey notes that over a typical five-year depreciation horizon, GPU-hour pricing can decline by half or more, which forces providers to recover capital quickly or risk stranded assets.


So neoclouds must keep raising capital to buy the next wave of chips even while the prior fleet is losing value. This makes cash flow, financing terms, and utilization rates far more important than simple revenue growth.


GPU clouds are not just chip businesses; they are power, cooling, networking, and operations businesses as well. High energy costs, high-density racks, and increasingly complex cooling requirements raise operating expense and add execution risk.


Up to this point, neoclouds are heavily dependent on Nvidia for the chips, networking ecosystem, and much of the software stack.


Google will test that thesis.


A big reason neoclouds emerged was that they could undercut hyperscalers on price and provisioning speed, sometimes by large margins. But hyperscalers are responding.


That means the initial “GPU scarcity arbitrage” is not a durable moat by itself.


The strategic tension is that investors often want neoclouds to move up the stack into managed services, orchestration, inference platforms, or sector-specific solutions. Those layers can improve retention and margins, but they also bring neoclouds into direct competition with hyperscalers that have deeper ecosystems and broader product bundles.


So the firms face a hard choice: stay close to bare-metal GPU rental, where margins are thin, or build higher-value services, where competition is tougher and sales cycles are longer.


That suggests a need to pioneer niche markets, such as sovereign compute and specialized workloads.


Sunday, May 10, 2026

AI Ecosystem "Rule of Three" Coming?

The eventual market structure for the artificial intelligence value chain is a reasonable question, as it was for the internet value chain before it and for virtually every value chain, ever. 


The core question for at least a few possible market leaders: should you own the entire value chain (vertical integration) or dominate a single layer exceptionally well (horizontal specialization)? 


And even for possible market leaders, the idea of becoming “a platform” necessarily entails a horizontal dominance. 


For most firms with less scale, the answer is almost always some form of horizontal specialization. 


Vertical dominance almost always appeals early on, though, as much of the stack does not yet exist, and must be created. 


Early in any market's development, firms face high uncertainty, fragmented or nonexistent supply chains, undefined standards, and limited infrastructure. 


The "value stack" (the full chain of activities from raw inputs to end-customer delivery, including supporting services like logistics, financing, or after-sales) is incomplete or unreliable.


The internet era began with vertically integrated ambitions that mostly failed. Later, many firms prospered by operating “asset-light,” owning as little of the full stack as possible.


The internet's structural lesson might be summarized as “ infrastructure commoditizes and value migrates up the stack.”


The winners were companies that owned the layer closest to the user:

  • Google (search/intent)

  • Facebook (social graph)

  • Salesforce (CRM workflow)

  • Microsoft (Office + enterprise identity)

  • Amazon (fulfillment + Prime).


Vertical integration seems to appeal most early in value chain development. 


The PC and semiconductor markets were once vertically integrated. 


But, eventually, the supply chains became more horizontal:

  • Microsoft for operating systems

  • Intel for processors

  • Nvidia supplying graphics chips

  • Several companies manufacturing hard drives.


The single clearest exception to the "horizontal wins" rule was Apple, which maintained radical vertical integration (silicon → OS → apps → retail).


Given the “early” status of AI, you might guess that vertical approaches are favored by would-be future leaders of the market. 


Very-high infrastructure costs (GPUs, memory, data centers, energy sources) mean that infrastructure costs scale faster than revenues unless you own the stack.


This creates a structural pressure toward vertical integration, largely because high infrastructure costs and scarcity now rearranges infra value. At least for the moment, what cloud was to software as a service, AI infrastructure is forAI and AI agents.


What remains undetermined are the long-term relationships within the value chain. How important will infra remain, and how much differentiation can it provide? How important will vertical integration remain?


Much depends on how today’s bottlenecks are resolved. 


For full-stack integrators (Google, Microsoft, Amazon, OpenAI), bottlenecks in compute, distribution, and enterprise relationships suggest at least significant vertical integration advantages


Long term, “a few” ecosystem winners with significant vertical integration are likely to emerge, with partners occupying key horizontal functions. Applications will likely remain an area where the most specialists will emerge, as has been the case for the internet value chain. 


Internet Layer

Internet Winner

Why They Won

AI Layer

Current Leader(s)

Survivability

Physical infrastructure

Telecom / cable cos (AT&T, Comcast)

Owned the last mile; regulatory moats

GPU compute & data centers

NVIDIA, CoreWeave

Medium — commoditization risk as custom ASICs proliferate; CUDA moat is real but contested

Backbone / routing

Level 3, Cogent (commodity over time)

Traffic volume; peering scale

Cloud hyperscalers (compute fabric)

AWS, Google Cloud, Azure

High for top 3; structural oligopoly with massive switching costs

Horizontal platform / OS

Microsoft Windows, then Android/iOS

Developer lock-in; ecosystem flywheel

Foundation model + API platform

OpenAI, Anthropic, Google DeepMind

Medium-high — differentiation real today, commoditization pressure building

CDN / performance layer

Akamai, then Cloudflare

Edge distribution; hard-to-replicate infra footprint

Inference optimization / edge AI

Cloudflare Workers AI, Groq

High for winners; latency & cost matter enormously at inference scale

Search / intent layer

Google

Owned the demand aggregation point; data flywheel

AI assistant / agent interface

ChatGPT, Perplexity, Google Gemini

Very high — whoever owns the default query interface owns the toll road

Vertical SaaS

Salesforce, Workday, Veeva

Deep workflow + data lock-in in specific domains

Vertical AI (legal, medical, finance)

Harvey (legal), Tempus (oncology), Palantir (gov/defense)

Very high — proprietary domain data + workflow integration = durable moat

Developer tooling / middleware

Twilio, Stripe, Segment

Abstracted complexity; usage-based pricing

AI orchestration & dev tools

LangChain, Weights & Biases, Hugging Face

Medium — commoditization risk as hyperscalers bundle equivalents

Content / media

Netflix, Spotify

Owned the user relationship + proprietary content

AI-native consumer apps

Midjourney, ElevenLabs, Runway

Medium — switching costs low, but brand + proprietary training data matter

E-commerce / marketplace

Amazon, Shopify

Demand aggregation + fulfillment infrastructure

Agentic commerce / AI procurement

Amazon Alexa+, emerging agent platforms

Unknown — biggest open question; whoever controls the purchasing agent controls commerce

"Picks and shovels" enabling layer

Cisco (networking gear), VMware (virtualization)

Sold to all combatants; infrastructure-agnostic

Memory, packaging, power

SK Hynix (HBM), TSMC (fabrication), Eaton (power)

Very high — scarce physical inputs with no software substitute


The internet produced one dominant full-stack integrator per consumer surface (Apple in mobile, Google in search/Android, Amazon in commerce/cloud) and many durable horizontal specialists at layers with genuine switching costs.


AI is likely to produce a similar structure, The full-stack integrators with both infrastructure and consumer/enterprise distribution (Google, Microsoft, Amazon) are best positioned for a role Apple almost uniquely pioneered.


Market dynamics tend to create  a "Rule of Three" (or Rule of Three and Four) structure in mature, stable, competitive markets. 


Bruce Henderson of BCG hypothesized in 1976 that a stable competitive market never has more than three significant (generalist) competitors, with the largest having no more than four times the market share of the smallest, often stabilizing around a 4:2:1 ratio (40-50 percent for the leader : 20-25 percent for number two and 10-12 percent for number three). 


That seems reflected in the internet’s “winner takes most” structure. 


Jagdish Sheth and others validated this across hundreds of industries: three full-line generalists dominate 70-90 percent of the market (by share or profit), while the rest consists of niche specialists (product, geographic, or segment-focused) that thrive on margins rather than volume.


Internet Layer

Internet Winner

Why They Won

AI Layer

Current Leader(s)

Survivability

Physical infrastructure

Telecom / cable cos (AT&T, Comcast)

Owned the last mile; regulatory moats

GPU compute & data centers

NVIDIA, CoreWeave

Medium — commoditization risk as custom ASICs proliferate; CUDA moat is real but contested

Backbone / routing

Level 3, Cogent (commodity over time)

Traffic volume; peering scale

Cloud hyperscalers (compute fabric)

AWS, Google Cloud, Azure

High for top 3; structural oligopoly with massive switching costs

Horizontal platform / OS

Microsoft Windows, then Android/iOS

Developer lock-in; ecosystem flywheel

Foundation model + API platform

OpenAI, Anthropic, Google DeepMind

Medium-high — differentiation real today, commoditization pressure building

CDN / performance layer

Akamai, then Cloudflare

Edge distribution; hard-to-replicate infra footprint

Inference optimization / edge AI

Cloudflare Workers AI, Groq

High for winners; latency & cost matter enormously at inference scale

Search / intent layer

Google

Owned the demand aggregation point; data flywheel

AI assistant / agent interface

ChatGPT, Perplexity, Google Gemini

Very high — whoever owns the default query interface owns the toll road

Vertical SaaS

Salesforce, Workday, Veeva

Deep workflow + data lock-in in specific domains

Vertical AI (legal, medical, finance)

Harvey (legal), Tempus (oncology), Palantir (gov/defense)

Very high — proprietary domain data + workflow integration = durable moat

Developer tooling / middleware

Twilio, Stripe, Segment

Abstracted complexity; usage-based pricing

AI orchestration & dev tools

LangChain, Weights & Biases, Hugging Face

Medium — commoditization risk as hyperscalers bundle equivalents

Content / media

Netflix, Spotify

Owned the user relationship + proprietary content

AI-native consumer apps

Midjourney, ElevenLabs, Runway

Medium — switching costs low, but brand + proprietary training data matter

E-commerce / marketplace

Amazon, Shopify

Demand aggregation + fulfillment infrastructure

Agentic commerce / AI procurement

Amazon Alexa+, emerging agent platforms

Unknown — biggest open question; whoever controls the purchasing agent controls commerce

"Picks and shovels" enabling layer

Cisco (networking gear), VMware (virtualization)

Sold to all combatants; infrastructure-agnostic

Memory, packaging, power

SK Hynix (HBM), TSMC (fabrication), Eaton (power)

Very high — scarce physical inputs with no software substitute


Vertical integration is probably going to work for a few big firms. Most long-term providers in the AI ecosystem will be specialists, though. Most markets ultimately develop that way.


Butterfly Effect: 100% Deterministic and Yet 0% Predictable

Maybe you have been puzzled by the butterfly effect , the idea that a tiny flap of a butterfly's wings in one part of the world can fund...