Wednesday, December 3, 2025

Maybe an AI Bubble Exists for Training, Not Inference

Not all compute is the same, where it comes to artificial intelligence models, argues entrepreneur Dion Lim, especially where it comes to the massive levels of current investment, which some worry are an example of over-investment of bubble proportions.


Maybe not, he argues.


The first pool of invesments is training compute made up of massive clusters used to create new AI models. This is where the game of chicken is being played most aggressively by the leading contenders.


No lab has a principled way of deciding how much to spend; each is simply responding to intelligence about competitors’ commitments.


If your rival is spending twice as much, they might pull the future forward by a year.


The result is an arms race governed less by market demand than by competitive fear, with Nvidia sitting in the middle as the arms dealer.


To a great extent, he suggests, that is where the bubble danger exists.


Training the largest foundational AI models (like large language models) requires an extraordinary, one-time investment in specialized hardware (primarily high-end GPUs) to process huge datasets.

The second area of investment is inference compute, the use of AI models in production, serving actual sers. Here, the dynamics look entirely different, he argues.


Inference is the phase where the trained model is actually used to generate preudictions or responses for users (using a chatbot, running an AI image generator). 

Investment in inference compute arguably is less prone to over-investment because it is tied more closely to actual, measurable customer demand and has a more flexible infrastructure. 

But that is a general rule that might not be true in all instances, some will argue.


Inference costs are ongoing operational expenses (Opex) that scale directly with usage (the number of user queries or requests).

Some will argue that the scale of inference operations will still dominate, over time. 


Still, the argument is that nference hardware has more flexibility than training hardware. Companies can often use a wider variety of chips, including older-generation GPUs, CPUs, or specialized, more cost-efficient accelerators (such as TPUs, ASICs, or FPGAs) optimized for running a fixed, optimized model.

As GPUs become commoditized and compute abundance arrives, inference capabilities will become the next major market, especially given growing demand for efficient agentic tools.


So LLM inference might not be a "bubble" in the sense that investment professionals worry about.


The companies that can deliver intelligence most efficiently, at the lowest cost per token or per decision, will capture disproportionate value, Lim argues.


Training the biggest model matters less now; running models efficiently at planetary scale matters more.


So the argument is that the magnitude of AI capex might, or will, produce some level of over-investment, which is to be expected when an important new technology emerges.

But this might differ fundamentally from the dot-com bubble, which was fueled primarily by advertising spend for firms and products that had yet to establish a revenue model. Of course, some claim neither has AI, for the time being.


Back then, companies burned cash on Super Bowl commercials to acquire customers they hoped to monetize later. That was speculative demand chasing speculative value.


Many would argue that AI already is producing measurable results for companies that do have revenue models and viable products used at scale.




Why People Talking Politics Cannot Communicate

As a first-semester undergraduate, I thought philosophy was the most useless of all subjects in my curriculum. As an adult, I now believe philosophy is the most important of all subjects.


Epistemology, the study of how we know what we know, shapes how people argue, what counts as evidence, and even whether dialogue is possible. And explains why we often are talking past each other. 


In other words, at a deep level, “political discussions,” which I now avoid, are not based on differences about personalities or policies, but are more deeply grounded in different ways of ascertaining “truth.”


Arguments about difficult subjects such as abortion seem irreconcilable because the philosophic assumptions are different: 

  • “Life begins at conception” is a moral claim.

  • “Women should control their bodies” is an ethical-autonomy claim

  •  “A fetus feels pain at X weeks” (empirical claim).


Debates on gender identity, race, or cultural identity illustrate epistemological divergences as well:

  • Subjective epistemology: identity is self-defined and validated by experience

  • Biological epistemology: identity is rooted in physical or genetic reality.

  • Social constructivism: identity categories are created by society and mutable.


When someone says “I am what I say I am” vs “Identity is rooted in biology”, they are not just disagreeing, they are using different knowledge frameworks.


For me, the greatest issue is post-modernism, which asserts that there is no such thing as universal truth, as in “your truth vs my truth.” For those of us whose intellectual framework is still the enlightenment, I find the biggest challenge lies precisely there.


Democracy, law, and social cooperation depend on some common epistemic ground. Grammatical rules for language, driving laws and what constitutes “crime” are examples. 


If truth is subjective, how do we arbitrate disputes? 


So different epistemologies make discourse difficult to impossible. It isn’t the existence of different answers, but different ways of determining answers. 


Epistemology is the hidden engine of public conflict.


Epistemology

What counts as truth?

Authority sources

Typical expression in debates

Empirical/Scientific

Truth = what can be measured, tested, falsified

Science, data, statistics

“Show me the evidence.”

Rationalist/Philosophical

Truth = what follows logically from premises

Logic, argument consistency

“That argument contradicts itself.”

Moral/Religious

Truth = grounded in divine authority or natural law

Scripture, tradition, moral principles

“Life is sacred because…”

Personal/Subjective

Truth = lived experience and internal perception

Individual narratives and identity

“This is my truth.”

Postmodern

Truth = socially constructed power structures

Culture, discourse, ideology

“Truth claims serve power.”

Monday, December 1, 2025

Enantiodromia

"I hope I die before I get old," Pete Townshend of the Who wrote in the song "My Generation." 

Of course, the ballad of youthful defiance became its opposite.

The youth who once said "don't trust anyone over 30" becoming the establishment, the leaders of media, academics, politics and culture. 

Ironic. And an example of Enantiodromia, something becoming its opposite. 

AI User Experience Will Get Way Better, as Did Internet Experiences

One suspects the user experience of artificial intelligence will change as much as did our experience of internet apps: basic functionality that over time gets really sophisticated.


AI Evolution (Next 2 Yrs)

Internet Evolution (Past 20 Yrs)

Key Functional Change

Tools → Companions

Web 1.0Web 2.0 (Read-Only to Social)

Focus shifted from delivering documents to enabling user-generated content (UGC) and two-way interaction.

Assistants → Agents

CGI ScriptsAJAX/SPA

Functionality moved from server-side, full-page reloads to client-side, asynchronous data processing, creating smooth, native-app-like experiences.

Chatbots → Personalities

Generic HTML Sites → Brand/Personal Platforms

Web experiences became highly customizable, responsive, and designed with specific User Experience (UX) patterns to elicit certain feelings/behaviors.

Models → Ecosystems

Isolated Sites → APIs/Cloud Computing

Applications began integrating services across platforms using APIs (Application Programming Interfaces), enabling collaboration and data sharing.


We’ll move from using tools to having companions. AI will shift apps from acting as a transactional utility (a tool you use for a specific, one-off task) to becoming an interpersonal entity designed for ongoing engagement and emotional support.


As a corollary, observers also expect the AI to shift from generic to personal, where the chatbots, for example, have personas. 


Our uses of AI assistants will shift to using them as agents that do not wait for explicit instructions, but instead act as autonomous actors.


AI also will move from being a standalone, single-purpose program to a deeply integrated ecosystem that permeates all aspects of the user's digital life. In many instances, that might also mean the AI creates functionality “on the fly.” 


So users might not have to consciously choose an app to “do something,” but tell the AI what is desired and the functionality is produced on the spot, in real time. 


Of course, there are some likely limits. There are casual consumer uses and then different professional use cases that require more granular control. Apps are likely to remain better for the latter. 


For highly detailed or specialized tasks, such as creating a pixel-perfect logo, detailed CAD drawing, or performing precise color grading, the granular control offered by a dedicated app remains superior. 


Originality and Context: AI systems, by nature, train on existing data. Human designers and creators bring unique, cultural, and emotional context to their work that AI struggles to grasp. The "app" becomes the co-pilot that handles the tedious, repetitive tasks (like masking or code completion), freeing the human to focus on high-level creative and strategic decisions.



The New Interface: The "app" might not disappear; its interface is simply changing. Instead of being a canvas full of buttons and menus, the new interface is often a text box—a conversational AI agent that can be queried, similar to how you interact with me.

Feature

Traditional App (e.g., Photoshop)

AI-Driven Interaction (e.g., AI Generator)

Input Method

Manual operation, clicking tools, adjusting settings.

Natural Language Prompt ("Add a realistic looking alien doing a peace sign," "Change the lighting to sunset").

User Skill Required

High, requires training and expertise.

Low, requires clarity in expressing the desired outcome.

Core Value

Provides a toolkit for maximum control and precision.

Provides a direct solution or output, prioritizing speed and accessibility.

Goal

Editing/Creation Process (You control how it's done).

Final Output (The AI controls how it's done).

Friday, November 28, 2025

"May You Stay Forever Young"

I was originally only going to find a version of "Forever Young" by Bob Dylan as somebody suggested it was a good Thanksgiving expression of gratitude and good wishes. This version by Norah Jones also appears to be a sort of memorial for Steve Jobs, which I also found touching. 

Blessings to all humankind.  

Coopitition is a Pretty Old Story in Technology

“Coopitition” happens frequently in many markets, as competitors find they also cooperate with their rivals. But value chain participants also often move into other parts of the chain, meaning customers become competitors. 


Original Supplier

Customer (Later Competitor)

What the Customer Originally Bought

How/When Customer Became Competitor

Nature of Competition

Intel

Apple (M1/M2/M3 chips)

x86 CPUs for Mac computers

2020: Apple launched Apple Silicon and began replacing Intel CPUs in all Macs

Direct chip-level competitor; vertically integrated into SoC design

Qualcomm

Samsung, Huawei (HiSilicon), Apple (modems underway)

Mobile baseband chips

Samsung & Huawei developed in-house modems; Apple pursuing own modems

Reduced reliance on Qualcomm; internal components compete directly

NVIDIA

Amazon AWS, Google, Microsoft Azure

GPUs for cloud AI workloads

Clouds built custom AI chips (AWS Trainium/Inferentia, Google TPU, Microsoft MAIA/Cobalt)

Cloud providers become GPU substitutes and new chip vendors

Cisco

Amazon AWS (cloud networking), Arista, large enterprises with internal networks

Networking gear for data centers

Hyperscalers built their own switches and disaggregated network OS

Displaces traditional Cisco purchases with in-house designs

Oracle, Microsoft

Salesforce, Workday, ServiceNow

Databases and infrastructure for enterprise apps

SaaS firms built full-stack platforms competing with traditional enterprise software

Customers became full software suite competitors

Google Maps API

Uber, Lyft

Location services, navigation APIs

Ride-hailing firms built proprietary mapping to reduce dependence

Competes with mapping providers and reduces reliance on Google

Android/Google

Samsung (Tizen), Huawei (HarmonyOS)

Android mobile OS

Developed alternative smartphone OS platforms

Competing mobile ecosystems reducing Android dependency

AWS Marketplace vendors

AWS (Basics, managed services)

AWS acted as infrastructure + reseller of partner products

AWS launched services competing directly with partners (e.g., ElasticSearch/Opensearch, Datadog-like monitoring)

High-profile “customer-turned-competitor” ecosystem conflict

IBM, Dell, HP infrastructure

Major banks, retailers, healthcare systems

Enterprise servers, storage, and IT services

Internal cloud teams built private clouds replacing vendor systems

Vertical integration into infrastructure previously purchased

Facebook/Meta (mobile platforms reliance)

Meta’s VR/AR device program (Quest)

Reliance on Apple/Google mobile platforms

Meta developed its own hardware/software ecosystem

Competes with platform providers to escape dependency

Telcos buying vendor gear

AT&T, Verizon, Deutsche Telekom (open RAN initiatives)

Proprietary RAN equipment from Nokia/Ericsson

Built open-source or disaggregated RAN alternatives

Reduces dependence on traditional equipment vendors

IBM/Intel server vendors

Google, Amazon, Facebook data-center hardware

Commodity servers

Hyperscalers designed their own servers and power systems

Competing designs via OCP and private supply chain


That also can be seen in the market for neural processing units, where former customers Google and Amazon now have emerged as important suppliers of NPUs used in place of graphics processor units, even if many of the use cases are internal to those firms. 


Companies such as Google (Tensor Processing Units) and Amazon (Inferentia/Trainium chips) primarily use their NPUs internally or sell access through their cloud services, obscuring any direct "retail" market share comparison.


NPU Segment / Use Case

Dominant Architecture / Product Type

Market Share Context & Key Vendors

Key Vendor Dominance / Market Share Notes

Data Center / Cloud AI (Training,  Inference)

GPU (for Training,  General-Purpose AI),  ASIC/Custom NPUs (for specific Inference)

This segment includes hyperscalers using hardware internally (like Google's TPUs) or for cloud-based services.

NVIDIA holds a dominant share (often cited as 90%+ for high-end AI training accelerators/GPUs, which are often grouped with NPUs). Google (TPU), Amazon (Trainium/Inferentia), and AMD (Instinct) are the primary competitors in the custom/dedicated space.

Edge Devices (Retail/B2C)

Integrated NPUs (AI-SoCs) and Dedicated Edge NPUs

This segment covers chips embedded in consumer products for on-device AI (smartphones, PCs, smart home, automotive).

Qualcomm (Snapdragon), Apple (A/M series chips), and Samsung (Exynos) dominate the smartphone/tablet space, which accounts for the largest application share (e.g., 37.6% of the total NPU market application in 2024). Intel (Core Ultra) is a major player in the PC NPU market.

Edge Market Share (Application)

Smartphones & Tablets

The largest single application area, driving the growth of retail NPU units.

Estimated 35% - 40% of the NPU market application share is in retail, for smartphones and tablets.

Data Center NPU Share (Product Type)

Data Center NPUs

Market share based on the volume of processing units deployed in large-scale data centers.

Data Center NPUs maintained an estimated 51.6% of the neural processor market share in 2024, and much of that represents internal consumption by Amazon and Google. 


Google designs and uses TPUs for its own services including search, translate, Gemini AI, for example. But Google does make its TPUs available to Google Cloud Platform customers as a service, though it does not sell the chips.


Yes, Follow the Data. Even if it Does Not Fit Your Agenda

When people argue we need to “follow the science” that should be true in all cases, not only in cases where the data fits one’s political pr...