Thursday, February 29, 2024

AI: Is it "Different This Time"?

The phrase "it's different this time" has often been uttered when financial markets seem to be behaving in dangerously inflated ways. It refers to a belief that past trends or patterns won't hold true in the present situation and that the current situation is unique and exempt from the lessons learned from historical events.


Some of us last encountered the phrase during the dotcom era, when some argued historical data and traditional methods of analysis were not relevant due to unique and unprecedented circumstances.

As a practical matter, it meant some investors justified continued investment despite unsustainable valuations; without clear revenue models or business models. 

It might also be seen in periods of industry merger and acquisitions frenzy, when high valuation multiples are justified on the basis of asset scarcity. 

For the most part, we are not hearing that about firms in the artificial intelligence area. Investors see higher valuations, of course, but are not saying traditional criteria are irrelevant. That is helpful. 

On the other hand, the current high interest in artificial intelligence companies has led some to draw parallels to the dotcom bubble of the late 1990s, where technology stocks saw explosive growth followed by a dramatic crash.

So is AI in an investment bubble? To be sure, many believe AI is the "next big thing," as was the internet. And despite the dotcom excesses, we generally now recognize the internet as a general purpose technology capable of enabling lots of innovation and industries and firms. 

So excesses aside, any general purpose technology is a technology that has the potential to significantly affect a wide range of industries and sectors across an entire economy, often including massive disruption of existing industries.

Past GPTs have included fire, agriculture, steam power, electricity, the internal combustion engine and mass production. 

Many believe AI is going to emerge as a GPT. And that is where some of the apparent similarities to the dotcom boom and crash might apply. 


The internet boom and bust of the dotcom era featured exuberant expectations about revenue growth; valuation excesses; over-investment; often a focus on “growth” or market share over “profits;” unclear business models or a lack of sufficient distinctiveness or competitive “moats.”


So much investment capital was wasted. 


To be sure, there is a strong likelihood of over-investment in AI as well. On the other hand, monetization mechanisms are better understood. Infrastructure suppliers of course already are making money selling the products essential to building and operating AI models and apps. 


And that includes the expected development of system integration; consulting and managed services provided as professional services. 


Others in the AI area have revenues based on licensing of models, such as OpenAI and others. 


Beyond that, subscription models already are proliferating, allowing direct monetization of AI features and apps. Microsoft has been an early leader in that regard, adding AI subscriptions to its productivity suites. Others will follow. 


So AI software as a service; infra as a service and data as a service models already exist. 


Other revenue models will be more indirect, and may take some time to develop. Those might include pay-per-use; micro-transactions; performance-based revenue (revenue tied to specific cost savings, for example); or other outcome-based models (revenue earned by enabling specific return-on-investment objectives); other forms of profit sharing or licensing.  


Eventually, AI will support many types of retail transactions or advertising. But the benefits will often be indirect, making the specific AI contribution harder to measure. 


The point is that the analogies to the dotcom era are only partially applicable. Yes, over-investment and business models without “moats” could happen.


Still, we already see tangible and sustainable business models in action, clearly in infrastructure but in a growing number of subscription-based monetization models. 


Though many will argue there will ultimately only be a handful of leading large language models to lead the market, it already seems the case that specialized, smaller models optimized for particular firms or industries also are developing.


Tuesday, February 27, 2024

The Cure for "High GPU Prices" and Supply Constraints is "High GPU Prices"

Perhaps you share my astonishment that Nvidia's valuation in February 2024 is higher than that of Google or Amazon, powered by sales of graphics processing units that are in short supply at the moment. 


But markets change. Nvidia’s biggest customers (Meta, Microsoft, AWS, Google and others) now have huge incentives to create their own GPUs and especially specialized processors to support AI operations. 


Obviously, Nvidia’s revenue is its customers’ cost. So there are important financial (save money) and strategic reasons (avoid single supplier reliance; optimize chips for our own specific needs) for big customers to create their own chips. 


Company

Study/Article Name

Publication Date

Publishing Venue

Estimated Spending on Nvidia GPUs

Microsoft

- Nvidia's AI Boom Lifts Chipmaker's Stock to Record High (Seeking Alpha, Jan 26, 2023)

Jan 26, 2023

Seeking Alpha

$3 billion - $5 billion annually


- Microsoft Azure Spending on Nvidia GPUs to Reach $10 Billion by 2025 (DigiTimes, Nov 21, 2023)

Nov 21, 2023

DigiTimes

Up to $10 billion by 2025

Meta

- Meta Platforms Spent $10 Billion on Nvidia GPUs in 2022 (The Information, Oct 27, 2023)

Oct 27, 2023

The Information

$10 billion in 2022

AWS

AWS Spending on Nvidia GPUs Likely Exceeds $5 Billion Annually (MarketWatch, Feb 23, 2024)

Feb 23, 2024

MarketWatch

$5 billion+ annually (estimated)

Google

- GCP's AI Spending on Nvidia GPUs Could Reach $4 Billion by 2025 (Seeking Alpha, Jan 12, 2024)

Jan 12, 2024

Seeking Alpha

Up to $4 billion by 2025 (estimated)


The point is that markets change in response to supply and demand. If the "cure for low prices is low prices," we might also note the "cure for high prices is high prices." In other words, low prices reduce suppliers and competition in markets and can increase demand and therefore lead to markets rebalancing.

Low prices also encourage cost-side innovations that restore profit margins and remove inefficient competitors from the market.

High prices and excess demand for GPUs encourages new competitors, thereby increasing supply and leading to lower prices. As Jeff Bezos of Amazon is fond of saying, "your margin is my opportunity." So GPU supply will rebalance.

But specialized chips also seem to be of growing importance, another trend that will encourage hyperscale cloud computing providers to continue developing their own AI-related chips, beyond GPUs. 


Generative AI models do create demand for chips specifically optimized for tasks like text and image generation that can differ from general-purpose GPUs and may offer better performance and efficiency for specific tasks.


Also, specialized chips might be used for different stages of the AI workflow. Training might use dedicated training chips while inference could happen on edge processors (including onboard the device) optimized for power efficiency.


As AI applications move to the edge, chips will need to be smaller, more power-efficient and optimized for specific tasks like sensor data processing or local inference. Think of the ways ASICs and custom chips have been used for decades. 


Company

Chip Name

Purpose

Status

Technology

AWS

Trainium

Training large language models and other AI workloads

Available

High-bandwidth memory, custom compute cores


Inferentia

Inferencing pre-trained models

Available

Tensor cores, optimized data paths


AWS Annapurna Labs

Various custom processors for networking, storage, and other infrastructure

Available/Developing

ARM-based cores, various configurations

Meta

AI Engine

Accelerate various AI workloads, including training and inference

Available

Custom architecture with heterogeneous compute cores


Habana Gaudi 2

Train large language models and other workloads

Available

High-bandwidth memory, tensor cores


Cambrian M1

High-performance inference for specific tasks

Available

Specialized architecture for inference

Google

Tensor Processing Unit (TPU)

Training and inference for various AI models

Available/Developing

Custom architecture with tensor cores, multiple generations available


Cerebras Wafer-Scale Engine

Train and run massive AI models

Available

Wafer-scale design, high interconnect bandwidth


Sycamore

Quantum computing for specific AI tasks

Research/Development

Superconducting qubits

Microsoft

Azure Maia

Training large language models

Available/Developing

Optimized for large language models, sub-8-bit data types


Azure Cobalt

General-purpose cloud workloads

Available/Developing

ARM-based cores, optimized for cloud computing


One might therefore speculate that, in the future, the GPU market as led by Nvidia could change. Hyperscalers and others are likely to explore ways of using CPUs and other specialized chips rather than expensive GPUs to support their AI functions.


Nvidia certainly will move to protect itself from any diminution of its market share in GPUs, partly by becoming a “GPU as a service provider” itself; by increasing its chip fab operations to support others building their own custom chips or creating such chips on demand for its customers and prospects.



Will AI See a Repeat of Wild Dotcom Over-Investment?

The current high interest in artificial intelligence companies has led some to draw parallels to the dotcom bubble of the late 1990s, where technology stocks saw explosive growth followed by a dramatic crash. The danger is over-investment in platforms, apps or use cases of questionable value.


To be sure, that will happen. At such an early stage, we cannot predict ultimate winners and losers, and perhaps seven out of 10 bets will be lost.


But underlying those trends was the emergence of the internet, something we generally now recognize to have been a general purpose technology capable of enabling lots of innovation and industries and firms.  


A general purpose technology is a technology that has the potential to significantly affect a wide range of industries and sectors across an entire economy, often including massive disruption of existing industries. Past GPTs have included fire, agriculture, steam power, electricity, the internal combustion engine and mass production. 


Many believe AI is going to emerge as a GPT. And that is where some of the apparent similarities to the dotcom boom and crash might apply. 


The internet boom and bust of the dotcom era featured exuberant expectations about revenue growth; valuation excesses; over-investment; often a focus on “growth” or market share over “profits;” unclear business models or a lack of sufficient distinctiveness or competitive “moats.”


So much of the investment capital was wasted. But there is an argument to be made that the "expected" Ai win rate is about what one would expect from investment banking in general: more failures than successes.


The issue is "bubble" style over-investment, and there the parallels with the dotcom over-investment might not be so large.


Compared to the dotcom period, AI monetization mechanisms are better understood. Infrastructure suppliers of course already are making money selling the products essential to building and operating AI models and apps. 


And that includes the expected development of system integration; consulting and managed services provided as professional services. 


Others in the AI area have revenues based on licensing of models, such as OpenAI and others. 


Beyond that, subscription models already are proliferating, allowing direct monetization of AI features and apps. Microsoft has been an early leader in that regard, adding AI subscriptions to its productivity suites. Others will follow. 


So AI software as a service; infra as a service and data as a service models already exist. 


Other revenue models will be more indirect, and may take some time to develop. Those might include pay-per-use; micro-transactions; performance-based revenue (revenue tied to specific cost savings, for example); or other outcome-based models (revenue earned by enabling specific return-on-investment objectives); other forms of profit sharing or licensing.  


Eventually, AI will support many types of retail transactions or advertising. But the benefits will often be indirect, making the specific AI contribution harder to measure. 


The point is that the analogies to the dotcom era are only partially applicable. Yes, over-investment and business models without “moats” could happen.


Still, we already see tangible and sustainable business models in action, clearly in infrastructure but in a growing number of subscription-based monetization models. 


Though many will argue there will ultimately only be a handful of leading large language models to lead the market, it already seems the case that specialized, smaller models optimized for particular firms or industries also are developing.


The point is that wild or excessive investment in firms without viable business models is much less an issue with AI, which has reasonable and justified implications for reducing operating cost for almost any business.


OpenAI Sora Scares Hollywood for a Reason


All 40 of these videos were generated directly by OpenAI Sora from prompts such as "A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about."

Sunday, February 25, 2024

Does AI Create a New Rationale for "Smartphone as a Service?"

How far are we away from "smartphone as a service," where the cost of the device, plus mobile service plus AI is a single bundle with a recurring cost?


“Artificial intelligence” smartphones are likely to pose issues--or create opportunities--related to processing tasks and memory on devices; use of edge computing or cloud computing resources. That, in turn, might create an opening for different ways of envisioning device and service business models.


Regarding on-board resources, on-board machine learning models might require more on-device memory.  to load up even before we get to running them, although the availability of compressed models surely is coming. 


Processing also will be an issue. Running an ML model arguably requires more unique arithmetic logic blocks than your typical CPU, so specialized processors are likely necessary.


Smartphone processing likely will continue to be constrained by power consumption and heat generation limits, as well, so there will be some limits on on-board processing power. 


Leveraging cloud or edge computing obviously is a potential solution. Processing of some tasks--such as real-time language translation; camera features and voice-to-text will continue to make sense as an “on-board processing” capability. 


Other features might continue to make sense as an “edge- or cloud-supported” capability. 


The issues are that this could reshape needs for end-user data plan features and higher-speed, latency-bounded networks. Roaming costs also are an issue. 


So even if on-board processing is, in principle, ideal, it might not be practical for all devices (mid-range and low-end devices, for example). And heat and processor cost issues must be considered as well. 


As a marketing issue, “subscription phones and service” might need a rethink. To some extent, consumers often take advantage of subsidized phone offers from their service providers. So a service plan with a two-year contract that includes the cost of the device in the recurring cost already is a form of “phone as a service.”


Subscription plans for advanced AI service (Google’s Gemini or Microsoft’s Copilot) already exist. So we might see a rethink of possible product bundles that include, on a subscription basis, the device, the AI capabilities (on-board plus cloud or edge) and the recurring service plan costs. 


Creating such bundles should be easier for consumers to understand once we develop more valuable AI-enhanced apps and features usable on smartphones. People might expect AI features such as camera performance or image editing and translation services to be “bundled” with the device. 


But, eventually, some compelling additional use cases could--and should--develop that require an AI service plan that relies on cloud and edge computing, faster connections and more data allowances. So think of a 5G service plan using mid-band spectrum (for speed); unlimited data usage (so the external cloud and edge processing can be used) plus “AI device” supplied on a subscription basis, with a “new” device supplied every two or three years. 


Aside from all the practical details of figuring out the service provider’s cost to do so, we still need some new “killer apps” that make the purchase of an AI service plan such as Gemini or Copilot a reasonable and necessary investment by the consumer. 


As a business problem, this is a “logical bundle” issue. What features (device, AI, recurring service cost, features and apps) will make sense for many customers when all of those features are a subscription, not just the mobile service plan and the AI? 


Right now, it is not so clear what the new killer value requiring AI--and therefore a more-powerful device plus remote processing--is the trigger. Still, once one or a few such use cases do develop, with high customer interest, it will be easier to conceive of, and sell, new bundles of device, AI and service, for one recurring monthly price.


AI Impact on Data Centers

source: PTC