Thursday, April 3, 2025

AI Assistant Revenue Upside Mostly Will be Measured Indirectly

Amazon expects Rufus, its AI shopping assistant, to indirectly contribute over $700 million in operating profits this year, Business Intelligence says. 


The expected upside would come in the form of "downstream impact," a metric Amazon uses to measure a product or service's potential to generate additional consumer spending across Amazon's vast offerings. Rufus, as such, generates no direct revenue, of course. 


Rufus product recommendations might lead to more purchases on Amazon's marketplace, for example. The value of advertising embedded in Rufus content are another way indirect revenue upside is measured. 


By 2027, however, it is expected to reach $1.2 billion in DSI profit contributions, according to Amazon. 


“From broad research at the start of a shopping journey such as ‘what to consider when buying running shoes?’ to comparisons such as ‘what are the differences between trail and road running shoes?’ to more specific questions such as ‘are these durable?’, Rufus meaningfully improves how easy it is for customers to find and discover the best products to meet their needs, Amazon says. 


That is likely a way most firms are going to have to rely upon to quantify their LLM assistant revenue gains. 


Use Case

Description

Revenue Impact

Customer Support Automation

AI chatbots handle FAQs and troubleshooting, reducing customer service costs.

Lowers operational costs and improves customer retention.

Lead Generation,  Qualification

AI assistants engage website visitors, collect data, and qualify leads.

Increases conversion rates and enhances sales pipeline efficiency.

E-commerce Upselling,  Cross-Selling

AI recommends relevant products based on user behavior and preferences.

Boosts average order value and sales.

Content & SEO Optimization

AI generates blog posts, product descriptions, and metadata for SEO.

Increases organic traffic, improving brand visibility and sales.

Personalized Marketing, Retargeting

AI-driven chatbots deliver personalized offers and recommendations.

Enhances engagement, conversion rates, and repeat purchases.

Employee Productivity Enhancement

AI automates repetitive tasks (e.g., email drafting, summarization, scheduling).

Saves time, allowing employees to focus on high-value tasks.

Market Research,  Insights

AI collects and analyzes customer feedback for business insights.

Improves decision-making and product-market fit.

Training, Onboarding

AI-based interactive training modules for new employees.

Reduces onboarding time and training costs.

Subscription, Membership Services

AI chatbots engage users to promote premium subscriptions.

Increases subscription revenue and customer lifetime value.

Reducing Churn,  Customer Retention

AI proactively engages users before they disengage or cancel services.

Lowers customer acquisition costs by improving retention rates.


Are Large Language Models Really "10 Times" More Energy Consumptive than Search?

Most of us have heard claims that a single chatbot (Large Language Model or generative AI system) query is significantly more energy-intensive (often cited as roughly 10 times more) than a traditional search query.


Most of us could agree that the statement about energy intensity is directionally correct for most systems at the present time, though perhaps not as big a long-term issue, as energy intensity is virtually certain to be reduced over time. 


Computational complexity obviously is an issue. Traditional search uses pre-computed indexes. Much of the “heavy lifting" (indexing the web) is done beforehand.


Large language models run a generative process through a massive neural network (often with billions or trillions of parameters). Each query requires significant computations to understand a prompt and generate a novel response. This "inference" process is inherently more computationally demanding per query than retrieving indexed information.  


Early energy estimates suggested a "10x" more energy metric. These estimates looked at computational operations (FLOPs - Floating Point Operations per Second) required for each type of task and translated that into potential energy use based on typical hardware efficiency.


But that probably already is an out-of-date way to make the comparisons. As search engines increasingly integrate generative AI into search, the difference between an LLM query and a search query is likely narrowing quite substantially, in terms of energy consumption. 


Study/Source

Year

Model(s) Analyzed (Examples)

Key Finding / Estimate per Query

Context / Notes

Luccionni, Viguier, & Ligozat (NeurIPS 2023 - originally arXiv 2022)

2022/2023

BLOOM (176B parameters)

Estimated inference energy consumption for BLOOM, varying significantly based on hardware (e.g., A100 vs. T4 GPUs). Provided methodology for carbon footprint calculation.

Focused on BLOOM, an open model. Emphasized the impact of hardware and location (electricity grid mix) on the carbon footprint. Didn't give a single universal Wh/query figure.

Patterson et al. (Google Research) (arXiv 2021)

2021

LaMDA, MUM (Conceptual / Internal Google Models)

Not a direct per-query energy figure, but stated "some models used by Search are already large," and newer AI features (like MUM) are more compute-intensive.

Context was broader discussion of model efficiency and training costs. Confirms Google's internal view that advanced AI features increase computational demands over basic search.

De Vries (Digiconomist) (Joule, 2023 & ongoing analysis)

2023

General LLMs (e.g., based on ChatGPT/GPT-3 scale)

Estimated a single ChatGPT query could consume ~0.001-0.01 kWh (1-10 Wh) on average, potentially much higher depending on complexity & hardware. Compared this to a Google search (~0.0003 kWh or 0.3 Wh).

Estimates based on assumed hardware (like Nvidia A100 GPUs), server power usage, and query processing time. Acknowledges high uncertainty. Helped popularize the ~10x search comparison.

Gupta et al. (Stanford HAI) (Working Paper / Estimates)

2023

Conceptual LLM (e.g., GPT-3 scale)

Estimated generating a single image with a diffusion model might consume as much energy as charging a smartphone. Extrapolated that text generation is also energy-intensive.

Focused partially on image generation AI but discussed text AI costs. Used comparisons to relatable actions (phone charging) to illustrate magnitude. Emphasized inference costs add up globally.

Google Public Statements / Reports (Various)

Ongoing

Google's AI Services (incl. Search Generative Experience)

Repeatedly stated that generative AI queries are more computationally intensive and thus consume more energy than traditional search queries. No specific public Wh/query figure released.

Confirms the general premise from the provider's side. Focuses on efforts to improve efficiency via hardware (TPUs) and software optimization.

University Research (Various) (e.g., studies citing FLOPs)

Ongoing

Various (BERT, GPT variants, etc.)

Often estimate FLOPs (Floating Point Operations) per query/token. E.g., a query might require trillions of FLOPs. This can be converted to energy using hardware efficiency (Joules/FLOP), leading to estimates often in the 0.1 Wh to 10 Wh range depending on assumptions.

These are often theoretical calculations based on model architecture and assumed hardware specs (e.g., Joules per FLOP for a specific GPU). Highly variable.


Also, models are becoming more energy efficient, as tends to happen with all computing processes that become more mature. 


So at this point, we really do not know much about energy consumption, except that, on today’s hardware, using today’s algorithms and compute intensity, it is logical enough to believe that more energy is required, as more computation is required. 


Still, logic also suggests that simple queries will require less computation, and therefore less energy. 

A simple classification task, retrieving a cached answer, or generating a very short response using a smaller, specialized model might have an energy cost that isn't dramatically higher than a complex search operation.


But actual consumption is certain to vary by model, by model architecture, by data center and hardware platforms. And since no “operating at scale” AI “as a service” supplier seems to have released any actual studies on the subject, we might assume they already know the energy consumption increase is significant.


Wednesday, April 2, 2025

AI Might Affect the Whole Economy, But Chip Ecosystem Not So Much

The ramifications from artificial intelligence, should it emerge as a genuine general-purpose technology, will obviously have huge potential implications for the computing industry as well, from chip design and capabilities to fabrication to the relative importance of processing functions and possible changes in the value chain related to hardware versus software and types of software. 


On the other hand, markets change all the time. It seems less clear that AI-driven changes are qualitative, at the chip end of the business, compared to the software part of the value chain. 


Taiwan’s chip fabrication dominance, largely driven by TSMC, has been tied to the Intel ecosystem for decades, for example. Intel’s x86 architecture powered the PC and server markets. 


But AI arguably is not driven by the Intel ecosystem. As computing pivots toward AI, GPUs, and accelerators like TPUs, the ecosystem arguably is liable to shift. 


Looking only at the “digital infrastructure” value chain, chips, servers, models, training and then the AI impact on software value, chip manufacturing and design likely will continue to represent 55 percent to 65 percent of value within the infra part of the value chain.


Value Chain Segment

Estimated % of Value (Revenue Share)

Key Players & Examples

AI Chip Manufacturing

35-40%

TSMC, Samsung, Intel Foundry

AI Chip Design

20-25%

NVIDIA, AMD, Google, Apple, Amazon (AWS Trainium & Inferentia)

Cloud & AI Infrastructure

15-20%

AWS, Microsoft Azure, Google Cloud, Oracle

AI Model Development & Training

5-10%

OpenAI, Anthropic, Meta, Google DeepMind

Enterprise AI Software & Applications

10-15%

Microsoft (Copilot), OpenAI (ChatGPT API), Salesforce, Adobe, ServiceNow

Edge AI & AI-Powered Devices

5-10%

Tesla (Autopilot AI), Apple (Neural Engine), Qualcomm (Snapdragon AI)


Obviously a “full” value chain would have to include the contribution to value of all markets for products used by people and businesses that include AI as part of the solution, but that ultimately will be virtually every part of an economy. 


If we might argue that the x86 ecosystem was driven by standardization, AI, so far, seems less so. AI workloads use, and perhaps can require, specialized silicon, including Nvidia graphics processor units, or Google’s Tensor Processing Units.


That doesn’t change some fundamental roles. Chip designers might still be separate from chip manufacturing. Value still will exist in intellectual property and manufacturing efficiency. Some chop run volumes might be smaller and manufacturing venues could shift away from Taiwan. 


Markets evolve over time, so this might be more a quantitative than qualitative shift. Nobody seems to believe the roles of chip design and manufacturing will fuse or that the need for chip fabs will go away as priorities shift to accelerators and parallel processing. 


Sure, the focus might shift to AI products rather than x86 processors. So the business is reframed rather than revamped. 


We probably cannot say the same about consumer and business software. In the realm of software, AI might indeed be poised to “change everything.” “AI features” are not simply being added to existing software. 


AI might conceivably disrupt entire value propositions, change user expectations and alter the economics of software. AI should make it easier for non-technical people to produce apps, as the internet enabled many content creators to flourish outside the established media firms. 


The cost of creating content or code should drop. And the way people pay for use of software could keep evolving in the direction of consumption-based pricing rather than flat-fee licenses. And advertising might be a new “pricing” tool, allowing use of software to be defrayed by advertising exposure. 


For consumers, AI arguably leads to more dynamic, adaptive experiences, shifting  focus from manual input to automation and personalization. For business software, the ability to make decisions is probably more important. 


In either case, there might be an argument to be made that software now begins to be experienced more as a “service.” 


Beyond that, software becomes more adaptive, learning from user behavior. Software also becomes less of a tool and more of an “assistant.” 


And it always is possible that whole new categories of apps are created, as once was the case for search and social media; ride-hailing and food delivery.


Monday, March 31, 2025

Why Regulatory Risk Can Influence Model Responses

As someone who uses language models including Gemini, Perplexity and Claude for various research tasks including some that seek to summarize market trends, I have often found that I get answers that use different sources, which is not unexpected.


What has been unexpected is the frequent refusal of Gemini to provide estimates the other engines do supply. That leads me to believe Gemini, in particular, uses algorithms intended to limit its use in many ways that might expose Alphabet to regulatory or other exposure. 


My guess is that Google’s higher regulatory and antitrust exposure, compared to firms supporting Perplexity or Claude, for example, causes the use of guardrails that instruct the chatbot to avoid anything that might be construed as “financial advice,” even when no personally-identifiable information is involved, and the questions relate to industry market sizes, revenues and so forth. 


The issue here is not that different training data has been used, that data recency varies or that models use different underlying architectures, algorithms, and fine-tuning techniques. So even when working from similar base data, different models can generate slightly-different conclusions. 


Refusals to forecast (like you sometimes see from Gemini) are” typically due to built-in safety protocols designed to prevent the AI from giving unlicensed financial advice, acknowledge the inherent uncertainty of markets, and avoid potential liability and the spread of misinformation,” Gemini itself says, when asked about “refusal to answer” responses.


Sunday, March 30, 2025

"World" Wants to be a SuperApp: Who Will Want to Use It?

World is a mobile app including cryptocurrency that is designed to function as a Super App, a multipurpose app (similar to WeChat) combining a social network with commerce and currency support. 

People might see significance in Sam Altman's  (OpenAI) involvement. Elon Musk likewise is said to be interested in developing a Super App built around X (now merged with xAI). 

Such apps typically start with a core service, such as messaging or ride-hailing, and expand to include various other features like payments, e-commerce, social networking,

source: Emerline

Originally successful in Southeast Asia and China, there obviously is some interest in whether the same popularity will occur in other markets, especially where banking, payment and e-commerce ecosystems already are robust. 

Super Apps also might have succeeded first in markets without a developed application ecosystem. The U.S.  app market, for example, is mature and highly competitive. Dominant players already exist in virtually every key vertical that might be part of a single Super App:


* Social Media/Messaging: Meta (Facebook, Instagram, WhatsApp), TikTok, X (formerly Twitter), Snapchat.   

* E-commerce: Amazon, Walmart, Target, eBay, specialized retailers.

* Payments: PayPal, Venmo (owned by PayPal), Cash App (Block/Square), Zelle, Apple Pay, Google Pay, credit card companies.

* Transportation: Uber, Lyft.

* ood Delivery: DoorDash, Uber Eats, Grubhub. 


Observers might also note that success so far has been most clear in culturally-homogenous countries, though it remains unclear whether this is a functionally-important issue or not. 



Convenience often is said to be the driver behind consumer acceptance of Super Apps and some might note that adoption to date has been strongest in mobile-first countries in Asia where banking and payment systems often are less developed. 

As has been the case for digital payment systems generally, countries and markets with well-developed payment and banking systems have not shown the same level of interest. It is likely no accident that Super Apps are not developed or used in North America, Europe and other regions with good banking systems and ubiquitous fixed network internet access availability.

Though "convenience" is said to be the driver of usage, perhaps it is something else that makes Super App usage attractive. A mobile-frist app might make lots of sense for users who are out and about quite a lot, living in highly-urban areas, always with a smartphone and culturally attuned to mobile app behavior rather than also having easy and convenient access to other device form factors (such as personal computers). 

How much value does conducting any number of operations or experiences (using social media, consuming content, conducting transactions) within a single app actually provide, in markets where the "best of breed" or preferred providers are separate apps?

It often has been claimed that consumers prefer bundled services such as internet access, video entertainment and communications (mobile and fixed) from a single provider because of "convenience." I have often thought that was incorrect.

It is not the "convenience" of having a single provider of those services, on one bill, as it is the price discounts such bundling provides. "Lower cost" is the value, not convenience. 

Similarly, it might be that some value other than mere convenience is the driver of Super App usage. Perhaps where it is successful, the various Super App component capabilities actually are the "best of breed" and preferred experiences. 

That said, if the components are "best of breed," or close to it, then the integration of functions (chat, talk, share, schedule, purchase) in a single app might be useful. 

Beyond consumer preferences for "best of breed" or "integrated" approaches, there are many potential softer issues that could limit Super App creation in some markets. U.S. antitrust and privacy adfocates would likely oppose any dominant supplier in any single market niche from gaining too much share in adjacent and complementary markets. 

AI Assistant Revenue Upside Mostly Will be Measured Indirectly

Amazon expects Rufus , its AI shopping assistant, to indirectly contribute over $700 million in operating profits this year, Business Intel...