Showing posts sorted by date for query GPT. Sort by relevance Show all posts
Showing posts sorted by date for query GPT. Sort by relevance Show all posts

Saturday, April 12, 2025

AI Stack Will be Based on Layers, Just Like All Other App Ecosystems

“Digital infrastructure” tends to refer to physical assets like data centers, fiber optic networks, cell tower networks and cloud computing “as a service” providers, at least as seen by investors in and operators of such assets. 


And even the cloud computing providers (AWS, Zaure, Google Cloud and so forth) most often viewed as the key customers for digital infra providers, rather than core parts of the digital infrastructure business itself. 


However, as AI capabilities become more integral to business operations, there is an argument that digital infrastructure should also encompass AI as a service (AIaaS) capabilities which extend beyond hardware to include software models, language models, and AI platforms. 


Right now that is perhaps not the case, as investors and operators of language models tend to be distinct from traditional “digital infra” providers and investors. 


The “AI ecosystem” is different from the “digital infra” ecosystem, as the former necessarily includes chips, servers, language models, other AI systems and apps. 


As a practical matter, the expanded definition probably will not happen, for several reasons. For starters, financial analysts and operators of “computing” businesses tend to be different from “digital infra” analysts and operators. The former tends to be anchored by “software and hardware” interests, while the latter tends to be dominated by “real estate” interests. 


Foundation models and language models such as  GPT-4, Gemini, and Claude will tend to be the province of “software” industry analysts and practitioners; “hardware” analysts and practitioners such as Nvidia as well as the traditional “computing” ecosystem participants and analysts. 


In other words, the analysts and businesses that are in “middleware” and software stacks and tools, plus applications, are distinct from the analysts and businesses that supply “real estate” functions such as data transmission and data center operations. 


“Cloud computing” might be the function of many data centers, both including AI operators and all other software hosting, but there still are key differences between the businesses of computing and connectivity real estate and the actual software applications that use those real estate assets and platforms. 


So “infrastructure” and “applications” (including language models and AI as a service) will likely continue to remain separate areas of interest. 


Thursday, April 3, 2025

Are Large Language Models Really "10 Times" More Energy Consumptive than Search?

Most of us have heard claims that a single chatbot (Large Language Model or generative AI system) query is significantly more energy-intensive (often cited as roughly 10 times more) than a traditional search query.


Most of us could agree that the statement about energy intensity is directionally correct for most systems at the present time, though perhaps not as big a long-term issue, as energy intensity is virtually certain to be reduced over time. 


Computational complexity obviously is an issue. Traditional search uses pre-computed indexes. Much of the “heavy lifting" (indexing the web) is done beforehand.


Large language models run a generative process through a massive neural network (often with billions or trillions of parameters). Each query requires significant computations to understand a prompt and generate a novel response. This "inference" process is inherently more computationally demanding per query than retrieving indexed information.  


Early energy estimates suggested a "10x" more energy metric. These estimates looked at computational operations (FLOPs - Floating Point Operations per Second) required for each type of task and translated that into potential energy use based on typical hardware efficiency.


But that probably already is an out-of-date way to make the comparisons. As search engines increasingly integrate generative AI into search, the difference between an LLM query and a search query is likely narrowing quite substantially, in terms of energy consumption. 


Study/Source

Year

Model(s) Analyzed (Examples)

Key Finding / Estimate per Query

Context / Notes

Luccionni, Viguier, & Ligozat (NeurIPS 2023 - originally arXiv 2022)

2022/2023

BLOOM (176B parameters)

Estimated inference energy consumption for BLOOM, varying significantly based on hardware (e.g., A100 vs. T4 GPUs). Provided methodology for carbon footprint calculation.

Focused on BLOOM, an open model. Emphasized the impact of hardware and location (electricity grid mix) on the carbon footprint. Didn't give a single universal Wh/query figure.

Patterson et al. (Google Research) (arXiv 2021)

2021

LaMDA, MUM (Conceptual / Internal Google Models)

Not a direct per-query energy figure, but stated "some models used by Search are already large," and newer AI features (like MUM) are more compute-intensive.

Context was broader discussion of model efficiency and training costs. Confirms Google's internal view that advanced AI features increase computational demands over basic search.

De Vries (Digiconomist) (Joule, 2023 & ongoing analysis)

2023

General LLMs (e.g., based on ChatGPT/GPT-3 scale)

Estimated a single ChatGPT query could consume ~0.001-0.01 kWh (1-10 Wh) on average, potentially much higher depending on complexity & hardware. Compared this to a Google search (~0.0003 kWh or 0.3 Wh).

Estimates based on assumed hardware (like Nvidia A100 GPUs), server power usage, and query processing time. Acknowledges high uncertainty. Helped popularize the ~10x search comparison.

Gupta et al. (Stanford HAI) (Working Paper / Estimates)

2023

Conceptual LLM (e.g., GPT-3 scale)

Estimated generating a single image with a diffusion model might consume as much energy as charging a smartphone. Extrapolated that text generation is also energy-intensive.

Focused partially on image generation AI but discussed text AI costs. Used comparisons to relatable actions (phone charging) to illustrate magnitude. Emphasized inference costs add up globally.

Google Public Statements / Reports (Various)

Ongoing

Google's AI Services (incl. Search Generative Experience)

Repeatedly stated that generative AI queries are more computationally intensive and thus consume more energy than traditional search queries. No specific public Wh/query figure released.

Confirms the general premise from the provider's side. Focuses on efforts to improve efficiency via hardware (TPUs) and software optimization.

University Research (Various) (e.g., studies citing FLOPs)

Ongoing

Various (BERT, GPT variants, etc.)

Often estimate FLOPs (Floating Point Operations) per query/token. E.g., a query might require trillions of FLOPs. This can be converted to energy using hardware efficiency (Joules/FLOP), leading to estimates often in the 0.1 Wh to 10 Wh range depending on assumptions.

These are often theoretical calculations based on model architecture and assumed hardware specs (e.g., Joules per FLOP for a specific GPU). Highly variable.


Also, models are becoming more energy efficient, as tends to happen with all computing processes that become more mature. 


So at this point, we really do not know much about energy consumption, except that, on today’s hardware, using today’s algorithms and compute intensity, it is logical enough to believe that more energy is required, as more computation is required. 


Still, logic also suggests that simple queries will require less computation, and therefore less energy. 

A simple classification task, retrieving a cached answer, or generating a very short response using a smaller, specialized model might have an energy cost that isn't dramatically higher than a complex search operation.


But actual consumption is certain to vary by model, by model architecture, by data center and hardware platforms. And since no “operating at scale” AI “as a service” supplier seems to have released any actual studies on the subject, we might assume they already know the energy consumption increase is significant.


Tuesday, March 25, 2025

Internet and AI: It's "Different This Time"

Investors, as all humans, tend to see the future through the lens of the past. And the thinking that "it is different this time" tends to be dangerous. So many have warned of an investment  “dot com bubble” in artificial intelligence.


So some worry about the size of AI infra investments, compared to the near-term and immediate revenue generation from those investments. 

source: Seeking Alpha 


But investment in AI stands on much-firmer ground than did internet startup investing a quarter century ago. 

To be sure, the past emergence of general-purpose technologies (assuming AI will one day be deemed to be a GPT), have led to over-investment. But it also is true that the past GPTs did emerge as transformative and profitable, even if there was a period of investment excess. 


And it might also be correct to say concern over the present investment boom is not anchored in the magnitude of the investment so much as the magnitude of the near-term revenues. 


Would-be leaders of the coming AI markets have a different perspective, of course. They believe the future markets will be huge and will be led by just a handful of firms. So the risk of falling behind is commensurately great. 


There is a risk of over-investment, to be sure. But that might be deemed the lesser of evils. The risk of some temporary over-investment has to be weighed against the risk of losing out on permanent, long-term market leadership. 


Some over-investment is temporary and quantitative. Missing out on the chance to lead in AI markets is lasting and qualitative. 


General-Purpose Technology

Time Period

Investment Boom/Bubble

“Boom”

“Bust”

Railroads

1840s

Railroad Mania

Rapid expansion of rail networks, speculative investments

Many companies went bankrupt, but rail infrastructure remained

Automobiles

Early 20th century

Automotive boom

Proliferation of car manufacturers, increased road construction

Industry consolidation

Internet

Late 1990s

Dot-com Bubble

Excessive speculation in internet-related companies, skyrocketing valuations

NASDAQ crashed 78%, many startups failed

Artificial Intelligence

2020s-present

AI Boom

Massive investments in AI companies, high valuations for AI-related stocks

?


But there might also be many differences between the “internet” investment bubble of the last turn of the century and the current AI investment trend. For starters, AI infrastructure is so hugely expensive that most of the leading investors are deep-pocketed, profitable firms with established businesses and huge cash flows. 


The internet investment bubble was much more speculative, with a greater role played by venture capital and even retail investors, where AI investment is led by established technology giants and institutional investors. 


Internet firms often raised money on the assumption they would “find a business model.” Today’s AI leaders already have logical avenues to  monetize their investments, for the most part. And, for the most part, all those models hinge on vast improvements to the performance of existing use cases, not the creation of new use cases. 


Aspect

Internet Bubble (Late 1990s)

AI Investment Wave (2020s)

Investor Composition

Primarily speculative retail investors and venture capital

Predominantly established, profitable tech giants and institutional investors

Company Financials

Many dot-com startups with no proven business models

AI companies backed by companies with substantial existing revenue streams

Revenue Potential

Highly speculative, based on potential internet reach

More concrete, with clear applications in existing industries

Technology Maturity

Nascent internet infrastructure and capabilities

More advanced technological foundation with demonstrable AI capabilities

Valuation Basis

Primarily "eyeballs" and website traffic

Tangible metrics like AI model performance, integration potential, and efficiency gains

Market Penetration

Theoretical internet transformation

Proven AI applications across multiple sectors (healthcare, finance, technology)

Investment Sources

Retail investors, IPOs, venture capital

Large tech companies (Microsoft, Google, NVIDIA), institutional investors, strategic corporate investments

Economic Context

Emerging digital economy

Established digital infrastructure with clear productivity enhancement potential

Risk Profile

Extremely high speculative risk

More measured risk with clearer value proposition

Competitive Landscape

Numerous undifferentiated internet startups

Fewer, more technologically advanced AI companies with distinct competitive advantages


And where internet metrics often were indirect or non-financial (usage, attention), AI metrics already are largely operationally quantifiable (time saved, code generated, output per hour increased), even if direct revenue increases are harder to measure. 


And even if some parts of the AI infrastructure must be created (graphics processing unit as a service; model training and inference as a service), most of the rest of the infrastructure (broadband internet access; high-capacity cloud computing and data transport facilities; high existing use of applications and devices) is basically in place. 


The internet investment occurred when broadband access had yet to be created; when smartphones were not common; search, social media, e-commerce and content streaming were still developing; and the widespread availability of cloud computing as a service had yet to develop. 


Perhaps the point is that the internet and AI investment context is quite different. There will be over-investment, but by many large, profitable firms that can take the short-term hit. The fate of many would-be startups remains unknown. 


But there are many significant differences between the internet and AI investment contexts. While firms might still falter for any number of reasons, monetization paths are quite a bit clearer; the finances of big investors are sturdier; the use cases clear, in principle. 


We do not have to guess at the value of AI embodied in the form of robo-taxis or autonomous vehicles; factory and other robots. We already know AI can enhance all personalization efforts for all types of software and consumer processes. We are aware of the many ways AI can speed up output by automating repetitive processes. 


The value of the internet was far less clear in early days.


Monday, March 24, 2025

Will AI Exceed Internet in Terms of Producitivity Gains?

If the value of the internet had to be summed up in just one word, that would probably be “connectivity:” people to people; people to apps; people to devices; people to information; devices to devices. 


And though we cannot be fully sure yet, if we had to sum up artificial intelligence in just one word, that might be “augmentation” today, but most observers probably would agree we are on a road to some form of  “intelligence” eventually. 


That might raise the question of whether the internet or AI will have more impact on life, business and economies, though few seem to doubt that both are huge innovations.


Today, AI mostly augments human capabilities, which is not so different from other general-purpose technologies of the past that amplified human muscle power, sight, sound, mobility, memory or speech.


But it will be hard to determine whether communicatiion is more important than decision making; information access more valuable than knowledge creation.  


General-Purpose Technology (GPT)

Amplified Capability or Sense

Printing Press (15th century)

Knowledge Sharing, Memory

Steam Engine (18th century)

Physical Strength, Mobility

Electricity (19th century)

Vision (Lighting), Strength (Machines)

Telegraph & Telephone (19th century)

Communication (Hearing, Speech)

Automobile (19th-20th century)

Mobility, Speed

Radio & Television (20th century)

Hearing, Vision

Computers (20th century)

Calculation, Memory, Logic

Internet (20th century)

Communication, Knowledge Access

AI & Machine Learning (21st century)

Pattern Recognition, Decision-Making


But many observers might already suggest that AI’s potential impact could be greater than the value added by the internet. While the internet broke geographic and physical limitations, connecting people and information faster,  AI has the potential to automate and augment human capabilities across a wider range of tasks and industries.   


AI has the potential to automate cognitive tasks, automate routine processes of all sorts and amplify pattern recognition in almost any sphere of life or industry.


The internet's productivity gains arguably were largely driven by increased connectivity and information access. AI's productivity gains are expected to come from advanced automation and “intelligent” systems. 

   

Feature

Internet Impact

AI Impact (Expected)

Primary Productivity Drivers

Increased information access.  Enhanced communication.  Automation of information-based tasks E-commerce and digital markets

Advanced automation of cognitive and physical tasks. Optimization of complex processes. Creation of AI-driven products and services. Data driven decision making.

Quantifiable Productivity Gains

Significant increase in total factor productivity (TFP) during the "internet boom" of the late 1990s and early 2000s. Studies indicate a notable contribution to GDP growth. 

Estimates vary widely: some predict a substantial boost to GDP within the next decade (e g , Goldman Sachs projecting a potential 15% GDP boost).  Studies suggest potential increases in annual TFP growth by 0 25 to 0 6 percentage points.  Micro level studies show very high increases in productivity in specific sectors

Key Productivity Sectors

Information technology

Finance 

Retail Communication

Manufacturing Healthcare 

Finance 

Transportation Customer service Software development


China Approaches AI Diffusion as "Asian Tigers" Approached Economic Development

Among the differences between the U.S. and China frameworks for fostering widespread use of artificial intelligence are the roles of state s...