IP Carrier

Monday, April 1, 2024

Are Massive Energy Consumption Forecasts for AI Data Center Operations Correct?

With the caveat that the following chart shows both data center energy consumption as well as cryptocurrency mining operations, manufacturing and electrification of fossil fuel operations, energy demand fueled in part by artificial intelligence is going to double consumption within six or seven years.

How that fits with pledges of net zero carbon dioxide footprints is unclear to some of us.

source: Financial Times

That noted, we cannot ever rule out human creativity and technology advances such as ways to dramatically lower energy consumption for AI-related operations, but some researchers suggest the demand for data center energy could grow much more than a doubling over the next six years.

Study Title	Publisher	Year	Projected Growth
"Power Consumption of Large Scale Data Centers:"	Institute of Electrical and Electronics Engineers (IEEE)	2019	Up to 10x growth by 2030
"The State of AI 2022	Stanford University	2022	Doubling of training compute every 3-4 months
"AI and the Environmental Footprint of Computing”	Cornell University	2020	Up to 1000x growth
"The AI Revolution's Power Squeeze"	Nature: https://www.nature.com/	2020	Up to 3x by 2025
"How AI is Changing the Data Center"	McKinsey & Company: https://www.mckinsey.com/	2022	2.5x to 4x by 2025
"The Implications of Deep Learning for Data Center Design"	Joule: https://www.sciencedirect.com/journal/joule	2019	Doubling within a few years

Most assumptions we make about future energy consumption are necessarily linear, and assume AI operations at data centers are not essentially examples of complicated ecosystems that are necessarily “chaotic” and subject to huge changes produced by small inputs that are part of our assumptions.

Saturday, March 30, 2024

Which Edge Will Dominate AI Processing?

Edge computing advantages generally are said to revolve around use cases requiring low-latency response, and the same is generally true for artificial intelligence processing as well.

Some use cases requiring low-latency response will be best executed “on the device” rather than at a remote data center, and often on the device rather than at an “edge” data center.

That might especially be true as some estimate consumer apps will represent as much as 70 percent of total generative artificial intelligence compute requirements.

So does that mean we see graphics processor units on most smartphones? Probably not, even if GPU prices fall over time. We’ll likely see lots of accelerator chips, though, including more use of tensor processing units or neural processing units and application specific integrated circuits, for reasons of cost.

The general principle is always that the cost of computing facilities increases, while efficiency decreases, as computing moves to the network edge. In other words, centralized computing tends to be the most efficient while computing at the edge--which necessarily involves huge numbers of processors--is necessarily more capital intensive.

For most physical networks, as much as 80 percent of cost is at the network edges.

Beyond content delivery, many have struggled to define the business model for edge computing, however. Either from an end user experience perspective or an edge computing supplier perspective.

Sheer infrastructure cost remains an issue, as do compelling use cases. Beyond those issues, there arguably are standardization and interoperability issues similar to multi-cloud, complexity concerns and fragmented or sub-scale revenue opportunities.

In many cases, “edge” use cases also make more sense for “on the device” processing, something we already see with image processing, speech-to-text and real-time language translation.

To be sure, battery drain, processors and memory (and therefore cost) will be issues, initially.

On-Device Use Case	Benefits	Considerations
Image Processing (Basic)	Privacy: Processes images locally without sending data to servers. Offline Functionality: Works even without internet connection. - Low Latency: Real-time effects and filters.	Limited Model Complexity: Simpler tasks like noise reduction or basic filters work well on-device. - Battery Drain: Complex processing can drain battery life.
Voice Interface (Simple Commands)	Privacy: Voice data stays on device for sensitive commands. - Low Latency: Faster response for basic commands (e.g., smart home controls).	Limited Vocabulary and Understanding: On-device models may not handle complex requests. - Limited Customization: Pre-trained models offer less user personalization.
Language Translation (Simple Phrases)	Offline Functionality: Translates basic phrases even without internet. - Privacy: Sensitive conversations remain on device.	Limited Languages and Accuracy: Fewer languages and potentially lower accuracy compared to cloud-based models. Storage Requirements: Larger models for complex languages might not fit on all devices.
Message Autocomplete	Privacy: Keeps message content on device. Offline Functionality: Auto-completes even without internet.	Limited Context Understanding: Relying solely on local message history might limit accuracy. - Personalized Experience: On-device models may not adapt to individual writing styles as well.
Music Playlist Generation (Offline)	Offline Functionality: Creates playlists based on downloaded music library. - Privacy: No need to send music preferences to the cloud.	Limited Music Library Size: On-device storage limits playlist diversity. - Static Recommendations: Playlists may not adapt to changing user tastes as effectively.
Maps Features (Limited Functionality)	Offline Functionality: Access basic maps and navigation even without internet. - Privacy: No user location data sent to servers for basic features.	Limited Features: Offline functionality may lack real-time traffic updates or detailed points of interest. - Outdated Maps: Requires periodic updates downloaded to the device.

Remote processing (edge or remote) will tend to favor use cases including augmented reality; advanced image processing; personalized content recommendations or predictive maintenance.

Latency requirements for these and other apps will tend to drive the need for edge processing.

From AI Chatbots to AI Agents

The evolution of generative artificial intelligence chatbots to AI agents might be likened to the difference between a customer service representative who follows a script and a knowledgeable personal assistant who understands your needs, accesses information, and completes tasks on your behalf.

Chatbots are sort of “manual” operators: they respond to user questions, but do nothing unless they are asked, responding using pre-defined knowledge bases and scripts.

AI agents, in principle, are designed to operate more automatically, performing tasks on a user’s behalf.

May access and manipulate data to complete tasks (booking a flight, recommending products). Agents are able to handle more-complex interactions and conduct more-complex tasks.

Analogy	Chatbot	Agent
Restaurant Server	Takes your order based on a fixed menu	Recommends dishes based on your preferences and dietary restrictions
Customer Service Representative	Follows a script to answer basic questions	Resolves complex issues by accessing customer history and internal databases
Travel Booking Agent	Books flights based on simple criteria (carrier, date, time, fare, direct or connecting flights).	Creates a personalized travel itinerary based on your interests and budget
Library Assistant	Keyword-based search	Researches topics across different resources and summarizes key points
Movie Recommendation System	Suggests popular movies based on genre	Recommends movies based on your past viewing history

Thursday, March 28, 2024

Many Winners and Losers from Generative AI

Perhaps there is no contradiction between low historical total factor annual productivity gains and high expected generative artificial intelligence revenue impact; productivity impact or profit impact for some firms in some industries.

Keep in mind that total productivity changes include effects from all sources, not just information technology. So unless you believe IT was solely responsible for total productivity change since 1970, the actual impact of IT arguably is rather slight, perhaps on the order of 0.5 percent up to about 1.5 percent per year, maximum.

And those years would include the impact of personal computers, the internet and cloud computing, to name a few important information technology advances.

Industry Sector (NAICS Code)	Description	Percent Change in Labor Productivity (2022)
Total Nonfarm Business (All)	Covers all industries except agriculture, government, and private households	1.0%
GoodsProducing Industries (1133)	Includes mining, construction, and manufacturing	0.4%
Manufacturing (3133)	Factory production of goods	0.5%
Construction (23)	Building, renovation, and maintenance of structures	1.2%
Mining (21)	Extraction of minerals and natural resources	2.3%
ServiceProviding Industries (4892)	Covers a wide range of service businesses	1.4%
Wholesale Trade (42)	Selling goods to businesses in bulk	1.2%
Retail Trade (4445)	Selling goods directly to consumers	0.4%
Transportation and Warehousing (4849)	Moving people and goods	2.1%
Information (51)	Publishing, broadcasting, and telecommunications	2.5%
Financial Activities (52)	Banking, insurance, and real estate	1.2%
Professional and Business Services (5456)	Legal, accounting, consulting, and scientific services	2.0%
Education and Health Services (6162)	Schools, hospitals, and other social services	1.3%
Leisure and Hospitality (7172)	Accommodation, food services, and entertainment	3.9%
Other Services (8189)	Repair shops, personal care services, and religious organizations	0.8%

On the other hand, some forecasts of higher impact for some firms in some industries are not necessarily incompatible with the “all industries” trends for productivity improvements. The best firms in the industries most able to use GenAI might well wring more benefit from the technology.

In fact, that might tend to be the case for the best and worst firms in almost any industry.

Also, cumulative productivity gains over a period of years will of course be higher than single year gains.

Revenue gains in excess of 10 percent for some companies in some industries over a multiyear period are conceivable, even if single-year gains are in single digits or less.

Industry	Potential Impact	Source, Forecast
Manufacturing	Increased product design efficiency and innovation Improved production line optimization Reduced waste and defects	McKinsey & Company: 20% to 40% productivity gains by 2030 PwC: Up to $3.7 trillion global GDP impact in manufacturing by 2030
Retail and Ecommerce	Personalized marketing and promotions Enhanced customer experience (chatbots, product recommendations) Optimized pricing and inventory management	J.P. Morgan: Up to 10% revenue growth for retailers by 2030 Accenture: Up to $1 trillion in annual revenue growth for retailers by 2035
Financial Services	Fraud detection and risk management Algorithmic trading and portfolio management Personalized financial advice and wealth management	Goldman Sachs: Up to $1.2 trillion in annual cost savings for financial institutions McKinsey: Up to $200 billion in annual revenue growth for wealth management by 2030
Healthcare	Drug discovery and development Personalized medicine and treatment plans Improved medical imaging analysis	PWC: Up to $150 billion in annual savings in the US healthcare system McKinsey: Up to $6 trillion in global healthcare productivity gains by 2030
Media & Entertainment	Content creation (music, scripts, video) Personalized content recommendations Streamlined content production workflows	Bain & Company: Up to 10% productivity gains in media content creation by 2030

The point, though, is that big numbers predicted for applied GenAI have to be understood in context. Total-economy gains will be far smaller than many expect, even if some firms, in some industries, will show higher revenue growth; profit rates or productivity gains.