Sunday, March 17, 2024

"Tokens" are the New "FLOPS," "MIPS" or "Gbps"

Modern computing has some virtually-universal reference metrics. For Gemini 1.5 and other large language models, tokens are a basic measure of capability. 


In the context of LLMs, a token is the basic unit of text (for example) that the model processes and generates, usually measured in “tokens per second.”


For a text-based model, tokens can include individual words; subwords (prefixes, suffixes or  characters) or special characters such as punctuation marks or spaces. 


For a multimodal LLM, where images and audio and video have to be processed, content is typically divided into smaller units like patches or regions, which are then processed by the LLM. Each patch or region can be considered a token.


Audio can be segmented into short time frames or frequency bands, with each segment serving as a token. Videos can be tokenized by dividing them into frames or sequences of frames, with each frame or sequence acting as a token.


Tokens are not the only metrics used by large- and small-language models, but tokens are among the few that are relatively easy to quantify. 


Metric

LLM

SLM

Tokens per second

Important for measuring processing speed

Might be less relevant for real-time applications

Perplexity

Indicates ability to predict next word

Less emphasized due to simpler architecture

Accuracy

Task-specific, measures correctness of outputs

Crucial for specific tasks like sentiment analysis

Fluency and Coherence

Essential for generating human-readable text

Still relevant, but might be less complex

Factual correctness

Important to avoid misinformation

Less emphasized due to potentially smaller training data

Diversity

Encourages creativity and avoids repetitive outputs

Might be less crucial depending on the application

Bias and fairness

Critical to address potential biases in outputs

Less emphasized due to simpler models and training data

Efficiency

Resource consumption and processing time are important

Especially crucial for real-time applications on resource-constrained devices

LLMs rely on various techniques to quantify their performance on attributes other than token processing rate. 


Perplexity is measured by calculating the inverse probability of the generated text sequence. Lower perplexity indicates better performance as it signifies the model's ability to accurately predict the next word in the sequence.


Accuracy might compare the LLM-generated output with a reference answer. That might include precision (percent of correct predictions); recall (proportion of actual correct answers identified by the model) or F1-score that combines precision and recall into a single metric.


Fluency and coherence is substantially a matter of human review for readability, grammatical correctness, and logical flow. 


But automated metrics such as BLEU score (compares the generated text with reference sentences, considering n-gram overlap); ROUGE score (similar to BLEU but focuses on recall of n-grams from reference summaries) or Meteor (considers synonyms and paraphrases alongside n-gram overlap). 


So get used to hearing about token rates, just as we hear about FLOPS, MIPS, Gbps, clock rates or bit error rates.


  • FLOPS (Floating-point operations per second): Measures the number of floating-point operations a processor can perform in one second.

  • MIPS (Millions of instructions per second): Similar to IPS, but expressed in millions.

  • Bits per second (bps): megabits per second (Mbps), and gigabits per second (Gbps).

  • Bit error rate (BER): Measures the percentage of bits that are transmitted incorrectly.


Token rates are likely to remain a relatively easy-to-understand measure of model performance, compared to the others, much as clock speed (cycles the processor can execute per second) often is the simplest way to describe a processor’s performance, even when there are other metrics. 


Other metrics, such as the number of cores and threads; cache size; instructions per second (IPS) or floating-point operations per second also are relevant, but unlikely to be as relatable, for normal consumers, as token rates.


Thursday, March 14, 2024

AI Productivity Gains Might be Hard to Measure

Proponents of applied artificial intelligence generally tout the productivity advantages AI will bring, and with reason. But productivity impact, assuming you believe we can measure it, if the result of many influences. 


The U.S. Bureau of Labor Statistics uses a measure called “Multifaceted Productivity.” It  includes information technology investment, but also advancements in business practices, workforce skills, and other capital investment.


Since 2000, IT investment growth has exceeded MFP growth. 


Year

MFP Growth (%)

IT Investment Growth (%)

2000

2.4

10.2

2005

1.3

7.8

2010

0.8

5.1

2015

0.6

3.4

2020

1.4

4.2

source: Bureau of Labor Statistics


The point is that AI contributions will be hard to identify, in terms of productivity gains.


Wednesday, March 13, 2024

For Most Firms, Sustainable AI Advantage Will Prove Illusory

Most of you are familiar with the concept of “first movers” in new markets. Many of you also are familiar with the notion of “sustainable competitive advantage” (“business moats protecting firms from competition). 


The long-term ability to sustain competitive advantage tends not to be so easy, as any new technology innovation--including artificial intelligence--propagates and becomes mainstream. 


Similar to past innovations like electricity, PCs, and the internet, early adopters arguably had an edge. 

Over time, as the technologies  became widely available and mainstream, the advantage was diminished or largely lost.


But perhaps not completely lost. Consider data centers with access to lots of low-cost electricity. In such cases, competitive advantage might still remain, even for a “commodity” such as power. 


The same might be noted for some manufacturers of products such as aluminum, which is highly energy-intensive as well. 


In similar ways, some firms in some industries might retain competitive advantages in use of computing hardware and software, even if use of computing software and platforms is virtually ubiquitous across all industries. 


High-performance computing, semiconductor design and manufacturing, software as a service and cybersecurity might provide relevant examples. 


The point is that competitive advantage for adopters of artificial intelligence will likely exist for early successful adopters. Over time, the magnitude of advantage, in most cases, will shrink, though still providing some amount of sustainable advantage in some industries, for some firms. 


A few firms in search and social media provide obvious examples. 


Still, most firms will eventually be using AI as a routine part of their businesses, even if, in many cases, such use will not produce sustainable competitive advantage, compared to their key competitors. 


New technologies offering business value will be quickly adopted and improved upon by competitors.


As technology becomes more widespread, it will become standardized, leading to price competition and eroding first-mover initial advantages.


Once the core technology becomes ubiquitous, competition will logically intensify in complementary areas such as service offerings, user experience, and business models. 


So unusual advantage tends to be eroded over time. Still, some firms will likely be able to gain a period of advantage by deploying AI more effectively than competitors, until everyone catches up.


Tuesday, March 12, 2024

GenAI Consumes Lots of Energy, But What is Net Impact?

Much has been made of a recent study suggesting ChatGPT operations consume prodigious amounts of electricity, as exemplified by the claim that ChatGPT operations consume 17,000 times more energy than a typical household.  


No question, cloud computing requires remote data centers, and data centers are big consumers of energy. In the United States, data centers now account for about four per cent of electricity consumption, and that figure is expected to climb to six per cent by 2026, according to reporting by The New Yorker


But that is not the whole story. Data centers, apps and cloud computing are used to design, manufacture and use all sorts of products that might also decrease energy consumption. Some would argue, for example, that there is a net energy reduction when people use ridesharing instead of driving their personal vehicles. 


Study Title

Location

Key Findings

Life Cycle Energy Consumption of Ride-hailing Services: A Case Study of Taxi and Ride-Hailing Trips in California (2020)

California, USA

- Ridesharing resulted in 11-23% lower energy consumption compared to private vehicles, primarily due to higher vehicle occupancy.

The Energy and Environmental Impacts of Shared Autonomous Vehicles Under Different Pricing Strategies (2023)

N/A (Hypothetical Scenario)

- Shared Autonomous Vehicles (SAVs) with high occupancy rates have the potential for significant energy savings compared to private vehicles.

Future Transportation: The Social, Economic, and Environmental Impacts of Ridesourcing Services: A Literature Review (2022)

N/A (Literature Review)

- Ridesharing can potentially reduce vehicle miles traveled (VMT) compared to private vehicles, leading to lower energy consumption. - However, concerns exist regarding: * Increased empty miles driven by rideshare vehicles searching for passengers. * Potential substitution of public transportation trips with ridesharing, negating some environmental benefits.

Life-Cycle Energy Assessment of Personal Mobility in China (2020)

China

Ridesharing with three passengers can reduce energy consumption.

The Energy and Environmental Impacts of Shared Autonomous Vehicles (2021)

N/A

Shared autonomous vehicles can reduce energy consumption. 

Empty Urban Mobility: Exploring the Energy Efficiency of Ridesharing and Microtransit (2019)

Europe

High-occupancy ridesharing reduces energy consumption, compared to use of private vehicles, but we also must account for energy consumed when not transporting passengers. 


So far as I can determine, nobody has really tried to model the net energy impact of generative artificial intelligence, data centers or cloud computing, where the energy footprint of GenAI, data centers or cloud computing is compared with the possible net reductions throughout an economy if the app outputs are used to reduce energy consumption in products using cloud computing, data center and GenAI  outputs. 


Study Title

Key Findings

Green Cloud? An Empirical Analysis of Cloud Computing and Energy Efficiency (2020)

Cloud computing adoption improves user-side energy efficiency, particularly after 2006. - SaaS (Software-as-a-Service) contributes most significantly to both electric and non-electric energy savings. IaaS (Infrastructure-as-a-Service) primarily benefits industries with high internal IT hardware usage.

The Internet: Explaining ICT Service Demand in Light of Cloud Computing Technologies (2015)

Cloud computing can lead to increased energy consumption in data centers. Potential for energy savings in other sectors due to: * Reduced need for personal computing devices.  Improved resource utilization and consolidation.

Decarbonizing the Cloud: How Cloud Computing Can Enable a Sustainable Future (McKinsey & Company, 2020)

Cloud adoption powered by renewables can significantly reduce emissions compared to on-premise IT infrastructure. Cloud enables the development of various sustainability solutions (smart grids, remote work).

Cloud Computing: Lowering the Carbon Footprint of Manufacturing SMEs? (2013)

Case studies of manufacturing SMEs shifting to cloud-based solutions.


But some related research suggests ways of looking at net energy footprint. 


Industry

Cloud-based Solution

Potential Fuel Savings

Source

Trucking

Route optimization with real-time traffic data

Up to 10%

DoT

Railroad

Predictive maintenance for locomotives

5-10%

Wabtec

Shipping

Optimized container loading and route planning

5-15%

Massey Ferguson


The point is that “net” impact is what we are after.


Friday, March 8, 2024

Open Interpreter Can be Used for Writing Code


It is not a capability I am likely to need from my own AI use cases, but generating code is among the salient use cases for large language models. 

Wednesday, March 6, 2024

AI Revenue in the "Picks and Shovels" Phase

Some idea of the ultimate creation of revenue within the artificial intelligence ecosystem, broadly defined to include many of the same participants in the internet ecosystem will shed light on ultimate outcomes. 


Right now, the clearest identifiable revenues are earned by providers of infrastructure, such as graphics processing units or providers of computing-, storage-, models- or inferences-as-a-service. 


In the U.S. internet ecosystem, for example, some $3.3 trillion is estimated to be earned annually by firms in the ecosystem. As always, many of the products in the ecosystem use the internet in some way to support the features and value of their products. 


Not all semiconductors, devices, advertising, platforms or e-commerce and content, for example, are directly and fully attributable to “the internet” in specific. 


In some cases, as with home broadband, it is infrastructure that continues to supply much of the revenue. But even there, revenue earned supporting the use of mobile devices produces the majority of revenue, and internet access is only part of the value proposition. 


Segment

Estimated Annual Revenue (USD Billion)

Source

Semiconductors

200

[1, 2]

Devices

500

[3, 4]

Apps

150

[5, 6]

Advertising

300

[7, 8]

Platforms

500

[9, 10]

Connectivity

250

[11, 12]

E-commerce

1.2 trillion

[13, 14]

Content

200

[15, 16]

Total

3.3 $Trillion



Sources:

[1] Semiconductor Industry Association: https://www.semiconductors.org/

[2] Gartner: https://www.gartner.com/en/newsroom/press-releases/2023-04-26-gartner-forecasts-worldwide-semiconductor-revenue-to-decline-11-percent-in-2023

[3] Consumer Electronics Association: https://www.cta.tech/

[4] IDC: https://www.idc.com/

[5] App Annie: https://go.appannie.com/AppAnnieLite_IWD_Form.html

[6] Sensor Tower: https://sensortower.com/

[7] Interactive Advertising Bureau: https://www.iab.com/

[8] eMarketer: https://www.insiderintelligence.com/topics/category/emarketer

[9] Meta Platforms Investor Relations: https://investor.fb.com/home/default.aspx

[10] Alphabet Investor Relations: http://abc.xyz/investor/earnings

[11] CTIA - The Wireless Association: https://www.ctia.org/

[12] Federal Communications Commission: https://www.fcc.gov/

[13] Digital Commerce 360: https://www.digitalcommerce360.com/

[14] Statista: https://www.statista.com/

[15] National Endowment for the Arts: https://www.arts.gov/

[16] Bureau of Labor Statistics: https://www.bls.gov/


For such reasons, the ultimate winners within the AI ecosystem will likely be more difficult to identify with precision. Specific AI hardware and software will continue to be a source of revenue for some providers within the value chain. 


But most of the AI-related revenue upside will come in more-indirect ways, as AI becomes a feature of many products and services but not a direct and identifiable revenue stream. 


"Tokens" are the New "FLOPS," "MIPS" or "Gbps"

Modern computing has some virtually-universal reference metrics. For Gemini 1.5 and other large language models, tokens are a basic measure...