Saturday, October 21, 2023

It's Hard to Quantify LLM Costs

Virtually everyone believes artificial intelligence use cases are going to drive important changes in employment, work processes, applications, use cases, processing operations, power consumption and data center requirements, though precisely how much change will occur, and when, remains unclear. 


More practically, firms and entities are having to estimate how much it will cost to create generative AI models and then draw inferences from those models. 


The answers, inevitably, are that “it depends” on what one wishes to accomplish, using which engines, which compute platform, scraping how much data, how much customization for a generic model is required, the number of users of the model; the complexity of the tasks that the model supports; the amount of data that is needed to train the model and the cost of computing resources. 


Context length, which determines the amount of information the LLM can consider when formulating an output, also affects pricing. if a generative AI model has a context length of 10, it will consider the 10 previous words when generating the next word.


The context length of ChatGPT-4 is 8,192 tokens for the 8K variant and 32,768 tokens for the 24K variant. This means that ChatGPT-4 can consider up to 8,192 or 32,768 previous words when generating the next word, depending on the variant. 


The cost for using the GPT-4 8K context model API is about $0.03 per 1,000 tokens for input and $0.06 per 1,000 tokens for output. 


Using the 32K context model, the cost is $0.06 per 1,000 tokens for input and $0.12 per 1,000 tokens for output.  


Applications

GenAI Costs

Studies

Marketing

$10,000 - $100,000

Gartner

Sales

$100,000 - $1 million

Forrester

Customer Service

$1 million - $10 million

McKinsey

Product Development

$10 million - $100 million

PwC


And the cost of building a model, offered as a platform, is not the same as the cost for an entity to use that model, when offered as a subscription; a pay-per-use model or bundled as a feature. 


Certainly, everyone expects model building, training and customization costs to come down over time. But the costs appear to be significant, whether enterprises choose to build using their in-house resources or use a cloud computing “as a service” provider. 


Business size

Cost of building generative AI model on-premises

Cost of building generative AI model on the cloud

Fortune 500

$10 million - $100 million

$5 million - $50 million

Mid-market

$1 million - $10 million

$500,000 - $5 million

Small business

$100,000 - $1 million

$50,000 - $100,000


The costs of building generic models will likely, over time, mostly be the province of LLM platform suppliers, as few entities will have the financial resources to build and train proprietary models. 


Cost estimate

Key assumptions

Study name

Date of publication

Publishing venue

$10M-$100M

100B parameters, trained on 100TB of text data, using 1,000 GPUs for 1 month


2022

OpenAI

$1B-$10B

1T parameters, trained on 1T TB of text data, using 10,000 GPUs for 1 year


2023

Google AI

$10B-$100B

10T parameters, trained on 10T TB of text data, using 100,000 GPUs for 10 years


2024

Microsoft AI

$10 million

175B parameter model, trained on 100TB of text data using 1024 GPUs for 1 month

"The Cost of Training a Large Language Model" by Brown et al.

2020

arXiv

$100 million

1 trillion parameter model, trained on 100PB of text data using 10,240 GPUs for 1 month

"Scaling Laws for Neural Language Models" by Chen et al.

2020

arXiv

$1 billion

10 trillion parameter model, trained on 10EB of text data using 100,240 GPUs for 1 month

"The Cost of Training a Large Language Model" by Webber

2023

Forbes

$1 billion

100 trillion parameters, 1 million GPUs

"The Cost of Large Language Models: A Scaling Law Analysis"

2022

Nature


For most entities, the relevant cost question will be “how much will it cost to use an existing platform,” including the cost of adapting (customizing) a generic model for a particular enterprise or entity. 


For example, costs of generating inferences when using "as a service" providers are based on the number of tokens. A generative AI token is a unit of text or code that is used by a generative AI model to generate new text or code. Generative AI tokens can be as small as a single character or as large as a word or phrase.


As a simplified rule, the number of tokens can be likened to the number of words in a generated response, for example. 


OpenAI offers a variety of generative AI models as a service through its API. Licensing costs range from $0.00025 to $0.006 per 1000 tokens for inference.


Google AI Platform offers a variety of generative AI models as a service through its Vertex AI platform. Licensing costs range from $0.005 to $0.02 per 1000 tokens for inference.


Microsoft Azure offers a variety of generative AI models as a service through its Azure Cognitive Services platform. Licensing costs range from $0.005 to $0.02 per 1000 tokens for inference.


Cost estimate

Key assumptions

Study name

Date of publication

Publishing venue

$0.006 per 1000 tokens

Inference on a single GPU

"Pricing Large Language Models as a Service"

2022

arXiv

$0.02 per 1000 tokens

Inference on multiple GPUs

"The Economics of Large Language Models"

2023

Medium

$0.05 per 1000 tokens

Inference on a TPU

"Comparing the Cost of Different Hardware Platforms for Large Language Models"

2023

arXiv

$0.02 per 1,000 tokens

GPT-3.5 model

"The Economics of Large Language Models"

2023

Medium

$0.10 per 1,000 tokens

GPT-4 (8K) model

"The Cost of Large Language Models: A Scaling Law Analysis"

2022

Nature

$0.40 per 1,000 tokens

GPT-4 (32K) model

"The Cost of Large Language Models: A Scaling Law Analysis"

2022

Nature


The point is that the cost of deploying generative AI for any particular business function is highly variable at the moment.


No comments:

Will AI Disrupt Non-Tangible Products and Industries as Much as the Internet Did?

Most digital and non-tangible product markets were disrupted by the internet, and might be further disrupted by artificial intelligence as w...