IP Carrier: It's Hard to Quantify LLM Costs

Saturday, October 21, 2023

It's Hard to Quantify LLM Costs

Virtually everyone believes artificial intelligence use cases are going to drive important changes in employment, work processes, applications, use cases, processing operations, power consumption and data center requirements, though precisely how much change will occur, and when, remains unclear.

More practically, firms and entities are having to estimate how much it will cost to create generative AI models and then draw inferences from those models.

The answers, inevitably, are that “it depends” on what one wishes to accomplish, using which engines, which compute platform, scraping how much data, how much customization for a generic model is required, the number of users of the model; the complexity of the tasks that the model supports; the amount of data that is needed to train the model and the cost of computing resources.

Context length, which determines the amount of information the LLM can consider when formulating an output, also affects pricing. if a generative AI model has a context length of 10, it will consider the 10 previous words when generating the next word.

The context length of ChatGPT-4 is 8,192 tokens for the 8K variant and 32,768 tokens for the 24K variant. This means that ChatGPT-4 can consider up to 8,192 or 32,768 previous words when generating the next word, depending on the variant.

The cost for using the GPT-4 8K context model API is about $0.03 per 1,000 tokens for input and $0.06 per 1,000 tokens for output.

Using the 32K context model, the cost is $0.06 per 1,000 tokens for input and $0.12 per 1,000 tokens for output.

Applications	GenAI Costs	Studies
Marketing	$10,000 - $100,000	Gartner
Sales	$100,000 - $1 million	Forrester
Customer Service	$1 million - $10 million	McKinsey
Product Development	$10 million - $100 million	PwC

And the cost of building a model, offered as a platform, is not the same as the cost for an entity to use that model, when offered as a subscription; a pay-per-use model or bundled as a feature.

Certainly, everyone expects model building, training and customization costs to come down over time. But the costs appear to be significant, whether enterprises choose to build using their in-house resources or use a cloud computing “as a service” provider.

Business size	Cost of building generative AI model on-premises	Cost of building generative AI model on the cloud
Fortune 500	$10 million - $100 million	$5 million - $50 million
Mid-market	$1 million - $10 million	$500,000 - $5 million
Small business	$100,000 - $1 million	$50,000 - $100,000

The costs of building generic models will likely, over time, mostly be the province of LLM platform suppliers, as few entities will have the financial resources to build and train proprietary models.

Cost estimate	Key assumptions	Study name	Date of publication	Publishing venue
$10M-$100M	100B parameters, trained on 100TB of text data, using 1,000 GPUs for 1 month		2022	OpenAI
$1B-$10B	1T parameters, trained on 1T TB of text data, using 10,000 GPUs for 1 year		2023	Google AI
$10B-$100B	10T parameters, trained on 10T TB of text data, using 100,000 GPUs for 10 years		2024	Microsoft AI
$10 million	175B parameter model, trained on 100TB of text data using 1024 GPUs for 1 month	"The Cost of Training a Large Language Model" by Brown et al.	2020	arXiv
$100 million	1 trillion parameter model, trained on 100PB of text data using 10,240 GPUs for 1 month	"Scaling Laws for Neural Language Models" by Chen et al.	2020	arXiv
$1 billion	10 trillion parameter model, trained on 10EB of text data using 100,240 GPUs for 1 month	"The Cost of Training a Large Language Model" by Webber	2023	Forbes
$1 billion	100 trillion parameters, 1 million GPUs	"The Cost of Large Language Models: A Scaling Law Analysis"	2022	Nature

For most entities, the relevant cost question will be “how much will it cost to use an existing platform,” including the cost of adapting (customizing) a generic model for a particular enterprise or entity.

For example, costs of generating inferences when using "as a service" providers are based on the number of tokens. A generative AI token is a unit of text or code that is used by a generative AI model to generate new text or code. Generative AI tokens can be as small as a single character or as large as a word or phrase.

As a simplified rule, the number of tokens can be likened to the number of words in a generated response, for example.

OpenAI offers a variety of generative AI models as a service through its API. Licensing costs range from $0.00025 to $0.006 per 1000 tokens for inference.

Google AI Platform offers a variety of generative AI models as a service through its Vertex AI platform. Licensing costs range from $0.005 to $0.02 per 1000 tokens for inference.

Microsoft Azure offers a variety of generative AI models as a service through its Azure Cognitive Services platform. Licensing costs range from $0.005 to $0.02 per 1000 tokens for inference.

Cost estimate	Key assumptions	Study name	Date of publication	Publishing venue
$0.006 per 1000 tokens	Inference on a single GPU	"Pricing Large Language Models as a Service"	2022	arXiv
$0.02 per 1000 tokens	Inference on multiple GPUs	"The Economics of Large Language Models"	2023	Medium
$0.05 per 1000 tokens	Inference on a TPU	"Comparing the Cost of Different Hardware Platforms for Large Language Models"	2023	arXiv
$0.02 per 1,000 tokens	GPT-3.5 model	"The Economics of Large Language Models"	2023	Medium
$0.10 per 1,000 tokens	GPT-4 (8K) model	"The Cost of Large Language Models: A Scaling Law Analysis"	2022	Nature
$0.40 per 1,000 tokens	GPT-4 (32K) model	"The Cost of Large Language Models: A Scaling Law Analysis"	2022	Nature

The point is that the cost of deploying generative AI for any particular business function is highly variable at the moment.

IP Carrier

Saturday, October 21, 2023

It's Hard to Quantify LLM Costs

No comments:

When Robotaxis Will Displace Auto Rentals

Translate

Blog Archive

Translate

Report Abuse

Pages