Language models are improving at a blistering pace, far outstripping what we have come to expect from computing in general and Moore’s Law in particular. Where Moore’s Law has suggested chip density doubles about every 18 months or so, AI language models have been improving nearly 300 times faster.
The cost of querying an artificial intelligence model that scores the equivalent of GPT-3.5 (64.8) on MMLU, a popular benchmark for assessing language model performance, dropped from $20.00 per million tokens in November 2022 to just $0.07 per million tokens by October 2024 (Gemini-1.5-Flash-8B), a more than 280-fold reduction in approximately 18 months, according to Stanford University’s Human Centered AI Institute.
source: Stanford University HAI
Depending on the task, LLM inference prices have fallen anywhere from nine to 900 times per year.
At the hardware level, costs have declined by 30 percent annually, while energy efficiency has improved by 40 percent each year, HAI’s 2025 AI Index Report says.
No comments:
Post a Comment