IP Carrier: LLM-Assisted Search Would Dramatically Boost GPU Usage and Energy Consumption

Friday, October 13, 2023

LLM-Assisted Search Would Dramatically Boost GPU Usage and Energy Consumption

Concern about business and revenue models for large language models and generative AI are logical enough, for a variety of reasons. Model creation and inference generation cost money (graphics processor units, GPU as a service, code generation and software engineer support) and energy consumption.

SemiAnalysis, for example, has estimated that implementing AI similar to ChatGPT in each Google search would require 512,821 of NVIDIA’s A100 HGX servers, totaling 4,102,568 GPUs.

By some estimates, GenAI applied only to the search function for Google, if nearly ubiquitous, would boost costs of such search dramatically.

source: Semianalysis

At a power demand of 6.5 kW per server, this would translate into a daily electricity consumption of 80 GWh and an annual consumption of 29.2 TWh, an estimate similar to that made by New Street Research.

New Street Research has estimated that Google would need approximately 400,000 servers to handle search queries, each using an LLM model, which would lead to a daily energy consumption of 62.4 GWh and an annual consumption of 22.8 TWh.

With Google currently processing up to nine billion searches daily, these scenarios would average to an energy consumption of 6.9–8.9 Wh per request.