IP Carrier: Much of the AI Chip Market Shifting to Inference

Sunday, March 16, 2025

Much of the AI Chip Market Shifting to Inference

The artificial intelligence market changes fast, and not only because new models have been popping up. It seems we already are moving towards inference operations as the driver of much of the chip market, for example.

Inference might already represent up to 90 percent of all machine learning costs.

As AI adoption scales, cloud and data center operations will prioritize inference-driven AI workloads. That will highlight a growing need for specialized hardware optimized for inference tasks, and that arguably is where large end users (Amazon Web Services, Google Cloud, Meta and others) have been working to create homegrown solutions.

AWS and Google Cloud, for example, have invested heavily in developing their own AI accelerators, specifically designed for inference tasks.

The AWS Inferentia is purpose-built for AI inference workloads. Google Cloud Tensor Processing Units are specifically designed for AI workloads, including inference. On the other hand, Meta also is developing its own custom chips for model training. And lots of capital is being invested in startups aiming to improve processing efficiency.