Tuesday, March 18, 2025

AI Chip Markets and Operations Shifting to "Inference?"


 

The artificial intelligence market changes fast, and not only because new models have been popping up. 

It seems we already are moving towards inference operations as the driver of much of the chip market, for example. 

 As AI adoption scales, cloud and data center operations will prioritize inference-driven AI workloads. 

That will highlight a growing need for specialized hardware optimized for inference tasks, and that arguably is where large end users (Amazon Web Services, Google Cloud, Meta and others) have been working to create homegrown solutions. AWS and Google Cloud, for example, have invested heavily in developing their own AI accelerators, specifically designed for inference tasks. 

The AWS Inferentia is purpose-built for AI inference workloads. Google Cloud Tensor Processing Units are specifically designed for AI workloads, including inference. 

Inference might already represent up to 90 percent of all machine learning costs, for example. 

And lots of capital is being invested in startups aiming to improve processing efficiency.

No comments:

The Tomb is Empty

  Happy Easter, brothers and sisters.