Tuesday, March 18, 2025

AI Chip Markets and Operations Shifting to "Inference?"


 

The artificial intelligence market changes fast, and not only because new models have been popping up. 

It seems we already are moving towards inference operations as the driver of much of the chip market, for example. 

 As AI adoption scales, cloud and data center operations will prioritize inference-driven AI workloads. 

That will highlight a growing need for specialized hardware optimized for inference tasks, and that arguably is where large end users (Amazon Web Services, Google Cloud, Meta and others) have been working to create homegrown solutions. AWS and Google Cloud, for example, have invested heavily in developing their own AI accelerators, specifically designed for inference tasks. 

The AWS Inferentia is purpose-built for AI inference workloads. Google Cloud Tensor Processing Units are specifically designed for AI workloads, including inference. 

Inference might already represent up to 90 percent of all machine learning costs, for example. 

And lots of capital is being invested in startups aiming to improve processing efficiency.

No comments:

Sometimes You're Happy for the Wave the Other Guy Caught

Yeah, that's pretty much how you feel if you are a surfer paddling out and you see this. Great wave walling up; a longboard rider in per...