Among the many other changes artificial computing is raising for enterprise technologists and managers, AI also creates a new framework for thinking about older issues such as "cloud or on-prem?"
The new question is: "Which workloads justify dedicated GPU ownership, and which should be rented?"
Historically, the decision matrix was fairly simple.
But AI inference operations introduce new variables.
So the new decision-making matrix requires some understanding of when public cloud, private cloud or owned facilities provide the best economics for specific workloads.
For example, public cloud remains an optimal choice when utilization is uncertain or sporadic.
If GPU utilization is below roughly 30 percent to 40 percent, public cloud often is economically attractive.
But private cloud (enterprise-owned Infrastructure operated as a cloud) makes sense in other scenarios, such as trials or customer service operations, for reasons including customization, data control or security.
If workloads are predictable and GPU utilization exceeds roughly 50 percent to 60 percent, private infrastructure often becomes economically superior.
Owned facilities will make most sense for hyperscalers such as Amazon Web Services, Microsoft Azure, Google Cloud, large AI labs, major telecom operators, or very-large enterprises.
As often happens with computing technology, no single solution is right for every use case. For most large enterprises, the most-likely long-term architecture for large enterprises will often use public cloud, sometimes private cloud or owned facilities in some instances.
No solution will always be the best.
Pilots and training will normally be best suited for public cloud platforms. Proprietary models, regulated workloads or internal inference will be suited to private cloud.
Consumer-facing workload spikes are likely suited to use of public cloud, with high-volume inference likely an option for high-volume, sustained inference operations.
No comments:
Post a Comment