Saturday, March 30, 2024

Which Edge Will Dominate AI Processing?

Edge computing advantages generally are said to revolve around use cases requiring low-latency response, and the same is generally true for artificial intelligence processing as well. 


Some use cases requiring low-latency response will be best executed “on the device” rather than at a remote data center, and often on the device rather than at an “edge” data center. 


That might especially be true as some estimate consumer apps will represent as much as 70 percent of total generative artificial intelligence compute requirements. 


So does that mean we see graphics processor units on most smartphones? Probably not, even if GPU prices fall over time. We’ll likely see lots of accelerator chips, though, including more use of tensor processing units or neural processing units and application specific integrated circuits, for reasons of cost.  


The general principle is always that the cost of computing facilities increases, while efficiency decreases, as computing moves to the network edge. In other words, centralized computing tends to be the most efficient while computing at the edge--which necessarily involves huge numbers of processors--is necessarily more capital intensive. 


For most physical networks, as much as 80 percent of cost is at the network edges. 


Beyond content delivery, many have struggled to define the business model for edge computing, however. Either from an end user experience perspective or an edge computing supplier perspective. 


Sheer infrastructure cost remains an issue, as do compelling use cases. Beyond those issues, there arguably are standardization and interoperability issues similar to multi-cloud, complexity concerns and fragmented or sub-scale revenue opportunities. 


In many cases, “edge” use cases also make more sense for “on the device” processing, something we already see with image processing, speech-to-text and real-time language translation. 


To be sure, battery drain, processors and memory (and therefore cost) will be issues, initially. 


On-Device Use Case

Benefits

Considerations

Image Processing (Basic)

Privacy: Processes images locally without sending data to servers.  Offline Functionality: Works even without internet connection. - Low Latency: Real-time effects and filters.

Limited Model Complexity: Simpler tasks like noise reduction or basic filters work well on-device. - Battery Drain: Complex processing can drain battery life.

Voice Interface (Simple Commands)

Privacy: Voice data stays on device for sensitive commands. - Low Latency: Faster response for basic commands (e.g., smart home controls).

Limited Vocabulary and Understanding: On-device models may not handle complex requests. - Limited Customization: Pre-trained models offer less user personalization.

Language Translation (Simple Phrases)

Offline Functionality: Translates basic phrases even without internet. - Privacy: Sensitive conversations remain on device.

Limited Languages and Accuracy: Fewer languages and potentially lower accuracy compared to cloud-based models.  Storage Requirements: Larger models for complex languages might not fit on all devices.

Message Autocomplete

Privacy: Keeps message content on device.  Offline Functionality: Auto-completes even without internet.

Limited Context Understanding: Relying solely on local message history might limit accuracy. - Personalized Experience: On-device models may not adapt to individual writing styles as well.

Music Playlist Generation (Offline)

Offline Functionality: Creates playlists based on downloaded music library. - Privacy: No need to send music preferences to the cloud.

Limited Music Library Size: On-device storage limits playlist diversity. - Static Recommendations: Playlists may not adapt to changing user tastes as effectively.

Maps Features (Limited Functionality)

Offline Functionality: Access basic maps and navigation even without internet. - Privacy: No user location data sent to servers for basic features.

Limited Features: Offline functionality may lack real-time traffic updates or detailed points of interest. - Outdated Maps: Requires periodic updates downloaded to the device.


Remote processing (edge or remote) will tend to favor use cases including augmented reality; advanced image processing; personalized content recommendations or predictive maintenance. 


Latency requirements for these and other apps will tend to drive the need for edge processing.


No comments:

How Big is "GPU as a Service" Market?

It’s almost impossible to precisely quantify the addressable market for specialized “graphics processor unit as a service” providers such as...