As important as central processing units and graphics processing units are for modern computing, other application specific integrated circuits and now neural processing units are becoming important as artificial intelligence becomes a fundamental part of computing and computing devices.
A neural processing unit (NPU) is a specialized microprocessor that is designed to accelerate the performance of machine learning algorithms, particularly those involving artificial neural networks (ANNs).
Often called “AI accelerators,” neural processing units are dedicated hardware that handle specific machine learning tasks such as computer vision algorithms. You can think of them much like a GPU, but for AI rather than graphics.
Though important for virtually any AI processing task, in any setting, NPUs will be vital for onboard smartphone processing tasks, as they reduce power consumption.
NPU's are specifically designed to handle the large matrix operations that are common in ANNs, making them much faster and more efficient than traditional CPUs or GPUs for these tasks.
Producers of NPUs already include a “who’s who” list of suppliers:
Google (Tensor Processing Unit)
Intel Nervana
NVIDIA's AI Tensor Cores are a type of NPU that is integrated into NVIDIA's GPUs.
IBM's TruAI
Graphcore Intelligence Processing Unit
Wave Computing's Data Processing Unit
Cambricon's Machine Learning Unit
Huawei's NPU
Qualcomm's AI Engine is integrated into Qualcomm's mobile processors.
Why are they used? Performance, efficiency, latency.
NPUs can provide significant performance improvements over CPUs and GPUs for machine learning tasks. NPUs are also more efficient than CPUs and GPUs for machine learning tasks, consuming less power and producing less heat. NPUs can also reduce the latency of machine learning tasks.
NPUs are used for natural language processing, computer vision, recommendation systems and to power autonomous vehicles, for example.