IP Carrier

Friday, December 1, 2023

Here Come the NPUs

As important as central processing units and graphics processing units are for modern computing, other application specific integrated circuits and now neural processing units are becoming important as artificial intelligence becomes a fundamental part of computing and computing devices.

A neural processing unit (NPU) is a specialized microprocessor that is designed to accelerate the performance of machine learning algorithms, particularly those involving artificial neural networks (ANNs).

Often called “AI accelerators,” neural processing units are dedicated hardware that handle specific machine learning tasks such as computer vision algorithms. You can think of them much like a GPU, but for AI rather than graphics.

Though important for virtually any AI processing task, in any setting, NPUs will be vital for onboard smartphone processing tasks, as they reduce power consumption.

NPU's are specifically designed to handle the large matrix operations that are common in ANNs, making them much faster and more efficient than traditional CPUs or GPUs for these tasks.

Producers of NPUs already include a “who’s who” list of suppliers:

Google (Tensor Processing Unit)
Intel Nervana
NVIDIA's AI Tensor Cores are a type of NPU that is integrated into NVIDIA's GPUs.
IBM's TruAI
Graphcore Intelligence Processing Unit
Wave Computing's Data Processing Unit
Cambricon's Machine Learning Unit
Huawei's NPU
Qualcomm's AI Engine is integrated into Qualcomm's mobile processors.

Why are they used? Performance, efficiency, latency.

NPUs can provide significant performance improvements over CPUs and GPUs for machine learning tasks. NPUs are also more efficient than CPUs and GPUs for machine learning tasks, consuming less power and producing less heat. NPUs can also reduce the latency of machine learning tasks.

NPUs are used for natural language processing, computer vision, recommendation systems and to power autonomous vehicles, for example.

"Back to the Future" as Extended Edge Develops

In some ways, extended edge AI processing on devices--and not at remote cloud computing centers, is a form of “back to the future” technology, as in the days when most data processing happened directly on devices (PCs).

That suggests substantial movement back to a distributed, decentralized processing environment, in large part, where the cloud era brought a centralized model into being.

Just as certainly, extended edge will recreate smartphone, smart watch and PC markets, as those devices are outfitted to handle AI directly on board.

If one takes the current retail market value of smartphones, PCs and smart watches, and then assumes adoption rates between 70 percent and 90 percent by 2033, markets supporting extended edge will be quite substantial, including between a doubling to five-times increase in software, hardware, chip, platform, manufacturing and connectivity increases.

Market	Market Size in 2023 (USD Billion)	Estimated Market Size with Edge AI in 2033 (USD Billion)	Percentage of AI-Capable Devices in 2033
Smartphones	430	1,999	90%
PCs	250	600	80%
Smartwatches	30	150	70%
Total	710	2,749	80%

Just as PCs made computing power available to anyone with a computer, extended edge AI is making AI capabilities accessible to a wider range of devices and users, right on the device itself.

Extended edge AI also will embed AI operations into everyday objects and environments, enabling a range of new operations requiring immediate response (low latency).

That will require more powerful processors and more storage, which can be a problem for smartphones and wearable devices with limited resources.

Increased power consumption also will be an issue. And AI models will have to be updated.

Over-the-air updates, federated learning (where devices train a shared model without exchanging raw data), model compression and quantization, model pruning and knowledge distillation or adaptive learning are tools designers can use to ensure that AI models running on extended edge devices can be updated.

Model pruning techniques identify and remove redundant or less important connections within the AI model, reducing its complexity without significantly impacting performance. Knowledge distillation involves transferring the knowledge from a large, complex model to a smaller, more efficient model, preserving the original model's capabilities.

Adaptive learning algorithms enable AI models to continuously learn and adapt to changing environments and user behavior.

Thursday, November 30, 2023

Data Center Capex Will More Than Double over 10 Years: How Much is the Only Issue

Among the predictions about the impact of artificial intelligence on data centers are a few salient statistics. Sanjay Bhutani, AdaniConneX chief business officer, notes that AI will drive data centers from 54 gigawatts to perhaps 90 Gwatts.

Likewise, Gautham Gnanajothi, Frost and Sullivan global VP estimates data center investment will grow from about $300 billion to $775 billion over the next 10 years, of which the largest-eight hyperscalers currently represent about $110 billion in annual investment out of the total of $300 billion globally.

Capex estimates vary from firm to firm, depending on assumptions about growth rates. Generally speaking, earlier studies show lower expected capex in 2033, compared to more-recent studies that include assumptions about additional requirements to support AI.

Publisher	Prediction	Date
Frost & Sullivan	$610 billion	2023-10-04
IDC	$570 billion	2023-09-27
Gartner	$530 billion	2023-08-15
MarketsandMarkets	$590 billion	2023-07-12
Mordor Intelligence	$630 billion	2023-06-08
Grand View Research	$550 billion	2023-05-03
Technavio	$510 billion	2023-04-19
Fortune Business Insights	$560 billion	2023-03-07
Allied Market Research	$540 billion	2023-02-14
Statista	$580 billion	2023-01-10

Study Title	Capital Investment Prediction (2033)	Date of Publication	Publisher
Global Data Center Market Size, Trends, and Forecasts, 2022-2030	$814 billion	September 2023	Fortune Business Insights
Data Center Market Size, Share, Trends & Growth, 2023-2032	$795.2 billion	October 2023	Grand View Research
Data Center Market Global Report 2023	$768.7 billion	November 2023	Market Research Future
Data Center Market Size, Share & Trends Analysis Report by Component (IT Infrastructure, Facility Infrastructure), by Deployment Type (On-Premises, Cloud), by Vertical (BFSI, IT & Telecom, Retail, Healthcare), Forecast, 2023 - 2028	$742.5 billion	November 2023	Allied Market Research
Data Center Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2023 - 2028	$721.4 billion	November 2023	Global Market Insights
Future of Data Centers: 2023-2033	$698.2 billion	October 2023	Data Center Frontier
Data Center Market Global Trends and Forecast to 2030	$675.1 billion	September 2023	IDC
The Future of Data Centers in the Post-Pandemic Era	$652.0 billion	August 2023	Gartner
Data Center Market Size and Growth Analysis, 2023-2030	$631.9 billion	July 2023	ReportLinker
Global Data Center Industry Outlook 2023	$610.8 billion	June 2023	Frost & Sullivan

If the same ratios hold in a decade, hyperscalers will be investing about $284 billion, while other data centers invest about $496 billion, using the Frost and Sullivan estimate made by Gnanajothi.

On the other hand, not all of the AI impact will necessarily involve “more” commitment of resources. Since AI model training does not require the same level of redundancy as do other operations, it is possible that AI training will represent less resource intensity than other types of operations, Gnanajothi suggests.

Also, the denser footprint AI represents might also mean less proportional demand for land and building space, compared to existing operations.

Also, AI training operations might not always be so latency dependent, though inference operations might often require edge computing, says Phillip Marangella, EdgeConnex chief marketing officer.

And data centers, just like any other enterprise or organization, should be able to use AI to improve the efficiency of its operations. In fact, says Bhutani, AdaniConnex already uses AI to improve safety operations when it is building or operating a facility.

Wednesday, November 29, 2023

Here Comes the "Emerging Edge"

When we see new terms emerging, it often is a sign that a category of functions, devices or computing modes is evolving. Consider artificial intelligence and edge computing, which increasingly will be done directly on devices ranging from smartphones to sensors to autos.

Some might call on-device AI “emerging edge,” for example, where a more traditional use of “edge” might mean processing in an in-building data center or at a metro data center or at the base of a cell tower.

And where edge computing has been pitched an a solution for very-low-latency computing or conservation of network bandwidth, emerging edge provides ultra-low latency by eliminating the immediate need for connectivity to any remote location at all.

Emerging edge computing, now used to support image processing or natural language, also is being envisioned as suitable for other types of machine learning that do not require uploading data to the cloud for processing.

Augmented Reality, Virtual Reality and autonomous vehicles provide some obvious examples.

Predictive Maintenance, smart homes and smart buildings are other use cases where temperature, humidity, occupancy, and other measurements can be processing right on the appliances collecting the data.

Likewise, wearable devices and medical sensors, retail checkout functions and logistics functions often will work right on the appliances collecting the data, as could be the case for some manufacturing or agricultural sensors as well.

The obvious analogy is the shift from standalone personal computers to internet-connected appliances that rely on external and remote computing.

In the case of emerging edge computing, the shift is back to the appliance or device itself.

In some ways, such a shift also could affect our notions of what “digital infrastructure” is.

If one assumes “digital infrastructure,” refers to the transport and access layers of the computing stack, then digital infra narrowly includes the physical components that enable digital communication and data storage, such as:

Internet access networks
Data centers
Wholesale capacity networks
Wireless communication towers

On the other other hand, if one considers digital infra the platform for end user apps and services, then a broad definition of infra could include the entire ecosystem of hardware, software, and services that support the digital economy, including:

Chips (processors, memory, storage devices)
Apps (mobile, web, desktop)
Devices (phones, PCs, AR/VR appliances, sensors)
Platforms (operating systems, cloud computing, databases)
Customer-facing retail networks (internet, mobile, satellite)
Cybersecurity infrastructure

“Emerging edge” allows more of the actual customer-used apps to merge with the “infra.”

The narrow definition of infra might exclude end-user apps, platforms and devices. The broad definition might include them, and “emerging edge” is one trend that will encourage the broader definition.