Friday, December 1, 2023

Here Come the NPUs

As important as central processing units and graphics processing units are for modern computing, other application specific integrated circuits and now neural processing units are becoming important as artificial intelligence becomes a fundamental part of computing and computing devices. 


A neural processing unit (NPU) is a specialized microprocessor that is designed to accelerate the performance of machine learning algorithms, particularly those involving artificial neural networks (ANNs). 


Often called “AI accelerators,” neural processing units are dedicated hardware that handle specific machine learning tasks such as computer vision algorithms. You can think of them much like a GPU, but for AI rather than graphics.


Though important for virtually any AI processing task, in any setting, NPUs will be vital for onboard smartphone processing tasks, as they reduce power consumption. 


NPU's are specifically designed to handle the large matrix operations that are common in ANNs, making them much faster and more efficient than traditional CPUs or GPUs for these tasks.


Producers of NPUs already include a “who’s who” list of suppliers:


  • Google (Tensor Processing Unit)

  • Intel Nervana

  • NVIDIA's AI Tensor Cores are a type of NPU that is integrated into NVIDIA's GPUs.

  • IBM's TruAI

  • Graphcore Intelligence Processing Unit

  • Wave Computing's Data Processing Unit

  • Cambricon's Machine Learning Unit

  • Huawei's NPU

  • Qualcomm's AI Engine is integrated into Qualcomm's mobile processors. 


Why are they used? Performance, efficiency, latency. 


NPUs can provide significant performance improvements over CPUs and GPUs for machine learning tasks. NPUs are also more efficient than CPUs and GPUs for machine learning tasks, consuming less power and producing  less heat. NPUs can also reduce the latency of machine learning tasks. 


NPUs are used for natural language processing, computer vision, recommendation systems and to power autonomous vehicles, for example. 


"Back to the Future" as Extended Edge Develops

In some ways, extended edge AI processing on devices--and not at remote cloud computing centers, is a form of “back to the future” technology, as in the days when most data processing happened directly on devices (PCs). 


That suggests substantial movement back to a distributed, decentralized processing environment, in large part, where the cloud era brought a centralized model into being. 


Just as certainly, extended edge will recreate smartphone, smart watch and PC markets, as those devices are outfitted to handle AI directly on board. 


If one takes the current retail market value of smartphones, PCs and smart watches, and then assumes adoption rates between 70 percent and 90 percent by 2033, markets supporting extended edge will be quite substantial, including between a doubling to five-times increase in software, hardware, chip, platform, manufacturing and connectivity increases. 


Market

Market Size in 2023 (USD Billion)

Estimated Market Size with Edge AI in 2033 (USD Billion)

Percentage of AI-Capable Devices in 2033

Smartphones

430

1,999

90%

PCs

250

600

80%

Smartwatches

30

150

70%

Total

710

2,749

80%


Just as PCs made computing power available to anyone with a computer, extended edge AI is making AI capabilities accessible to a wider range of devices and users, right on the device itself.


Extended edge AI also will embed AI operations into everyday objects and environments, enabling a range of new operations requiring immediate response (low latency). 


That will require more powerful processors and more storage, which can be a problem for smartphones and wearable devices with limited resources.


Increased power consumption also will be an issue. And AI models will have to be updated. 


Over-the-air updates, federated learning (where devices train a shared model without exchanging raw data), model compression and quantization, model pruning and knowledge distillation or adaptive learning are tools designers can use to ensure that AI models running on extended edge devices can be updated. 


Model pruning techniques identify and remove redundant or less important connections within the AI model, reducing its complexity without significantly impacting performance. Knowledge distillation involves transferring the knowledge from a large, complex model to a smaller, more efficient model, preserving the original model's capabilities.


Adaptive learning algorithms enable AI models to continuously learn and adapt to changing environments and user behavior.


Thursday, November 30, 2023

Data Center Capex Will More Than Double over 10 Years: How Much is the Only Issue

Among the predictions about the impact of artificial intelligence on data centers are a few salient statistics. Sanjay Bhutani, AdaniConneX chief business officer, notes that AI will drive data centers from 54 gigawatts  to perhaps 90 Gwatts. 


Likewise, Gautham Gnanajothi, Frost and Sullivan global VP estimates data center investment will grow from about $300 billion to $775 billion over the next 10 years, of which the largest-eight hyperscalers currently represent about $110 billion in annual investment out of the total of $300 billion globally. 


Capex estimates vary from firm to firm, depending on assumptions about growth rates. Generally speaking, earlier studies show lower expected capex in 2033, compared to more-recent studies that include assumptions about additional requirements to support AI. 


Publisher

Prediction

Date

Frost & Sullivan

$610 billion

2023-10-04

IDC

$570 billion

2023-09-27

Gartner

$530 billion

2023-08-15

MarketsandMarkets

$590 billion

2023-07-12

Mordor Intelligence

$630 billion

2023-06-08

Grand View Research

$550 billion

2023-05-03

Technavio

$510 billion

2023-04-19

Fortune Business Insights

$560 billion

2023-03-07

Allied Market Research

$540 billion

2023-02-14

Statista

$580 billion

2023-01-10



Study Title

Capital Investment Prediction (2033)

Date of Publication

Publisher

Global Data Center Market Size, Trends, and Forecasts, 2022-2030

$814 billion

September 2023

Fortune Business Insights

Data Center Market Size, Share, Trends & Growth, 2023-2032

$795.2 billion

October 2023

Grand View Research

Data Center Market Global Report 2023

$768.7 billion

November 2023

Market Research Future

Data Center Market Size, Share & Trends Analysis Report by Component (IT Infrastructure, Facility Infrastructure), by Deployment Type (On-Premises, Cloud), by Vertical (BFSI, IT & Telecom, Retail, Healthcare), Forecast, 2023 - 2028

$742.5 billion

November 2023

Allied Market Research

Data Center Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2023 - 2028

$721.4 billion

November 2023

Global Market Insights

Future of Data Centers: 2023-2033

$698.2 billion

October 2023

Data Center Frontier

Data Center Market Global Trends and Forecast to 2030

$675.1 billion

September 2023

IDC

The Future of Data Centers in the Post-Pandemic Era

$652.0 billion

August 2023

Gartner

Data Center Market Size and Growth Analysis, 2023-2030

$631.9 billion

July 2023

ReportLinker

Global Data Center Industry Outlook 2023

$610.8 billion

June 2023

Frost & Sullivan


If the same ratios hold in a decade, hyperscalers will be investing about $284 billion, while other data centers invest about $496 billion, using the Frost and Sullivan estimate made by Gnanajothi. 


On the other hand, not all of the AI impact will necessarily involve “more” commitment of resources. Since AI model training does not require the same level of redundancy as do other operations, it is possible that AI training will represent less resource intensity than other types of operations, Gnanajothi suggests. 


Also, the denser footprint AI represents might also mean less proportional demand for land and building space, compared to existing operations. 


Also, AI training operations might not always be so latency dependent, though inference operations might often require edge computing, says Phillip Marangella, EdgeConnex chief marketing officer. 


And data centers, just like any other enterprise or organization, should be able to use AI to improve the efficiency of its operations. In fact, says Bhutani, AdaniConnex already uses AI to improve safety operations when it is building or operating a facility. 


Wednesday, November 29, 2023

Here Comes the "Emerging Edge"

When we see new terms emerging, it often is a sign that a category of functions, devices or computing modes is evolving. Consider artificial intelligence and edge computing, which increasingly will be done directly on devices ranging from smartphones to sensors to autos. 


Some might call on-device AI “emerging edge,” for example, where a more traditional use of “edge” might mean processing in an in-building data center or at a metro data center or at the base of a cell tower. 


And where edge computing has been pitched an a solution for very-low-latency computing or conservation of network bandwidth, emerging edge provides ultra-low latency by eliminating the immediate need for connectivity to any remote location at all. 


Emerging edge computing, now used to support image processing or natural language, also is being envisioned as suitable for other types of machine learning that do not require uploading data to the cloud for processing. 


Augmented Reality, Virtual Reality and autonomous vehicles provide some obvious examples. 


Predictive Maintenance, smart homes and smart buildings are other use cases where temperature, humidity, occupancy, and other measurements can be processing right on the appliances collecting the data. 


Likewise, wearable devices and medical sensors, retail checkout functions and logistics functions often will work right on the appliances collecting the data, as could  be the case for some manufacturing or agricultural sensors as well. 


The obvious analogy is the shift from standalone personal computers to internet-connected appliances that rely on external and remote computing. 


In the case of emerging edge computing, the shift is back to the appliance or device itself. 


In some ways, such a shift also could affect our notions of what “digital infrastructure” is. 


If one assumes “digital infrastructure,” refers to the transport and access layers of the computing  stack, then digital infra narrowly includes the physical components that enable digital communication and data storage, such as:

  • Internet access networks

  • Data centers

  • Wholesale capacity networks

  • Wireless communication towers


On the other other hand, if one considers digital infra the platform for end user apps and services, then a broad definition of infra could include the entire ecosystem of hardware, software, and services that support the digital economy, including:

  • Chips (processors, memory, storage devices)

  • Apps (mobile, web, desktop)

  • Devices (phones, PCs, AR/VR appliances, sensors)

  • Platforms (operating systems, cloud computing, databases)

  • Customer-facing retail networks (internet, mobile, satellite)

  • Cybersecurity infrastructure


“Emerging edge” allows more of the actual customer-used apps to merge with the “infra.” 


The narrow definition of infra might exclude end-user apps, platforms and devices. The broad definition might include them, and “emerging edge” is one trend that will encourage the broader definition.


Will AI Fuel a Huge "Services into Products" Shift?

As content streaming has disrupted music, is disrupting video and television, so might AI potentially disrupt industry leaders ranging from ...