Friday, September 29, 2023

AI Impact on Data Centers

The use of internet mechanisms for content apps created a need for content delivery networks, so it is likely that artificial intelligence, perhaps a form of high-performance computing, will shape requirements for data centers. 

 

Area

Expected impact

Electricity consumption

AI applications are typically more energy-intensive than traditional IT applications. This is because AI applications often require the use of high-performance computing (HPC) resources, such as GPUs and FPGAs. 

Computing cycles

AI applications typically require more computing cycles than traditional IT applications. This is because AI applications often involve complex mathematical operations, such as matrix multiplication and convolution.

Storage

AI applications typically require more storage space than traditional IT applications since databases must be accessed to make inferences. 

Data center design

Data centers--in large part--are increasingly designed to support high-performance computing and ability to support AI training and inference operations. Additionally, data centers are being designed to be more energy-efficient and to provide better cooling for HPC resources. 

Edge computing

Generative AI and AI applications are increasingly being deployed at the “edge” of the network, closer to where the data is generated, as well as “on the device,” in large part because inference operations often require lower latency than many other types of apps. 

Cloud computing

Generative AI and AI applications are also driving the growth of cloud computing. Cloud providers offer a variety of AI-specific services, such as pre-trained AI models and AI development tools. This makes it easier for organizations to develop and deploy AI applications without having to invest in their own computing infrastructure. 

Chips

FPGAs and neuromorphic chips are being developed specifically for AI applications. These technologies can provide significant performance and energy efficiency improvements for AI applications.

Content delivery networks

CDNs can be used to distribute AI workloads (load balancing) or storage. 


What Would an AI Smartphone Do?

I’m not sure there will ultimately be anything we might all call an “AI smartphone.” Presumably the use of AI to provide rudimentary personalized recommendations or handle natural language processing will be augmented even further. 


Additional contextual awareness and enhanced personalization based on device learning of a user’s preferences and habits seem likely avenues of advance. 


Maybe the device learns which apps you use most often and places them on the home screen more conveniently. Perhaps the device learns your favorite settings and automatically applies them to new apps.


Maybe AI smartphones use information about your location, time of day, and activity level, and then use that information to provide you with relevant information and services in context.


Predictive capabilities might also be explored.  AI smartphones could use machine learning to predict your future needs and wants. ff you're regularly late for meetings, your phone might suggest setting off earlier. If you're planning a trip, your phone might suggest packing certain items or booking flights and accommodation.


More robust forms of natural language processing also seem likely, beyond today’s voice-to-text features and prompts. 


Improved camera capabilities such as automatically adjusting the exposure, white balance, and focus to produce the best possible results also seem likely.


5G Fixed Wireless Probably Appeals to about 25% of the U.S. Home Broadband Market

Verizon reports that its fixed wireless customers are using about 300 Gbytes of data per month. T-Mobile, meanwhile, has reported that in 2022 its fixed wireless customers were using an average of 478 GB of data each month.  


According to OpenVault, the average usage by home broadband users is about 567 GB per month. 


source: OpenVault 


One might point out that such levels of fixed wireless usage are characteristic of more than 60 percent of the whole U.S. market. If one assumes that a fixed wireless connection will tend to run no higher than about 200 Mbps, then fixed wireless arguably appeals to about a quarter of the U.S. home broadband market. 


source: OpenVault 


Wednesday, September 27, 2023

GPU as a Service and AI Business Models

It was virtually inevitable that the rise of artificial intelligence would create new roles for data centers and computing “as a service.” Consider firms that now offer access to graphics processing units “as a service,” including virtually all leading cloud computing as a service suppliers, as well as newer entrants in the “GPU as a service” or “cloud GPU” space:


Amazon Web Services (AWS)

Microsoft Azure

Google Cloud Platform (GCP)

NVIDIA DGX Cloud

IBM Cloud

Oracle Cloud Infrastructure (OCI)

CoreWeave

Jarvis Labs

Lambda Labs

Paperspace CORE


Perhaps notable on that list are NVIDIA DGX Cloud and firms such as CoreWeave. The former is notable because it represents a foray by a major GPU infra supplier into the “services” space, while the latter represents a new type of data center, namely focused on GPU as a service. 


Those efforts are part of a broader creation of revenue and business models for AI in general. 


As you might guess, a key question for generative AI and AI overall is “what is the revenue model?” The answers depend on what part of the internet ecosystem a firm operates in, and whether the “customer” is a business or a consumer. 


Chipmakers and server suppliers will sell infrastructure. Connectivity suppliers will sell bandwidth. Data centers will sell compute cycles, storage, interconnection and security “as a service.”


Virtually every firm will embed AI into its existing core business processes, creating revenue models for software and platform suppliers. The point is that direct revenue models (selling chips, servers, software, bandwidth, compute cycles, storage, security, payment processing, analytics) are likely to be common in business-to-business settings. 


The cost of those tools, in turn, will be monetized indirectly in the form of higher profit margins, higher sales, lower operating costs, lower churn, higher add-on sales, greater awareness and longer customer life cycles, for example. 


Indirect models are likely to dominate in consumer markets. Subscriptions, transactions and advertising are the basic consumer revenue models. So, in most cases, AI is unlikely to be a “product” sold separately to consumers. Rather, it is going to be embedded in some other revenue-producing process. 


That is probably what ChapGPT founder Sam Altman means when he says the costs of intelligence are on a path to near-zero costs. Revenue from applying AI will be embedded in the consumer’s cost to buy things, watch things, listen to things, read things, communicate with others, use social media and find things. 


Some pay-to-play consumer models could develop, much as some firms rely on subscriptions for access to content, services and features. But consumers are price conscious, so advertising and transactions  is likely to remain a key alternative to pay-to-play and subscriptions. 


But GPU as a service is among the new direct revenue and business models developing around AI.


Tuesday, September 26, 2023

Why TCP/IP Also Was a Business Choice

There are many reasons why the connectivity industry that might well have preferred asynchronous transfer mode instead of TCP/IP as the next-generation network protocol, there are lots of obvious reasons why TCP/IP was chosen. 


Dimension

TCP/IP

ATM

Cost

Relatively low

Relatively high

Scalability

Very high

Not as high

Openness

Open standard

Proprietary

Complexity

Relatively simple

More complex

Suitability for a variety of network technologies

Yes

No


On the other hand, the choice of TCP/IP also had serious implications for innovation and distribution of profit within the app ecosystem supported by multi-purpose, multimedia networks. By creating a layered, loosely-coupled architecture, permissionless innovation was possible.


In other words, so long as an app or service is compliant with TCP/IP network protocols, no business relationship needs to exist between any internet service provider and any app creator. So app creation is “open” and permissionless, not closed. App and service innovators do not need permission, or direct business relationships, to be used on any public IP network. 


In other words, nearly all apps and services become a matter of “direct to consumer,”


Factor

Advantage for Telcos

Advantage for App Creators

Disadvantage for Telcos

Disadvantage for App Creators

Loose coupling

Increased flexibility and scalability

Increased flexibility and agility

Increased complexity

Increased complexity

TCP/IP protocols

Ubiquitous and interoperable

Easy to use and develop for

Less control over network performance

Less control over network performance

Software layers

Easier to upgrade and maintain

Easier to innovate and develop new features

Less control over network performance and security

Less control over network performance and security


If you want to know why connectivity providers such as telcos worry so much about being "dumb pipes," the loosely-coupled, layered architecture of TCP/IP and modern software is a chief reason. 

No developer, app or content provider needs an ISP's permission to sell products directly to en users. Everything is "direct to consumer." 

Sunday, September 24, 2023

5G is to Edge Computing as WANS are for Cloud Computing

One way to look at 5G is to examine its role in supporting computing operations, rather than the role of enabling communications, much as global data transport networks can be looked at as essential parts of the cloud computing infrastructure, or Wi-Fi can be viewed as a key part of the internet access function. 


Simply stated, in the internet era most “computing” is inseparable from “internet access.” In other words, data processing, storage and app consumption depend on communications. 


And some use cases demand low latency, which drives demand for  edge computing. In other cases, edge computing adds value by reducing the amount of wide area network investment that has to be supported. 


Use Case

Value from Low Latency

Value from Reduced Bandwidth

Self-driving cars

Critical

Significant

Augmented reality/virtual reality

Critical

Significant

Industrial process control

Critical

Significant

Smart cities

Moderate

Moderate

Content delivery networks (CDNs)

Significant

Critical

Internet of Things (IoT) devices

Moderate

Significant


Edge computing value also varies among use cases. It is hard to imagine successful widespread use of self-driving vehicles without very-low-latency data processing “on the device,” accessible without wires. 


On-device is the place to put real-time language translation activities, such as translation during a voice call or during video or audio playback of content. Untethered access is essential when the devices are smartphones. 


5G also enables new cloud-based gaming services. These services allow gamers to play high-end games on their mobile devices without the need to download or install any software. These use cases typically require both low latency. 


5G is being used to automate industrial processes by connecting untethered devices to local servers for real-time process control. 


Proposed virtual reality use cases typically require both on-device and remote computing support, but with low latency crucial for realism. 


But many edge computing scenarios actually benefit from a mix of low latency and bandwidth-reduction value. 


5G is used to stream high-definition and ultra-high-definition video to mobile devices from edge content servers. Live streaming of sporting events and concerts to mobile devices also requires untethered access. But there the value mostly is bandwidth reduction, not so much latency. 


Likewise, 5G is being developed to support smart city applications, such as traffic management, public safety, and utility monitoring, though most of these apps will initially use remote computing rather than on-device computing. 


In some cases, as when delivering a self-contained on-device app, perhaps the greatest need is simply downloading the app and updating the app, ad delivery and uploading of usage and behavior profiles. 


In many other use cases, virtually every keystroke for a document, every frame of a video, every note of a song, every pixel of an image has to be transported to a remote server location. 


Our computing architecture includes processing on-device, on the premises, metro or regional data center and remote data center venues. 


Traditionally we’d describe the various key parts of the connectivity network as involving the inside home network (local area network using Ethernet or Wi-Fi); access network (home broadband); middle mile (connection between local network and nearest internet point of presence) and wide area network (long distances between points of presence). 


In all these cases, connectivity to nearby or remote resources is required. 


Edge computing, in general, is driven by the need for processing with low latency, and sometimes by the added advantage of reducing network bandwidth demand. 


Language translation “on the fly” is an example of the former; video content delivery an example of the latter. 


Edge Use Case

Description

Object detection and recognition

AI can be used to detect and recognize objects in images and videos. This can be used for a variety of applications, such as security, surveillance, and quality control. For example, AI-powered cameras can be used to detect intruders in a secure facility, or to identify defects in products on a production line.

Natural language processing (NLP)

NLP can be used to understand and respond to human language. This can be used for a variety of applications, such as customer service, chatbots, and voice assistants. For example, NLP-powered chatbots can be used to answer customer questions about products and services, or to help customers book appointments.

Machine learning (ML)

ML can be used to train AI models to learn from data and make predictions. This can be used for a variety of applications, such as fraud detection, predictive maintenance, and medical diagnosis. For example, ML-powered models can be used to detect fraudulent transactions, or to predict when a machine is likely to fail.

Computer vision

Computer vision is a field of AI that deals with the extraction of meaningful information from digital images and videos. This can be used for a variety of applications, such as self-driving cars, facial recognition, and medical imaging. For example, computer vision-powered systems can be used to identify pedestrians and other objects on the road, or to detect cancer cells in medical images.

Self-driving autos

AI at the edge is used to power the self-driving features in cars, such as lane keeping assist and adaptive cruise control, which, by definition, must use untethered mobile access. 

Smart homes

AI is used to power smart home devices, such as thermostats and security systems, which use untethered access and mobile networks for connectivity. 

Smart cities

AI is used to collect and analyze data from IoT devices, such as sensors and actuators, which use mobile networks for network connectivity.

VR/AR

Uses a mix of edge and remote computing


The point is that 5G can be viewed through the lens of “compute platform” rather than “communications,” just as cloud computing, data centers, edge computing and devices can be assessed as computing venues.


Directv-Dish Merger Fails

Directv’’s termination of its deal to merge with EchoStar, apparently because EchoStar bondholders did not approve, means EchoStar continue...