Showing posts sorted by date for query data center to data center. Sort by relevance Show all posts
Showing posts sorted by date for query data center to data center. Sort by relevance Show all posts

Monday, November 18, 2024

AI and Quantum Change

Lots of people in their roles as retail investors are hearing lots about “artificial intelligence winners” these days, and much of the analysis is sound enough. There will be opportunities for firms and industries to benefit from AI growth. 


Even if relatively few of us invest at the angel round or are venture capitalists, most of us might also agree that AI seems a fruitful area for investment, from infrastructure (GPUs; GPU as a service; AI as a service; transport and data center capacity) to software. 


Likewise, most of us are, or expect soon to be, users of AI features in our e-commerce; social media; messaging; search; smartphone; PC and entertainment experiences.


Most of those experiences are going to be quite incremental and evolutionary in terms of benefit. Personalization will be more intensive and precise, for example. 


But we might not experience anything “disruptive” or “revolutionary” for some time. Instead, we’ll see small improvements in most things we already do. And then, at some point, we are likely to experience something really new, even if we cannot envision it, yet. 


Most of us are experientially used to the idea of “quantum change,”  a sudden, significant, and often transformative shift in a system, process, or state. Think of a tea kettle on a heated stove. As the temperature of the water rises, the water remains liquid. But at one point, the water changes state, and becomes steam.


Or think of water in an ice cube tray, being chilled in a freezer. For a long time, the water remains a liquid. But at some definable point, it changes state, and becomes a solid. 


That is probably how artificial intelligence will feature hundreds of evolutionary changes in apps and consumer experiences that will finally culminate in a qualitative change. 


In the history of computing, that “quantity becomes quality” process has been seen in part because new technologies reach a critical mass. Some might say these quantum-style changes result from “tipping points” where the value of some innovation triggers widespread usage. 


Early PCs in the 1970s and early 1980s were niche products, primarily for hobbyists, academics, and businesses. Not until user-friendly graphical interfaces were available did PCs seem to gain traction.


It might be hard to imagine, but GUIs that allow users to interact with devices using visual elements such as icons, buttons, windows, and menus, was a huge advance over command line interfaces. Pointing devices such as a  mouse, touchpad, or touch screen are far more intuitive for consumers than CLIs that require users to memorize and type commands.


In the early 1990s, the internet was mostly used by academics and technologists and was a text-based medium. The advent of the World Wide Web, graphical web browsers (such as  Netscape Navigator) and commercial internet service providers in the mid-1990s made the internet user-friendly and accessible to the general public.


Likewise, early smartphones (BlackBerry, PalmPilot) were primarily tools for business professionals, using keyboard interfaces and without easy internet access. The Apple iPhone, using a new “touch” interface, with full internet access, changed all that. 


The point is that what we are likely to see with AI implementations for mobile and other devices is an evolutionary accumulation of features with possibly one huge interface breakthrough or use case that adds so much value that most consumers will adopt it. 


What is less clear are the tipping point triggers. In the past, a valuable use case sometimes was the driver. In other cases it seems the intuitive interface was key. For smartphones it possibly was a combination of elegant interface; multiple-functions (internet access in the purse or pocket; camera replacement; watch replacement; PC replacement; plus voice and texting) 


The point is that it is hard to point to a single “tipping point” value that made smartphones a mass market product. While no single app universally drove adoption, several categories of apps--social media, messaging, navigation, games, utility and productivity-- all combined with an intuitive user interface, app stores and full internet access to make the smartphone a mass market product. 


Regarding consumer AI integrations across apps and devices, we might see a similar process. AI will be integrated in any evolutionary way across most consumer experiences. But then one particular crystallization event (use case, interface, form factor or something else) will be the trigger for mass adoption. 


The point is that underlying details of the infrastructure(operating systems, chipsets) do not drive end user adoption. What we tend to see is that some easy to use, valuable use case or value proposition suddenly emerges after a long period of gradual improvements. 


For a long time, we’ll be aware of incremental changes in how AI is applied to devices and apps. The changes will be useful but evolutionary. 


But, eventually, some crystallization event will occur, producing a qualitative change, as all the various capabilities are combined in some new way. 


“AI,” by itself, is not likely to spark a huge qualitative shift in consumer behavior or demand. Instead, a gradual accumulation of changes including AI will set the stage for something quite new to emerge.


That is not to deny the important changes in ways we find things, shop,  communicate, learn or play. For suppliers, it will matter whether AI displaces some amount of search; shifts retail volume or social media personalization. 


But users and consumers are unlikely to see disruptive new possibilities for some time, until ecosystems are more-fully built out and then some unexpected innovation finally creates a tipping point moment such as the “iPhone moment,” a transformative, game-changing event or innovation that disrupts an industry or fundamentally alters how people interact with technology, products, or services. 


It might be worth noting that such "iPhone moments" often involve combining pre-existing technologies in a novel way. The Tesla Model S, ChatGPT, Netflix, social media and search might be other examples. 


We’ll just have to keep watching.


Saturday, November 9, 2024

Eventually, "Back to the Future" for Lumen Technologies

Eventually, Lumen Technologies will go back to the future, reversing its mashup of focused data transport and enterprise customer base with legacy telco local access businesses.


Right now, Lumen arguably is too small to compete with the likes of AT&T, Verizon and T-Mobile, but too large to effectively compete with local internet service providers.


And if Lumen separates its data transport from local access businesses, it will recreate its existence as a bigger version of Level 3 Communications, but shedding its legacy in the local access business. Still, it will be a smaller company than it is today.


The reason is simply that the capacity business is smaller than the mobility and consumer communications businesses Lumen will leave.


Looking at data transport revenue for major U.S. fixed network providers in 2023, excluding mobile revenues, Lumen’s position is arguably quite different from that of AT&T and Verizon. Because both of the latter firms have such huge mobile services businesses, the data transport portion of total revenue is relatively smaller than at Lumen. 


Though business revenues represent about 30 percent of Verizon’s revenue, and about the same percentage at AT&T, business revenues are 45 percent of Lumen Technologies revenue. Data transport is a portion of total business revenue.


Company

Total Revenue (USD Billion)

Business Segment (USD Billion)

Consumer Segment (USD Billion)

Lumen Technologies

$14.56

$6.6 (Enterprise)

$3.1

AT&T

$120.7

$36.3 (Business Wireline)

$15.1

Verizon

$134.0

$39.6 (Business Solutions)

$21.8


With the caveat that it can be difficult to separate out data transport revenues, such revenues are likely in single-digit billions of dollars for leading transport providers in the U.S. market. 


Company

Core Network Data Transport Revenue

Lumen Technologies

~$4.7 billion (2023)

AT&T

~$9 billion (2023)

Verizon

~$7 billion (2023)

Charter (Spectrum Enterprise)

~$1.5 billion (2023)

Comcast (Comcast Business)

~$2 billion (2023)

Cox Communications

~$1 billion (2023)


The point is simply that Lumen Technologies is different from the other noted providers in having a relatively small consumer business to rely upon for revenue generation. And that consumer business relies principally on the local access facilities, not the wide area data transport network. The other providers have substantial consumer revenue operations in both mobility and fixed network realms. 


Lumen does not have mobile revenue exposure and has a relatively small consumer revenue footprint, as business segment revenues are routinely about 78 percent of total revenues. 


That is the result of a sort of “mashup” of data transport assets with a traditional telco base that always was the least-dense of all former Bell company geographies. That also means it is less feasible for Lumen to upgrade its fixed network for fiber services. 


For that reason, it always has seemed reasonable to assume that, at some point, the former U.S. West (Qwest) assets would be separated from the core data transport assets. 


Looking at the acquisitions and asset dispositions Lumen has made over the past couple of decades, you can see the logic. 


U.S. West, the former telco, was acquired by Qwest Communications--a long-haul data transport and metro fiber company, in (2000. That was the first mashup.


Then, in 2011, Qwest was acquired by CenturyLink, a Louisiana-based telecom provider with a largely rural and consumer footprint. That would seem to be a move back in the direction of local access operations. In fact, CenturyLink sought a bigger role in business and enterprise services, but the acquisition of Qwest also gave the new CenturyLink a greater consumer services footprint. 


The 2011 CenturyLink acquisition of Savvis gave CenturyLink a bigger footprint in data center, cloud computing, and managed hosting services, rebalancing a bit back to enterprise and business revenue.


But it was the acquisition of Level 3 Communications in 2017 that completed the mashup, given Level 3’s huge presence in long-haul data transport, metro fiber operations and international assets. Level 3 has solely focused on enterprise and wholesale capacity operations, not consumer local access.


The first step towards reconfiguring the asset base came as Lumen divested its local telephone business in 20 states in 2021, shrinking the company’s revenue and highlighting its debt profile. 


The next shoe to drop, some would argue, is a separation of the remaining former U.S. West local access assets from the assembled portfolio of global capacity assets. If Lumen retains the capacity portfolio, while shedding the local access assets, it would be a smaller, focused capacity business, as was Level 3 Communications. 


Back to the future, in other words.


Wednesday, November 6, 2024

We Might Have to Accept Some Degree of AI "Not Net Zero"

An argument can be made that artificial intelligence operations will consume vast quantities of electricity and water, as well as create lots of new e-waste. It's hard to argue with that premise. After all, any increase in human activity--including computing intensity--will have that impact.


Some purists might insist we must be carbon neutral or not do AI. Others of us might say we need to make the same sorts of trade offs we must make everyday, for all our activities that have some impact on water, energy consumption or production of e-waste.


We have to balance outcomes and impacts, benefits and costs, while working over time to minimize those impacts. Compromise, in other words.


Some of us would be unwilling to accept "net zero" outcomes if it requires poor people to remain poor; hungry people to remain hungry.


And not all of the increase in e-waste, energy or water consumption is entirely attributable to AI operations. Some portion of the AI-specific investment would have been made in any case to support the growth of demand for cloud computing. 


 So there is a “gross” versus “net” assessment to be made, for data center power, water and e-waste purposes resulting from AI operations. 


By definition, all computing hardware will eventually become “e-waste.” So use of more computing hardware implies more e-waste, no matter whether the use case is “AI” or just “cloud computing.” And we will certainly see more of both. 


Also, “circular economy” measures will certainly be employed to reduce the gross amount of e-waste for all servers. So we face a dynamic problem: more servers, perhaps faster server replacement cycles, more data centers and capacity, offset by circular economy efficiencies and hardware and software improvements. 


Study Name

Date

Publishing Venue

Key Conclusions

The E-waste Challenges of Generative Artificial Intelligence

2023

ResearchGate

Quantifies server requirements and e-waste generation of generative AI (GAI). Finds that GAI will grow rapidly, with potential for 16 million tons of cumulative waste by 2030. Calls for early adoption of circular economy measures.

Circular Economy Could Tackle Big Tech Gen-AI E-Waste

2023

EM360

Introduces a computational framework to quantify and explore ways of managing e-waste generated by large language models (LLMs). Estimates annual e-waste production could increase from 2.6 thousand metric tons in 2023 to 2.5 million metric tons per year by 2030. Suggests circular economy strategies could reduce e-waste generation by 16-86%.

AI has a looming e-waste problem

2023

The Echo

Estimates generative AI technology could produce 1.2-5.0 million tonnes of e-waste by 2030 without changes to regulation. Suggests circular economy practices could reduce this waste by 16-86%.

E-waste from generative artificial intelligence"

2024

Nature Computational Science

Predicts AI could generate 1.2-5.0 million metric tons of e-waste by 2030; suggests circular economy strategies could reduce this by up to 86%1

2

"AI and Compute"

2023

OpenAI (blog)

Discusses exponential growth in computing power used for AI training, implying potential e-waste increase, but doesn't quantify net impact

"The carbon footprint of machine learning training will plateau, then shrink"

2024

MIT Technology Review

Focuses on energy use rather than e-waste, but suggests efficiency improvements may offset some hardware demand growth


Monday, November 4, 2024

AI Model Inference Costs Will Decline 20% to 30% Per Year

Despite concern over the high capital investment costs in infrastructure to support generative artificial intelligence models, many studies suggest that costs for inference, which should ultimately be the primary on-going costs, should drop over time, as have costs for other computing instances.


Study Title

Date

Publication Venue

Key Conclusions on Cost Declines

Scaling and Efficiency of Deep Learning Models

2019

NeurIPS

Demonstrates how advances in model scaling (larger models running on optimized hardware) lead to inference cost reductions of around 20-30% per year.

The Hardware Lottery

2020

Communications of the ACM

Highlights the role of specialized hardware (GPUs, TPUs) in reducing AI inference costs, estimating a 2x decrease every 1-2 years with hardware evolution.

Efficient Transformers: A Survey

2021

Journal of Machine Learning Research

Describes optimization techniques (such as pruning, quantization) that contribute to cost declines, estimating an average 30-50% drop in inference costs over two years.

The Financial Impact of Transformer Model Scaling

2022

IEEE Transactions on AI

Examines the economic impacts of scaling transformers and shows that large models can reduce costs by ~40% through efficiencies gained in distributed inference and hardware.

Inference Cost Trends in Large AI Deployments

2023

ICML

Finds a 50% reduction in inference costs per year for large-scale deployments, driven by optimizations in distributed computing and custom AI chips.

Beyond Moore’s Law: AI-Specific Hardware Innovations

2023

MIT Technology Review

Discusses how specialized hardware design reduces inference costs by 2-4x every 2 years, shifting from general-purpose GPUs to domain-specific architectures.

Optimizing Inference Workloads: From Data Center to Edge

2024

ArXiv

Analyzes cost reductions from 2020 to 2024 for both data center and edge deployments, concluding that distributed systems and model compression lead to 50% annual cost drops.


The implication is that inference costs should continue to drop. 


Year

Cost per Inference ($)

Cost Decline Compared to Prior Year

2018

1

-

2019

0.8

20%

2020

0.5

37.50%

2021

0.3

40%

2022

0.15

50%

2023

0.08

47%

2024

0.04

50%

2025

0.02

50%


Of course, a trend towards larger models, using more parameters, will run counter to that trend, in terms of model building. Still, AI model-building (training) cost declines over time, because of hardware acceleration, improved algorithms and model design optimization.


Study

Date

Publication Venue

Key Conclusions on Cost Declines

Scaling Neural Networks with Specialized Hardware

2018

NeurIPS

Describes how hardware advances, especially GPUs and early TPUs, helped reduce model-building costs by around 50% annually for larger models compared to CPU-only setups.

Reducing the Cost of Training Deep Learning Models

2019

IEEE Spectrum

Shows a 40% cost reduction for model training per year through advances in parallel computing and early model optimizations such as batch normalization and weight sharing.

The Lottery Ticket Hypothesis

2019

ICLR

Proposes pruning techniques that significantly reduce computational needs, allowing for up to a 2x reduction in training costs for large models without performance loss.

Efficient Training of Transformers with Quantization

2020

ACL

Demonstrates that quantization can cut training costs nearly in half for transformer models by using fewer bits per parameter, making training large models more economical.

Scaling Laws for Neural Language Models

2020

OpenAI Blog

Finds that while model sizes are increasing exponentially, training cost per parameter can be reduced by ~30% annually through more efficient scaling laws and optimized architectures.

AI and Compute: How Models Get Cheaper to Train

2021

MIT Technology Review

Highlights that training cost per model dropped by approximately 50% from 2018 to 2021 due to more efficient GPUs, TPUs, and evolving cloud infrastructures.

Scaling Up with Low Precision and Pruning Techniques

2022

Journal of Machine Learning Research

Examines pruning and low-precision computation, showing that cost reductions of 50-60% are possible for large-scale models by aggressively reducing unnecessary computations.

The Carbon Footprint of Machine Learning Training

2022

Nature Communications

Highlights how reduced training costs, linked to hardware improvements and energy-efficient computing, also lower the environmental impact, with 35% cost reductions per year.

Optimizing AI Model Training in Multi-GPU Systems

2023

ICML

Finds that advanced multi-GPU and TPU systems reduce training costs for models by ~50% annually, even as model sizes grow, through parallelization and memory sharing.

Scaling AI Economically with Distributed Training

2024

ArXiv

Analyzes distributed training techniques that cut training costs nearly in half for large models, balancing model complexity with infrastructure improvements.


AI model creation costs are quite substantial, representing perhaps an order of magnitude more capital intensity than did cloud computing, for example. But capital intensity should decline over time, as do all computing instances. 


Directv-Dish Merger Fails

Directv’’s termination of its deal to merge with EchoStar, apparently because EchoStar bondholders did not approve, means EchoStar continue...