Friday, January 31, 2025

Comcast "Low Lag" Consumer Internet Access Service Gets Commercialized

Network neutrality rules have barred the sort of quality-assurance features for consumer service that Comcast now is preparing to introduce nationwide, in stages. But such rules now are in abeyance in the U.S. market. 


The “low-lag” service aims to improve experience for “interactive applications like gaming, videoconferencing, and virtual reality.” 


Of course, behind all the marketing hype we can be expected to hear are some physical realities. Since the internet is actually a “network of networks,” either bandwidth or latency issues are generally not under any single participant’s control. No matter what performance is claimed on any single physical infrastructure, the end-to-end path packets take is non-determinstic. 


In other words, the exact path cannot be specified rigorously and always. All of which means actual performance is difficult to guarantee. For Comcast, which also uses a hybrid fiber coax access network, there are other practical considerations as well.


It is generally agreed that latency performance actually is best on a fiber-to-home connection, moderate on a hybrid fiber coax or copper digital subscriber line connection and worse on a geosynchronous satellite connection (which is one touted advantage of internet access from low-earth-orbit satellite constellations). 


The point is that lots of independent variables must be controlled to ensure low end-to-end latency performance. 

Latency Source

Typical Contribution (%)

Description

In-Home Network (Wi-Fi, Router, LAN)

5–20%

Wi-Fi interference, old routers, and internal LAN delays can introduce latency. Ethernet generally has lower latency than Wi-Fi.

ISP Core & Access Network

10–30%

Delays within the ISP's infrastructure, including fiber, cable, or DSL transmission, routing, and congestion effects.

Internet Backbone & Peering

20–50%

Transit across multiple networks, routing inefficiencies, and the number of hops between ISPs contribute to this latency.

Far-End Server Processing

10–40%

The speed at which the destination server processes and responds to requests, affected by server load, geographic distance, and CDN availability.


Processing delays in routers or switches can affect latency, but so does the distance a packet has to travel and the actual choice of networks over which any particular packet is forwarded. 


As a rule, observers expect the lowest latency (1–5 ms) on optical fiber networks. HFC/DSL latency is more often characterized as 10 to 30 ms). Geosynchronous satellite:connections have high latency (500–700 ms).


But latency can happen for any number of reasons. Long distances are an issue. So is network congestion caused heavy router and switch demand at peak hours of usage. Packet routes with more “hops” (segments) will increase latency as well. 


Latency might also be increased by heavier concurrent use of many applications on a bandwidth-limited connection as well. 


The physical well being of all physical elements (switches, routers, cables, connectors) also makes a difference. Signal interference for Wi-Fi routers or other signal barriers such as walls also make a difference. 


Server-side delays on the far end of a consumer’s internet connection also play a role in latency performance. 


Latency is an issue different from bandwidth and arguably is a more complex problem to solve.for an internet access end user.


Latency is the delay in data transmission, measured in milliseconds (ms). It represents how long it takes for a data packet to travel from the source to the destination and back. High latency causes lag, which is especially noticeable in real-time applications like video calls, gaming, or financial trading.


ISPs use several techniques to reduce latency, including optimized routing of packets, aided by direct peering arrangements with  other transport providers. More-deterministic routing protocols (BGP) also help. 


Since distance contributes to latency, content delivery networks are used to put content closer to actual end users. Edge computing and server colocation are forms of this strategy. 


Traffic shaping is another possible tactic, allowing some classes of traffic priority over delivery of other less-sensitive traffic. Giving priority to videoconferencing, voice, virtual reality or gaming bits are examples. 


Other methods to avoid excessive buffering or congestion also help. It is not clear which of these techniques Comcast will use, but a reasonable guess is “all of the above.”


Thursday, January 30, 2025

DeepSeek was a Wakeup Call, But Hardly Unusual

Whatever the ultimate resolution of the claimed DeepSeek training and inference cost, ways of cutting inference and training coss already were happening, and always were expected. DeepSeek has been an unexpected element of the time table, to be sure (assuming the cost advantages prove to be sustainable). 


ear

Training Cost (Per Billion Parameters)

Inference Cost (Per 1M Tokens)

Key Cost Drivers

2020

$10M - $20M

$1.00 - $2.00

Expensive GPUs, early transformer models

2022

$2M - $5M

$0.20 - $0.50

Hardware efficiency (A100, TPUv4), model optimizations (Mixture of Experts)

2024

$500K - $2M

$0.05 - $0.20

Advanced chips (H100, TPUv5), quantization, distillation

2026*

$100K - $500K

$0.01 - $0.05

Custom silicon (ASICs), edge inference, sparsity techniques

2028*

<$100K

<$0.01

Breakthroughs in model efficiency, neuromorphic computing


Virtually all computing technologies show such cost declines with time. 

source: Seeking Alpha 


The trend already has been seen for supercomputer cost per cycle, for example. 

ar

Supercomputer

Cost per FLOP ($/FLOP)

Peak Performance (FLOPS)

1960s

IBM 7030 ("Stretch")

~$1

~100 MFLOPS

1980s

Cray-1

~$0.10

~100 MFLOPS

1997

ASCI Red

~$0.001

~1 TFLOP

2008

Roadrunner

~$0.0001

~1 PFLOP

2018

Summit

~$0.00001

~200 PFLOPS

2022

Frontier

~$0.000001

~1.2 EFLOPS

2026*

TBD (Projected)

<$0.0000001

~10 EFLOPS


Will Generative AI Leadership Follow the Historic Computing Pattern?

Leadership in the generative artificial intelligence market is far from settled, and history suggests that early leaders do not always emerge as the mature market’s leaders.Still, investors, users and analysts do pay some attention to market share.


Suppliers know that market share and valuation, to say nothing of survival, hinge on market share performance. Investor bets alikewise ride on such outcomes. For enterprises, the bigger issue is “betting on the right horse” or using eventual “industry standard” products. 


source: Financial Times 


DeepSeek jitters aside, GenAI market share is in flux, at least where it comes to enterprise users. OpenAI started with the lead, and retains that lead, but others are emerging. 


But the history of computing innovations suggests that early leaders often do not emerge as the eventual market leaders. 


Computing Area

Early Leader(s)

Eventual Market Leader(s)

Personal Computers

MITS (Altair 8800), Tandy (TRS-80)

IBM, then Microsoft & Apple

Operating Systems

CP/M (Digital Research)

Microsoft (MS-DOS, Windows)

Search Engines

AltaVista, Yahoo!, Lycos

Google

Social Media

Friendster, MySpace

Facebook (Meta), Instagram

Smartphones

BlackBerry, Nokia

Apple (iPhone), Samsung

Online Video

RealNetworks, Metacafe

YouTube

Web Browsers

Netscape Navigator

Google Chrome, Mozilla Firefox

AI Assistants

Siri (Apple), Cortana (Microsoft)

Google Assistant, ChatGPT

Cloud Computing

Sun Microsystems, IBM (early data centers)

Amazon Web Services (AWS), Microsoft Azure


On the other hand, at least so far, large language models have proven to be exceedingly capital intensive, so all the present share leaders are firms with huge investments and hyperscale app provider sponsorship or ownership. 


That has not generally been the case since the advent of the personal computing era, the internet and cloud computing, where small firms have been able to innovate and succeed without such large capex requirements. 


But high-performance computing has, so far, been an expensive endeavor. A reasonable person might still forecast that the eventual market will be led by a few firms, as is the pattern in virtually all industries that are capital intensive or with scale requirements. 


The issue is whether capex requirements and scale will confer unique and ultimately determinative advantages for the early hyperscaler-backed contenders.


Wednesday, January 29, 2025

DeeoSeek Theats and Advantages for Other Models

DeepSeek, the new open source Large Language Model, is challenging conventional wisdom about what it costs to train an LLM and draw inferences from such models, even if there is some debate about the actual cost savings. 


By some estimates, DeepSeek creates models that are 20 to 40 times cheaper (when generating inferences)  than competitors like OpenAI, although DeepSeek also claims its costs to train models also are cheaper by about that much as well. 


The truth might be somewhere between the extremes of huge cost advantages in training and inference and parity with existing models on those scores. 


And, as an open source model, DeepSeek’s work can be used by others. Indeed, Meta software engineers already are said to be looking for ways to incorporate DeepSeek methods into Meta’s own open source Llama models. 


DeepSeek has not provided detailed disclosures on hardware utilization, power efficiency, or software optimizations that would justify significant cost reductions compared to leading AI labs like OpenAI, Google DeepMind, or Anthropic, though claiming an advantage as much as 20 times lower than those models. 


Some observers also wonder whether the claimed non-use of advanced Nvidia graphical processing units is substantially true. Some work might have used such advanced GPUs, though the final training run might not have done so. 


Some might suspect “borrowing” of intellectual property as well, which could explain some of the cost advantages. Microsoft and OpenAI also believe that has happened.  


DeepSeek says it uses a Mixture of Experts (MoE) architecture, which allows the model to activate only a small portion of its parameters for any given task. But many existing GenAI models also use MoE. 


By selectively activating only the necessary "experts" within the model, DeepSeek reduces the computational resources required for both training and inference. Still, some existing GenAI models also use MoE (Gemini, for example) so perhaps some of the claimed gains lie there, though in principle some other models might also be able to tweak their approaches as well. 


An optimized training process might be more important. DeepSeek says it has developed efficient training methods that allows training  using fewer computational resources and in less time. 


All that challenges existing conventional wisdom about how much AI capex and opex (electricity, for example) will be required to fully use AI “everywhere” in an economy. And since much investment in the AI ecosystem has been based on those assumed costs, DeepSeek is causing consternation in many circles about whether investments were an instance of “buy high, sell low.” 


Much of the immediate commentary along those lines has seemingly assumed that lower computation costs (training and inference) will translate somewhat directly into value creation (so-called “AI leadership”). 


Perhaps that is overreaching. One tends to hear the same thing about investment levels in other information technologies, as though the outcomes are directly related to the magnitude of investments, and that is not often true. 


For example, even without clear causal relationships, policymakers always assume that investing in better broadband, or coding skills, or the latest information technology, necessarily drives higher productivity. 


That might be partly true, but only in the context of other variables that arguably also contribute to higher productivity, including the sum total of all other human and institutional capital already built up. If IT infrastructure alone were able to drive productivity, there would not continue to be large gaps between economic output leaders and laggards. 


As we have seen time and again, mere increase in inputs (IT investment) does not drive productivity and creativity outputs in a linear way. So though lower cost LLM technology will be helpful, it does not necessarily represent an immediate strategic shift in creativity or productivity.


It might devalue already-made capital investments to some extent, but that also remains to be seen. That might damage some investors, to be sure. 


But DeepSeek might mostly be an intensification of the expected cost reduction cycle that all computing technologies undergo. 


Still, there is concern and hope in different DeepSeek, it is said, has significantly reduced model building costs through several innovative approaches:

  1. Mixture of Experts (MoE) architecture: DeepSeek uses an MoE system that activates only 37 billion of its 671 billion parameters for any given task, dramatically reducing computational costs. Alphabet’s Gemini uses MoE as well. 

  2. Reinforcement Learning (RL): Instead of relying on supervised fine-tuning, DeepSeek applied pure RL to its base model, allowing the AI to self-discover chain-of-thought reasoning through trial-and-error.

  3. Group relative policy optimization (GPRO): This RL algorithm is built directly into the main model, eliminating the need for a separate model for reinforcement learning and further cutting down training costs.

  4. Efficient hardware: DeepSeek trained its models on less powerful, cheaper chips (Nvidia H800 GPUs), demonstrating the ability to achieve high performance with modest hardware.

  5. Distillation techniques: DeepSeek used strategies like generating its own training data, which requires more compute but can lead to more efficient models.


Keep in mind that DeepSeek is open source, so other models are free to use or modify parts of the system for there own uses.

AI Assistant Revenue Upside Mostly Will be Measured Indirectly

Amazon expects Rufus , its AI shopping assistant, to indirectly contribute over $700 million in operating profits this year, Business Intel...