Showing posts sorted by relevance for query No Bandwidth Hogs. Sort by date Show all posts
Showing posts sorted by relevance for query No Bandwidth Hogs. Sort by date Show all posts

Friday, December 4, 2009

No Bandwidth Hogs?

Some would argue there is no "exaflood" and no such thing as a "bandwidth hog." 

I have no more detailed data from any Internet service provider than anybody else does, so I doubt anybody can prove or disprove the thesis definitively. But I also have no reason to think the usage curve will be anything other than a Pareto distribution, since so many common distributions in the physical and business world conform to such a distribution.
Vilfredo Pareto, an Italian economist, was studying the distribution of wealth in1906. What he found was a distribution most people would commonly understand as the "80/20 rule," where a disproportionate share of results come from 20 percent of actions. The Pareto distribution has been found widely in the physical and human worlds. It applies, for example, to the sizes of human settlements (few cities, many hamlets/villages). It fits the file size of Internet traffic (many smaller files, few larger ones).

It describes the distribution of oil reserves (a few large fields, many small fields) and jobs assigned supercomputers (a few large ones, many small ones). It describes the price returns on individual stocks. It likely holds for total returns from stock investments over a span of several years, as most observers point out that most of the gain, and most of the loss in a typical portfolio comes from changes on just a few days a year.

The Pareto distribution is what one finds when examining the sizes of sand particles, meteorites or numbers of species per genus, areas burnt in forest fires, casualty losses: general liability, commercial auto, and workers compensation.

The Pareto distribution also fits sales of music from online music stores and mass market retailer market share. The viewership of a single video over time fits the Pareto curve. Pareto describes the distribution of social networking sites. It describes the readership of books and the lifecycle value of telecom customers.

So knowing nothing else than that the Pareto distribution is so widely represented in the physical world and in business, I would expect to see the same sort of distribution in bandwidth consumption. As applied to users of bandwidth, Pareto would predict that a small number of users in fact do consumer a disproportionate share of bandwidth.

I certainly can't say for sure, but would be highly surprised if in fact a Pareto distribution does not precisely describe bandwidth consumption.

Saturday, December 3, 2011

Do Heavy Users Cause Congestion; Do Caps Work?

Capacity impact of peak-hour caps
A study of user behavior on one North American mid-tier Internet service provider’s network attempts to answer the question of whether the heaviest users really are responsible for peak-hour congestion, and whether data caps actually do much to manage peak-hour congestion.

It is an unquestioned fact that a small percentage of broadband users, on virtually any network, use vastly more data than typical users do. The top one percent of data consumers account for 20 percent of the overall consumption, for example, a fact the study by Analyst Benoît Felten confirms.

But overall consumption is not the chief issue on any network. Rather it is peak hour usage which is the gating factor when dimensioning a network in the capacity realm.

It is not the overall monthly data traffic volume by a subscriber, but when and where it is generated that is crucial, argues Monica Paolini of Senza Fili Consulting.  Operators would be better off with higher traffic volumes, as long as they are not during peak hours.

Let’s imagine that wireless subscribers have a plan with caps that apply to peak times only and unlimited access at other times, she says.  During off-peak times, a green dot appears on the smartphone and subscribers know that then their data usage does not count against their data allowance. In this case, we would expect overall higher data consumption, but more diluted through the day.

The chart shows what should happen in this scenario, with different rates of peak/off-peak substitution, assuming that the increase in non-peak traffic will be four-times as large as the decrease in peak traffic.

Starting from a base case of a usage of 1 GByte per month (current average usage is around 500 MBytes, and with the current traffic growth rates, average traffic per month will probably hit the 1 GByte mark in 2012. Peak hour congestion is the issue

The first step, though, is to better understand actual usage, whether by typical users or power users. Analyst Benoît Felten, for example, has wondered for some time about the extent to which power users create out-sized stress on access networks.

Recently, Felten’s firm was able to analyze data from a mid-sized North American ISP to test his hypothesis that “data hogs do not cause unusual congestion” at peak hours, even though some users consume vastly more data than a typical user. Data caps unnecessary?

The study, Felten says, shows that data consumption, overall, is at best a “poor” proxy for bandwidth usage, despite the clear pattern that “heavy” users consume vastly more data than typical users.

Where average daily data consumption over the period was 290 MBytes, the “very heavy” consumers consumed 9.6 GBytes. This roughly equates to data consumption of 8.7 GBytes and 288 GBytes per month, respectively. So the heaviest users consume two orders of magnitude more data than typical users.

But Felten argues that bandwidth usage outside of periods when the aggregation link is heavily loaded (which he defines as  75 percent load) has no impact on costs or other users.
Felten is right to focus on peak-hour congestion.  

“The results show that while the number of active users does not vary significantly between 8 AM and 1 AM, the average bandwidth usage does vary significantly, especially around late afternoon and evening,” he says.

This suggests that the increase in network load is not a result of more customers connecting at a given time, but a result of customers having a more intensive use of their connections during these hours.

The study does confirm that a small percentage of users dominate peak-hour usage. About six percent of all customers (and 7.5 percent of active customers) are among the top one percent of bandwidth users at one point or another during peak hours. The twist is that Felten’s  analysis also does suggest that 80 percent of peak load is generated by the heaviest users over a billing cycle.
In other words, perhaps 20 percent of peak-hour demand is created by “typical” users, not the “bandwidth hogs.”

Oddly enough, though Felten and some other observers might say this “confirms” the thesis that heavy users do not cause peak-hour congestion, the data seem to contradict the theory.
In fact, some might argue Felten’s analysis mostly confirms the theory that the heaviest users over a billing cycle are the heaviest contributors to peak-hour congestion as well.. He argues that “the correlation between real-time bandwidth usage and data downloaded over time is weak.”

A reasonable argument can be made, though, that data caps don’t seem to address peak-hour congestion.

But many, including perhaps most economists, would argue that bandwidth consumption is a good like any other, susceptible to price and other policies that can shift demand.

Of course, many policy advocates would not be in favor of such mechanisms, which obviously could include peak-hour pricing that is higher than off-peak pricing.

Users might not prefer such approaches, either, as pricing predictability is a major plus for most users. Also, many, if not most access providers might also prefer not to incur the overhead of billing on a differential basis.

But many would note the  the potential value of value-based pricing that can incorporate quality metrics, time of day priorities, off-peak pricing or other ways to create incentives for off-peak use and premium pricing for peak-hour use or peak-hour quality.

So are heavy users the problem? Felten seems to argue they are not. The data might suggest they are. The data do show there are indeed heavy users during peak hours, and 80 percent of them are the same people who use the most data over a billing cycle.

The issue might be viewed as determining whether “heavy users, at peak hours” are the same people as “heavy users, over a billing period.” That appears to be the case about 80 percent of the time.

Felten argues that bandwidth caps do not necessarily alleviate congestion problems, and he is right about that. Do data hogs cause congestion? If not, then it makes more sense to use other pricing and value mechanisms to shape demand, one might argue. The question might then be whether other schemes that would work are acceptable to end users and service providers.

DIY and Licensed GenAI Patterns Will Continue

As always with software, firms are going to opt for a mix of "do it yourself" owned technology and licensed third party offerings....