Thursday, May 31, 2012

How to Model Broadband Consumption With Few Data Points

“Nearly all communications traffic, including Internet traffic, can be approximated with high accuracy by the log-normal distribution,” says Phoenix Center Chief Economist Dr. George S. Ford. That’s important, as it means we generally can predict overall end user behavior when we actually know only a couple of key data points.

Among the practical implications are estimates of what is likely to happen when  a broadband service provider imposes a monthly usage cap of 250 gigabytes. The log-normal distribution suggests how many customers would hit the limit.

The log-normal distribution also generally allows some estimation of how consumption will vary across the entire customer base, knowing only the consumption of the top one percent, and the consumption of the top 10 percent of users, an analysis by Dr. Ford suggests.

The point is that “averages” (the arithmetic mean) don’t tell an observer very much when any service has an asymmetric distribution, as always seems to be the case for Internet consumption by consumers.

Cisco’s Visual Networking Index reports that the top one percent of users accounted for more than 20 percent  of Internet traffic and that the top 10 percent of users accounted for 60 percent
of traffic.

That means a Pareto distribution, which would ideally show that 20 percent of instances account for 80 percent of the impact would also likely hold.

Ford notes that Comcast’s 250 GByte  per month usage cap on its residential broadband
customers, taken with Comcast’s own statements that 99 percent of its residential customers will not approach that cap suggests that only one percent of Comcast’s residential users consume 250 GBytes per month or more.

Comcast also indicated that its median customer consumes about 8 GBytes to 10 GBytes per month.

The log-normal distribution could well inform many other sorts of policies, such as what amount of consumption a “typical” user requires.

“My approach to approximating usage patterns may be useful for variety of policy issues,” says Ford. “ For example, when addressing universal service for broadband, the level of service that qualifies as ‘broadband’ will have to be parameterized.”

Knowledge of the usage distribution may aid in establishing these service level definitions that can be described as “reasonably comparable to those services provided in urban areas, for example.

No comments:

Computing Archiitectures Now are Dependent on WAN Performance, Not LAN

These days, computing performance mostly hinges on the wide area network, not the "local" area network, a big change from earlie...