Monday, April 13, 2020

The Use and Misuse of Statistics

The use and misuse of statistics is an ever-recurring issue. Consider only the issue of how to count internet access availability or take rates. The former is a measure of supply, the latter a measure of demand. “Availability” means a consumer  can buy a product. “Access” means a customer does buy it. People confuse the two concepts, routinely. 


Ignore sampling errors or limitations for the moment.  Ignore the impact of definitions, which change. At one point, 10 Mbps was the top speed on a fiber to home network. Today we define “broadband” as a minimum of perhaps 25 Mbps downstream, and the definition will slide higher over time. The point is that we never are comparing apples to apples, over time. 


In isolated and rural areas, where there often is no business case for supplying such access, service only is available if subsidized, and so the issue of “how fast” is important, but arguably secondary to “availability.” We prefer equivalent grades of service everywhere, but the economics of supply mean that rural availability lags urban, virtually always. 


We frequently also often ignore some platforms entirely, as when we measure “fixed network” availability, but omit the additional coverage supplied by satellite providers or mobile networks. Sometimes it is not clear that wireless fixed networks are counted with other fixed networks. 


It arguably is one thing if a potential customer cannot buy a product; quite another thing if a potential customer chooses not to buy. The first might be considered a failure of policy; the latter a consumer exercise of choice. 


One also has to ignore lag times between data collection and publication. Most government data shows what was the case two years ago, not the situation as it stands today. So three years ago, using a minimum standard of 25 Mbps for “high speed,” perhaps 30 percent of “households” did not buy--or perhaps could not buy--the product. 


People often mistake “households” for “people,” as well. This illustration, using 2017 reporting data, says “30 percent of U.S. households don’t have a fixed high-speed internet connection.” 


That is wrong. The Federal Communications Commission figures for 2017 stated that 21.3 million people did not have access at that speed, not households. That overstates the degree of “lack of access” by more than 100 percent, as the typical number of people in a U.S. household is greater than two. 


source: Karma


This is a common error one sees in reports about the size of the digital divide. If one adds two satellite providers to the mix, there is almost no place within the continental landmass not already served by at least two networks selling 25 Mbps service, whatever the limitations of fixed networks in some locations. 


Of course, our goals always are aspirational. Most urban consumers consider 25 Mbps a problem. In my own household, anything less than 50 Mbps triggers the registration of a service issue report and an immediate reboot of the router. As a practical matter, even speeds below 100 Mbps might trigger a reboot. 


The point is that availability--the ability to buy internet access--is not generally a problem. I know people who live in isolated mountainous areas where neither fixed line service nor mobile service is available. They use their mobiles only when “in town.” But those people also choose not to buy satellite internet access. They could buy it; they simply choose not to do so. 


Speed and cost are issues, to be sure. Rare is the wireless platform that will match a hybrid fiber coax network or a fiber to home network in terms of speed or cost per bit. 


The point is not the definitions we use--as those change over time, and should change--so much as the misuse of terms. Availability is one matter; take rates another. People are one matter; households or locations another. 


One frequently sees and hears figures that confuse those concepts, with real implications for the meaning of the data. In the end, we care about take rates. Availability is a measure of our ability to support take rates. But there are grey areas. 


We want reasonable quality services and reasonable prices. That always is hard to do in rural areas. But even in urban areas, when quality and price are not issues, some customers still choose not to buy some services. They might prefer a mobile-only approach to buying fixed access, for example. 


Assessing trends in the real world is hard enough. It never helps when we are simply misapplying statistics (unintentionally, perhaps)  to make a point.


No comments:

It Will be Hard to Measure AI Impact on Knowledge Worker "Productivity"

There are over 100 million knowledge workers in the United States, and more than 1.25 billion knowledge workers globally, according to one A...