Skip to content

Is the distribution of the digits 0-9 uniform?

May 8, 2007

Another way of stating this question is "Are all digits equally likely?"

It turns out, no.  For large sets of numbers resulting from measurements of nearly anything, the lower numbers are more common.  In fact, they tend to follow a power law (See below).

But saying so doesn’t make it so. How about some examples?

To get some quick results, I wrote a Python script to count digits.  The core counting routine is shown below (download .py, PyX required for making plots).

inf = file(options.filename,’r’)
buf = inf.readlines()

nre = re.compile(‘[0-9]’)

hist = {0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0}

for line in buf:
    nlist = nre.findall(line)
    for n in nlist:
        hist[int(n)] += 1


Next, find some data.  I started close to home by looking at the data from a monthly report of online performance data and financial performance data for the employer.  For this data, the histogram of 1 month’s data looks like Figure 1 below.


April Performance Report Histogram



Figure 1. Distribution of digits 0-9 in monthly performance data for AdPay, Inc.


For a quick comparison, let’s find some data on the Web for rainfall and population statistics.

Rainfall Histogram

Figure 2.  Distribution of digits 0-9 in rainfall data. Is the digit ‘3’ unusually common in rainfall data?


Distribution Population Historgram 

Figure 3.  Distribution of digits 0-9 in population data from a combination of several countries.


For more information:

No comments yet

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: