Skip to content

Statistics of Coin-Toss Patterns

April 3, 2008

Yesterday, I watched Peter Donnelly’s TED presentation on statistical mind-benders. Among other things (statistician jokes!), Peter observes that humans don’t have good intuition for some kinds of statistical thinking. In the presentation, Donelly posses a coin toss problem to demonstrate his point.  He chooses one that is easy to get wrong.

Consider a series of fair coin tosses. For example, one possible sequence of coin tosses is THTTHHTHTTH. How many tosses are required to get a particular pattern? How does this depend on the length of the pattern?

Peter poses a concrete question as follows.  Consider the pattern HTH. If we do the experiment of tossing a coin repeatedly and counting the number of tosses, we find that the first occurrence of HTH arises in some average number of coin tosses n_{HTH}. For a different pattern, say TTH, we can repeat the experiment and find that the first occurrence of this pattern arises in some average number of coin tosses n_{TTH}.

One of the following statements must be true:

(a) n_{HTH} = n_{TTH}
(b) n_{HTH} \gt n_{TTH}
(c) n_{HTH} \lt n_{TTH}

Which statement is correct?

My reflex reaction was (a). The heuristic leading to this conclusion is that the probability of getting TTH is the same as the probability of getting HTH in any 3 coin tosses, i.e.,


On the other hand, if the pattern was HHH the probability of getting this pattern on any three coin tosses is the same. But intuitively, I expect to have to make more coin tosses on average to get this pattern.

It is difficult to get to the correct answer with this kind of reasoning. 

A little counter intuitively, (b) is the answer. It takes more tosses on average to get the pattern HTH than the pattern TTH. Peter spends some time arguing that this is plausible–watch his presentation to get those arguments. Below, I pursue calculating this for myself…

Remember that the average or expected value of n is calculated by,


The probability of any single coin toss sequence is \frac{1}{2^n} where n is the length of the sequence.

An approach to calculating P(n) is to count the number of permutations of coin-toss sequences of length 0 to n that contain the target pattern only in the last position. Each of these sequences has the length-dependent probability above, so that


To take into account that some target patterns can overlap themselves, I force the last (l-1) places in the sequence to match the first (l-1) places of the pattern.  The chance of getting the pattern on the next toss is1/2.  So multiply by 1/2 to get the formula above.

The table below shows the function N(n-1) for the two patterns HTH and TTH. It is easy to see that for longer sequences, there are many more possible sequences not containing HTH than HHT. Since the mean is calculated as a sum of N(n-1) multiplied by the probability which only depends on the length of the sequence, HTH is going to have the greater expected n.

You can use a pad a paper to count sequences, but this gets out of hand quickly. I wrote a Python script to help keep things sorted out. This script makes all the calculations shown for the rest of this entry and can be downloaded here.


Tosses Coin Toss Sequence
Permutations W/O
Pattern (TTH)
Coin Toss Sequence
Permutations W/O
Pattern (HTH)
4 2 2
5 4 3
6 7 5
7 12 9
8 20 16
9 33 28
10 54 49
11 88 86
12 143 151
13 232 265
14 376 465
15 609 816
16 986 1432
17 1596 2513
18 2583 4410
19 4180 7739
20 6764 13581
21 10945 23833
22 17710 41824
23 28656 73396
24 46367 128801
25 75024 226030


It is easy to use these counts and the formula above to calculate the expected value of n for short patterns. Here is an example for a pattern of length 2. This calculation applies to either TH or HT and shows that the expected value of n is 4.00.

Coin Toss Sequences
without Pattern (TH or HT)
Average Coin Tosses
(successive approximations)
3 2 1.250
4 3 2.000
5 4 2.625
6 5 3.094
7 6 3.422
8 7 3.641
9 8 3.781
10 9 3.869
11 10 3.923
12 11 3.955
13 12 3.974
14 13 3.985
15 14 3.992
16 15 3.995
17 16 3.997
18 17 3.999
19 18 3.999
20 19 4.000

This series converges slowly and the numbre of permutations becomes very large for longer patterns. So, although this brute-force summing is straight forward, it runs up against practical limits surprisingly quickly.

Another approach to calculated the average n for a pattern is to use a Monte Carlo simulation of the coin toss events. The Python script above also performs this calculation. The table below shows the results for patterns up to length 4.

Pattern (Est.) E
Patterns of length 1
T 2.00
H 1.98
Patterns of length 2
TT 5.99
HH 6.00
HT 4.02
TH 4.01
Patterns of length 3
TTT 13.95
HHH 14.07
HTT 8.03
THH 8.02
THT 9.95
HTH 10.04
HHT 8.00
TTH 8.06
Patterns of length 4
TTTT 30.07
HHHH 30.15
HTTT 16.10
THHH 16.07
THTT 17.96
HTHH 17.98
HHTT 15.92
TTHH 16.16
TTHT 18.09
HHTH 18.00
HTHT 20.14
THTH 19.94
THHT 18.12
HTTH 18.20
HHHT 16.13
TTTH 16.0

This table is useful to compare the different results for various patterns of the same length.

Finally, the Monte Carlo method gives an estimate of the distribution of n for the patterns. (The numbers above are based on 20,000 sequences.) The histogram below shows the Distribution of Sequence length for the pattern TTTH.  The average sequence length is estimated to be 16 (from the table above). Because the distribution is asymmetric and has a very long tail to the right, the average doesn’t appear near any interesting features (the peak, for example) of the distribution.

TTTH Distribution (20000 Trials) 


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: