Geometric distribution "extended"

Posted on January 26, 2011 by phimuemue

Everyone knows the geometric distribution, which is used to model "waiting for the first success". Here, we consider another experiment: We have a Bernoulli random variable with success probability \(p\) and \(q=1-p\) and we count the number of Bernoulli trials until we have \(n\) successive successes (note that \(n=1\) is the special case that corresponds to the usual geometric distribution). More precisely, we are interested in the expected number of trials until \(n\) successive successes. First of all, let us note that it does not suffice to combine each \(n\) trials into one and wait until one of these \(n\)-simple-trials-combos succeeds in all \(n\) trials. This is because the \(n\) successive trials may overlap from one of those combos into another.

So let us look how the experiment works: Let's say we have \(n+1\) states (numbered \(n\) down to 0). We start our experiment in state \(n\), from which we proceed with probability \(p\) to state \(n-1\) and we stay with probability \(q=1-p\) in state \(n\). To be more general, if we are in state \(i\), we proceed to state \(i-1\) with probability \(p\) and come back to state \(n\) with probability \(p\):

[][States for the experiment](http://phimuemue.com/wp-content/uploads/2011/01/pr1-4.png)[/caption]

Now we are going to compute the expected number of trials we need until we reach state 0 (if we start at state n). Therefore, we employ the following formula involving conditional expectation:

\(E(x)=\sum_{i=0}^l E(x|A_i) \cdot P(A_i)\)

Then, we continue like follows: If we are in state \(i\) we can in the next step be either in state \(n \) (the initial state) or in state \(i-1 \).

We denote the expected number of moves if we are currently in state \(i \) by \(E_i(x) \) or simply \(E_i\). Remember that \(q=1-p \)

Now we conclude \(E_i = p\cdot(1+E_{i-1})+q\cdot (1+E_n)=1+p\cdot E_{i-1} + q\cdot E_n\), from which we obtain that. So we have a recursive equation. The base case is \(E_0=0 \), since we do not need any additional trials if we have reached state 0 already.

Then, we prove via induction that (for \(i>0 \)) we have \(E_i=1+\sum_{j=1}^{i-1}p^j + q\cdot E_n\cdot\sum_{j=0}^{i-1}p^j \).

The base case \(i=0 \) can be seen immediately. For \(i>1 \) it can be seen the following way:

\(E_1 \stackrel{Def.}{=} 1+p\cdot E_{i-1}+q\cdot E_n = \\ \stackrel{Ind.hyp.}{=} 1+p\cdot(1+\sum_{j=1}^{i-2}p^j + q\cdot E_n\cdot \sum_{j=0}^{i-2}p^j) + q\cdot E_n = \\ = 1+p+\sum_{j=1}^{i-1}p^j+q\cdot E_n\cdot\sum_{j=1}^{i-1}p^j + q E_n =\\ = 1+ \sum_{j=1}^{i-1}p^j + q\cdot E_n\cdot\sum_{j=0}^{i-1}p^j \)

This concludes the induction. We can furthermore simplify the result as follows:

\(E_i=1+\frac{p^i-p}{p-1}+E_n\cdot(1-p^i) \)

Remember that we are especially interested in \(E_n \), the expected number of trials until \(n \) successive successes. Thus, we compute \(E_n=1+\frac{p^n-p}{p-1}+E_n\cdot(1-p^n) \), from which we resolve that \(E_n=\frac{p^n-1}{p^{n+1}-p^n} \).

As mentioned above, the special case \(n=1 \) yields \(E_1=\frac{p-1}{p^2-p}=\frac{1}{p} \), which logically corresponds with the expected number of turns for a simple geometric distribution.

The following figure shows how the expected number of trials evolves for different values of \(p\). You see that the expected number grows exponentially with \(n\). The greater \(p\), the slower the curve grows.

[][Expected values for different success probabilities depending on n](http://phimuemue.com/wp-content/uploads/2011/01/Unbenannt.png)[/caption]