# Geometric distribution "extended"

Posted on January 26, 2011 by phimuemue

Everyone knows the geometric distribution, which is used to model "waiting for the first success". Here, we consider another experiment: We have a Bernoulli random variable with success probability $$p$$ and $$q=1-p$$ and we count the number of Bernoulli trials until we have $$n$$ successive successes (note that $$n=1$$ is the special case that corresponds to the usual geometric distribution). More precisely, we are interested in the expected number of trials until $$n$$ successive successes. First of all, let us note that it does not suffice to combine each $$n$$ trials into one and wait until one of these $$n$$-simple-trials-combos succeeds in all $$n$$ trials. This is because the $$n$$ successive trials may overlap from one of those combos into another.

So let us look how the experiment works: Let's say we have $$n+1$$ states (numbered $$n$$ down to 0). We start our experiment in state $$n$$, from which we proceed with probability $$p$$ to state $$n-1$$ and we stay with probability $$q=1-p$$ in state $$n$$. To be more general, if we are in state $$i$$, we proceed to state $$i-1$$ with probability $$p$$ and come back to state $$n$$ with probability $$p$$:

Now we are going to compute the expected number of trials we need until we reach state 0 (if we start at state n). Therefore, we employ the following formula involving conditional expectation:

$$E(x)=\sum_{i=0}^l E(x|A_i) \cdot P(A_i)$$

Then, we continue like follows: If we are in state $$i$$ we can in the next step be either in state $$n$$ (the initial state) or in state $$i-1$$.

We denote the expected number of moves if we are currently in state $$i$$ by $$E_i(x)$$ or simply $$E_i$$. Remember that $$q=1-p$$

Now we conclude $$E_i = p\cdot(1+E_{i-1})+q\cdot (1+E_n)=1+p\cdot E_{i-1} + q\cdot E_n$$, from which we obtain that. So we have a recursive equation. The base case is $$E_0=0$$, since we do not need any additional trials if we have reached state 0 already.

Then, we prove via induction that (for $$i>0$$) we have $$E_i=1+\sum_{j=1}^{i-1}p^j + q\cdot E_n\cdot\sum_{j=0}^{i-1}p^j$$.

The base case $$i=0$$ can be seen immediately. For $$i>1$$ it can be seen the following way:

$$E_1 \stackrel{Def.}{=} 1+p\cdot E_{i-1}+q\cdot E_n = \\ \stackrel{Ind.hyp.}{=} 1+p\cdot(1+\sum_{j=1}^{i-2}p^j + q\cdot E_n\cdot \sum_{j=0}^{i-2}p^j) + q\cdot E_n = \\ = 1+p+\sum_{j=1}^{i-1}p^j+q\cdot E_n\cdot\sum_{j=1}^{i-1}p^j + q E_n =\\ = 1+ \sum_{j=1}^{i-1}p^j + q\cdot E_n\cdot\sum_{j=0}^{i-1}p^j$$

This concludes the induction. We can furthermore simplify the result as follows:

$$E_i=1+\frac{p^i-p}{p-1}+E_n\cdot(1-p^i)$$

Remember that we are especially interested in $$E_n$$, the expected number of trials until $$n$$ successive successes. Thus, we compute $$E_n=1+\frac{p^n-p}{p-1}+E_n\cdot(1-p^n)$$, from which we resolve that $$E_n=\frac{p^n-1}{p^{n+1}-p^n}$$.

As mentioned above, the special case $$n=1$$ yields $$E_1=\frac{p-1}{p^2-p}=\frac{1}{p}$$, which logically corresponds with the expected number of turns for a simple geometric distribution.

The following figure shows how the expected number of trials evolves for different values of $$p$$. You see that the expected number grows exponentially with $$n$$. The greater $$p$$, the slower the curve grows.