# How to algorithmically simulate probabilistic distributions using other distributions

In this article, we explore (in short) how we can generate random numbers of any distribution if we have a corrupt toin coss generator (i.e. a random toin that produces 1 with (unknown) probability \(p\in(0;1)\) and 0 with (unknown) probability \(q=1-p\)). So first of all, we convince ourselves that we can construct a fair coin tosser (i.e. 1 and 0 occur both with probability 0.5) from the above mentioned corrupt coin tosser.

Therefore, we do the following: We execute the corrupt coin tosser two times. Then there are four possible outcomes:

01: In this case, we output 0

10: In this case, we output 1

00: In this case, we repeat the experiment

11: In this case, we repeat the experiment

We repeat until we have either 01 or 10 as output. Then, we can compute: The probability of reaching the result "01" is the following: Either we reach "01" immediately or we get first some "00" and "11". Thus, we can compute the probability of reaching "01" as follows: \(\sum_{i=0}^\infty (p^2+q^2)^i \cdot (pq)\). Using the fact that \(q+p=1\) and geometric series, we compute that this probability is 0.5. Moreover, we can do the same for the outcome "10". Thus, we get a fair random 0/1-number-generator.

We saw that we can w.l.o.g. assume that we have a fair 0/1-random-number-generator \(R\). Armed with this knowledge, we can now say that we can generate any discrete random distribution, where the number of results is finite. Therefore, we can just employ a binary encoding of successive calls to \(R\). To be more precise, if we want to have \(n\)possible outcomes (namely from 0 to \(n-1\)), we do \(\lceil \log_2 n \rceil\) calls to \(R\). If the binary encoding of the successive \(R\)-calls yields a number less than \(n\), we are done and return that number as the result of our random process. Otherwise, we repeat until we get such a number as the result.

So, up to now we saw that we can w.l.o.g. assume that we have random number generators that generate a random number between 0 and \(n\).

For now, we assume that we want to construct a random variable \(X\) with a specific distribution. To generate any such variable, we need a continuous random variable. That is, we have to convert somehow our discrete variable to a continuous one.

There are several methods to do so, however we focus on the following: We assume \(R\) is a fair 0/1-random-number-generator. Then, we execute \(n\) times the generator \(R\) and interpret the result as a \(n\)-digit binary number. After that, we divide the result by \(2^n\). Note that this yields still a discrete random variable, but for large \(n\) it "behaves" like a uniformly distributed variable over the intervall \([0;1]\). To be honest, I do not know if there is a (theoretical) way of transforming any discrete random variable into a uniformly distributed continuous random variable.

So, we now assume that we have (almost) a random variable \(R'\) that is uniformly distributed over \([0;1]\). If we now want a (continuous) random variable \(X\) with the distribution \(F_X(t)=P(X\leq t)\), we can do as follows (we assume \(F_X\) to be strictly increasing, and thus we also assume that there is an \(F_X\) inverse \(F_X^{-1}\) with \(F_X(F_X^{-1}(t))\)): We define this random variable as follows: \(X=F_X^{-1}(R')\).

Now we compute:

\(P(X\leq t)=P(F_X^{-1}(R')\leq t)= P(R' \leq F_X(t))\approx F_R'(F_X(t))= F_X(t)\)

This tells us that \(X\) is a random variable with the distribution \(F_X\), which is what we wanted.

Note that out of a continuous variable it is no problem to derive a discrete random variable that e.g. takes only integral values.