Expected number of cycles in a random permutation

Posted on March 20, 2014 by phimuemue

Once again a random experiment and an associated random variable: We generate a random permutation and are interested in the expected number of cycles.

A short intro on cycles

Consider as an example a permutation of the numbers \(1, 2, 3, 4, 5\), namely the permutation \(5,2,4,1,3\). We can write this permutation in another representation as follows: \(\begin{pmatrix} 1 & 2 & 3 & 4 & 5 \\ 5 & 2 & 4 & 1 & 3 \end{pmatrix}\). Let's call this permutation \(p\) and interpret it as a bijective function \(\{1,2,3,4,5\} \rightarrow \{1,2,3,4,5\}\). That means, that in our case, \(p(1)=5, p(2)=2\) and so on.

We can now decompose this permutation into cycles the following way: We start with number 1, and see \(p(1)=5\); since the result is 5, we continue with this number and observe that \(p(5)=3\). Again, we continue and see \(p(3)=4\) and \(p(4)=1\) (which was our starting point).

The only number left is 2, and we see that \(p(2)=2\).

That is, to identify cycles, we start with one particular number and "walk" along the permutation until we reach our initial number. All numbers seen during this walk belong to our current cycle. If there are any numbers left, we start the process again and obtain new cycles.

A second example should suffice to explain how this works exactly. We consider the following permutation over the numbers 1 to 10: \(\begin{pmatrix} 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ 4 & 1 & 9 & 10 & 3 & 6 & 7 & 2 & 5 & 8 \end{pmatrix}\)

We observe that \(p(1)=4, p(4)=10, p(10)=8, p(8)=2, p(2)=1\), from which we get our first cycle \((1, 4, 10, 8, 2)\). We have not yet seen 3, so we examine \(p(3)=9, p(9)=5, p(5)=3\), giving the next cycle \((3,9,5)\). Finally, we see \(p(6)=6\) and \(p(7)=7\) resulting in two cycles of length 1: \((6)\) resp. \((7)\).

We can then write our original permutation in a very compact way - as a sequence of cycles: \((1,4,10,8,2)(3,9,5)(6)(7)\). Sometimes, cycles of length 1 are left out and only given implicitly, so we could reduce further to \((1,4,10,8,2)(3,9,5)\).

A noteworthy special case is the identity permutation having only cycles of length 1.

Here's some python code that can compute the cycles of a permutation (note that this code assumes 0-indexed arrays, thus working with numbers 0 to \(n-1\) instead of numbers 1 to \(n\)):

def cycles(p):
    done = [False for _ in p]
    result = []
    i = 0
    while True:
        while i<len(p) and done[i]:
            i = i + 1
        if i==len(p):
            break
        result.append([])
        while not done[i]:
            result[-1].append(i)
            done[i] = True
            i = p[i]
    return result

What do we want to do?

We now generate a random permutation (in a way such that all permutations are euqally likely) and are interested in the expected number of cycles.

Expected number of cycles of a certain length

As a first step, we compute - for a random permutation of the numbers \(1\) to \(n\) - the expected number of cycles of length \(m\) (where, of course, \(m\leq n\)). To do this, we introduce some indicator variables: \(I_i = \begin{cases} 0, & \text{ if } i \text{ is not on a cycle of length } m \\ 1, & \text{ if } i \text{ is on a cycle of length } m \end{cases}\)

Now, it is clear that \(E[I_i] = P[I_i = 1]\) since \(I_i\) is just an indicator variable. We can compute \(P[I_i = 1]\) using the following considerations.

To satisfy \(I_i = 1\), the number \(i\) must be contained in a cycle of length \(m\). Let's examine how many possibilities we have to generate such a cycle:

These facts boil down to \(\binom{n-1}{m-1} \cdot (m-1)! \cdot (n-m)! = \frac{(n-1)!}{(n-m)! (m-1)!} \cdot (m-1)! \cdot (n-m)! = (n-1)!\) possible ways of generating a cycle of length \(m\) containing the number \(i\).

Since there are \(n!\) possibilities in total, we have \(P[I_i=1]=\frac{(n-1)}{n!} = \frac{1}{n}\): This is a very surprising fact: The probability that \(i\) is contained in a cycle of length \(m\) is given by \(\frac{1}{n}\) - and thus independent of \(m\).

To compute the expected number of cycles of length \(m\), we can say that we just have to compute \(E[\text{no. of cycles of length } m] = E\left[\frac{1}{m}\left(\sum_{i=1}^n I_i\right)\right]\). This is clear because each "member" of a cycle of length \(m\) corresponds to an indicator variable of value 1. Thus, we have to divide ths sum of indicator variables by \(m\) to obtain the number of cycles of length \(m\).

Linearity of expectation now gives us that \(E[\text{no. of cycles of length } m] = \frac{1}{m} \cdot \sum_{i=1}^n E[I_i]\), which can be simplified to \(\frac{1}{m} \cdot \sum_{i=1}^n P[I_i=1] = \frac{1}{m} \cdot \sum_{i=1}^n \frac{1}{n} = \frac{1}{m}\).

We have seen that the expected number of cycles of length \(m\) is given by \(\frac{1}{m}\).

Expected number of cycles

We can now apply our findings to compute the expected number of cycles in general.

Since a cycle's length is between 1 and \(n\), we can say that we get the total number of cycles by adding up the number of length-1-cycles, length-2-cycles, length-3-cycles, ..., length-\(n\)-cycles. Putting this in a formula, we have: \(E[\text{no. of cycles}] = \sum_{m=1}^n E[\text{no. of cycles of length } m]\). Using our knowledge from the previous section, we obtain \(\sum_{m=1}^n \frac{1}{m} = H_m\) - the well-known harmonic numbers.