Finding all paths

Posted on October 23, 2013 by phimuemue

Consider the following problem: You are given a two-dimensional array of integers and you are allowed to place a chess king on one of the arrays. The king can move in any of the eight directions at a time and you want to know how many (simple) paths this king can go that satisfy that the corresponding numbers are strictly increasing and that end in an even number.

As an example, consider the following array:

0 1
2 3

In this array, we have the paths [0], [0 1 2], [0 2], [1 2], [2] (strictly increasing and ending in an even number).

The algorithmic core

First of all note that the number of paths in general is huge (i.e. exponential in the dimensions of the array). Still, we can write an algorithm that solves the problem for us.

We first of all declare a helper routine neighbours(a, x, y) that takes the array a and the coordinates of a cell (x, y). This routine shall iterate over all neighbours in our array. It outputs the coordinates of the respective neighbours.

The central part to solve this problem is a recursive routine generate_all_paths_from(a, X, Y, stack, f, final). It takes the array, the coordinates of the king's starting point, a "stack" storing the current path the king is on and two functions f and final that indicate whether the current path is strictly increasing (this must be done by f) and if the path ends in an even number (done by final).

The function generate_all_paths_from then will examine all paths beginning at coordinates X, Y satisfying the requirements dictated by the functions f and final. It does so by using a variant of depth first search: It adds the current cell's coordinates onto the stack, iterates over all neighbours of the current cell that are not part of the current path (i.e. not within the current stack), and checks whether the path is strictly increasing (using f). If this is the case, it recurses with the respective neighbour and yields the respective paths found.

The recursionn terminates when a cell has no valid neighbours anymore. In this case, the path consisting only of the current cell is returned, and processed within the parent function (i.e. one recursion level higher).

Since generate_all_paths_from only searches for paths beginning in one particular initial cell, we need some wrapper method to call it for the different starting points.

Code

The following code showcases the algorithm.

def neighbours(a, x, y):
    xs = []
    ys = []
    if x > 0:
        xs.append(-1)
    xs.append(0)
    if x < len(a[0]) - 1:
        xs.append(1)
    if y > 0:
        ys.append(-1)
    ys.append(0)
    if y < len(a) - 1:
        ys.append(1)
    for x1 in xs:
        for y1 in ys:
            if x1 != 0 or y1 != 0:
                yield (x+x1, y+y1)
 
def generate_all_paths_from(a, X, Y, stack, f = lambda a, s, x, y : True, final = lambda a, s, x, y : True):
    hadneighbour = False
    s = stack[:] + [(X,Y)]
    for (x, y) in neighbours(a, X, Y):
        if (x, y) not in stack:
            if f(a, s, x, y):
                for p in generate_all_paths_from(a, x, y, s, f, final):
                    hadneighbour = True
                    yield [(X, Y)] + p
    if final(a, s, X, Y):
        yield [(X, Y)]
 
def generate_all_paths(a, f = lambda a, s, x, y : True, final = lambda a, s, x, y : True):
    for x in xrange(len(a[0])):
        for y in xrange(len(a)):
            for p in generate_all_paths_from(a, x, y, [], f, final):
                yield p
 
def is_growing(a, s, x, y):
    if not s:
        return True
    lastelem = a[s[-1][1]][s[-1][0]]
    newelem = a[y][x]
    return newelem > lastelem
 
N = 3
from random import sample
arr = [sample([_ for _ in xrange(N*N)], N) for i in xrange(N)]
print arr
 
for p in generate_all_paths(arr, is_growing, lambda a, s, x, y : a[y][x]%2==0):
    print [arr[y][x] for (x,y) in p]

Remarks and optimizations

I designed the above code such that it works not only for increasing paths ending in a number divisible by 2, but such that it can be adapted to suit your needs. To do this, the signatures for f and final are somewhat complex: They take the array, the current path (i.e. stack of coordinates) and the coordinates to be appended to the path.

Obviously, we could easily speed up the process a bit by avoiding the check whether (x,y) not in stack. This check requires linear run time in the size of the stack, and we could easily acchieve constant run time for it by assigning each cell a property that indicates whether it is currently in the stack. As soon as we push a cell onto the stack, we set this property to true. If we terminate the call for the respective cell, we reset the value to false. However, as I didn't run into problems with this line, I left it for the sake of simplicity.

Moreover, the above implementation is not very memory conservative. In each call to generate_all_paths_from, we create a copy of the whole stack (so we do not modify the original). We could adjust this to work on the stack directly, thereby avoiding copying it, but again, this would make things more complicated.

Another thing is that in some cases we can easily predict the paths for some starting node. Consider e.g. the array

3 2 1
4 1 1
5 6 7

If we already computed the paths starting in 3, we can easily say that the paths beginning in 2 must -- basically -- be the same like the ones beginning in 3 - simply preceeded by 2. However, I do not think those kinds of tricks work for arbitrary functions f and final, which is why I did not research it any further.