8. Recursion#

A very powerful concept in programming is recursion, or functions which call themselves. A simple example is the factorial function, which can be defined as:

\[\begin{split} x_n = \begin{cases} 1 &\text{if }n = 0 \\ n x_{n-1}, &\text{if }n > 0 \end{cases} \end{split}\]

Note that:

  • There is a base case, \(n=0\), which defines the output directly

  • For higher \(n\), the output is defined in terms of lower values

This can be implemented in a Julia function using an if-statement and a call to the function itself:

function recursive_factorial(n)
    if n == 0
        return 1
    else
        return n*recursive_factorial(n-1)
    end
end
recursive_factorial (generic function with 1 method)
recursive_factorial(7)
5040

Note the way this will be evaluated in Julia: Eventually it will call the function with \(n=0\) which will return 1, this will be multiplied by 2, etc, and last it will multiply by 7. This can be demonstrated by printing additional information:

function recursive_factorial_info(n)
    if n == 0
        println("n == 0, returning 1")
        return 1
    else
        println("n == ", n, ", calling itself with parameter ", n-1)
        output = n*recursive_factorial_info(n-1)
        println("n == ", n, ", finished calling itself, multiplying and returning ", n, "! = ", output)
        return output
    end
end
recursive_factorial_info (generic function with 1 method)
recursive_factorial_info(7)
n == 7, calling itself with parameter 6
n == 6, calling itself with parameter 5
n == 5, calling itself with parameter 4
n == 4, calling itself with parameter 3
n == 3, calling itself with parameter 2
n == 2, calling itself with parameter 1
n == 1, calling itself with parameter 0
n == 0, returning 1
n == 1, finished calling itself, multiplying and returning 1! = 1
n == 2, finished calling itself, multiplying and returning 2! = 2
n == 3, finished calling itself, multiplying and returning 3! = 6
n == 4, finished calling itself, multiplying and returning 4! = 24
n == 5, finished calling itself, multiplying and returning 5! = 120
n == 6, finished calling itself, multiplying and returning 6! = 720
n == 7, finished calling itself, multiplying and returning 7! = 5040
5040

This example is highly hypothetical - the factorial function is easier and more efficient to implement using for-loops. But for more complex problems, recursion can be an essential tool.

8.1. Example: The Ackermann function#

Think Julia, Exercise 6-5: The Ackermann function, \(A(m, n)\), is defined:

\[\begin{split} \begin{equation} {A(m, n) = \begin{cases} n+1& \textrm{if}\ m = 0 \\ A(m-1, 1)& \textrm{if}\ m > 0\ \textrm{and}\ n = 0 \\ A(m-1, A(m, n-1))& \textrm{if}\ m > 0\ \textrm{and}\ n > 0. \end{cases}} \end{equation} \end{split}\]

See https://en.wikipedia.org/wiki/Ackermann_function. Write a function named ack that evaluates the Ackermann function. Use your function to evaluate ack(3, 4), which should be 125. What happens for larger values of m and n?

This example is not an obvious for-loop like the factorial function, but very easy to implement using recursion:

function ack(m,n)
    if m == 0
        return n + 1
    elseif m > 0 && n == 0
        return ack(m-1,1)
    else
        return ack(m-1, ack(m,n-1))
    end
end
ack (generic function with 1 method)
ack(3,4)
125

Again, let’s print some additional information to try to understand how the recursive function calls itself:

function ack_info(m,n)
    function printme()
        print("ack(", m, ",", n, "): ")
    end
    if m == 0
        output = n + 1
        printme()
        println("Case 1: returning n + 1 = ", output)
        return output
    elseif m > 0 && n == 0
        printme()
        println("Case 2: calling itself with parameters m-1,1 == ", m-1, ",", 1)
        output = ack_info(m-1,1)
        printme()
        println("Case 2: finished calling itself, returning with output ", output)
        return output
    else
        printme()
        println("Case 3: calling itself for new n-value with parameters m,n-1 == ", m, ",", n-1)
        newn = ack_info(m,n-1)
        printme()
        println("Case 3: finished calling itself for new n-value == ", newn)
        printme()
        println("Case 3: calling itself with parameters m-1, A(m,n-1) == ", m-1, ",", newn)
        output = ack_info(m-1,newn)
        printme()
        println("Case 3: finished calling itself, returning ", output)
        return output
    end
end
ack_info (generic function with 1 method)
ack_info(2,1)
ack(2,1): Case 3: calling itself for new n-value with parameters m,n-1 == 2,0
ack(2,0): Case 2: calling itself with parameters m-1,1 == 1,1
ack(1,1): Case 3: calling itself for new n-value with parameters m,n-1 == 1,0
ack(1,0): Case 2: calling itself with parameters m-1,1 == 0,1
ack(0,1): Case 1: returning n + 1 = 2
ack(1,0): Case 2: finished calling itself, returning with output 2
ack(1,1): Case 3: finished calling itself for new n-value == 2
ack(1,1): Case 3: calling itself with parameters m-1, A(m,n-1) == 0,2
ack(0,2): Case 1: returning n + 1 = 3
ack(1,1): Case 3: finished calling itself, returning 3
ack(2,0): Case 2: finished calling itself, returning with output 3
ack(2,1): Case 3: finished calling itself for new n-value == 3
ack(2,1): Case 3: calling itself with parameters m-1, A(m,n-1) == 1,3
ack(1,3): Case 3: calling itself for new n-value with parameters m,n-1 == 1,2
ack(1,2): Case 3: calling itself for new n-value with parameters m,n-1 == 1,1
ack(1,1): Case 3: calling itself for new n-value with parameters m,n-1 == 1,0
ack(1,0): Case 2: calling itself with parameters m-1,1 == 0,1
ack(0,1): Case 1: returning n + 1 = 2
ack(1,0): Case 2: finished calling itself, returning with output 2
ack(1,1): Case 3: finished calling itself for new n-value == 2
ack(1,1): Case 3: calling itself with parameters m-1, A(m,n-1) == 0,2
ack(0,2): Case 1: returning n + 1 = 3
ack(1,1): Case 3: finished calling itself, returning 3
ack(1,2): Case 3: finished calling itself for new n-value == 3
ack(1,2): Case 3: calling itself with parameters m-1, A(m,n-1) == 0,3
ack(0,3): Case 1: returning n + 1 = 4
ack(1,2): Case 3: finished calling itself, returning 4
ack(1,3): Case 3: finished calling itself for new n-value == 4
ack(1,3): Case 3: calling itself with parameters m-1, A(m,n-1) == 0,4
ack(0,4): Case 1: returning n + 1 = 5
ack(1,3): Case 3: finished calling itself, returning 5
ack(2,1): Case 3: finished calling itself, returning 5
5

This also shows that the function calls itself for the same parameter values many times, suggesting that this could be done more efficiently by storing values that have already been computed.

8.2. Example: The Greatest Common Divisor (GCD)#

Think Julia, Exercise 6-8: The greatest common divisor (GCD) of \(a\) and \(b\) is the largest number that divides both of them with no remainder.

One way to find the GCD of two numbers is based on the observation that if \(r\) is the remainder when \(a\) is divided by \(b\), then gcd(a, b) = gcd(b, r). As a base case, we can use gcd(a, 0) = a.

function my_gcd(a,b)
    if a == 0
        return b
    elseif b == 0
        return a
    else
        return my_gcd(b, a % b)
    end
end
my_gcd (generic function with 1 method)
factor = 123_456_789
prime1 = 67_867_979
prime2 = 86_028_121
my_gcd(prime1*factor, prime2*factor)
123456789

8.3. Example: Recursive triangles#

Consider the shape obtained by the following algorithm:

    if the triangle is big enough
        Connect the midpoints.
        Color the interior triangle mauve.
        Draw smaller versions of the same shape in each of the 3 remaining interior triangles
    else
        Color the whole triangle yellow.
    end

We implement it as follows. “Big enough” is determined by keeping track of the level (depth) of recursion.

using PyPlot

function drawTriangle(x, y, level)
    # Draw recursively colored triangles.
    # x,y are 3-vectors that define the vertices of a triangle.
    
    if level == 0
        # Recursion limit (depth) reached
        fill(x, y, "y") # Color whole triangle yellow
    else
        # Draw the triangle...
        plot(x[[1,2,3,1]], y[[1,2,3,1]], "k", linewidth=0.5)
        # Determine the midpoints...
        a = (x + x[[2,3,1]]) / 2
        b = (y + y[[2,3,1]]) / 2
        # Draw and color the interior triangle mauve
        fill(a, b, "m")
        # Apply the process to the three "corner" triangles...
        newx = [x a a[[3,1,2]]]
        newy = [y b b[[3,1,2]]]
        for i = 1:3
            drawTriangle(newx[i,:], newy[i,:], level - 1)
        end
    end
end
drawTriangle (generic function with 1 method)
# Equilateral triangle
x = [0, 1, 0.5]
y = [0, 0, 1/sqrt(2)]

drawTriangle(x, y, 5)
../../_images/4400fabfe79b7ce1a00f325c971c8478a1a1ad3483be2c2b94c6c82589eda2c2.png

8.4. Merge sort#

Merge sort is a so-called Divide and Conquer algorithm. It divides an input array into two halves, calls itself for these two halves and then merges the two sorted halves. The method is illustrated below, for a simple test case with 7 elements: merge_sort (from https://en.wikipedia.org/wiki/Merge_sort)

In our implementation, the mergeLR!(L, R, x) function is used for merging two halves. It assumes that the arrays L and R are sorted, and merges them into one:

function mergeLR!(L, R, x)
    # Merge the *already sorted arrays* L and R into a sorted array x
    i = j = k = 1
        
    # Merge L and R into x
    while i <= length(L) && j <= length(R)
        if L[i] < R[j]
            x[k] = L[i]
            i += 1
        else
            x[k] = R[j]
            j += 1
        end
        k += 1
    end

    # Copy remaining elements
    while i <= length(L)
        x[k] = L[i]
        i += 1
        k += 1
    end
    while j <= length(R)
        x[k] = R[j]
        j += 1
        k += 1
    end
end
mergeLR! (generic function with 1 method)

This can now be used to recursively split the array into two approximately equal-sized arrays until they have length 1, and then apply the mergeLR! function at each level:

function mergesort!(x)
    # Sort the elements of the array x using the Mergesort algorithm
    if length(x) <= 1
        return x
    else
        mid = length(x) ÷ 2   # Find the midpoint of the array
        L = x[1:mid]          # Divide array into 2 halves
        R = x[mid+1:end]

        mergesort!(L)         # Sort first half
        mergesort!(R)         # Sort second half
        
        mergeLR!(L, R, x)
    end
end
mergesort! (generic function with 1 method)
# Example: Sort random integers
x = rand(1:1000, 10)
println(x)
mergesort!(x)
println(x)
[146, 427, 955, 762, 329, 106, 86, 683, 968, 862]
[86, 106, 146, 329, 427, 683, 762, 862, 955, 968]

It can be shown that the number of operations needed for this algorithm to sort an array of length \(n\) is about a constant times \(n \log_2 n\) (which is optimal for methods based on general comparisons). We can roughly verify this using the @time macro which measures execution time:

for array_size = Int64[1e3, 1e4, 1e5, 1e6, 1e7]
    x = rand(array_size)   # Random floating point numbers
    println("n = ", array_size)
    @time mergesort!(x)
end
n = 1000
  0.047354 seconds (24.08 k allocations: 1.628 MiB, 99.70% compilation time)
n = 10000
  0.001318 seconds (20.00 k allocations: 2.145 MiB)
n = 100000
  0.036682 seconds (200.08 k allocations: 23.977 MiB, 28.63% gc time, 13.57% compilation time)
n = 1000000
  0.199707 seconds (2.00 M allocations: 266.053 MiB, 12.26% gc time)
n = 10000000
  2.144884 seconds (20.01 M allocations: 2.834 GiB, 5.95% gc time)

We can see that at least for the larger values, the execution time is a little more than 10 times larger when \(n\) gets 10 times larger, showing slightly more than a linear dependency on \(n\).

8.5. Saving intermediate values in a recursion#

The function below implements the so-called McCarthy 91 function:

\[\begin{split} M(n) = \begin{cases} n-10, & \text{if }n > 100 \\ M(M(n+11)), & \text{if } n \le 100 \end{cases} \end{split}\]

While trivial to implement using recursion, it is not that easy to trace the recursive calls to the function. Therefore, we define a function Mvalues(n) which creates an empty array returned_values. Inside this function we define the actual recursive function M(n), and each time it is called we push the value that it returns to the returned_values array. Note that the order of these numbers in the array will not be the same as the order in which M(n) is called, since the values are pushed to the array at the end of the function.

function Mvalues(n)
    returned_values = Int64[]

    function M(n)
        if n > 100
            newval = n - 10
        else
            newval = M(M(n + 11))
        end
        push!(returned_values, newval)
        return newval
    end

    M(n)
    return returned_values
end
Mvalues (generic function with 1 method)
Mvalues(105)     # Easy - terminates immediately, M(105) = 95
1-element Vector{Int64}:
 95
Mvalues(97)      # More complex - finally returns M(97) = 91
9-element Vector{Int64}:
  98
  99
 100
 101
  91
  91
  91
  91
  91