4.3 Recursive problem solving

In the previous section, we used functional composition to break a problem into two procedures that can be composed to produce the desired output. A particularly useful variation on this is when we can break a problem into a smaller version of the original problem.

The goal is to be able to feed the output of one application of the procedure back into the same procedure as its input for the next application, as shown in Figure 4.3.

Figure 4.3: Circular Composition.

Here’s a corresponding Scheme procedure:

(define f (lambda (n) (f n)))

Of course, this doesn’t work very well!1 Every application of f results in another application of f to evaluate. This never stops — no output is ever produced and the interpreter will keep evaluating applications of f until it is stopped or runs out of memory.

We need a way to make progress and eventually stop, instead of going around in circles. To make progress, each subsequent application should have a smaller input. Then, the applications stop when the input to the procedure is simple enough that the output is already known. The stopping condition is called the base case, similarly to the grammar rules in Section 2.4. In our grammar examples, the base case involved replacing the nonterminal with nothing (e.g., MoreDigits ::$\Rightarrow $ $\epsilon $) or with a terminal (e.g., Noun ::$\Rightarrow $ Alice). In recursive procedures, the base case will provide a solution for some input for which the problem is so simple we already know the answer. When the input is a number, this is often (but not necessarily) when the input is 0 or 1.

To define a recursive procedure, we use an if expression to test if the input matches the base case input. If it does, the consequent expression is the known answer for the base case. Otherwise, the recursive case applies the procedure again but with a smaller input. That application needs to make progress towards reaching the base case. This means, the input has to change in a way that gets closer to the base case input. If the base case is for 0, and the original input is a positive number, one way to get closer to the base case input is to subtract 1 from the input value with each recursive application.

This evaluation spiral is depicted in Figure 4.4. With each subsequent recursive call, the input gets smaller, eventually reaching the base case. For the base case application, a result is returned to the previous application. This is passed back up the spiral to produce the final output. Keeping track of where we are in a recursive evaluation is similar to keeping track of the subnetworks in an RTN traversal. The evaluator needs to keep track of where to return after each recursive evaluation completes, similarly to how we needed to keep track of the stack of subnetworks to know how to proceed in an RTN traversal.

Figure 4.4: Recursive Composition.

Here is the corresponding procedure:

(define g (lambda (n)
        (if (= n 0) 1
                (g (- n 1)))))

Unlike the earlier circular f procedure, if we apply g to any non-negative integer it will eventually produce an output. For example, consider evaluating (g 2). When we evaluate the first application, the value of the parameter n is 2, so the predicate expression (= n 0) evaluates to false and the value of the procedure body is the value of the alternate expression, (g (- n 1)). The subexpression, (- n 1) evaluates to 1, so the result is the result of applying g to 1. As with the previous application, this leads to the application, (g (- n 1)), but this time the value of n is 1, so (- n 1) evaluates to 0. The next application leads to the application, (g 0). This time, the predicate expression evaluates to true and we have reached the base case. The consequent expression is just 1, so no further applications of g are performed and this is the result of the application (g 0). This is returned as the result of the (g 1) application in the previous recursive call, and then as the output of the original (g 2) application.

We can think of the recursive evaluation as winding until the base case is reached, and then unwinding the outputs back to the original application. For this procedure, the output is not very interesting: no matter what positive number we apply g to, the eventual result is 1. To solve interesting problems with recursive procedures, we need to accumulate results as the recursive applications wind or unwind. Examples 4.1 and 4.2 illustrate recursive procedures that accumulate the result during the unwinding process. Example 4.3 illustrates a recursive procedure that accumulates the result during the winding process.

Example 4.1: Factorial

How many different arrangements are there of a deck of 52 playing cards?

The top card in the deck can be any of the 52 cards, so there are 52 possible choices for the top card.

The second card can be any of the cards except for the card that is the top card, so there are 51 possible choices for the second card. The third card can be any of the 50 remaining cards, and so on, until the last card for which there is only one choice remaining.

$$ 52 * 51 * 50 * \cdots * 2 * 1 $$

This is known as the factorial function (denoted in mathematics using the exclamation point, e.g., $52!$). It can be defined recursively:

$0! = 1$

$n! = n * (n - 1)!$ for all $n > 0$

The mathematical definition of factorial is recursive, so it is natural that we can define a recursive procedure that computes factorials:

(define (factorial n) (if (= n 0) 1 (* n (factorial (- n 1)))))

Evaluating (factorial 52) produces the number of arrangements of a 52-card deck: a sixty-eight digit number starting with an 8.

The factorial procedure has structure very similar to our earlier definition of the useless recursive g procedure. The only difference is the alternative expression for the if expression: in g we used (g (- n 1)); in factorial we added the outer application of *: (* n (factorial (- n 1))). Instead of just evaluating to the result of the recursive application, we are now combining the output of the recursive evaluation with the input n using a multiplication application.

Exercise 4.5. How many different ways are there of choosing an unordered 5-card hand from a 52-card deck?

This is an instance of the “$n$ choose $k$” problem (also known as the binomial coefficient): how many different ways are there to choose a set of $k$ items from $n$ items. There are $n$ ways to choose the first item, $n - 1$ ways to choose the second, $\ldots $, and $n - k + 1$ ways to choose the $k^{th}$ item. But, since the order does not matter, some of these ways are equivalent. The number of possible ways to order the $k$ items is $k!$, so we can compute the number of ways to choose $k$ items from a set of $n$ items as:

[ \frac{n * (n - 1) * \cdots * (n - k + 1)}{k!} = \frac{n!}{(n-k)!k!} ]

a. Define a procedure choose that takes two inputs, $n$ (the size of the item set) and $k$ (the number of items to choose), and outputs the number of possible ways to choose $k$ items from $n$.

b. Compute the number of possible 5-card hands that can be dealt from a 52-card deck.

c. $\left[\star \right]$Compute the likelihood of being dealt a flush (5 cards all of the same suit). In a standard 52-card deck, there are 13 cards of each of the four suits. Hint: divide the number of possible flush hands by the number of possible hands.

Exercise 4.6.  Gauss, Karl Reputedly, when Karl Gauss was in elementary school his teacher assigned the class the task of summing the integers from 1 to 100 (e.g., $1 + 2 + 3 + \cdots + 100$) to keep them busy. Being the (future) “Prince of Mathematics”, Gauss developed the formula for calculating this sum, that is now known as the Gauss sum. Had he been a computer scientist, however, and had access to a Scheme interpreter in the late 1700s, he might have instead defined a recursive procedure to solve the problem. Define a recursive procedure, gauss-sum, that takes a number n as its input parameter, and evaluates to the sum of the integers from 1 to n as its output. For example, (gauss-sum 100) should evaluate to 5050.

Karl Gauss

Exercise 4.7. $\left[\star \right]$accumulate Define a higher-order procedure, accumulate, that can be used to make both gauss-sum (from Exercise 4.6) and factorial. The accumulate procedure should take as its input the function used for accumulation (e.g., * for factorial, + for gauss-sum). With your accumulate procedure, ((accumulate +) 100) should evaluate to 5050 and ((accumulate *) 3) should evaluate to 6. We assume the result of the base case is 1 (although a more general procedure could take that as a parameter).

Hint: since your procedure should produce a procedure as its output, it could start like this:

(define (accumulate f)
    (lambda (n)
        (if (= n 1) 1
...

Example 4.2: Find Maximum

Consider the problem of defining a procedure that takes as its input a procedure, a low value, and a high value, and outputs the maximum value the input procedure produces when applied to an integer value between the low value and high value input. We name the inputs f, low, and high. To find the maximum, the find-maximum procedure should evaluate the input procedure f at every integer value between the low and high, and output the greatest value found.

Here are a few examples:

Scheme output
> (find-maximum (lambda (x) x) 1 20)  
20  
> (find-maximum (lambda (x) (- 10 x)) 1 20)  
9  
> (find-maximum (lambda (x) (* x (- 10 x))) 1 20)  
25

To define the procedure, think about how to combine results from simpler problems to find the result. For the base case, we need a case so simple we already know the answer. Consider the case when low and high are equal. Then, there is only one value to use, and we know the value of the maximum is (f low). So, the base case is (if (= low high) (f low) ...).

How do we make progress towards the base case? Suppose the value of high is equal to the value of low plus 1. Then, the maximum value is either the value of (f low) or the value of (f (+ low 1)). We could select it using the bigger procedure (from Example 3.3): (bigger (f low) (f (+ low 1))). We can extend this to the case where high is equal to low plus 2: (bigger (f low) (bigger (f (+ low 1)) (f (+ low 2))))

The second operand for the outer bigger evaluation is the maximum value of the input procedure between the low value plus one and the high value input. If we name the procedure we are defining find-maximum, then this second operand is the result of (find-maximum f (+ low 1) high). This works whether high is equal to (+ low 1), or (+ low 2), or any other value greater than high.

Putting things together, we have our recursive definition of find-maximum: (define (find-maximum f low high) (if (= low high) (f low) (bigger (f low) (find-maximum f (+ low 1) high))))

Exercise 4.8. To find the maximum of a function that takes a real number as its input, we need to evaluate at all numbers in the range, not just the integers. There are infinitely many numbers between any two numbers, however, so this is impossible. We can approximate this, however, by evaluating the function at many numbers in the range.

Define a procedure find-maximum-epsilon that takes as input a function f, a low range value low, a high range value high, and an increment epsilon, and produces as output the maximum value of f in the range between low and high at interval epsilon. As the value of epsilon decreases, find-maximum-epsilon should evaluate to a value that approaches the actual maximum value.

For example,

(find-maximum-epsilon (lambda (x) (* x (- 5.5 x))) 1 10 1)

evaluates to 7.5. And,

(find-maximum-epsilon (lambda (x) (* x (- 5.5 x))) 1 10 0.01)

evaluates to 7.5625.

Exercise 4.9. $\left[\star \right]$The find-maximum procedure we defined evaluates to the maximum value of the input function in the range, but does not provide the input value that produces that maximum output value. Define a procedure that finds the input in the range that produces the maximum output value. For example, (find-maximum-input inc 1 10) should evaluate to 10 and (find-maximum-input (lambda (x) (* x (- 5.5 x))) 1 10) should evaluate to 3.

Exercise 4.10. $\left[\star \right]$Define a find-area procedure that takes as input a function f, a low range value low, a high range value high, and an increment epsilon, and produces as output an estimate for the area under the curve produced by the function f between low and high using the epsilon value to determine how many regions to evaluate.

Example 4.3: Euclid’s Algorithm

Euclid In Book 7 of the Elements, Euclid describes an algorithm for finding the greatest common divisor of two non-zero integers. The greatest common divisor is the greatest integer that divides both of the input numbers without leaving any remainder. For example, the greatest common divisor of 150 and 200 is 50 since (/ 150 50) evaluates to 3 and (/ 200 50) evaluates to 4, and there is no number greater than 50 that can evenly divide both 150 and 200.

modulo The modulo primitive procedure takes two integers as its inputs and evaluates to the remainder when the first input is divided by the second input. For example, (modulo 6 3) evaluates to 0 and (modulo 7 3) evaluates to 1.

Euclid’s algorithm stems from two properties of integers:

  1. If (modulo a b) evaluates to 0 then b is the greatest common divisor of a and b.

  2. If (modulo a b) evaluates to a non-zero integer r, the greatest common divisor of a and b is the greatest common divisor of b and r.

We can define a recursive procedure for finding the greatest common divisor closely following Euclid’s algorithm2:

(define (gcd-euclid a b)
        (if (= (modulo a b) 0) b (gcd-euclid b (modulo a b))))

The structure of the definition is similar to the factorial definition: the procedure body is an if expression and the predicate tests for the base case. For the gcd-euclid procedure, the base case corresponds to the first property above. It occurs when b divides a evenly, and the consequent expression is b. The alternate expression, (gcd-euclid b (modulo a b)), is the recursive application.

The gcd-euclid procedure differs from the factorial definition in that there is no outer application expression in the recursive call. We do not need to combine the result of the recursive application with some other value as was done in the factorial definition, the result of the recursive application is the final result. Unlike the factorial and find-maximum examples, the gcd-euclid procedure produces the result in the base case, and no further computation is necessary to produce the final result. When no further evaluation is necessary to get from the result of the recursive application to the final result, a recursive definition is said to be tail recursive. Tail recursive procedures have the advantage that they can be evaluated without needing to keep track of the stack of previous recursive calls. Since the final call produces the final result, there is no need for the interpreter to unwind the recursive calls to produce the answer.

Exercise 4.11.  Show the structure of the gcd-euclid applications in evaluating (gcd-euclid 6 9).

Exercise 4.12. $\left[\star \right]$Provide a convincing argument why the evaluation of (gcd-euclid a b) will always finish when the inputs are both positive integers.

Exercise 4.13. Provide an alternate definition of factorial that is tail recursive. To be tail recursive, the expression containing the recursive application cannot be part of another application expression. (Hint: define a factorial-helper procedure that takes an extra parameter, and then define factorial as (define (factorial n) (factorial-helper n 1)).)

Exercise 4.14. Provide a tail recursive definition of find-maximum.

Exercise 4.15. $\left[\star \star \right]$Provide a convincing argument why it is possible to transform any recursive procedure into an equivalent procedure that is tail recursive.

Exploration 4.1: Square Roots

One of the earliest known algorithms is a method for computing square roots. It is known as Heron’s method after the Greek mathematician Heron of Alexandria who lived in the first century AD who described the method, although it was also known to the Babylonians many centuries earlier. Isaac Newton developed a more general method for estimating functions based on their derivatives known as Netwon’s method, of which Heron’s method is a specialization.

Square root is a mathematical function that take a number, $a$, as input and outputs a value $x$ such that $x^2 = a$. For many numbers (including 2), the square root is irrational, so the best we can hope for with is a good approximation. We define a procedure find-sqrt that takes the target number as input and outputs an approximation for its square root.

Heron’s method works by starting with an arbitrary guess, $g_0$. Then, with each iteration, compute a new guess ($g_ n$ is the $n^{th}$ guess) that is a function of the previous guess ($g_{n - 1}$) and the target number ($a$):

[ g_ n = \frac{g_{n - 1} + \frac{a}{g_{n - 1}}}{2} ]

As $n$ increases $g_ n$ gets closer and closer to the square root of $a$.

The definition is recursive since we compute $g_ n$ as a function of $g_{n-1}$, so we can define a recursive procedure that computes Heron’s method. First, we define a procedure for computing the next guess from the previous guess and the target:

(define (heron-next-guess a g) (/ (+ g (/ a g)) 2))

Heron of Alexandria

Next, we define a recursive procedure to compute the $n^{th}$ guess using Heron’s method. It takes three inputs: the target number, $a$, the number of guesses to make, $n$, and the value of the first guess, $g$.

(define (heron-method a n g)
        (if (= n 0)
        g
        (heron-method a (- n 1) (heron-next-guess a g))))

To start, we need a value for the first guess. The choice doesn’t really matter — the method works with any starting guess (but will reach a closer estimate quicker if the starting guess is good). We will use 1 as our starting guess. So, we can define a find-sqrt procedure that takes two inputs, the target number and the number of guesses to make, and outputs an approximation of the square root of the target number.

(define (find-sqrt a guesses)
        (heron-method a guesses 1))

Heron’s method converges to a good estimate very quickly:

Scheme output
> (square (find-sqrt 2 0))
1  
> (square (find-sqrt 2 1))  
2 1/4  
> (square (find-sqrt 2 2))  
2 1/144  
> (square (find-sqrt 2 4))  
2 1/221682772224
> (exact-> inexact (find-sqrt 2 5))  
1.4142135623730951  

The actual square root of 2 is $1.414213562373095048\ldots $, so our estimate is correct to 16 digits after only five guesses.

Users of square roots don’t really care about the method used to find the square root (or how many guesses are used). Instead, what is important to a square root user is how close the estimate is to the actual value. Can we change our find-sqrt procedure so that instead of taking the number of guesses to make as its second input it takes a minimum tolerance value?

Since we don’t know the actual square root value (otherwise, of course, we could just return that), we need to measure tolerance as how close the square of the approximation is to the target number. Hence, we can stop when the square of the guess is close enough to the target value.

(define (close-enough? a tolerance g)
        ( < = (abs (- a (square g)))
tolerance))

The stopping condition for the recursive definition is now when the guess is close enough. Otherwise, our definitions are the same as before.

(define (heron-method-tolerance a tolerance g)
        (if (close-enough? a tolerance g)
                g
                (heron-method-tolerance a tolerance (heron-next-guess a g))))
               
(define (find-sqrt-approx a tolerance)
        (heron-method-tolerance a tolerance 1))

Note that the value passed in as tolerance does not change with each recursive call. We are making the problem smaller by making each successive guess closer to the required answer.

Here are some example interactions with find-sqrt-approx:

Scheme output
> (exact-> inexact (square (find-sqrt-approx 2 0.01)))  
2.0069444444444446  
> (exact-> inexact (square (find-sqrt-approx 2 0.0000001)))  
2.000000000004511
  1. How accurate is the built-in sqrt procedure?
  2. Can you produce more accurate square roots than the built-in sqrt procedure?
  3. Why doesn’t the built-in procedure do better?

  1. Curious readers should try entering this definition into a Scheme interpreter and evaluating (f 0). If you get tired of waiting for an output, in DrRacket you can click the Stop button in the upper right corner to interrupt the evaluation. 

  2. DrRacket provides a built-in procedure gcd that computes the greatest common divisor. We name our procedure gcd-euclid to avoid a clash with the build-in procedure.