7.1 Empirical measurements
We can measure the cost of evaluating a given expression empirically. If
we are primarily concerned with time, we could just use a stopwatch to
measure the evaluation time. For more accurate results, we use the
builtin (time Expression)
special form.^{1} Evaluating
(time Expression)
produces the value of the input expression, but also prints
out the time required to evaluate the expression (shown in our examples
using slanted font). It prints out three time values:
The time in milliseconds the processor ran to evaluate the expression. CPU is an abbreviation for “central processing unit”, the computer’s main processor.
The actual time in milliseconds it took to evaluate the expression. Since other processes may be running on the computer while this expression is evaluated, the real time may be longer than the CPU time, which only counts the time the processor was working on evaluating this expression.
The time in milliseconds the interpreter spent on garbage collection to evaluate the expression. Garbage collection is used to reclaim memory that is storing data that will never be used again.
For example, using the definitions from Chapter 5,
(makeposition 1 1))))
prints: cpu time: 141797 real time: 152063 gc time: 765. The real time is 152 seconds, meaning this evaluation took just over two and a half minutes. Of this time, the evaluation was using the CPU for 142 seconds, and the garbage collector ran for less than one second.
Here are two more examples:
> (time (car (listappend (intsto 1000) (intsto 100))))
cpu time: 531 real time: 531 gc time: 62
1
> (time (car (listappend (intsto 1000) (intsto 100))))
cpu time: 609 real time: 609 gc time: 0
1
The two expressions evaluated are identical, but the reported time varies. Even on the same computer, the time needed to evaluate the same expression varies. Many properties unrelated to our expression (such as where things happen to be stored in memory) impact the actual time needed for any particular evaluation. Hence, it is dangerous to draw conclusions about which procedure is faster based on a few timings.
Another limitation of this way of measuring cost is it only works if we wait for the evaluation to complete. If we try an evaluation and it has not finished after an hour, say, we have no idea if the actual time to finish the evaluation is sixtyone minutes or a quintillion years. We could wait another minute, but if it still hasn’t finished we don’t know if the execution time is sixtytwo minutes or a quintillion years. The techniques we develop allow us to predict the time an evaluation needs without waiting for it to execute.
There’s no sense in being precise when you don’t even know what you’re
talking about.
John von Neumann
Finally, measuring the time of a particular application of a procedure does not provide much insight into how long it will take to apply the procedure to different inputs. We would like to understand how the evaluation time scales with the size of the inputs so we can understand which inputs the procedure can sensibly be applied to, and can choose the best procedure to use for different situations. The next section introduces mathematical tools that are helpful for capturing how cost scales with input size.
Exercise 7.1. Suppose you are defining a procedure that needs to
append two lists, one short list, short
and one very long list,
long
, but the order of elements in the resulting list does not matter.
Is it better to use (listappend short long)
or (listappend long
short)
? (A good answer will involve both experimental results and an
analytical explanation.)
Filius Bonacci was an Italian monk and mathematician in the 12th century. He published a book, Liber Abbaci, on how to calculate with decimal numbers that introduced HinduArabic numbers to Europe (replacing Roman numbers) along with many of the algorithms for doing arithmetic we learn in elementary school. It also included the problem for which Fibonacci numbers are named:^{2}
 A pair of newlyborn male and female rabbits are put in a field. Rabbits mate at the age of one month and after that procreate every month, so the female rabbit produces a new pair of rabbits at the end of its second month. Assume rabbits never die and that each female rabbit produces one new pair (one male, one female) every month from her second month on. How many pairs will there be in one year?
We can define a function that gives the number of pairs of rabbits at the beginning of the $n^{th}$ month as:
The third case follows from Bonacci’s assumptions: all the rabbits alive at the beginning of the previous month are still alive (the $\textrm{Fibonacci}(n1)$ term), and all the rabbits that are at least two months old reproduce (the $\textrm{Fibonacci}(n2)$ term).
The sequence produced is known as the Fibonacci sequence:
[ 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, \ldots ]
After the first two 1s, each number in the sequence is the sum of the previous two numbers. Fibonacci numbers occur frequently in nature, such as the arrangement of florets in the sunflower (34 spirals in one direction and 55 in the other) or the number of petals in common plants (typically 1, 2, 3, 5, 8, 13, 21, or 34), hence the rarity of the fourleaf clover.
Translating the definition of the Fibonacci function into a Scheme
procedure is straightforward; we combine the two base cases using the
or
special form:
(if (or (= n 1) (= n 2)) 1
(+ (fibo ( n 1)) (fibo ( n 2)))))
Applying fibo
to small inputs works fine:
> (time (fibo 10))
cpu time: 0 real time: 0 gc time: 0
55
> (time (fibo 30))
cpu time: 2156 real time: 2187 gc time: 0
832040
But when we try to determine the number of rabbits in five years by
computing (fibo 60)
, our interpreter just hangs without producing a
value. The fibo
procedure is defined in a way that guarantees it
eventually completes when applied to a nonnegative whole number: each
recursive call reduces the input by 1
or 2
, so both recursive calls
get closer to the base case. Hence, we always make progress and must
eventually reach the base case, unwind the recursive applications, and
produce a value. To understand why the evaluation of (fibo 60)
did not
finish in our interpreter, we need to consider how much work is required
to evaluate the expression.
To evaluate (fibo 60)
, the interpreter follows the if expressions to
the recursive case, where it needs to evaluate (+ (fibo 59) (fibo
58))
. To evaluate (fibo 59)
, it needs to evaluate (fibo 58)
again
and also evaluate (fibo 57)
. To evaluate (fibo 58)
(which needs to
be done twice), it needs to evaluate (fibo 57)
and (fibo 56)
. So,
there is one evaluation of (fibo 60)
, one evaluation of (fibo 59)
,
two evaluations of (fibo 58)
, and three evaluations of (fibo 57)
.
The total number of evaluations of the fibo
procedure for each input
is itself the Fibonacci sequence! To understand why, consider the
evaluation tree for (fibo 4)
shown in
Figure 7.1. The only direct number
values are the 1
values that result from evaluations of either (fibo
1)
or (fibo 2)
. Hence, the number of 1
values must be the value of
the final result, which just sums all these numbers. For (fibo 4)
,
there are 5 leaf applications, and 3 more inner applications, for 8 (=
$\textrm{Fibonacci}(5)$) total recursive applications. The
number of evaluations of applications of fibo
needed to evaluate
(fibo 60)
is the 61st Fibonacci number — 2,504,730,781,961 — over two
and a half trillion applications of fibo
!
Although our fibo
definition is correct, it is ridiculously
inefficient and only finishes for input numbers below about 40. It
involves a tremendous amount of duplicated work: for the (fibo 60)
example, there are two evaluations of (fibo 58)
and over a trillion
evaluations of (fibo 1)
and (fibo 2)
.
We can avoid this duplicated effort by building up to the answer starting from the base cases. This is more like the way a human would determine the numbers in the Fibonacci sequence: we find the next number by adding the previous two numbers, and stop once we have reached the number we want.
The fastfibo
procedure computes the $n^{th}$ Fibonacci number, but
avoids the duplicate effort by computing the results building up from
the first two Fibonacci numbers, instead of working backwards.
(define (fiboiter a b left)
(if ( < = left 0) b
(fiboiter b (+ a b) ( left 1))))
(fiboiter 1 1 ( n 2)))
This is a form of what is known as dynamic programming. The definition
is still recursive, but unlike the original definition the problem is
broken down differently. Instead of breaking the problem down into a
slightly smaller instance of the original problem, with dynamic
programming we build up from the base case to the desired solution. In
the case of Fibonacci, the fastfibo
procedure builds up from the two
base cases until reaching the desired answer. The additional complexity
is we need to keep track of when to stop; we do this using the left
parameter.
The helper procedure, fiboiter
(short for iteration), takes three
parameters: a
is the value of the previousprevious Fibonacci number,
b
is the value of the previous Fibonacci number, and left
is the
number of iterations needed before reaching the target. The initial call
to fiboiter
passes in 1
as a
(the value of
$\textrm{Fibonacci}(1)$), and 1
as b
(the value of
$\textrm{Fibonacci}(2)$), and ( n 2)
as left
(we have
$n  2$ more iterations to do to reach the target, since the first two
Fibonacci numbers were passed in as a
and b
we are now working on
$\textrm{Fibonacci}(2)$). Each recursive call to fiboiter
reduces the value passed in as left
by one, and advances the values of
a
and b
to the next numbers in the Fibonacci sequence.
The fastfibo
procedure produces the same output values as the
original fibo
procedure, but requires far less work to do so. The
number of applications of fiboiter
needed to evaluate (fastfibo
60)
is now only 59. The value passed in as left
for the first
application of fiboiter
is 58, and each recursive call reduces the
value of left
by one until the zero case is reached. This allows us to
compute the expected number of rabbits in 5 years is 1548008755920
(over 1.5 Trillion)^{3}.

The time construct must be a special form, since the expression is not evaluated before entering time as it would be with the normal application rule. If it were evaluated normally, there would be no way to time how long it takes to evaluate, since it would have already been evaluated before time is applied. ↩

Although the sequence is named for Bonacci, it was probably not invented by him. The sequence was already known to Indian mathematicians with whom Bonacci studied. ↩

Perhaps Bonacci’s assumptions are not a good model for actual rabbit procreation. This result suggests that in about 10 years the mass of all the rabbits produced from the initial pair will exceed the mass of the Earth, which, although scary, seems unlikely! ↩