# 8.1.1 Best-First Sort

A simple sorting strategy is to find the best element in the list and put that at the front. The best element is an element for which the comparison procedure evaluates to true when applied to that element and every other element. For example, if the comparison function is $<$, the best element is the lowest number in the list. This element belongs at the front of the output list.

The notion of the best element in the list for a given comparison function only makes sense if the comparison function is transitive. This means it has the property that for any inputs a, b, and c, if (cf a b) and (cf b c) are both true, the result of (cf a c) must be true. The $<$ function is transitive: $a$$<$$b$ and $b$$<$$c$ implies $a$$<$$c$ for all numbers $a$, $b$, and $c$. If the comparison function does not have this property, there may be no way to arrange the elements in a single sorted list. All of our sorting procedures require that the procedure passed as the comparison function is transitive.

Once we can find the best element in a given list, we can sort the whole list by repeatedly finding the best element of the remaining elements until no more elements remain. To define our best-first sorting procedure, we first define a procedure for finding the best element in the list, and then define a procedure for removing an element from a list.

Finding the Best. The best element in the list is either the first element, or the best element from the rest of the list. Hence, we define list-find-best recursively. An empty list has no best element, so the base case is for a list that has one element. When the input list has only one element, that element is the best element. If the list has more than one element, the best element is the better of the first element in the list and the best element of the rest of the list.

To pick the better element from two elements, we define the pick-better procedure that takes three inputs: a comparison function and two values.

(define (pick-better cf p1 p2) (if (cf p1 p2) p1 p2))

Assuming the procedure passed as cf has constant running time, the running time of pick-better is constant. For most of our examples, we use the &lt; procedure as the comparison function. For arbitrary inputs, the running time of &lt; is not constant since in the worst case performing the comparison requires examining every digit in the input numbers. But, if the maximum value of a number in the input list is limited, then we can consider &lt; a constant time procedure since all of the inputs passed to it are below some fixed size.

We use pick-better to define list-find-best:

(define (list-find-best cf p)
(if (null? (cdr p)) (car p)
(pick-better cf (car p) (list-find-best cf (cdr p)))))

We use $n$ to represent the number of elements in the input list p. An application of list-find-best involves $n-1$ recursive applications since each one passes in (cdr p) as the new p operand and the base case stops when the list has one element left. The running time for each application (excluding the recursive application) is constant since it involves only applications of the constant time procedures null?, cdr, and pick-better. So, the total running time for list-find-best is in $\Theta(n)$; it scales linearly with the length of the input list.

Deleting an Element. To implement best first sorting, we need to produce a list that contains all the elements of the original list except for the best element, which will be placed at the front of the output list. We define a procedure, list-delete, that takes as inputs a List and a Value, and produces a List that contains all the elements of the input list in the original order except for the first element that is equal to the input value.

(define (list-delete p el)
(if (null? p) null
(if (equal? (car p) el) (cdr p) ; found match, skip this element
(cons (car p) (list-delete (cdr p) el)))))

We use the equal? procedure to check if the element matches instead of = so the list-delete procedure works on elements that are not just Numbers. The equal? procedure behaves identically to = when both inputs are Numbers, but also works sensibly on many other datatypes including Booleans, Characters, Pairs, Lists, and Strings. Since we assume the sizes of the inputs to equal? are bounded, we can consider equal? to be a constant time procedure (even though it would not be constant time on arbitrary inputs).

The worst case running time for list-delete occurs when no element in the list matches the value of el (in the best case, the first element matches and the running time does not depend on the length of the input list at all). We use $n$ to represent the number of elements in the input list. There can be up to $n$ recursive applications of list-delete. Each application has constant running time since all of the procedures applied (except the recursive call) have constant running times. Hence, the total running time for list-delete is in $\Theta(n)$ where $n$ is the length of the input list.

Best-First Sorting. We define list-sort-best-first using list-find-best and list-delete:

(define (list-sort-best-first cf p)
(if (null? p) null
(cons (list-find-best cf p)
(list-sort-best-first cf (list-delete p (list-find-best cf p))))))

The running time of the list-sort-best-first procedure grows quadratically with the length of the input list. We use $n$ to represent the number of elements in the input list. There are $n$ recursive applications since each application of list-delete produces an output list that is one element shorter than its input list. In addition to the constant time procedures (null? and cons), the body of list-sort-best-first involves two applications of list-find-best on the input list, and one application of list-delete on the input list.

Each of these applications has running time in $\Theta(m)$ where $m$ is the length of the input list to list-find-best and list-delete (we use $m$ here to avoid confusion with $n$, the length of the first list passed into list-sort-best-first). In the first application, this input list will be a list of length $n$, but in later applications it will be involve lists of decreasing length: $n-1$, $n-2$, $\cdots$, 1. Hence, the average length of the input lists to list-find-best and list-delete is approximately $\frac{n}{2}$. Thus, the average running time for each of these applications is in $\Theta(\frac{n}{2})$, which is equivalent to $\Theta(n)$.

There are three applications (two of list-find-best and one of list-delete) for each application of list-sort-best-first, so the total running time for each application is in $\Theta(3n)$, which is equivalent to $\Theta(n)$.

There are $n$ recursive applications, each with average running time in $\Theta(n)$, so the running time for list-sort-best-first is in $\Theta(n^2)$. This means doubling the length of the input list quadruples the expected running time, so we predict that sorting a list of 2000 elements to take approximately four times as long as sorting a list of 1000 elements.

Let expression. Each application of the list-sort-best-first procedure involves two evaluations of (list-find-best cf p), a procedure with running time in $\Theta(n)$ where $n$ is the length of the input list.

The result of both evaluations is the same, so there is no need to evaluate this expression twice. We could just evaluate (list-find-best cf p) once and reuse the result. One way to do this is to introduce a new procedure using a lambda expression and pass in the result of (list-find-best cf p) as a parameter to this procedure so it can be used twice:

(define (list-sort-best-first-nodup cf p)
(if (null? p) null
((lambda (best)
(cons best (list-sort-best-first-nodup cf (list-delete p best))))
(list-find-best cf p))))

This procedure avoids the duplicate evaluation of (list-find-best cf p), but is quite awkward to read and understand.

Scheme provides the let expression special form to avoid this type of duplicate work more elegantly. The grammar for the let expression is:

The evaluation rule for the let expression is:

Evaluation Rule 6: Let expression. To evaluate a let expression, evaluate each binding in order. To evaluate each binding, evaluate the binding expression and bind the name to the value of that expression. Then, the value of the let expression is the value of the body expression evaluated with the names in the expression that match binding names substituted with their bound values.

A let expression can be transformed into an equivalent application expression. The let expression

(let ((\Namea \Expressiona) (\Nameb \Expressionb)
\cdots (\Namek \Expressionk))
\ExpressionBody)

is equivalent to the application expression:

((lambda (\Namea \Nameb \ldots \Namek) \ExpressionBody)
\Expressiona \Expressionb \ldots \Expressionk)

The advantage of the let expression syntax is it puts the expressions next to the names to which they are bound. Using a let expression, we define list-sort-best-first-let to avoid the duplicate evaluations:

(define (list-sort-best-first-let cf p)
(if (null? p) null
(let ((best (list-find-best cf p)))
(cons best (list-sort-best-first-let cf (list-delete p best))))))

This runs faster than list-sort-best-first since it avoids the duplicate evaluations, but the asymptotic asymptotic running time is still in $\Theta(n^2)$: there are $n$ recursive applications of list-sort-best-first-let and each application involves linear time applications of list-find-best and list-delete. Using the let expression improves the actual running time by avoiding the duplicate work, but does not impact the asymptotic growth rate since the duplicate work is hidden in the constant factor.

Exercise 8.1. What is the best case input for list-sort-best-first? What is its asymptotic running time on the best case input?

Exercise 8.2. Use the time special form (Section 7.1) to experimentally measure the evaluation times for the list-sort-best-first-let procedure. Do the results match the expected running times based on the $\Theta(n^2)$ asymptotic running time?

You may find it helpful to define a procedure that constructs a list containing $n$ random elements. To generate the random elements use the built-in procedure random that takes one number as input and evaluates to a pseudorandom number between 0 and one less than the value of the input number. Be careful in your time measurements that you do not include the time required to generate the input list.

Exercise 8.3. Define the list-find-best procedure using the list-accumulate procedure from Section 5.4.2 and evaluate its asymptotic running time.

Exercise 8.4. $\left[\star \right]$ Define and analyze a list-sort-worst-last procedure that sorts by finding the worst element first and putting it at the end of the list.