Although building and extracting elements from trees allows us to sort with expected time in $\Theta(n\log n)$, the constant time required to build all those trees and extract the elements from the final tree is high.
In fact, we can use the same approach to sort without needing to build trees. Instead, we keep the two sides of the tree as separate lists, and sort them recursively. The key is to divide the list into halves by value, instead of by position. The values in the first half of the list are all less than the values in the second half of the list, so the lists can be sorted separately.
list-quicksort procedure uses
list-filter (from Example 5.5) to divide the input list into sublists containing elements below and above the comparison element, and then recursively sorts those sublists:
(if (null? p) null
(list-filter (lambda (el) (cf el (car p))) (cdr p)))
(cons (car p)
(list-filter (lambda (el) (not (cf el (car p)))) (cdr p)))))))
This is the famous quicksort algorithm that was invented by Sir C. A. R. (Tony) Hoare while he was an exchange student at Moscow State University in 1959. He was there to study probability theory, but also got a job working on a project to translate Russian into English. The translation depended on looking up words in a dictionary. Since the dictionary was stored on a magnetic tape which could be read in order faster than if it was necessary to jump around, the translation could be done more quickly if the words to translate were sorted alphabetically. Hoare invented the quicksort algorithm for this purpose and it remains the most widely used sorting algorithm.
list-sort-tree-insert, the expected running time for a randomly arranged list is in $\Theta(n\log n)$ and the worst case running time is in $\Theta(n^2)$. In the expected cases, each recursive call halves the size of the input list (since if the list is randomly arranged we expect about half of the list elements are below the value of the first element), so there are approximately $\log n$ expected recursive calls.
Each call involves an application of
list-filter, which has running time in $\Theta(m)$ where $m$ is the length of the input list. At each call depth, the total length of the inputs to all the calls to
list-filter is $n$ since the original list is subdivided into $2^d$ sublists, which together include all of the elements in the original list. Hence, the total running time is in $\Theta(n\log n)$ in the expected cases where the input list is randomly arranged. As with
list-sort-tree-insert, if the input list is not randomly rearranged it is possible that all elements end up in the same partition. Hence, the worst case running time of
list-quicksort is still in $\Theta(n^2)$.
Exercise 8.10. Estimate the time it would take to sort a list of one million elements using
Exercise 8.11. Both the
list-sort-tree-insert procedures have expected running times in $\Theta(n\log n)$. Experimentally compare their actual running times.
Exercise 8.12. What is the best case input for
list-quicksort? Analyze the asymptotic running time for
list-quicksort on best case inputs.
Exercise 8.13. $\left[\star \right]$ Instead of using binary trees, we could use ternary trees. A node in a ternary tree has two elements, a left element and a right element, where the left element must be before the right element according to the comparison function. Each node has three subtrees:
left, containing elements before the left element;
middle, containing elements between the left and right elements; and
right, containing elements after the right element. Is it possible to sort faster using ternary trees?