Merge sort (Oz)
From LiteratePrograms
- Other implementations: ACL2 | C | C++ | dc | Eiffel | Erlang | Haskell | Java | JavaScript | Lisp | OCaml | Oz | Perl | Prolog | Python | Ruby | Scheme
This article showcases two implementations of the merge sort algorithm in the programming language Oz. Both implementations are generic: the sorting predicate is user-specified; use Value.'<='
to provide the usual stable sorting of numbers.
Contents |
Implementation
Mergesort applies a predicate to a list of items that can be compared using that predicate. The predicate defines the sorted order produced by the mergesort algorithm. For a simple list (one element or empty), we can just return a duplicate of the current list. For longer lists, we split the list into two halves, recurse on each half, then merge the two halves in accordance with the predicate.
Mergesort using built-in functions
Oz provides a number of list-processing procedures that make the mergesort algorithm very easy to implement: List.length
determines the length of a list; List.takeDrop
splits a list into two smaller lists at a defined point; List.merge
merges two sorted lists together in accordance with a sorting predicate. Using these functions, mergesort can be expressed as:
<<mergesort>>= fun {Mergesort Xs P} case Xs of nil then nil [] [X] then [X] else Len Ys Zs in {List.length Xs Len} {List.takeDrop Xs (Len div 2) Ys Zs} % split the list in two {List.merge {Mergesort Ys P} {Mergesort Zs P} P} end end
Mergesort from first principles
The implementation using built-in functions is quick and easy, but doesn't provide much insight into how the mergesort actually works. Let's now create a mergesort implementation built from custom merge
and split
functions. The signature of this function is identical to our other mergesort function:
<<custom mergesort>>= fun {MergesortCustom Xs P} custom mergesort body end
However the body of the custom mergesort is a little different. To begin with, it includes locally defined split and merge functions. It also contains a locally defined mergesort function, which operates in a slightly different way than the mergesort constructed from built-in functions:
<<custom mergesort body>>= split merge fun {Mergesort Xs} case Xs of nil then nil [] [X] then [X] else Ys # Zs = {Split Xs} in {Merge {Mergesort Ys} {Mergesort Zs}} end end in {Mergesort Xs}
The differences in the locally defined mergesort are a result of two important facts:
- The local
Split
function is specifically designed to split a list in half, regardless of length, so there is no need to find the length of the list first. - The local
Merge
function already has the sorting predicateP
in its scope, so there is no need to pass the predicate as an argument.
Split
The workhorse for splitting a list is the SplitRec
recursive function. This method takes two lists together with a work-in-progress result. In each recursion, two elements are eliminated off the first list, and one element transferred from the second list to the result list. Once the first list is exhausted, we return from the recursion. At this point, we must be n / 2 recursions deep -- where n is the length of the original list. As such, the second list will contain the last n − n / 2 elements from the original list and the working list will contain the first n / 2 elements that were eliminated from the second list. This pair of lists is the ultimate return value. Due to the way the working list is built, the elements it contains are in reverse order. To ensure a stable sort, the order of the working list is reversed (to restore the original list order) before it is returned. The Split
function simply kicks off this recursive chain.
<<split>>= fun {Split Xs} fun {SplitRec Xs Ys Ws} case Xs of nil then {List.reverse Ws} # Ys [] [_] then {List.reverse Ws} # Ys [] _ | _ | Xss then Y1 | Yss = Ys in {SplitRec Xss Yss (Y1 | Ws)} end end in {SplitRec Xs Xs nil} end
A more intuitive way of splitting the list might be to put all odd-positioned elements in one sublist, and the rest in another. For example, we might have written Split
as
fun {Split Xs} case Xs of nil then nil # nil [] [X] then [X] # nil [] X | Y | Zs then Xs # Ys = {Split Zs} in ( X | Xs ) # ( Y | Ys ) end end
This, however, is not stable: a list that is already partially-ordered according to the predicate might be rearranged. For example, given a binary predicate string-length<= that returns true
iff the first argument is not longer than the second, and the list ["aaa" "bbb" "ccc"]
, you get ["aaa" "ccc" "bbb"]
back. The SplitRec
function avoids this problem by using a copy of the list, which is consumed two elements at a time, as a counter to find the midpoint of the list.
Merge
Merging is the heart of the algorithm and operates by interleaving the elements of two ordered lists in such a way that the combined list is ordered. This process takes only linear (O(n)) time. The Merge
function takes two lists. If either list is empty, we return a duplicate of the other as the merged result. Otherwise, the lead elements of each list are compared using the sorting predicate, P
. A true result from the predicate indicates that the first element should precede the second in sorting order. The "winning" element is removed from its list and we recurse on the remainder of the lists. The result, prepended by the removed element, is returned:
<<merge>>= fun {Merge Xs Ys} case Xs # Ys of nil # Ys then Ys [] Xs # nil then Xs [] ( X | Xss ) # ( Y | Yss) then if {P X Y} then X | {Merge Xss Ys} else Y | {Merge Xs Yss} end end end
Packaging
Both of the mergesort functions are packaged inside a functor, which is an Oz module. In this case, the functor exports both merge functions, making them accessible to other Oz modules.
<<mergesort.oz>>= functor export Mergesort MergesortCustom define mergesort custom mergesort end
Testing
We provide a simple test harness for the mergesort implementations. The heart of this test harness is the TestSort
procedure. This procedure accepts a sorting function, an ordering function, a testname, a stimulus, and an expected response as parameters. It applies the sorting function to the stimulus using the provided ordering function, and checks to see whether the actual result matches the expected one. The results of the test are then printed to the console.
<<TestSort>>= proc {TestSort F Order Testname S R} Result in if R == {F S Order} then Result = "passed" else Result = "failed" end {System.showInfo {System.printName F} # " - " # Result # " - " # Testname # "."} end
The test cases provided as part of the test harness are fairly straightforward. One of the tests verifies that the sorting function actually returns a sorted output.
<<TestNumericSort>>= proc {TestNumericSort F} Stimulus = [3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3] Response = [1 1 2 3 3 3 4 5 5 5 6 7 8 9 9 9] in {TestSort F Value.'=<' "numeric sort" Stimulus Response} end
The second test verifies that the sorting function is stable, i.e. that it does not alter the order of a list that should not be permuted under the provided ordering function. In this case, the test is performed using a list of equal-length strings as the stimulus, and an anonymous function that defines an ordering in terms of string lengths. The expected result is that the stimulus list is not permuted.
<<TestStableSort>>= proc {TestStableSort F} Stimulus = ["bbb" "aaa" "ccc" "zzz" "fff"] Response = ["bbb" "aaa" "ccc" "zzz" "fff"] in {TestSort F fun {$ X Y} {List.length X}=<{List.length Y} end "stable sort" Stimulus Response} end
The complete test harness loads the necessary support modules, and then applies the two test procedures to both the Mergesort
and MergesortCustom
sorting functions. Note that imported functions are invoked using names that have a lower case first letter (and a prefixed module name), despite the fact that they are defined with an upper case letter at the start of their names.
<<mergesort-tests.oz>>= functor import Application System Mergesort define TestSort TestNumericSort TestStableSort in {TestNumericSort Mergesort.mergesort} {TestStableSort Mergesort.mergesort} {TestNumericSort Mergesort.mergesortCustom} {TestStableSort Mergesort.mergesortCustom} {Application.exit 0} end
Building and running
If you have installed the Mozart Programming System, then the mergesort implementations and test harness can be compiled at the command line by issuing the following commands:
ozc -c mergesort.oz ozc -x mergesort-tests.oz
Running mergesort-tests
should produce the following output:
Mergesort - passed - numeric sort. Mergesort - passed - stable sort. MergesortCustom - passed - numeric sort. MergesortCustom - passed - stable sort.
Download code |