# Merge sort (Oz)

Other implementations: ACL2 | C | C++ | dc | Eiffel | Erlang | Haskell | Java | JavaScript | Lisp | OCaml | Oz | Perl | Prolog | Python | Ruby | Scheme

This article showcases two implementations of the merge sort algorithm in the programming language Oz. Both implementations are generic: the sorting predicate is user-specified; use `Value.'<='` to provide the usual stable sorting of numbers.

## Implementation

Mergesort applies a predicate to a list of items that can be compared using that predicate. The predicate defines the sorted order produced by the mergesort algorithm. For a simple list (one element or empty), we can just return a duplicate of the current list. For longer lists, we split the list into two halves, recurse on each half, then merge the two halves in accordance with the predicate.

### Mergesort using built-in functions

Oz provides a number of list-processing procedures that make the mergesort algorithm very easy to implement: `List.length` determines the length of a list; `List.takeDrop` splits a list into two smaller lists at a defined point; `List.merge` merges two sorted lists together in accordance with a sorting predicate. Using these functions, mergesort can be expressed as:

```<<mergesort>>=
fun {Mergesort Xs P}
case Xs
of nil then nil
[] [X] then [X]
else
Len Ys Zs
in
{List.length Xs Len}
{List.takeDrop Xs (Len div 2) Ys Zs}  % split the list in two
{List.merge {Mergesort Ys P} {Mergesort Zs P} P}
end
end
```

### Mergesort from first principles

The implementation using built-in functions is quick and easy, but doesn't provide much insight into how the mergesort actually works. Let's now create a mergesort implementation built from custom `merge` and `split` functions. The signature of this function is identical to our other mergesort function:

```<<custom mergesort>>=
fun {MergesortCustom Xs P}
custom mergesort body
end
```

However the body of the custom mergesort is a little different. To begin with, it includes locally defined split and merge functions. It also contains a locally defined mergesort function, which operates in a slightly different way than the mergesort constructed from built-in functions:

```<<custom mergesort body>>=
split
merge
fun {Mergesort Xs}
case Xs
of nil then nil
[] [X] then [X]
else
Ys # Zs = {Split Xs}
in
{Merge {Mergesort Ys} {Mergesort Zs}}
end
end
in
{Mergesort Xs}
```

The differences in the locally defined mergesort are a result of two important facts:

1. The local `Split` function is specifically designed to split a list in half, regardless of length, so there is no need to find the length of the list first.
2. The local `Merge` function already has the sorting predicate `P` in its scope, so there is no need to pass the predicate as an argument.

#### `Split`

The workhorse for splitting a list is the `SplitRec` recursive function. This method takes two lists together with a work-in-progress result. In each recursion, two elements are eliminated off the first list, and one element transferred from the second list to the result list. Once the first list is exhausted, we return from the recursion. At this point, we must be n / 2 recursions deep -- where n is the length of the original list. As such, the second list will contain the last nn / 2 elements from the original list and the working list will contain the first n / 2 elements that were eliminated from the second list. This pair of lists is the ultimate return value. Due to the way the working list is built, the elements it contains are in reverse order. To ensure a stable sort, the order of the working list is reversed (to restore the original list order) before it is returned. The `Split` function simply kicks off this recursive chain.

```<<split>>=
fun {Split Xs}
fun {SplitRec Xs Ys Ws}
case Xs
of nil then {List.reverse Ws} # Ys
[] [_] then {List.reverse Ws} # Ys
[] _ | _ | Xss then
Y1 | Yss = Ys
in
{SplitRec Xss Yss (Y1 | Ws)}
end
end
in {SplitRec Xs Xs nil} end
```

A more intuitive way of splitting the list might be to put all odd-positioned elements in one sublist, and the rest in another. For example, we might have written `Split` as

```fun {Split Xs}
case Xs
of nil then nil # nil
[] [X] then [X] # nil
[] X | Y | Zs then
Xs # Ys = {Split Zs}
in
( X | Xs ) # ( Y | Ys )
end
end
```

This, however, is not stable: a list that is already partially-ordered according to the predicate might be rearranged. For example, given a binary predicate string-length<= that returns `true` iff the first argument is not longer than the second, and the list `["aaa" "bbb" "ccc"]`, you get `["aaa" "ccc" "bbb"]` back. The `SplitRec` function avoids this problem by using a copy of the list, which is consumed two elements at a time, as a counter to find the midpoint of the list.

#### `Merge`

Merging is the heart of the algorithm and operates by interleaving the elements of two ordered lists in such a way that the combined list is ordered. This process takes only linear (O(n)) time. The `Merge` function takes two lists. If either list is empty, we return a duplicate of the other as the merged result. Otherwise, the lead elements of each list are compared using the sorting predicate, `P`. A true result from the predicate indicates that the first element should precede the second in sorting order. The "winning" element is removed from its list and we recurse on the remainder of the lists. The result, prepended by the removed element, is returned:

```<<merge>>=
fun {Merge Xs Ys}
case Xs # Ys
of nil # Ys then Ys
[] Xs # nil then Xs
[] ( X | Xss ) # ( Y | Yss) then
if {P X Y} then
X | {Merge Xss Ys}
else
Y | {Merge Xs Yss}
end
end
end
```

### Packaging

Both of the mergesort functions are packaged inside a functor, which is an Oz module. In this case, the functor exports both merge functions, making them accessible to other Oz modules.

```<<mergesort.oz>>=
functor
export
Mergesort
MergesortCustom
define
mergesort
custom mergesort
end
```

## Testing

We provide a simple test harness for the mergesort implementations. The heart of this test harness is the `TestSort` procedure. This procedure accepts a sorting function, an ordering function, a testname, a stimulus, and an expected response as parameters. It applies the sorting function to the stimulus using the provided ordering function, and checks to see whether the actual result matches the expected one. The results of the test are then printed to the console.

```<<TestSort>>=
proc {TestSort F Order Testname S R}
Result
in
if
R == {F S Order}
then
Result = "passed"
else
Result =  "failed"
end
{System.showInfo
{System.printName F} # " - " # Result # " - " # Testname # "."}
end
```

The test cases provided as part of the test harness are fairly straightforward. One of the tests verifies that the sorting function actually returns a sorted output.

```<<TestNumericSort>>=
proc {TestNumericSort F}
Stimulus = [3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3]
Response = [1 1 2 3 3 3 4 5 5 5 6 7 8 9 9 9]
in
{TestSort F Value.'=<' "numeric sort" Stimulus Response}
end
```

The second test verifies that the sorting function is stable, i.e. that it does not alter the order of a list that should not be permuted under the provided ordering function. In this case, the test is performed using a list of equal-length strings as the stimulus, and an anonymous function that defines an ordering in terms of string lengths. The expected result is that the stimulus list is not permuted.

```<<TestStableSort>>=
proc {TestStableSort F}
Stimulus = ["bbb" "aaa" "ccc" "zzz" "fff"]
Response = ["bbb" "aaa" "ccc" "zzz" "fff"]
in
{TestSort F fun {\$ X Y} {List.length X}=<{List.length Y} end
"stable sort" Stimulus Response}
end
```

The complete test harness loads the necessary support modules, and then applies the two test procedures to both the `Mergesort` and `MergesortCustom` sorting functions. Note that imported functions are invoked using names that have a lower case first letter (and a prefixed module name), despite the fact that they are defined with an upper case letter at the start of their names.

```<<mergesort-tests.oz>>=
functor
import
Application
System
Mergesort
define
TestSort
TestNumericSort
TestStableSort
in
{TestNumericSort Mergesort.mergesort}
{TestStableSort Mergesort.mergesort}
{TestNumericSort Mergesort.mergesortCustom}
{TestStableSort Mergesort.mergesortCustom}
{Application.exit 0}
end
```

### Building and running

If you have installed the Mozart Programming System, then the mergesort implementations and test harness can be compiled at the command line by issuing the following commands:

```ozc -c mergesort.oz
ozc -x mergesort-tests.oz
```

Running `mergesort-tests` should produce the following output:

```Mergesort - passed - numeric sort.
Mergesort - passed - stable sort.
MergesortCustom - passed - numeric sort.
MergesortCustom - passed - stable sort.
```