Insertion sort (Python, arrays)

From LiteratePrograms

Jump to: navigation, search
Other implementations: ACL2 | C | C, simple | Eiffel | Erlang | Forth | Haskell | Io | Java | Lisp | OCaml | Python | Python, arrays | Ruby | Scala, list | Smalltalk | Standard ML | Visual Basic .NET

This is a simple example of the insertion sort sorting algorithm, using an array package to sort in-place.

theory

Insertion sort, like selection sort, works by maintaing the invariant that, given a division of the list into a prefix and suffix, the prefix is sorted while the suffix remains unsorted. Initially, the suffix is the entire sequence and the empty list is trivially sorted. At each step, we lengthen the prefix and shorten the suffix. Once the suffix is empty, the prefix is the entire sequence, sorted, and we are done.

Insertion sort is a converse of selection sort in that we form the sorted list by inserting an arbitrary element from the unsorted remainder into position in the sorted prefix, instead of appending the minimum element from the unsorted remainder to the sorted prefix.

Image:Insertion_sort.png

practice

By choosing the arbitrary element to be the element immediately after the sorted prefix, we can perform the insertion by rotating the entire subsequence between the insertion point and element's current location.

We could use searchsorted to find the insertion point in the prefix, but instead generate the index by adding up the number of elements that are less than or equal to the element to be inserted.

Question: why is <= preferable to <?

<<insertion point>>=
sum(seq[:i]<=seq[i])

Now the algorithm becomes a straightforward transcription of the definition: for each element in the list, rotate it into place in the prefix.

<<insertion sort>>=
def isort(seq):
    define rotate
    for i in xrange(len(seq)):
        rotate(seq[insertion point:i+1])

wrapping up

Unlike most array processing packages, Numeric Python doesn't offer an array rotation primitive. We perform the rotation by swapping the tail element with the remainder of the list, using the usual 3-step temporary variable idiom.

Question: why can't we use parallel assignment here?

<<define rotate>>=
def rotate(xs): tmp = xs[-1]; xs[1:] = xs[:-1]; xs[0] = tmp

and then package it up with a few short tests, of a random sequence and an already sorted sequence.

<<insertion_sort.py>>=
from numpy import *
insertion sort
if __name__ == "__main__":
        r, s = random.random(12), arange(12)
        isort(r); isort(s)
        print s
        print r
        print alltrue(argsort(r) == arange(len(r)))

These should produce output similar to the following:

[ 0  1  2  3  4  5  6  7  8  9 10 11]
[ 0.02861689  0.10848212  0.30447467  0.31301701  0.35791055  0.37558012
  0.38740565  0.55887859  0.66199803  0.69475146  0.76803227  0.83034475]
True
Download code
Views