Quicksort (Sed)

From LiteratePrograms

Jump to: navigation, search
Other implementations: AWK | C | C++ | Eiffel | Erlang | Forth | Haskell | Java | JavaScript | Mathematica | Mercury | Oz | Python | Python, arrays | Scala | Sed | Standard ML | Visual Basic .NET | XProc

Contents

theory

Perform a Quicksort:

   q(x/xs) = q(as) x q(zs)

practice

A bog-standard quicksort: partition around a pivot. This is a recursive algorithm, but the state of the recursion is encoded in the data structure, so not only don't we need a code stack, we won't have any explicit code to recurse -- we run until convergence instead.

(to animate the algorithm, uncomment the l command)

<<quicksort>>=
: part
partition
: sort
# l
pivot

First we look for pivots. At this point the pattern space consists of bracketed {input} chunks, interspersed with already sorted output.

To sort a nontrivial chunk, we choose a pivot digit from the front and add a lookup table
so we can invoke the partitioning code.

<<pivot>>=
except for trivial chunks
s/{\(.\)\([^}]*\)}/{\2 \1 9876543210 }/
t part

Of course, every recursion has to bottom out somewhere, and for us that is at empty chunks.

<<except for trivial chunks>>=
s/{}//
t sort

the crux

Actually partitioning the unsorted digit string is the crux move. We use the technique of Greg Ubben's lookup table tutorial to pick out the elements of the lower partition which are greater than the pivot and move them to the upper partition.

<<partition>>=
#   +-1: input before element
#   |         +-2: element to move
#   |         |    +-3: remainder of input
#   |         |    |          +-4: pivot
#   |         |    |          |     +-5: table (codes \2 vs. \4 ordering)
#   |         |    |          |     | 
s/{\([0-9]*\)\(.\)\([0-9]*\) \(.\) \([0-9]*\2[0-9]*\4[0-9]*\) /{\1\3 \4 \5 \2/
t part
cleanup

After all that heavy lifting converges, the cleanup is straightforward: rewrite the pattern space into:
so we can look for pivots in the subchunks.

<<cleanup>>=
s/{\([^ ]*\) \(.\) \([^ ]*\) \([^}]*\)}/{\1}\2{\4}/
b sort

wrapping up

Finally, we arrange for a rudimentary input: find a string of digits in the input line, and turn it into a {chunk} for sorting.

<<quicksort.sed>>=
#!/usr/bin/sed -f
s/[^0-9]*\([0-9]*\).*/{\1}/
b sort
quicksort

Let's try a simple sanity check:

> echo "31415926535897931" | sed -f quicksort.sed
11123334555678999
Download code
Views