# Quicksort (Python, arrays)

### From LiteratePrograms

**Other implementations**: AWK | C | C++ | Eiffel | Erlang | Forth | Haskell | Java | JavaScript | Mathematica | Mercury | Oz | Python |**Python, arrays**| Scala | Sed | Standard ML | Visual Basic .NET | XProc

Quicksort can be implemented for Python lists (see Quicksort (Python)) but due to the inability to efficiently choose good pivots in a list, it suffers from poor performance on sorted or nearly sorted inputs. An alternative is to use one of the many array implementations available for Python, such as the standard library module array, the Numeric library (now called NumPy), or the numarray library. We will also require a source of random numbers as supplied by either the standard library's random module (based on the excellent Mersenne twister generator) or Numeric's RandomArray extension library.

## Contents |

## In-place partition function

The most efficient partition functions are in-place — they just shuffle around the elements of an array. The most common approach is to take two array indexes, one starting at the beginning and moving up and one starting at the end and moving down, and the values at these indexes are sometimes exchanged. The following function partitions the subarray a[start..end] inclusive about the pivot value at `a[pivotIndex]`

:

<<partition function>>=defpartition(a, start, end, pivotIndex): low = start high = end - 1 # After we remove pivot it will be one smaller pivotValue = a[pivotIndex] remove pivot from arraywhileTrue:whilelow <= highanda[low] < pivotValue: low = low + 1whilelow <= highanda[high] >= pivotValue: high = high - 1iflow > high:breaka[low], a[high] = a[high], a[low] insert pivot into final position and return final position

In works by moving the two indexes until they point to two elements in the wrong order. It then exchanges them using tuple syntax. We move the pivot "out of the way" to ensure that it isn't moved around during the process - this would complicate things. At the end we store it where *low* points, which will be one past the end of the lesser elements, its final position:

<<remove pivot from array>>=a[pivotIndex] = a[end]

<<insert pivot into final position and return final position>>=a[end] = a[low] a[low] = pivotValuereturnlow

Because the element the pivot supplants is sent to the end of the array, it's necessary to exchange it with the first element in the sublist of greater-or-equal elements rather than the last element of the lesser elements.

## Main algorithm

Now that we have partition, the main algorithm is simple to write. We are given an array and inclusive start and end indexes. We select a pivot at random using `random.randint()`

, partition about it, then recursively invoke quicksort on the two partitions.

<<quicksort function>>=defqsortRange(a, start, end):ifend - start + 1 < 32: insertionSort(a, start, end)else: pivotIndex = partition(a, start, end, randint(start, end)) qsortRange(a, start, pivotIndex - 1) qsortRange(a, pivotIndex + 1, end)returna

The actual `qsort`

function just sorts the whole range:

<<quicksort function>>=defqsort(a):returnqsortRange(a, 0, len(a) - 1)

If the list size becomes small, we fall back to insertion sort (see Insertion sort (Python) for more):

<<insertion sort function>>=definsertionSort(a, start, end):foriinxrange(start, end + 1): # Insert a[i] into the sorted sublist v = a[i]forjinreversed(xrange(0, i)):ifa[j] <= v: a[j + 1] = vbreaka[j + 1] = a[j]else: a[0] = vreturna

We add 1 to end because the given range is inclusive, while xrange includes its upper limit. We use the `reversed`

filter to iterate *j* from *i*−1 down to zero. The else after the for executes only if it did not break and handles the case where the value is moved all the way to the beginning.

## Test code

We create this helper function to generate a random array for testing:

<<generate random array function>>=defrandarray(typecode, numElements, minValue, maxValue): a = array(typecode)foriinxrange(0, numElements): a.append(randint(minValue, maxValue))returna

To verify the result, we use this helper:

<<check if array is sorted function>>=defcheckArraySorted(a):foriinxrange(1, len(a) - 1):ifa[i] < a[i-1]:returnFalsereturnTrue

Now we can try it out with something like:

<<test code>>=a = randarray('i', 50, 1, 200)ifcheckArraySorted(qsort(randarray('i', 10000, 0, 999999999))):else:

We see that the result is correctly sorted.

## Files

Finally, we wrap all this up in a Python file for distribution. We need to import the random and array modules.

<<qsort.py>>=fromarrayimport*fromrandomimport* partition function insertion sort function quicksort function generate random array function check if array is sorted function test code

## Performance

Since the simplest implementations in Python are list-based, the question remains whether this implementation outperforms them well enough to justify its additional complexity. The below table compares runtime (including setup) for data sets of various sizes using this algorithm and the simple list-comprehension-based quicksort algorithm from Quicksort (Python). Times were taken on a 2.8 GhZ Pentium 4 machine under Gentoo Linux using the standard Python package; we measured both with and without the -O flag, but the difference was not appreciable.

**Random lists**

List size | 100 | 1000 | 10000 | 50000 | 100000 | 250000 | 500000 | 1000000 |

Array quicksort (s) | 0.022 | 0.038 | 0.23 | 1.15 | 2.415 | 6.52 | 13.0 | 27.0 |

List quicksort (s) | 0.021 | 0.034 | 0.19 | 0.98 | 2.157 | 6.37 | 13.7 | 28.3 |

Surprisingly, the very simple list implementation remains competitive with the array implementation for all list sizes. Next, we compare their behavior on sorted lists of sequential integers, which is a worst-case for the naive list algorithm. To account for the difference in setup time we first generate and throw away a random list the same size as before.

**Sorted lists**

List size | 100 | 1000 | 10000 | 50000 | 100000 | 250000 | 500000 | 1000000 |

Array quicksort (s) | 0.028 | 0.033 | 0.17 | 0.867 | 1.72 | 4.46 | 9.58 | 19.1 |

List quicksort (s) | 0.023 | — | — | — | — | — | — | — |

The simple comprehension-based implementation stack overflowed at 1000 elements. The array-based implementation performed better than either of the algorithms did on random lists. In this commonly occurring case, it's clear that the list-based quicksort is not even an option. It's also interesting to compare this with the results in Quicksort (C), which are several orders of magnitude faster than either of these implementations.

Download code |