Nth element (Python)

From LiteratePrograms

Jump to: navigation, search
Other implementations: Java | Python

Find the nth-largest element in an unsorted list.

theory

The obvious way to find the nth-largest element in a list is to sort it first, then examine the nth index. However, that approach does more work than is required, as we only care about the element in the nth place, not the permutation of any elements before or after. By modifying a Quicksort to only recurse on the sublist of interest, we avoid much of the work of a full sort.

(for an alternative approach, see An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan’s worst-case linear selection algorithm)

Question: Suppose n to be the minimum or maximum element. What relation does this algorithm bear to a 1-dimensional version of Quickhull (Python, arrays)?

practice

We start off in classic quicksort style — we choose a pivot, and create lists of elements which sort above or below the pivot.

<<define qnth>>=
def qnth(sample, n):
    pivot = sample[0]
    below = [s for s in sample if s < pivot]
    above = [s for s in sample if s > pivot]
    i, j = len(below), len(sample)-len(above)

At this point, i and j would (if it were sorted) index into sample as follows:

hence we need only recurse on the segment containing the nth element.

<<define qnth>>=
    if n < i:      return qnth(below, n)
    elif n >= j:   return qnth(above, n-j)
    else:          return pivot

wrapping up

Finally we wrap the function up in a module which, if run from the command line, checks (for an element from the middle of a random list) that qnth produces the same result as sorting the list and then selecting the nth element.

<<qnth.py>>=
define qnth
if __name__ == "__main__":
    import random
    n, mid = 64, 32
    sample = [random.random() for _ in range(n)]
    partial = qnth(sample, mid)
    sample.sort(); sorted = sample[mid]
    print partial, sorted, partial == sorted

The output should be similar to:

0.524130542211 0.524130542211 True
Download code
Views