Sieve of Eratosthenes (Haskell)

From LiteratePrograms

Jump to: navigation, search
Other implementations: Alice ML | Bash | C++ | Forth | Haskell | Java | Python | Python, arrays | Scala

The Sieve of Eratosthenes is an algorithm for rapidly locating all the prime numbers in a certain range of integers. It operates by marking as composite all nontrivial multiples of each prime in sequence until only primes remain unmarked. It is most well-suited to implementation using arrays, which are compact and feature random access.


This program is under development.
Please help to debug it. When debugging
is complete, remove the {{develop}} tag.


Contents

A first look

This "Sieve of Eratosthenes" is not the genuine algorithm, but is instead a naive version of the original as (incorrectly) taught in most courses on algorithms. A more comprehensive (and far better-performing!) Sieve will be provided later as infrastructure code (specifically a priority queue) is provided elsewhere.

The core of this naive implementation is the primes function, detailed below. primes is just a convenient wrapper that hides implementation detail behind a facade. Called by itself it provides an infinite list of primes which may be used in other circumstances as desired. It accomplishes this by wrapping a tail-recursive function, sieve, tucked away in a where clause to keep from polluting the namespace.

<<primes_naive>>=
primes :: [Integer]
primes = sieve [2..]
  where
    sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]

The sieve function in the where-clause takes a single list of integers as its argument and assumes that the first list member is a prime. Since primes passes in the infinite list [2..], it is guaranteed in this code that the first number will always be prime. It returns with the initial prime p glued onto the output of a tail-recursive call to itself with a list comprehension that returns only those integers which are not evenly divisible by the provided prime. To more clearly explain:

  1. sieve is first passed a list of all integers greater than or equal to 2 ([2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ...]);
  2. it uses the 2 as the first prime and then passes itself a list of all remaining integers not divisible by 2 ([3, 5, 7, 9, 11, 13, 15, 17, ...]);
  3. it uses the 3 as the second prime and then passes itself a list of all remaining integers not divisible by 3 ([5, 7, 11, 13, 17, ...]);
  4. this proceeds recursively ad infinitum.

To those unfamiliar with Haskell's lazy semantics, the above can seem like a mystery. It is best to consider the lists involved not as lists in the traditional data structure sense, but instead as list generators -- functions which will calculate on demand the next item in the "list". It is these lazy semantics which permit elegant, readable code like the above to be generated without the fussing around that's usually needed when coding this pared-down "Sieve of Eratosthenes".

Some notes on the provided code:

  • Having the pairing of primes and sieve is a safety issue used to guarantee correctness of the ensuing list at the possible expense of some performance. (Using the GHC compiler there is absolutely no performance degradation, but naive implementations could incur some overhead.)
  • The lines above each function (e.g. primes :: [Integer]) are type declarations and are not necessary for this code to compile and work. Haskell can infer the type of t