Miller-Rabin primality test (C)

Other implementations: C | C, GMP | Clojure | Groovy | Java | Python | Ruby | Scala

The Miller-Rabin primality test is a simple probabilistic algorithm for determining whether a number is prime or composite that is easy to implement. It proves compositeness of a number using the following formulas:

Suppose 0 < a < n is coprime to n (this is easy to test using the GCD). Write the number n−1 as , where d is odd. Then, provided that all of the following formulas hold, n is composite:

for all

If a is chosen uniformly at random and n is prime, these formulas hold with probability 1/4. Thus, repeating the test for k random choices of a gives a probability of 1 − 1 / 4k that the number is prime. Moreover, Gerhard Jaeschke showed that any 32-bit number can be deterministically tested for primality by trying only a=2, 7, and 61.

16-bit integers

We begin with a simple implementation for 16-bit integers, which is easier to implement for reasons that will become apparent. A better test for 16-bit integers would be a simple 8K bit array lookup table, but this is for the purpose of illustration.

First, we'll need a way to perform efficient modular exponentiation on an arbitrary 16-bit integer. We accomplish this using exponentiation by squaring:

<<16-bit modular exponentiation function>>=
u16 modular_exponent_16(u16 base, u16 power, u16 modulus) {
u32 result = 1;
int i;
for (i=15; i>=0; i--) {
result = (result*result) % modulus;
if (power & (1 << i)) {
result = (result*base) % modulus;
}
}
return (u16)result; /* Will not truncate since modulus is a u16 */
}

Here, u16 is a 16-bit unsigned integer type and u32 is a (at least) 32-bit unsigned integer type. Since the product of two 16-bit numbers has at most 32 bits, we know for certain that this will never overflow or truncate bits. To define these integer types portably, we take advantage of the C's guarantee that an unsigned short is at least 16 bits wide and an unsigned long is at least 32 bits wide:

<<type definitions>>=
typedef unsigned short u16;
typedef unsigned long  u32;

Next, we write a function that actually tests the desired formulas for a given a and n. It works by starting with ad and squaring it s−1 times, comparing each to −1 (n−1 actually; we have to add n because we're working with nonnegative numbers):

<<Miller-Rabin results>>=
#define COMPOSITE 0
#define PRIME     1
<<16-bit Miller-Rabin pass>>=
int miller_rabin_pass_16(u16 a, u16 n) {
u16 a_to_power, s, d, i;
16-bit compute s and d
a_to_power = modular_exponent_16(a, d, n);
if (a_to_power == 1) return PRIME;
for(i=0; i < s-1; i++) {
if (a_to_power == n-1) return PRIME;
a_to_power = modular_exponent_16(a_to_power, 2, n);
}
if (a_to_power == n-1) return PRIME;
return COMPOSITE;
}

The call to modular_exponent_16 to square a_to_power could be converted into a call to a more specialized function, but we omit this optimization here. To compute s and d, we just divide n−1 by 2 until it becomes odd:

<<16-bit compute s and d>>=
s = 0;
d = n - 1;
while ((d % 2) == 0) {
d /= 2;
s++;
}

Now, taking advantage of Jaeschke's result, we can write a deterministic, 100% accurate implementation that tries the values of 2, 7, and 61 for a, returning PRIME if they all produce PRIME, and COMPOSITE otherwise:

<<16-bit Miller-Rabin>>=
int miller_rabin_16(u16 n) {
if (n <= 1) return COMPOSITE;
if (n == 2) return PRIME;
if (miller_rabin_pass_16( 2, n) == PRIME &&
(n <= 7  || miller_rabin_pass_16( 7, n) == PRIME) &&
(n <= 61 || miller_rabin_pass_16(61, n) == PRIME))
return PRIME;
else
return COMPOSITE;
}

The tests at the beginning are special cases, since Miller-Rabin requires n ≥ 3. The other tests on n ensure we don't pass an a that greater than or equal to n. Here's some test code that uses it:

<<miller-rabin_16.c>>=
#include <stdio.h>
#include <stdlib.h>
type definitions
Miller-Rabin results
16-bit modular exponentiation function
16-bit Miller-Rabin pass
16-bit Miller-Rabin
int main(int argc, char* argv[]) {
u16 n = (u16)atoi(argv);
puts(miller_rabin_16(n) == PRIME ? "PRIME" : "COMPOSITE");
return 0;
}

Arbitrary-precision integers

The test for arbitrary-precision integers is largely identical except that arbitrary-precision integer operations should be used. Unfortunately, arbitrary-precision mods are relatively complicated, and so it would avail us little to provide a specialized implementation here. Instead, we'll import some basic arbitrary-precision routines from Arbitrary-precision integer arithmetic (C):

<<integer.h>>=
Arbitrary-precision_integer_arithmetic_(C)#6222#integer.h
<<integer.c>>=
Arbitrary-precision_integer_arithmetic_(C)#6222#integer.c

We then rewrite our implementation above using these functions. Our new modular exponentiation function must loop over the components of the power as well as the bits of each component:

<<modular exponentiation function>>=
integer modular_exponent(integer base, integer power, integer modulus) {
int i, bit;
integer result = create_integer(modulus.num_components + 1);
integer temp   = create_integer(modulus.num_components*2 + 1);
set_zero_integer(result);
result.c = 1;
for(i=power.num_components - 1; i>=0; i--) {
for(bit=COMPONENT_BITS-1; bit>=0; bit--) {
multiply_integer(result, result, temp);
mod_integer(temp, modulus, result);
if ((power.c[i] & (1 << bit)) != 0) {
multiply_integer(result, base, temp);
mod_integer(temp, modulus, result);
}
}
}
free_integer(temp);
return result;
}

When performing the Miller-Rabin pass, only a, n, d, and powers of a are large numbers; s and i are logarithmically smaller. We precompute the large integers 1, 2, and n−1 for use in comparisons and squaring:

<<Miller-Rabin pass>>=
int miller_rabin_pass(integer a, integer n) {
int i, s, result;
integer a_to_power, d, one, n_minus_one;
integer temp = create_integer(n.num_components*2 + 1);
one = create_integer(1);
set_zero_integer(one);
one.c = 1;
n_minus_one = create_integer(n.num_components);
subtract_integer(n, one, n_minus_one);
compute s and d
a_to_power = modular_exponent(a, d, n);
if (compare_integers(a_to_power, one) == 0)  { result=PRIME; goto exit; }
for(i=0; i < s-1; i++) {
if (compare_integers(a_to_power, n_minus_one) == 0) { result=PRIME; goto exit; }
multiply_integer(a_to_power, a_to_power, temp);
mod_integer(temp, n, a_to_power);
}
if (compare_integers(a_to_power, n_minus_one) == 0) { result=PRIME; goto exit; }
result = COMPOSITE;
exit:
free_integer(temp);
free_integer(a_to_power);
free_integer(one);
free_integer(n_minus_one);
return result;
}

We've replaced the squaring using modular_exponent with a modular multiply of a_to_power by itself. Finally, we use bit shifts to compute d rapidly, and test just its least significant component to determine if it is odd:

<<compute s and d>>=
s = 0;
d = create_integer(n_minus_one.num_components);
copy_integer(n_minus_one, d);
while ((d.c % 2) == 0) {
shift_right_one_integer(d);
s++;
}

Since our numbers are now arbitrarily large, we can no longer take advantage of Jaeschke's result to produce a deterministic implementation, but we can produce a highly accurate probabilistic implementation by simply running the test enough times to make 1/4k very small, say 20 times.

<<Miller-Rabin>>=
int miller_rabin(integer n) {
integer a = create_integer(n.num_components);
int repeat;
for(repeat=0; repeat<20; repeat++) {
do {
random_integer(n, a);
} while (is_zero_integer(a));
if (miller_rabin_pass(a, n) == COMPOSITE) {
return COMPOSITE;
}
}
return PRIME;
}

For convenience we ignore the special cases where n ≤ 2. To produce a random integer less than n, we simply choose a random value for each component, and for the most significant component mod it to make it less than the most significant component of n. This doesn't choose an integer less than n uniformly at random (we miss the ones where the most significant component has its largest value) but it's not important here as long as we cover a large proportion of the space.

<<random integer>>=
void random_integer(integer max, integer result) {
int i, most_sig = max.num_components - 1;
while (max.c[most_sig] == 0) most_sig--;
for (i=0; i<most_sig; i++) {
result.c[i] = rand() % (MAX_COMPONENT + 1);
}
result.c[most_sig] = rand() % max.c[most_sig];
}

Finally, some test code:

<<miller-rabin.c>>=
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include "integer.h"
Miller-Rabin results
modular exponentiation function
Miller-Rabin pass
random integer
Miller-Rabin
int main(int argc, char* argv[]) {
srand(time(NULL));
if (strcmp(argv, "test") == 0) {
integer n = string_to_integer(argv);
puts(miller_rabin(n) == PRIME ? "PRIME" : "COMPOSITE");
} else if (strcmp(argv, "genprime") == 0) {
integer max = create_integer(atoi(argv)/COMPONENT_BITS);
integer p   = create_integer(max.num_components);
set_zero_integer(max);
max.c[max.num_components-1] = MAX_COMPONENT;
do {
random_integer(max, p);
test for small factors
} while (miller_rabin(p) == COMPOSITE);
puts(integer_to_string(p));
}
return 0;
}

We've augmented main() with the ability to generate a prime number of a specified number of bits. It does this by randomly selecting numbers of that size until it finds a prime one. We could use this to rapidly generate large primes for use in cryptography. Since most random values have small prime factors, we first test some of these to avoid an expensive test:

<<test for small factors>>=
if ((p.c % 2) == 0) continue;
if (mod_small_integer(p, 3) == 0) continue;
if (mod_small_integer(p, 5) == 0) continue;
if (mod_small_integer(p, 7) == 0) continue;

Here's some sample output:

\$ ./miller-rabin test 516119616549881
PRIME
\$ ./miller-rabin test 516119616549887
COMPOSITE
\$ ./miller-rabin genprime 128
277003545840181465186202237665148044033
\$ ./miller-rabin genprime 512
69070670148391210520342329574373473828883687815325200193110546590937028225649711\
62269141121965617545613897865904269544243721348093953041840506282278098321

On a Pentium 4 2.8 GhZ machine, the last run took 20 seconds, since the arbitrary-precision arithmetic implementation we're using is primitive. With a more efficient library, such as GMP, we could produce large primes even more quickly (see Miller-Rabin primality test (C, GMP)). Runtime also varies from run to run depending on how quickly we stumble across a prime.