*Huynh Thiet Dung <[email protected]>
Nov 8, 1994*

Primality testing and prime factorization are ancient problems that
have attracted special attention of mathematicians for centuries. Number
theory is traditionally believed to be the purest of all sciences, and
within it the hunt for large primes and for factors of large numbers has
always seemed particularly remote from applications, even to questions
of a number- theoretic nature. However, recent developments in complexity
theory and cryptology have altered this state of affairs as computational
number theory finds applications in public crypto- systems such as the
RSA (Rivest-Shamir-Adleman) scheme. The **RSA cryptographic system**
is based on the assumption that prime factorization and computing the discrete
logarithm are computa- tionally intractable. This brief summary concerns
the primality problem only.

**PRIMALITY **is the problem of determining whether a given integer
P (written in binary) is prime, and its complement is denoted by **COMPOSITENESS
**which is to determine whether P is a composite number. Although the
two problems are strongly related in that a deterministic algorithm for
one problem also yields a deterministic algorithm for the other, it makes
sense to consider both of them in the framework of complexity theory.

What is the complexity of **PRIMALITY**? As most of us know, existing
algorithms become very inefficient when P is large. Indeed, if we measure
their running times in terms of the length n of the binary representation
of P (i.e. n = log P), then all these methods require more than a polynomial
number of steps. Such computations are said to be infeasible. Therefore,
the question of whether PRIMALITY can be solved in polynomial time has
been an important open problem in computational number theory for quite
some time. As a matter of fact, the exact complexity of PRIMALITY has been
posed as an open problem since the advent of computational complexity theory
in the late 60s.

Following are the most outstanding known results concerning the complexity of PRIMALITY:

(1) Assuming that the Extended Riemann's Hypothesis holds true, PRIMALITY
can be solved in polynomial time (1976). The Riemann's Hypothesis is a
number-theoretic conjecture concerning the roots of the **Riemann zeta
function**. It postulates that the real parts of the zeros of the zeta
function zeta(s) = 1 + (1/2^s) + (1/3^s) + ... are exactly 1/2. It's worth
noting that this hypothesis is item #8 in Hilbert's 1900 lecture on the
most important open problems in mathematics.

(2) There is a fast **Monte Carlo algorithm for PRIMALITY** (Solovay
& Strassen, 1972) as well as a fast **Monte Carlo algorithm for COMPO-
SITENESS** (1987). These two algorithms can be combined to obtain a **Las
Vegas algorithm for PRIMALITY (or COMPOSITENESS)**.

It seems appropiate to have a brief explaination of what Monte Carlo and Las Vegas algorithms are. Monte Carlo algorithms are algorithms that may produce wrong answers, but they err with a small probability. The error probability can be made arbitrarily small when we repeat the algorithm for sufficiently many times.

Solovay & Strassen's paper (1972) is the pioneering work that stirred much interest in research on randomized computations. The idea is truly remarkable. After all, computation itself is inherently randomized. The probability of computer failure is nonzero no matter how small it is. So if a randomized algorithm has an extremely small error probability, then it's quite useful indeed. Besides randomized algorithms may be very helpful in the design of deterministic algorithms.

A Monte Carlo (randomized) algorithm for COMPOSITENESS essentially has the following form. It has in the input the n-bit integer P to be tested and a sequence R of n random bits generated from some random source. Depending on R, the algorithm performs computation on the integer P and outputs "Yes" if R provides a witness for the compositeness of P. If P is prime, then no R will ever produce a witness and the algorithm always outputs "No". However, it P is composite, then some R may not produce any witness at all, and the outputs "No" of the algorithm is false in this case.

The important fact is that if P is composite, then a majority of the n-bit sequences R will provide witness for the compositeness of P. So the error probability is less than 1/2, and it can be made arbitrarily small by running the algorithm a number of times. Thus, if this Monte Carlo algorithm outputs "Yes", then this is a definite answer as P is indeed composite. One should note the asymmetry of Monte Carlo algorithms: They err on one side only. This is one of the many reasons why we need to consider COMPOSITENESS and PRIMALITY separately.

Now, suppose that A1 is a Monte Carlo algorithm for COMPOSITENESS and A2 is another one for PRIMALITY. Then we can run both algorithms repeatedly and wait for a definite answer ("Yes, P is composite") from A1 or "Yes, P is prime" from A2. When we get an answer, it is definite and we are sure about P. The problem here is that we may not obtain a definite answer after k runs of both algorithms, but this is very unlikely if k is sufficiently large. This combined algorithm is called a Las Vegas algorithm: if it outputs an answer, the answer is always correct; and the probability that it will eventually output an answer can be made arbitrarily high by letting k be large enough.

As far as prime factorization is concerned, not much is known except the fact that it can be solved in polynomial time by NONDETERMINISTIC algorithms. To the best of my knowledge, there are currently no randomized algorithms for prime factorization. And the RSA scheme is based on the assumption that such algorithms do not exist. Whether this assumption is a legitimate one or not is certainly debatable. In any case, this problem has resisted numerous attempts of mathematicians over the centuries.