Prime Factorization

In the post “Sieve of Eratosthenes“, I had posted a Java program for the prime sieve. Translating it to Python, I have:

from math import sqrt

def prime_sieve(limit) :
    crosslimit = int(sqrt(limit))
    sieve = [False]*limit

    for n in range(4, limit, 2) :
        sieve[n] = True
    for n in range(3, crosslimit+1, 2) :
        if not sieve[n]:
            for m in range(n*n, limit, 2*n) :
                sieve[m] = True

    num_primes = 0
    primes = []
    for n in range(2, limit) :
        if not sieve[n]:
            num_primes += 1

    for n in range(2, limit) :
        if not sieve[n]:
            primes.append(n)

    return primes

The list of primes generated by this sieve could then be used for factorizing a given number. For calculating the factors, I use the Trial Division algorithm, which isn’t the fastest, but easily implemented and not probabilistic. The algorithm basically tests if a given number (integer) n is divisible by any integer between 1 and n. Since only the prime factors of n are needed, we can use the list of primes generated from the sieve as a base set for the calculation, instead of the set of integers. Furthermore, the trial factors need go no further than \scriptstyle\sqrt{n} because, if n is divisible by some number p, then n = p×q and if q were smaller than pn would have earlier been detected as being divisible by q or a prime factor of q.

A function in Python for prime factorization is:

def trial_division(n) :
    if n == 1: return [1]
    primes = prime_sieve(int(n**0.5) + 1)
    prime_factors = []

    for p in primes:
        if p*p > n: break
        while n % p == 0:
            prime_factors.append(p)
            n //= p
    if n > 1: prime_factors.append(n)

    return prime_factors

This method can be used for primality testing as well. If only one integer is returned by the method above (it would be same as the input number), then the input integer is a prime number. If the number of trial divisions required for an integer x is represented by π(x), then

Number of trial divisions required

for some A>0. The reason I chose to implement this algorithm in Python is that it follows Arbitrary-Precision Arithmetic. It runs satisfactorily well even for large numbers and doesn’t use a large amount of memory. In fact, if the overhead of generating the prime sieve is ignored, it becomes quite fast.