Sieve of Eratosthenes - Primes between X and N
Solution 1:
The implementation you've borrowed is able to start at 3 because it replaces sieving out the multiples of 2 by just skipping all even numbers; that's what the 2*…
that appear multiple times in the code are about. The fact that 3 is the next prime is also hardcoded in all over the place, but let's ignore that for the moment, because if you can't get past the special-casing of 2, the special-casing of 3 doesn't matter.
Skipping even numbers is a special case of a "wheel". You can skip sieving multiples of 2 by always incrementing by 2; you can skip sieving multiples of 2 and 3 by alternately incrementing by 2 and 4; you can skip sieving multiples of 2, 3, 5, and 7 by alternately incrementing by 2, 4, 2, 4, 6, 2, 6, … (there's 48 numbers in the sequence), and so on. So, you could extend this code by first finding all the primes up to x
, then building a wheel, then using that wheel to find all the primes between x
and n
.
But that's adding a lot of complexity. And once you get too far beyond 7, the cost (both in time, and in space for storing the wheel) swamps the savings. And if your whole goal is not to find the primes before x
, finding the primes before x
so you don't have to find them seems kind of silly. :)
The simpler thing to do is just find all the primes up to n
, and throw out the ones below x
. Which you can do with a trivial change at the end:
primes = numpy.r_[2,result]
return primes[primes>=x]
Or course there are ways to do this without wasting storage for those initial primes you're going to throw away. They'd be a bit complicated to work into this algorithm (you'd probably want to build the array in sections, then drop each section that's entirely < x
as you go, then stack all the remaining sections); it would be far easier to use a different implementation of the algorithm that isn't designed for speed and simplicity over space…
And of course there are different prime-finding algorithms that don't require enumerating all the primes up to x
in the first place. But if you want to use this implementation of this algorithm, that doesn't matter.
Solution 2:
Since you're now interested in looking into other algorithms or other implementations, try this one. It doesn't use numpy, but it is rather fast. I've tried a few variations on this theme, including using sets, and pre-computing a table of low primes, but they were all slower than this one.
#! /usr/bin/env python
''' Prime range sieve.
Written by PM 2Ring 2014.10.15
For range(0, 30000000) this is actually _faster_ than the
plain Eratosthenes sieve in sieve3.py !!!
'''
import sys
def potential_primes():
''' Make a generator for 2, 3, 5, & thence all numbers coprime to 30 '''
s = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29)
for i in s:
yield i
s = (1,) + s[3:]
j = 30
while True:
for i in s:
yield j + i
j += 30
def range_sieve(lo, hi):
''' Create a list of all primes in the range(lo, hi) '''
#Mark all numbers as prime
primes = [True] * (hi - lo)
#Eliminate 0 and 1, if necessary
for i in range(lo, min(2, hi)):
primes[i - lo] = False
ihi = int(hi ** 0.5)
for i in potential_primes():
if i > ihi:
break
#Find first multiple of i: i >= i*i and i >= lo
ilo = max(i, 1 + (lo - 1) // i ) * i
#Determine how many multiples of i >= ilo are in range
n = 1 + (hi - ilo - 1) // i
#Mark them as composite
primes[ilo - lo : : i] = n * [False]
return [i for i,v in enumerate(primes, lo) if v]
def main():
lo = int(sys.argv[1]) if len(sys.argv) > 1 else 0
hi = int(sys.argv[2]) if len(sys.argv) > 2 else lo + 30
#print lo, hi
primes = range_sieve(lo, hi)
#print len(primes)
print primes
#print primes[:10], primes[-10:]
if __name__ == '__main__':
main()
And here's a link to the plain Eratosthenes sieve that I mentioned in the docstring, in case you want to compare this program to that one.
You could improve this slightly by getting rid of the loop under #Eliminate 0 and 1, if necessary
. And I guess it might be slightly faster if you avoided looking at even numbers; it'd certainly use less memory. But then you'd have to handle the cases when 2 was inside the range, and I figure that the less tests you have the faster this thing will run.
Here's a minor improvement to that code: replace
#Mark all numbers as prime
primes = [True] * (hi - lo)
#Eliminate 0 and 1, if necessary
for i in range(lo, min(2, hi)):
primes[i - lo] = False
with
#Eliminate 0 and 1, if necessary
lo = max(2, lo)
#Mark all numbers as prime
primes = [True] * (hi - lo)
However, the original form may be preferable if you want to return the plain bool
list rather than performing the enumerate
to build a list of integers: the bool
list is more useful for testing if a given number is prime; OTOH, the enumerate
could be used to build a set rather than a list.