Adding wheel factorization to an indefinite sieve
Solution 1:
The odds, i.e. 2-coprimes, are generated by "rolling the wheel" [2]
, i.e. by repeated additions of 2, starting from the initial value of 3 (similarly from 5, 7, 9, ...),
n=3; n+=2; n+=2; n+=2; ... # wheel = [2]
3 5 7 9
The 2-3-coprimes are generated by repeated additions of 2, then 4, and again 2, then 4, and so on:
n=5; n+=2; n+=4; n+=2; n+=4; ... # wheel = [2,4]
5 7 11 13 17
Here we do need to know where to start adding the differences from, 2 or 4, depending on the initial value. For 5, 11, 17, ..., it's 2 (i.e. 0-th element of the wheel); for 7, 13, 19, ..., it's 4 (i.e. 1-st element).
How can we know where to start? The point to the wheel optimization is that we work only on this sequence of coprimes (in this example, 2-3-coprimes). So in the part of the code where we get the recursively generated primes, we will also maintain the rolling wheel stream, and advance it until we see that next prime in it. The rolling sequence will need to produce two results - the value and the wheel position. Thus when we see the prime, we also get the corresponding wheel position, and we can start off the generation of its multiples starting from that position on the wheel. We multiply everything by p
of course, to start from p*p
:
for (i, p) # the (wheel position, summated value)
in enumerated roll of the wheel:
when p is the next prime:
multiples of p are m = p*p; # map (p*) (roll wheel-at-i from p)
m += p*wheel[i];
m += p*wheel[i+1]; ...
So each entry in the dict will have to maintain its current value, its base prime, and its current wheel position (wrapping around to 0 for circularity, when needed).
To produce the resulting primes, we roll another coprimes sequence, and keep only those elements of it that are not in the dict, just as in the reference code.
update: after a few iterations on codereview (big thanks to the contributors there!) I've arrived at this code, using itertools as much as possible, for speed:
from itertools import accumulate, chain, cycle, count
def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A
wh11 = [ 2,4,2,4,6,2,6,4,2,4,6, 6,2,6,4,2,6,4,6,8,4,2, 4,
2,4,8,6,4,6,2,4,6,2,6, 6,4,2,4,6,2,6,4,2,4,2, 10,2,10]
cs = accumulate(chain([11], cycle(wh11))) # roll the wheel from 11
yield(next(cs)) # cf. ideone.com/WFv4f,
ps = wsieve() # codereview.stackexchange.com/q/92365/9064
p = next(ps) # 11
psq = p**2 # 121
D = dict(zip(accumulate(chain([0], wh11)), count(0))) # wheel roll lookup dict
mults = {}
for c in cs: # candidates, coprime with 210, from 11
if c in mults:
wheel = mults.pop(c)
elif c < psq:
yield c
continue
else: # c==psq: map (p*) (roll wh from p) = roll (wh*p) from (p*p)
i = D[(p-11) % 210] # look up wheel roll starting point
wheel = accumulate( chain( [psq],
cycle( [p*d for d in wh11[i:] + wh11[:i]])))
next(wheel)
p = next(ps)
psq = p**2
for m in wheel: # pop, save in m, and advance
if m not in mults:
break
mults[m] = wheel # mults[143] = wheel@187
def primes():
yield from (2, 3, 5, 7)
yield from wsieve()
Unlike the above description, this code directly calculates where to start rolling the wheel for each prime, to generate its multiples
Solution 2:
This is the version that I had come up with. It's not as clean as Ness' but it works. I'm posting it so there's another example on how to use wheel factorization in case anyone comes by. I've left in the ability to choose what wheel size to use but it's easy to nail down a more permanent one - just generate the size you want and paste that into the code.
from itertools import count
def wpsieve():
"""prime number generator
call this function instead of roughing or turbo"""
whlSize = 11
initPrms, gaps, c = wheel_setup(whlSize)
for p in initPrms:
yield p
primes = turbo(0, (gaps, c))
for p, x in primes:
yield p
def prod(seq, factor=1):
"sequence -> product"
for i in seq: factor *= i
return factor
def wheelGaps(primes):
"""returns list of steps to each wheel gap
that start from the last value in primes"""
strtPt = primes.pop(-1) # where the wheel starts
whlCirm = prod(primes) # wheel's circumference
# spokes are every number that are divisible by primes (composites)
gaps = [] # locate where the non-spokes are (gaps)
for i in xrange(strtPt, strtPt + whlCirm + 1, 2):
if not all(map(lambda x: i%x, primes)): continue # spoke
else: gaps.append(i) # non-spoke
# find the steps needed to jump to each gap (beginning from the start of the wheel)
steps = [] # last step returns to start of wheel
for i, j in enumerate(gaps):
if i == 0: continue
steps.append(int(j - gaps[i-1]))
return steps
def wheel_setup(num):
"builds initial data for sieve"
initPrms = roughing(num) # initial primes from the "roughing" pump
gaps = wheelGaps(initPrms[:]) # get the gaps
c = initPrms.pop(-1) # prime that starts the wheel
return initPrms, gaps, c
def roughing(end):
"finds primes by trial division (roughing pump)"
primes = [2]
for i in range(3, end + 1, 2):
if all(map(lambda x: i%x, primes)):
primes.append(i)
return primes
def turbo(lvl=0, initData=None):
"""postponed prime generator with wheels (turbo pump)
Refs: http://stackoverflow.com/a/10733621
http://stackoverflow.com/a/19391111"""
gaps, c = initData
yield (c, 0)
compost = {} # found composites to skip
# store as current value: (base prime, wheel index)
ps = turbo(lvl + 1, (gaps, c))
p, x = next(ps)
psq = p*p
gapS = len(gaps) - 1
ix = jx = kx = 0 # indices for cycling the wheel
def cyc(x): return 0 if x > gapS else x # wheel cycler
while True:
c += gaps[ix] # add next step on c's wheel
ix = cyc(ix + 1) # and advance c's index
bp, jx = compost.pop(c, (0,0)) # get base prime and its wheel index
if not bp:
if c < psq: # prime
yield c, ix # emit index for above recursive level
continue
else:
jx = kx # swap indices as a new prime comes up
bp = p
p, kx = next(ps)
psq = p*p
d = c + bp * gaps[jx] # calc new multiple
jx = cyc(jx + 1)
while d in compost:
step = bp * gaps[jx]
jx = cyc(jx + 1)
d += step
compost[d] = (bp, jx)
leaving in the option for the wheel size also lets you see how quickly larger wheels don't do much. Below is testing code for how long it takes to generate the wheel of selected size and how fast the sieve is with that wheel.
import time
def speed_test(num, whlSize):
print('-'*50)
t1 = time.time()
initPrms, gaps, c = wheel_setup(whlSize)
t2 = time.time()
print('2-{} wheel'.format(initPrms[-1]))
print('setup time: {} sec.'.format(round(t2 - t1, 5)))
t3 = time.time()
prm = initPrms[:]
primes = turbo(0, (gaps, c))
for p, x in primes:
prm.append(p)
if len(prm) > num:
break
t4 = time.time()
print('run time : {} sec.'.format(len(prm), round(t4 - t3, 5)))
print('prime sum : {}'.format(sum(prm)))
for w in [5, 7, 11, 13, 17, 19, 23, 29]:
speed_test(1e7-1, w)
Here's how it ran on my computer using PyPy (Python 2.7 compatible) when set to generate ten million primes:
2- 3 wheel
setup time: 0.0 sec.
run time : 18.349 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 5 wheel
setup time: 0.001 sec.
run time : 13.993 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 7 wheel
setup time: 0.001 sec.
run time : 7.821 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 11 wheel
setup time: 0.03 sec.
run time : 6.224 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 13 wheel
setup time: 0.011 sec.
run time : 5.624 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 17 wheel
setup time: 0.047 sec.
run time : 5.262 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 19 wheel
setup time: 1.043 sec.
run time : 5.119 sec.
prime sum : 870530414842019
--------------------------------------------------
2- 23 wheel
setup time: 22.685 sec.
run time : 4.634 sec.
prime sum : 870530414842019
Larger wheels are possible, but you can see they become rather long to set up. There's also the law of diminishing returns as the wheels get larger - not much point to go past the 2-13 wheel as they don't really make it that much faster. I also ended up running into a memory error past the 2-23 wheel (which had some 36 million numbers in its gaps
list).