Issues implementing the "Wave Collapse Function" algorithm in Python
The hypothesis suggested by @mbrig and @Leon that the propagation step iterates over a whole stack of cells (instead of being limited to a set of 4 direct neighbors) was correct. The following is an attempt to provide further details while answering my own questions.
The problem occured at step 7, while propagating. The original algorithm does update the 4 direct neighbors of a specific cell BUT:
- the index of that specific cell is in turns replaced by the indices of the previously updated neighbors.
- this cascading process is triggered every time a cell is collapsed
- and last as long as the adjacent patterns of a specific cell are available in 1 of its neighboring cell
In other words, and as mentionned in the comments, this is a recursive type of propagation that updates not only the neighbors of the collapsed cell, but also the neighbors of the neighbors... and so on as long as adjacencies are possible.
Detailed Algorithm
Once a cell is collapsed, its index is put in a stack. That stack is meant later to temporarily store indices of neighoring cells
stack = set([emin]) #emin = index of cell with minimum entropy that has been collapsed
The propagation will last as long as that stack is filled with indices:
while stack:
First thing we do is pop()
the last index contained in the stack (the only one for now) and get the indices of its 4 neighboring cells (E, W, N, S). We have to keep them withing bounds and make sure they wrap around.
while stack:
idC = stack.pop() # index of current cell
for dir, t in enumerate(mat):
x = (idC%w + t[0])%w
y = (idC/w + t[1])%h
idN = x + y * w # index of neighboring cell
Before going any further, we make sure the neighboring cell is not collapsed yet (we don't want to update a cell that has only 1 pattern available):
if H[idN] != 'c':
Then we check all the patterns that could be placed at that location. ex: if the neighboring cell is on the left of the current cell (east side), we look at all the patterns that can be placed on the left of each pattern contained in the current cell.
possible = set([n for idP in W[idC] for n in A[idP][dir]])
We also look at the patterns that are available in the neighboring cell:
available = W[idN]
Now we make sure that the neighboring cell really have to be updated. If all its available patterns are already in the list of all the possible patterns —> there’s no need to update it (the algorithm skip this neighbor and goes on to the next) :
if not available.issubset(possible):
However, if it is not a subset of the possible
list —> we look at the intersection of the two sets (all the patterns that can be placed at that location and that, "luckily", are available at that same location):
intersection = possible & available
If they don't intersect (patterns that could have been placed there but are not available) it means we ran into a "contradiction". We have to stop the whole WFC algorithm.
if not intersection:
print 'contradiction'
noLoop()
If, on the contrary, they do intersect --> we update the neighboring cell with that refined list of pattern's indices:
W[idN] = intersection
Because that neighboring cell has been updated, its entropy must be updated as well:
lfreqs = [freqs[i] for i in W[idN]]
H[idN] = (log(sum(lfreqs)) - sum(map(lambda x: x * log(x), lfreqs)) / sum(lfreqs)) - random(.001)
Finally, and most importantly, we add the index of that neighboring cell to the stack so it becomes the next current cell in turns (the one whose neighbors will be updated during the next while
loop):
stack.add(idN)
Full updated script
from collections import Counter
from itertools import chain
from random import choice
w, h = 40, 25
N = 3
def setup():
size(w*20, h*20, P2D)
background('#FFFFFF')
frameRate(1000)
noStroke()
global W, A, H, patterns, freqs, npat, mat, xs, ys
img = loadImage('Flowers.png')
iw, ih = img.width, img.height
xs, ys = width//w, height//h
kernel = [[i + n*iw for i in xrange(N)] for n in xrange(N)]
mat = ((-1, 0), (1, 0), (0, -1), (0, 1))
all = []
for y in xrange(ih):
for x in xrange(iw):
cmat = [[img.pixels[((x+n)%iw)+(((a[0]+iw*y)/iw)%ih)*iw] for n in a] for a in kernel]
for r in xrange(4):
cmat = zip(*cmat[::-1])
all.append(cmat)
all.append(cmat[::-1])
all.append([a[::-1] for a in cmat])
all = [tuple(chain.from_iterable(p)) for p in all]
c = Counter(all)
patterns = c.keys()
freqs = c.values()
npat = len(freqs)
W = [set(range(npat)) for i in xrange(w*h)]
A = [[set() for dir in xrange(len(mat))] for i in xrange(npat)]
H = [100 for i in xrange(w*h)]
for i1 in xrange(npat):
for i2 in xrange(npat):
if [n for i, n in enumerate(patterns[i1]) if i%N!=(N-1)] == [n for i, n in enumerate(patterns[i2]) if i%N!=0]:
A[i1][0].add(i2)
A[i2][1].add(i1)
if patterns[i1][:(N*N)-N] == patterns[i2][N:]:
A[i1][2].add(i2)
A[i2][3].add(i1)
def draw():
global H, W
emin = int(random(w*h)) if frameCount <= 1 else H.index(min(H))
if H[emin] == 'c':
print 'finished'
noLoop()
id = choice([idP for idP in W[emin] for i in xrange(freqs[idP])])
W[emin] = [id]
H[emin] = 'c'
stack = set([emin])
while stack:
idC = stack.pop()
for dir, t in enumerate(mat):
x = (idC%w + t[0])%w
y = (idC/w + t[1])%h
idN = x + y * w
if H[idN] != 'c':
possible = set([n for idP in W[idC] for n in A[idP][dir]])
if not W[idN].issubset(possible):
intersection = possible & W[idN]
if not intersection:
print 'contradiction'
noLoop()
return
W[idN] = intersection
lfreqs = [freqs[i] for i in W[idN]]
H[idN] = (log(sum(lfreqs)) - sum(map(lambda x: x * log(x), lfreqs)) / sum(lfreqs)) - random(.001)
stack.add(idN)
fill(patterns[id][0])
rect((emin%w) * xs, (emin/w) * ys, xs, ys)
Overall improvements
In addition to these fixes I also did some minor code optimization to speed-up both the observation and propagation steps, and shorten the weighted choice computation.
The "Wave" is now composed of Python sets of indices whose size decrease as cells are "collapsed" (replacing large fixed size lists of booleans).
Entropies are stored in a defaultdict whose keys are progressively deleted.
The starting entropy value is replaced by a random integer (first entropy calculation not needed since equiprobable high level of uncertainty at start)
Cells are diplayed once (avoiding storing them in a array and redrawing at each frame)
The weighted choice is now a one-liner (avoiding several dispensable lines of list comprehension)
While looking at the live demo linked in one of your examples, and based on a quick review of the original algorithm code, I believe your error lies in the "Propagation" step.
The propagation is not just updating the neighbouring 4 cells to the collapsed cell. You must also update all of those cells neighbours, and then the neighbours to those cells, etc, recursively. Well, to be specific, as soon as you update a single neighbouring cell, you then update its neighbour (before getting to the other neighbours of the first cell), i.e. depth-first, not breadth-first updates. At least, that's what I gather from the live demo.
The actual C# code implementation of the original algorithm is quite complicated and I don't fully understand it, but the key points appear to be creation of the "propagator" object here, as well as the Propagate function itself, here.