How to compute a 3D Morton number (interleave the bits of 3 ints)

I'm looking for a fast way to compute a 3D Morton number. This site has a magic-number based trick for doing it for 2D Morton numbers, but it doesn't seem obvious how to extend it to 3D.

So basically I have 3 10-bit numbers that I want to interleave into a single 30 bit number with a minimal number of operations.


You can use the same technique. I'm assuming that variables contain 32-bit integers with the highest 22 bits set to 0 (which is a bit more restrictive than necessary). For each variable x containing one of the three 10-bit integers we do the following:

x = (x | (x << 16)) & 0x030000FF;
x = (x | (x <<  8)) & 0x0300F00F;
x = (x | (x <<  4)) & 0x030C30C3;
x = (x | (x <<  2)) & 0x09249249;

Then, with x,y and z the three manipulated 10-bit integers we get the result by taking:

x | (y << 1) | (z << 2)

The way this technique works is as follows. Each of the x = ... lines above "splits" groups of bits in half such that there is enough space in between for the bits of the other integers. For example, if we consider three 4-bit integers, we split one with bits 1234 into 000012000034 where the zeros are reserved for the other integers. In the next step we split 12 and 34 in the same way to get 001002003004. Even though 10 bits doesn't make for a nice repeated division in two groups, you can just consider it 16 bits where you lose the highest ones in the end.

As you can see from the first line, you actually only need that for each input integer x it holds that x & 0x03000000 == 0.


Here is my solution with a python script:

I took the hint from in his comment: Fabian “ryg” Giesen
Read the long comment below! We need to keep track which bits need to go how far!
Then in each step we select these bits and move them and apply a bitmask (see comment last lines) to mask them!

Bit Distances: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Bit Distances (binary): ['0', '10', '100', '110', '1000', '1010', '1100', '1110', '10000', '10010']
Shifting bits by 1   for bits idx: []
Shifting bits by 2   for bits idx: [1, 3, 5, 7, 9]
Shifting bits by 4   for bits idx: [2, 3, 6, 7]
Shifting bits by 8   for bits idx: [4, 5, 6, 7]
Shifting bits by 16  for bits idx: [8, 9]
BitPositions: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Shifted bef.:   0000 0000 0000 0000 0000 0011 0000 0000 hex: 0x300
Shifted:        0000 0011 0000 0000 0000 0000 0000 0000 hex: 0x3000000
NonShifted:     0000 0000 0000 0000 0000 0000 1111 1111 hex: 0xff
Bitmask is now: 0000 0011 0000 0000 0000 0000 1111 1111 hex: 0x30000ff

Shifted bef.:   0000 0000 0000 0000 0000 0000 1111 0000 hex: 0xf0
Shifted:        0000 0000 0000 0000 1111 0000 0000 0000 hex: 0xf000
NonShifted:     0000 0011 0000 0000 0000 0000 0000 1111 hex: 0x300000f
Bitmask is now: 0000 0011 0000 0000 1111 0000 0000 1111 hex: 0x300f00f

Shifted bef.:   0000 0000 0000 0000 1100 0000 0000 1100 hex: 0xc00c
Shifted:        0000 0000 0000 1100 0000 0000 1100 0000 hex: 0xc00c0
NonShifted:     0000 0011 0000 0000 0011 0000 0000 0011 hex: 0x3003003
Bitmask is now: 0000 0011 0000 1100 0011 0000 1100 0011 hex: 0x30c30c3

Shifted bef.:   0000 0010 0000 1000 0010 0000 1000 0010 hex: 0x2082082
Shifted:        0000 1000 0010 0000 1000 0010 0000 1000 hex: 0x8208208
NonShifted:     0000 0001 0000 0100 0001 0000 0100 0001 hex: 0x1041041
Bitmask is now: 0000 1001 0010 0100 1001 0010 0100 1001 hex: 0x9249249

x &= 0x3ff
x = (x | x << 16) & 0x30000ff   <<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249

So for a 10bit number and 2 interleaving bits (for 32 bit), you need to do the following!:

x &= 0x3ff
x = (x | x << 16) & 0x30000ff   #<<< THIS IS THE MASK for shifting 16 (for bit 8 and 9)
x = (x | x << 8) & 0x300f00f
x = (x | x << 4) & 0x30c30c3
x = (x | x << 2) & 0x9249249

And for a 21bit number and 2 interleaving bits (for 64bit), you need to do the following!:

x &= 0x1fffff
x = (x | x << 32) & 0x1f00000000ffff
x = (x | x << 16) & 0x1f0000ff0000ff
x = (x | x << 8) & 0x100f00f00f00f00f
x = (x | x << 4) & 0x10c30c30c30c30c3
x = (x | x << 2) & 0x1249249249249249

And for a 42bit number and 2 interleaving bits (for 128bit), you need to do the following ( in case you need it ;-)) :

x &= 0x3ffffffffff
x = (x | x << 64) & 0x3ff0000000000000000ffffffffL
x = (x | x << 32) & 0x3ff00000000ffff00000000ffffL
x = (x | x << 16) & 0x30000ff0000ff0000ff0000ff0000ffL
x = (x | x << 8) & 0x300f00f00f00f00f00f00f00f00f00fL
x = (x | x << 4) & 0x30c30c30c30c30c30c30c30c30c30c3L
x = (x | x << 2) & 0x9249249249249249249249249249249L

Python Script to produce and check the Interleaving Patterns!!!

def prettyBinString(x,d=32,steps=4,sep=".",emptyChar="0"):
    b = bin(x)[2:]
    zeros = d - len(b)
    
    
    if zeros <= 0: 
        zeros = 0
        k = steps - (len(b) % steps)
    else:
        k = steps - (d % steps)
        
    s = ""
    #print("zeros" , zeros)
    #print("k" , k)
    for i in range(zeros): 
        #print("k:",k)
        if(k%steps==0 and i!= 0):
            s+=sep
        s += emptyChar
        k+=1
     
    for i in range(len(b)):
        if( (k%steps==0 and i!=0 and zeros == 0) or  (k%steps==0 and zeros != 0) ):
            s+=sep
        s += b[i]
        k+=1
    return s    

def binStr(x): return prettyBinString(x,32,4," ","0")


def computeBitMaskPatternAndCode(numberOfBits, numberOfEmptyBits):
    bitDistances=[ i*numberOfEmptyBits for i in range(numberOfBits) ]
    print("Bit Distances: " + str(bitDistances))
    bitDistancesB = [bin(dist)[2:] for dist in  bitDistances]
    print("Bit Distances (binary): " + str(bitDistancesB))
    moveBits=[] #Liste mit allen Bits welche aufsteigend um 2, 4,8,16,32,64,128 stellen geschoben werden müssen

    maxLength = len(max(bitDistancesB, key=len))
    abort = False
    for i in range(maxLength):
        moveBits.append([])
        for idx,bits in enumerate(bitDistancesB):
            if not len(bits) - 1 < i:
                if(bits[len(bits)-i-1] == "1"):
                    moveBits[i].append(idx)
                    
    for i in range(len(moveBits)):
        print("Shifting bits by " + str(2**i) + "\t for bits idx: " + str(moveBits[i]))
    
    bitPositions = range(numberOfBits);
    print("BitPositions: " + str(bitPositions))
    maskOld = (1 << numberOfBits) -1
    
    codeString = "x &= " + hex(maskOld) + "\n"
    for idx in xrange(len(moveBits)-1, -1, -1):
        if len(moveBits[idx]):
            
           
           shifted = 0
           for bitIdxToMove in moveBits[idx]:
                shifted |= 1<<bitPositions[bitIdxToMove];
                bitPositions[bitIdxToMove] += 2**idx; # keep track where the actual bit stands! might get moved several times
           
           # Get the non shifted part!     
           nonshifted = ~shifted & maskOld
           
           print("Shifted bef.:\t" + binStr(shifted) + " hex: " + hex(shifted))
           shifted = shifted << 2**idx
           print("Shifted:\t" + binStr(shifted)+ " hex: " + hex(shifted))
                
           print("NonShifted:\t" + binStr(nonshifted) + " hex: " + hex(nonshifted))
           maskNew =  shifted | nonshifted
           print("Bitmask is now:\t" + binStr(maskNew) + " hex: " + hex(maskNew) +"\n")
           #print("Code: " + "x = x | x << " +str(2**idx)+ " & " +hex(maskNew))
                
           codeString += "x = (x | x << " +str(2**idx)+ ") & " +hex(maskNew) + "\n"
           maskOld = maskNew
    return codeString


numberOfBits = 10;
numberOfEmptyBits = 2;
codeString = computeBitMaskPatternAndCode(numberOfBits,numberOfEmptyBits);
print(codeString)

def partitionBy2(x):
    exec(codeString)
    return x

def checkPartition(x):
    print("Check partition for: \t" + binStr(x))
    part = partitionBy2(x);
    print("Partition is : \t\t" + binStr(part))
    #make the pattern manualy
    partC = long(0);
    for bitIdx in range(numberOfBits):
        partC  = partC | (x & (1<<bitIdx)) << numberOfEmptyBits*bitIdx
    print("Partition check is :\t" + binStr(partC))
    if(partC == part):
        return True
    else:
        return False

checkError = False        
for i in range(20):
    x = random.getrandbits(numberOfBits);
    if(checkPartition(x) == False):
        checkError = True
        break
if not checkError:
    print("CHECK PARTITION SUCCESSFUL!!!!!!!!!!!!!!!!...")
else:
    print("checkPartition has ERROR!!!!")

The simplest is probably a lookup table, if you've 4K free space:

static uint32_t t [ 1024 ] = { 0, 0x1, 0x8, ... };

uint32_t m ( int a, int b, int c )
{
    return t[a] | ( t[b] << 1 ) | ( t[c] << 2 );
}

The bit hack uses shifts and masks to spread the bits out, so each time it shifts the value and ors it, copying some of the bits into empty spaces, then masking out combinations so only the original bits remain.

for example:

x = 0xabcd;
  = 0000_0000_0000_0000_1010_1011_1100_1101    

x = (x | (x << S[3])) & B[3]; 

  = ( 0x00abcd00 | 0x0000abcd ) & 0xff00ff 
  = 0x00ab__cd & 0xff00ff 
  = 0x00ab00cd
  = 0000_0000_1010_1011_0000_0000_1100_1101
x = (x | (x << S[2])) & B[2]; 
  = ( 0x0ab00cd0 | 0x00ab00cd) & 0x0f0f0f0f 
  =   0x0a_b_c_d & 0x0f0f0f0f 
  = 0x0a0b0c0d 
  = 0000_1010_0000_1011_0000_1100_0000_1101
x = (x | (x << S[1])) & B[1]; 
  = ( 0000_1010_0000_1011_0000_1100_0000_1101 | 
      0010_1000_0010_1100_0011_0000_0011_0100 ) &
      0011_0011_0011_0011_0011_0011_0011_0011
  =   0010_0010_0010_0011_0011_0000_0011_0001
x = (x | (x << S[0])) & B[0]; 
  = ( 0010_0010_0010_0011_0011_0000_0011_0001 | 
      0100_0100_0100_0110_0110_0000_0110_0010 ) &
      0101_0101_0101_0101_0101_0101_0101_0101
  =   0100_0010_0100_0101_0101_0000_0101_0001

In each iteration, each block is split in two, the rightmost bit of the leftmost half of the block moved to its final position, and a mask applied so only the required bits remain.

Once you have spaced the inputs out, shifting them so the values of one fall into the zeros of the other is easy.

To extend that technique for more than two bits between values in the final result, you have to increase the shifts between where the bits end up. It gets a bit trickier, as the starting block size isn't a power of 2, so you could either split it down the middle, or on a power of 2 boundary.

So an evolution like this might work:

0000_0000_0000_0000_0000_0011_1111_1111    
0000_0011_0000_0000_0000_0000_1111_1111    
0000_0011_0000_0000_1111_0000_0000_1111    
0000_0011_0000_1100_0011_0000_1100_0011    
0000_1001_0010_0100_1001_0010_0100_1001    

// 0000_0000_0000_0000_0000_0011_1111_1111    
x = ( x | ( x << 16 ) ) & 0x030000ff;
// 0000_0011_0000_0000_0000_0000_1111_1111    
x = ( x | ( x << 8 ) ) & 0x0300f00f;
// 0000_0011_0000_0000_1111_0000_0000_1111    
x = ( x | ( x << 4 ) ) & 0x030c30c3;
// 0000_0011_0000_1100_0011_0000_1100_0011    
x = ( x | ( x << 2 ) ) & 0x09249249;
// 0000_1001_0010_0100_1001_0010_0100_1001    

Perform the same transformation on the inputs, shift one by one and another by two, or them together and you're done.