2-D convolution as a matrix-matrix multiplication [closed]
I know that, in the 1D case, the convolution between two vectors, a
and b
, can be computed as conv(a, b)
, but also as the product between the T_a
and b
, where T_a
is the corresponding Toeplitz matrix for a
.
Is it possible to extend this idea to 2D?
Given a = [5 1 3; 1 1 2; 2 1 3]
and b=[4 3; 1 2]
, is it possible to convert a
in a Toeplitz matrix and compute the matrix-matrix product between T_a
and b
as in the 1-D case?
Yes, it is possible and you should also use a doubly block circulant matrix (which is a special case of Toeplitz matrix). I will give you an example with a small size of kernel and the input, but it is possible to construct Toeplitz matrix for any kernel. So you have a 2d input x
and 2d kernel k
and you want to calculate the convolution x * k
. Also let's assume that k
is already flipped. Let's also assume that x
is of size n×n
and k
is m×m
.
So you unroll k
into a sparse matrix of size (n-m+1)^2 × n^2
, and unroll x
into a long vector n^2 × 1
. You compute a multiplication of this sparse matrix with a vector and convert the resulting vector (which will have a size (n-m+1)^2 × 1
) into a n-m+1
square matrix.
I am pretty sure this is hard to understand just from reading. So here is an example for 2×2 kernel and 3×3 input.
*
Here is a constructed matrix with a vector:
which is equal to .
And this is the same result you would have got by doing a sliding window of k
over x
.
1- Define Input and Filter
Let I be the input signal and F be the filter or kernel.
2- Calculate the final output size
If the I is m1 x n1 and F is m2 x n2 the size of the output will be:
3- Zero-pad the filter matrix
Zero pad the filter to make it the same size as the output.
4- Create Toeplitz matrix for each row of the zero-padded filter
5- Create a doubly blocked Toeplitz matrix
Now all these small Toeplitz matrices should be arranged in a big doubly blocked Toeplitz matrix.
6- Convert the input matrix to a column vector
7- Multiply doubly blocked toeplitz matrix with vectorized input signal
This multiplication gives the convolution result.
8- Last step: reshape the result to a matrix form
For more details and python code take a look at my github repository:
Step by step explanation of 2D convolution implemented as matrix multiplication using toeplitz matrices in python
If you unravel k to a m^2 vector and unroll X, you would then get:
- a
m**2
vectork
- a
((n-m)**2, m**2)
matrix forunrolled_X
where unrolled_X
could be obtained by the following Python code:
from numpy import zeros
def unroll_matrix(X, m):
flat_X = X.flatten()
n = X.shape[0]
unrolled_X = zeros(((n - m) ** 2, m**2))
skipped = 0
for i in range(n ** 2):
if (i % n) < n - m and ((i / n) % n) < n - m:
for j in range(m):
for l in range(m):
unrolled_X[i - skipped, j * m + l] = flat_X[i + j * n + l]
else:
skipped += 1
return unrolled_X
Unrolling X and not k allows a more compact representation (smaller matrices) than the other way around for each X - but you need to unroll each X. You could prefer unrolling k depending on what you want to do.
Here, the unrolled_X
is not sparse, whereas unrolled_k
would be sparse, but of size ((n-m+1)^2,n^2)
as @Salvador Dali mentioned.
Unrolling k
could be done like this:
from scipy.sparse import lil_matrix
from numpy import zeros
import scipy
def unroll_kernel(kernel, n, sparse=True):
m = kernel.shape[0]
if sparse:
unrolled_K = lil_matrix(((n - m)**2, n**2))
else:
unrolled_K = zeros(((n - m)**2, n**2))
skipped = 0
for i in range(n ** 2):
if (i % n) < n - m and((i / n) % n) < n - m:
for j in range(m):
for l in range(m):
unrolled_K[i - skipped, i + j * n + l] = kernel[j, l]
else:
skipped += 1
return unrolled_K