Numpy Zero Padding to match a certain shape
I have a file with arrays or different shapes. I want to zeropad all the array to match the largest shape. The largest shape is (93,13).
To test this I have the following code:
testarray = np.ones((41,13))
how can I zero pad this array to match the shape of (93,13)? And ultimately, how can I do it for thousands of rows?
Edit: The solution was found in the comments:
for index, array in enumerate(mfcc):
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
mfcc[index] = testarray
Solution 1:
Here's an approach using np.pad
that can generalize to an arbitrary target shape:
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
For the proposed example:
a = np.ones((41,13))
shape = [93, 13]
to_shape(a, shape).shape
# (93, 13)
Lets check with another example:
shape = [100, 121]
to_shape(a, shape).shape
# (100, 121)
Timings
def florian(array, shape):
#print(array)
testarray = np.zeros(shape)
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]
def to_shape(a, shape):
y_, x_ = shape
y, x = a.shape
y_pad = (y_-y)
x_pad = (x_-x)
return np.pad(a,((y_pad//2, y_pad//2 + y_pad%2),
(x_pad//2, x_pad//2 + x_pad%2)),
mode = 'constant')
a = np.ones((500, 500))
shape = [1000, 1103]
%timeit florian(a, shape)
# 101 ms ± 5.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit to_shape(a, shape)
# 19.8 ms ± 318 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Solution 2:
If you want to pad to the right and to the bottom of your original array in 2D, here's what you want:
import numpy as np
a = np.ones((41,11))
desired_rows = 91
desired_cols = 13
b = np.pad(a, ((0, desired_rows-a.shape[0]), (0, desired_cols-a.shape[1])), 'constant', constant_values=0)
print(b)
"""
prints
[[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
[1. 1. 1. ... 1. 0. 0.]
...
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]]
"""
Of course it's not error-proof solution, e.g. if your desired number of rows or columns is smaller than corresponding size of the original array, you'll get ValueError: index can't contain negative values
.
Solution 3:
You could do like this. array
is your original array and in this case just for testcase. Just use your own one.
import numpy as np
array = [[None] * 10]*10
#print(array)
testarray = np.zeros((93,13))
for index,row in enumerate(array):
for i in range(0,len(row)-1):
testarray[index][i]= row[i]