Numpy remove a dimension from np array

I have some images I want to work with, the problem is that there are two kinds of images both are 106 x 106 pixels, some are in color and some are black and white.

one with only two (2) dimensions:

(106,106)

and one with three (3)

(106,106,3)

Is there a way I can strip this last dimension?

I tried np.delete, but it did not seem to work.

np.shape(np.delete(Xtrain[0], [2] , 2))
Out[67]: (106, 106, 2)

Solution 1:

You could use numpy's fancy indexing (an extension to Python's built-in slice notation):

x = np.zeros( (106, 106, 3) )
result = x[:, :, 0]
print(result.shape)

prints

(106, 106)

A shape of (106, 106, 3) means you have 3 sets of things that have shape (106, 106). So in order to "strip" the last dimension, you just have to pick one of these (that's what the fancy indexing does).

You can keep any slice you want. I arbitrarily choose to keep the 0th, since you didn't specify what you wanted. So, result = x[:, :, 1] and result = x[:, :, 2] would give the desired shape as well: it all just depends on which slice you need to keep.

Solution 2:

Just take the mean value over the colors dimension (axis=2):

Xtrain_monochrome = Xtrain.mean(axis=2)

Solution 3:

When the shape of your array is (106, 106, 3), you can visualize it as a table with 106 rows and 106 columns filled with data points where each point is array of 3 numbers which we can represent as [x, y ,z]. Therefore, if you want to get the dimensions (106, 106), you must make the data points in your table of to not be arrays but single numbers. You can achieve this by extracting either the x-component, y-component or z-component of each data point or by applying a function that somehow aggregates the three component like the mean, sum, max etc. You can extract any component just like @matt Messersmith suggested above.

Solution 4:

well, you should be careful when you are trying to reduce the dimensions of an image. An Image is normally a 3-D matrix that contains data of the RGB values of each pixel. If you want to reduce it to 2-D, what you really are doing is converting a colored RGB image into a grayscale image.

And there are several ways to do this like you can take the maximum of three, min, average, sum, etc, depending on the accuracy you want in your image. The best you can do is, take a weighted average of the RGB values using the formula

Y = 0.299R + 0.587G + 0.114B

where R stands for RED, G is GREEN and B is BLUE. In numpy, this can be written as

new_image = img[:, :, 0]*0.299 + img[:, :, 1]*0.587 + img[:, :, 2]*0.114