How to make a CNN model correctly?

Solution 1:

Considering the model you have shared without adding padding to the layer:

model=tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), activation='relu', input_shape=(7, 9, 1)))
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.Conv2D(2, kernel_size=(3, 3), activation='relu'))

The summary of the model is as below:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_30 (Conv2D)          (None, 5, 7, 4)           40        
                                                                 
 conv2d_31 (Conv2D)          (None, 3, 5, 4)           148       
                                                                 
 conv2d_32 (Conv2D)          (None, 1, 3, 2)           74        
                                                                 
=================================================================
Total params: 262
Trainable params: 262
Non-trainable params: 0
_________________________________________________________________

In a convolutional neural network, a kernel/filter which moves across the data scans each pixel and converts the data into a smaller data. Here the output size of the last layer is (1, 3, 2) whereas the input size is (7, 9, 1).

Considering the model with padding:

model=tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), padding='same', activation='relu', input_shape=(7, 9, 1)))
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(tf.keras.layers.Conv2D(4, kernel_size=(3, 3), padding='same', activation='relu'))

The model summary is as below:

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 7, 9, 4)           40        
                                                                 
 conv2d_1 (Conv2D)           (None, 7, 9, 4)           148       
                                                                 
 conv2d_2 (Conv2D)           (None, 7, 9, 4)           148       
                                                                 
 conv2d_3 (Conv2D)           (None, 7, 9, 4)           148       
                                                                 
=================================================================
Total params: 484
Trainable params: 484
Non-trainable params: 0
_________________________________________________________________

As you can see in the above example, when padding is added , padding=same, the padding layers append zero values in the outer frame of the data so the filter we are using can cover the edge of the matrix and doesn't reduce the size of the image. 4 in the conv2D layer determines the number of output filters in the convolution. Hence the output shape of the all the layers is(7, 9, 4).