Is there a way to fix a batch size mismatch between output from final model layer and input?

What I've done: I have a complete dataset of 898 labels with a total of ~55,000 images. For purposes of speed, I took 10 of those labels and about ~600 images to test the code below. I've tried changing the batchSize, modifying the data function, but to no avail.

Problem: Error: Batch size mismatch: output dense_Dense1 has 10; expected 500 based on input conv2d_Conv2D1_input.

Goal: Either change the final output of dense_Dense1 to have 500, or change the expected input of conv2d_Conv2D1_input to only 10.

Complete Code:

var tf = require('@tensorflow/tfjs');
var tfnode = require('@tensorflow/tfjs-node');
var fs = require(`fs`)

const numberOfClasses = 10;

const imageWidth = 500;
const imageHeight = 800;
const imageChannels = 3;

const batchSize = 3;
const epochsValue = 5;

const createImage = async (fileName) => {
  const imageBuffer = await fs.readFileSync(fileName);
  const image = await tfnode.node.decodeImage(imageBuffer);
  return image;
}

const labelArray = indice => Array.from({length: numberOfClasses}, (_, k) => k === indice ? 1 : 0)

async function* data() {
  for (i = 1; i < numberOfClasses+1; i++) {
    for (x = 10; x < 40; x++) {
      const feature = await createImage(`./images/${i}/${i}-${x}.png`) ;
      const label = tf.tensor1d(labelArray(i))
      yield {xs: feature, ys: label};
    }
  }
}

function onBatchEnd(batch, logs) {
  console.log('Accuracy', logs.acc);
}

const main = async () => {
  const model = tf.sequential();

  model.add(tf.layers.conv2d({
    inputShape: [imageWidth, imageHeight, imageChannels],
    filters: 8,
    kernelSize: 5,
    padding: 'same',
    activation: 'relu'
  }));
  model.add(tf.layers.maxPooling2d({
    poolSize: 2,
    strides: 2
  }));

  model.add(tf.layers.conv2d({
    filters: 16,
    kernelSize: 5,
    padding: 'same',
    activation: 'relu'
  }));
  model.add(tf.layers.maxPooling2d({
    poolSize: 3,
    strides: 3
  }));
  
  model.add(tf.layers.flatten());

  model.add(tf.layers.dense({
    units: numberOfClasses,
    activation: 'softmax'
  }));

  model.compile({
    optimizer: 'sgd',
    loss: 'categoricalCrossentropy',
    metrics: ['accuracy']
  });

  model.summary()

  const ds = tf.data.generator(data);

  model.fitDataset(ds, {
    epochs: 5,
    batchSize: 10,
    callbacks: {onBatchEnd}
  }).then(info => {
    console.log('Final accuracy', info.history.acc);
  });

}
main()


Solution 1:

The error is self explanatory

Error: Batch size mismatch: output dense_Dense1 has 10; expected 500 based on input conv2d_Conv2D1_input.

There is a mismatch of shape between what the model expects and what the dataset has. dense_Dense1 is the last layer and 10 the number of classes (model.summary() is handy to know the layers' name)

500 is the number of batches - at least that's what the model is getting with fitDataset(). Here is the confusion. 500 is the first axis size of the feature. feature is a 3d tensor with the first axis (image width) being 500. If the 3d tensor is used directly for prediction, the model consider that the image width is the number of batches and the features are 2d tensors. Therefore the feature should rather be a 4d tensor.

To go further, the model is not predicting on a single image at a time though. Since there is later a batch(10). After each iteration of the iterator data*, the features are stacked and after 10 iterations are used for predictions. It is handy because other operators can be used to reshuffle the batch and so on.

Here is the fix:

yield {xs: feature.expandDims(), ys: label.expandDims()};

Or even better you can batch the number of samples before fitting it to the model this way

const ds = tf.data.generator(data).batch(1);