Why does prediction needs batch size in Keras?

Solution 1:

Keras can predict multiple values at the same time, like if you input a vector of 100 elements, Keras can compute one prediction for each element, giving 100 outputs. This computation can also be done in batches, defined by the batch_size.

This is just in case you cannot fit all the data in the CPU/GPU RAM at the same time and batch processing is needed.

Solution 2:

The reason is the same , why you need batch size for training, because you cannot fit all data into one single batch

Similarly, if you have millions of data points to predict, it is obviously that you will not be able to pass at one go (single batch).

After all, training and prediction both have a forward pass on the batch data.

Hence, you need the batch size to control/limit the data point in a single batch and distribute it across multiple batches of prediction.