Tensorflow 2.4.1 - using tf.data.Dataset to fit a Keras Sequential Model

I'm trying to fit a tf.data.Dataset as follows:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

INPUT_NEURONS = 10
OUTPUT_NEURONS = 1

features = tf.random.normal((1000, INPUT_NEURONS))
labels = tf.random.normal((1000, OUTPUT_NEURONS))
dataset = tf.data.Dataset.from_tensor_slices((features, labels))

def build_model():

  model = keras.Sequential(
    [
        layers.Dense(3, input_shape=[INPUT_NEURONS]),
        layers.Dense(OUTPUT_NEURONS),
    ]
  )

  optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

  model.compile(loss='mse',
            optimizer=optimizer,
            metrics=['mae', 'mse'])

  return model

model = build_model()

model.fit(dataset, epochs=2, verbose=2)

However, I'm getting the following error:

ValueError: Input 0 of layer sequential is incompatible with the layer: expected axis -1 of input shape to have value 10 but received input with shape (10, 1)

The model.summary() looks good though:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 3)                 33        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 4         
=================================================================
Total params: 37
Trainable params: 37
Non-trainable params: 0
_________________________________________________________________

Is Keras Model fit() actually suitable to tf.data.Dataset? If so, what I'm doing wrong here?


Solution 1:

as far I know training using batches is optional, a hyperparameter to use or not during the model development

Not exactly, not optional. TF-Keras is designed to work with batches. First dimension in the summary always corresponds to batch_size, and None indicates that any batch_size is accepted by the model.

Most of the times you want your model to accept any batch size. Well, if you use stateful LSTMs then you want to a define static batch_size.

After you put your data into tf.data.Dataset they would not have batch dimension specifically:

dataset.element_spec
>> (TensorSpec(shape=(10,), dtype=tf.float32, name=None),
    TensorSpec(shape=(1,), dtype=tf.float32, name=None))

And when using tf.data, batch_size in Model.fit() is ignored, so batching should be done manually. More specifically, you may not know how many elements a tf.data.Dataset contains everytime.

In this situation it does not make sense (I'll explain) to batch after dataset creation:

dataset.batch(3).element_spec
>> (TensorSpec(shape=(None, 10), dtype=tf.float32, name=None),
 TensorSpec(shape=(None, 1), dtype=tf.float32, name=None))

tf.data is generally used with mid-large size datasets so batching it after creation will allow vectorized transformations. Consider these scenarios:

  1. You have 5M rows of signal data want to apply fft. If you don't batch it before the the fft process, it will apply one by one.

  2. You have (100K) images dataset. You want to apply some transformations or some operations. Batched dataset will allow faster and vectorized transformations and save a lot of time.