ValueError: Can not squeeze dim[1], expected a dimension of 1 for '{{node binary_crossentropy/weighted_loss/Squeeze}}

Solution 1:

I don't think the problem is the Masking layer. Since you set the parameter return_sequences to True in the LSTM layer, you are getting a sequence with the same number of time steps as your input and an output space of 100 for each timestep, hence the shape (128, 4, 100), where 128 is the batch size. Afterwards, you apply a BatchNormalization layer and finally a Dense layer resulting in the shape (128, 4, 1). The problem is your labels have a 2D shape (128, 1) and your model has a 3D output due to the return_sequences parameter. So, simply setting this parameter to False should solve your problem. See also this post.

Here is a working example:

from tensorflow.keras.layers import LSTM, Dense, BatchNormalization, Masking
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Nadam
import numpy as np

if __name__ == '__main__':
    
    # define stub data
    samples, timesteps, features = 128, 4, 99
    X = np.random.rand(samples, timesteps, features)
    Y = np.random.randint(0, 2, size=(samples))
    
    # create model
    model = Sequential()
    model.add(Masking(mask_value=0., input_shape=(None, 99)))
    model.add(LSTM(100, return_sequences=False))
    model.add(BatchNormalization())
    model.add(Dense(1, activation='sigmoid'))
    optimizer = Nadam(learning_rate=0.0001)
    loss = BinaryCrossentropy(from_logits=False)
    model.compile(loss=loss, optimizer=optimizer)

    # train model
    model.fit(
        X,
        Y,
        batch_size=128)