Get class labels from Keras functional model

I have a functional model in Keras (Resnet50 from repo examples). I trained it with ImageDataGenerator and flow_from_directory data and saved model to .h5 file. When I call model.predict I get an array of class probabilities. But I want to associate them with class labels (in my case - folder names). How can I get them? I found that I could use model.predict_classes and model.predict_proba, but I don't have these functions in Functional model, only in Sequential.


y_prob = model.predict(x) 
y_classes = y_prob.argmax(axis=-1)

As suggested here.


When one uses flow_from_directory the problem is how to interpret the probability outputs. As in, how to map the probability outputs and the class labels as how flow_from_directory creates one-hot vectors is not known in prior.

We can get a dictionary that maps the class labels to the index of the prediction vector that we get as the output when we use

generator= train_datagen.flow_from_directory("train", batch_size=batch_size)
label_map = (generator.class_indices)

The label_map variable is a dictionary like this

{'class_14': 5, 'class_10': 1, 'class_11': 2, 'class_12': 3, 'class_13': 4, 'class_2': 6, 'class_3': 7, 'class_1': 0, 'class_6': 10, 'class_7': 11, 'class_4': 8, 'class_5': 9, 'class_8': 12, 'class_9': 13}

Then from this the relation can be derived between the probability scores and class names.

Basically, you can create this dictionary by this code.

from glob import glob
class_names = glob("*") # Reads all the folders in which images are present
class_names = sorted(class_names) # Sorting them
name_id_map = dict(zip(class_names, range(len(class_names))))

The variable name_id_map in the above code also contains the same dictionary as the one obtained from class_indices function of flow_from_directory.

Hope this helps!


UPDATE: This is no longer valid for newer Keras versions. Please use argmax() as in the answer from Emilia Apostolova.

The functional API models have just the predict() function which for classification would return the class probabilities. You can then select the most probable classes using the probas_to_classes() utility function. Example:

y_proba = model.predict(x)
y_classes = keras.np_utils.probas_to_classes(y_proba)

This is equivalent to model.predict_classes(x) on the Sequential model.

The reason for this is that the functional API support more general class of tasks where predict_classes() would not make sense.

More info: https://github.com/fchollet/keras/issues/2524


In addition to @Emilia Apostolova answer to get the ground truth labels, from

generator = train_datagen.flow_from_directory("train", batch_size=batch_size)

just call

y_true_labels = generator.classes

You must use the labels index you have, here what I do for text classification:

# data labels = [1, 2, 1...]
labels_index = { "website" : 0, "money" : 1 ....} 
# to feed model
label_categories = to_categorical(np.asarray(labels)) 

Then, for predictions:

texts = ["hello, rejoins moi sur skype", "bonjour comment ça va ?", "tu me donnes de l'argent"]

sequences = tokenizer.texts_to_sequences(texts)

data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)

predictions = model.predict(data)

t = 0

for text in texts:
    i = 0
    print("Prediction for \"%s\": " % (text))
    for label in labels_index:
        print("\t%s ==> %f" % (label, predictions[t][i]))
        i = i + 1
    t = t + 1

This gives:

Prediction for "hello, rejoins moi sur skype": 
    website ==> 0.759483
    money ==> 0.037091
    under ==> 0.010587
    camsite ==> 0.114436
    email ==> 0.075975
    abuse ==> 0.002428
Prediction for "bonjour comment ça va ?": 
    website ==> 0.433079
    money ==> 0.084878
    under ==> 0.048375
    camsite ==> 0.036674
    email ==> 0.369197
    abuse ==> 0.027798
Prediction for "tu me donnes de l'argent": 
    website ==> 0.006223
    money ==> 0.095308
    under ==> 0.003586
    camsite ==> 0.003115
    email ==> 0.884112
    abuse ==> 0.007655