How to export Keras .h5 to tensorflow .pb?

Keras does not include by itself any means to export a TensorFlow graph as a protocol buffers file, but you can do it using regular TensorFlow utilities. Here is a blog post explaining how to do it using the utility script freeze_graph.py included in TensorFlow, which is the "typical" way it is done.

However, I personally find a nuisance having to make a checkpoint and then run an external script to obtain a model, and instead prefer to do it from my own Python code, so I use a function like this:

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.

    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
                          or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = tf.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph

Which is inspired in the implementation of freeze_graph.py. The parameters are similar to the script too. session is the TensorFlow session object. keep_var_names is only needed if you want to keep some variable not frozen (e.g. for stateful models), so generally not. output_names is a list with the names of the operations that produce the outputs that you want. clear_devices just removes any device directives to make the graph more portable. So, for a typical Keras model with one output, you would do something like:

from keras import backend as K

# Create, compile and train model...

frozen_graph = freeze_session(K.get_session(),
                              output_names=[out.op.name for out in model.outputs])

Then you can write the graph to a file as usual with tf.train.write_graph:

tf.train.write_graph(frozen_graph, "some_directory", "my_model.pb", as_text=False)

The freeze_session method works fine. But compared to saving to a checkpoint file then using the freeze_graph tool that comes with TensorFlow seems simpler to me, as it's easier to maintain. All you need to do is the following two steps:

First, add after your Keras code model.fit(...) and train your model:

from keras import backend as K
import tensorflow as tf
print(model.output.op.name)
saver = tf.train.Saver()
saver.save(K.get_session(), '/tmp/keras_model.ckpt')

Then cd to your TensorFlow root directory, run:

python tensorflow/python/tools/freeze_graph.py \
--input_meta_graph=/tmp/keras_model.ckpt.meta \
--input_checkpoint=/tmp/keras_model.ckpt \
--output_graph=/tmp/keras_frozen.pb \
--output_node_names="<output_node_name_printed_in_step_1>" \
--input_binary=true

Update for Tensorflow 2

Saving everything into a single archive in the TensorFlow SavedModel format (contains saved_model.pb file):

model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')

or in the older Keras H5 format:

model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('model.h5')

The recommended format is SavedModel.

Loading the model back:

from tensorflow import keras
model = keras.models.load_model('path/to/location')
model = keras.models.load_model('model.h5')

A SavedModel contains a complete TensorFlow program, including trained parameters (i.e, tf.Variables) and computation. It does not require the original model building code to run, which makes it useful for sharing or deploying with TFLite, TensorFlow.js, TensorFlow Serving, or TensorFlow Hub.

  • Save and load Keras models: https://www.tensorflow.org/guide/keras/save_and_serialize
  • Using the SavedModel format: https://www.tensorflow.org/guide/saved_model

Example for Tensorflow 2

The following simple example (XOR example) shows how to export Keras models (in both h5 format and pb format), and using the model in Python and C++:


train.py:

import numpy as np
import tensorflow as tf

print(tf.__version__)  # 2.4.1

x_train = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], 'float32')
y_train = np.array([[0], [1], [1], [0]], 'float32')

inputs = tf.keras.Input(shape=(2,), name='input')
x = tf.keras.layers.Dense(64, activation='relu')(inputs)
x = tf.keras.layers.Dense(64, activation='relu')(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
x = tf.keras.layers.Dense(64, activation="relu")(x)
outputs = tf.keras.layers.Dense(1, activation='sigmoid', name='output')(x)

model = tf.keras.Model(inputs=inputs, outputs=outputs, name='xor')

model.summary()

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy'])

model.fit(x_train, y_train, epochs=100)

model.save('./xor/')  # SavedModel format

model.save('./xor.h5')  # Keras H5 format

After run the above script:

.
├── train.py
├── xor
│   ├── assets
│   ├── saved_model.pb
│   └── variables
│       ├── variables.data-00000-of-00001
│       └── variables.index
└── xor.h5

predict.py:

import numpy as np
import tensorflow as tf

print(tf.__version__)  # 2.4.1

model = tf.keras.models.load_model('./xor/')  # SavedModel format
# model = tf.keras.models.load_model('./xor.h5')  # Keras H5 format

# 0 xor 0 = [[0.11921611]] ~= 0
print('0 xor 0 = ', model.predict(np.array([[0, 0]])))

# 0 xor 1 = [[0.96736085]] ~= 1
print('0 xor 1 = ', model.predict(np.array([[0, 1]])))

# 1 xor 0 = [[0.97254556]] ~= 1
print('1 xor 0 = ', model.predict(np.array([[1, 0]])))

# 1 xor 1 = [[0.0206149]] ~= 0
print('1 xor 1 = ', model.predict(np.array([[1, 1]])))

Convert Model to ONNX:

ONNX is a new standard for exchanging deep learning models. It promises to make deep learning models portable thus preventing vendor lock in.

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

$ pip install onnxruntime
$ pip install tf2onnx
$ python -m tf2onnx.convert --saved-model ./xor/ --opset 9 --output xor.onnx

# INFO - Successfully converted TensorFlow model ./xor/ to ONNX
# INFO - Model inputs: ['input:0']
# INFO - Model outputs: ['output']
# INFO - ONNX model is saved at xor.onnx

By specifying --opset the user can override the default to generate a graph with the desired opset. For example --opset 13 would create a onnx graph that uses only ops available in opset 13. Because older opsets have in most cases fewer ops, some models might not convert on a older opset.

  • https://onnx.ai/
  • https://github.com/onnx/onnx
  • https://github.com/onnx/tensorflow-onnx

opencv-predict.py:

import numpy as np
import cv2

print(cv2.__version__)  # 4.5.1

model = cv2.dnn.readNetFromONNX('./xor.onnx')

# 0 xor 0 = [[0.11921611]] ~= 0
model.setInput(np.array([[0, 0]]), name='input:0')
print('0 xor 0 = ', model.forward(outputName='output'))

# 0 xor 1 = [[0.96736085]] ~= 1
model.setInput(np.array([[0, 1]]), name='input:0')
print('0 xor 1 = ', model.forward(outputName='output'))

# 1 xor 0 = [[0.97254556]] ~= 1
model.setInput(np.array([[1, 0]]), name='input:0')
print('1 xor 0 = ', model.forward(outputName='output'))

# 1 xor 1 = [[0.02061491]] ~= 0
model.setInput(np.array([[1, 1]]), name='input:0')
print('1 xor 1 = ', model.forward(outputName='output'))

predict.cpp:

#include <cstdlib>
#include <iostream>
#include <opencv2/opencv.hpp>

int main(int argc, char **argv)
{
    std::cout << CV_VERSION << std::endl; // 4.2.0

    cv::dnn::Net net;

    net = cv::dnn::readNetFromONNX("./xor.onnx");

    // 0 xor 0 = [0.11921611] ~= 0
    float x0[] = { 0, 0 };
    net.setInput(cv::Mat(1, 2, CV_32F, x0), "input:0");
    std::cout << "0 xor 0 = " << net.forward("output") << std::endl;

    // 0 xor 1 = [0.96736085] ~= 1
    float x1[] = { 0, 1 };
    net.setInput(cv::Mat(1, 2, CV_32F, x1), "input:0");
    std::cout << "0 xor 1 = " << net.forward("output") << std::endl;

    // 1 xor 0 = [0.97254556] ~= 1
    float x2[] = { 1, 0 };
    net.setInput(cv::Mat(1, 2, CV_32F, x2), "input:0");
    std::cout << "1 xor 0 = " << net.forward("output") << std::endl;

    // 1 xor 1 = [0.020614909] ~= 0
    float x3[] = { 1, 1 };
    net.setInput(cv::Mat(1, 2, CV_32F, x3), "input:0");
    std::cout << "1 xor 1 = " << net.forward("output") << std::endl;

    return EXIT_SUCCESS;
}

Compile and Run:

$ sudo apt install build-essential pkg-config libopencv-dev
$ g++ predict.cpp `pkg-config --cflags --libs opencv4` -o predict
$ ./predict

Original Answer

The following simple example (XOR example) shows how to export Keras models (in both h5 format and pb format), and using the model in Python and C++:


train.py:

import numpy as np
import tensorflow as tf


def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.

    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
                          or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ''
        frozen_graph = tf.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph


X = np.array([[0,0], [0,1], [1,0], [1,1]], 'float32')
Y = np.array([[0], [1], [1], [0]], 'float32')

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(64, input_dim=2, activation='relu'))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['binary_accuracy'])

model.fit(X, Y, batch_size=1, nb_epoch=100, verbose=0)

# inputs:  ['dense_input']
print('inputs: ', [input.op.name for input in model.inputs])

# outputs:  ['dense_4/Sigmoid']
print('outputs: ', [output.op.name for output in model.outputs])

model.save('./xor.h5')

frozen_graph = freeze_session(tf.keras.backend.get_session(), output_names=[out.op.name for out in model.outputs])
tf.train.write_graph(frozen_graph, './', 'xor.pbtxt', as_text=True)
tf.train.write_graph(frozen_graph, './', 'xor.pb', as_text=False)

predict.py:

import numpy as np
import tensorflow as tf

model = tf.keras.models.load_model('./xor.h5')

# 0 ^ 0 =  [[0.01974997]]
print('0 ^ 0 = ', model.predict(np.array([[0, 0]])))

# 0 ^ 1 =  [[0.99141496]]
print('0 ^ 1 = ', model.predict(np.array([[0, 1]])))

# 1 ^ 0 =  [[0.9897714]]
print('1 ^ 0 = ', model.predict(np.array([[1, 0]])))

# 1 ^ 1 =  [[0.00406971]]
print('1 ^ 1 = ', model.predict(np.array([[1, 1]])))

opencv-predict.py:

import numpy as np
import cv2 as cv


model = cv.dnn.readNetFromTensorflow('./xor.pb')

# 0 ^ 0 =  [[0.01974997]]
model.setInput(np.array([[0, 0]]), name='dense_input')
print('0 ^ 0 = ', model.forward(outputName='dense_4/Sigmoid'))

# 0 ^ 1 =  [[0.99141496]]
model.setInput(np.array([[0, 1]]), name='dense_input')
print('0 ^ 1 = ', model.forward(outputName='dense_4/Sigmoid'))

# 1 ^ 0 =  [[0.9897714]]
model.setInput(np.array([[1, 0]]), name='dense_input')
print('1 ^ 0 = ', model.forward(outputName='dense_4/Sigmoid'))

# 1 ^ 1 =  [[0.00406971]]
model.setInput(np.array([[1, 1]]), name='dense_input')
print('1 ^ 1 = ', model.forward(outputName='dense_4/Sigmoid'))

predict.cpp:

#include <cstdlib>
#include <iostream>
#include <opencv2/opencv.hpp>

int main(int argc, char **argv)
{
    cv::dnn::Net net;

    net = cv::dnn::readNetFromTensorflow("./xor.pb");

    // 0 ^ 0 = [0.018541215]
    float x0[] = { 0, 0 };
    net.setInput(cv::Mat(1, 2, CV_32F, x0), "dense_input");
    std::cout << "0 ^ 0 = " << net.forward("dense_4/Sigmoid") << std::endl;

    // 0 ^ 1 = [0.98295897]
    float x1[] = { 0, 1 };
    net.setInput(cv::Mat(1, 2, CV_32F, x1), "dense_input");
    std::cout << "0 ^ 1 = " << net.forward("dense_4/Sigmoid") << std::endl;

    // 1 ^ 0 = [0.98810625]
    float x2[] = { 1, 0 };
    net.setInput(cv::Mat(1, 2, CV_32F, x2), "dense_input");
    std::cout << "1 ^ 0 = " << net.forward("dense_4/Sigmoid") << std::endl;

    // 1 ^ 1 = [0.010002014]
    float x3[] = { 1, 1 };
    net.setInput(cv::Mat(1, 2, CV_32F, x3), "dense_input");
    std::cout << "1 ^ 1 = " << net.forward("dense_4/Sigmoid") << std::endl;

    return EXIT_SUCCESS;
}