Keras replacing input layer
The code that I have (that I can't change) uses the Resnet with my_input_tensor
as the input_tensor.
model1 = keras.applications.resnet50.ResNet50(input_tensor=my_input_tensor, weights='imagenet')
Investigating the source code, ResNet50 function creates a new keras Input Layer with my_input_tensor
and then create the rest of the model. This is the behavior that I want to copy with my own model. I load my model from h5 file.
model2 = keras.models.load_model('my_model.h5')
Since this model already has an Input Layer, I want to replace it with a new Input Layer defined with my_input_tensor
.
How can I replace an input layer?
Solution 1:
When you saved your model using:
old_model.save('my_model.h5')
it will save following:
- The architecture of the model, allowing to create the model.
- The weights of the model.
- The training configuration of the model (loss, optimizer).
- The state of the optimizer, allowing training to resume from where you left before.
So then, when you load the model:
res50_model = load_model('my_model.h5')
you should get the same model back, you can verify the same using:
res50_model.summary()
res50_model.get_weights()
Now you can, pop the input layer and add your own using:
res50_model.layers.pop(0)
res50_model.summary()
add new input layer:
newInput = Input(batch_shape=(0,299,299,3)) # let us say this new InputLayer
newOutputs = res50_model(newInput)
newModel = Model(newInput, newOutputs)
newModel.summary()
res50_model.summary()
Solution 2:
Layers.pop(0) or anything like that doesn't work.
You have two options that you can try:
1.
You can create a new model with the required layers.
A relatively easy way to do this is to i) extract the model json configuration, ii) change it appropriately, iii) create a new model from it, and then iv) copy over the weights. I'll just show the basic idea.
i) extract the configuration
model_config = model.get_config()
ii) change the configuration
input_layer_name = model_config['layers'][0]['name']
model_config['layers'][0] = {
'name': 'new_input',
'class_name': 'InputLayer',
'config': {
'batch_input_shape': (None, 300, 300),
'dtype': 'float32',
'sparse': False,
'name': 'new_input'
},
'inbound_nodes': []
}
model_config['layers'][1]['inbound_nodes'] = [[['new_input', 0, 0, {}]]]
model_config['input_layers'] = [['new_input', 0, 0]]
ii) create a new model
new_model = model.__class__.from_config(model_config, custom_objects={}) # change custom objects if necessary
ii) copy weights
# iterate over all the layers that we want to get weights from
weights = [layer.get_weights() for layer in model.layers[1:]]
for layer, weight in zip(new_model.layers[1:], weights):
layer.set_weights(weight)
2.
You can try a library like kerassurgeon (I am linking to a fork that works with the tensorflow keras version). Note that insertion and deletion operations only work under certain conditions such as compatible dimensions.
from kerassurgeon.operations import delete_layer, insert_layer
model = delete_layer(model, layer_1)
# insert new_layer_1 before layer_2 in a model
model = insert_layer(model, layer_2, new_layer_3)
Solution 3:
The solution from @MilindDeore did not work for me, unfortunately. While I can print the summary of the new model, I receive a "Matrix size incompatible" error upon prediction. I guess this makes sense, since the new input shape of the dense layer does not match the shape of the old dense layer weights.
Thus, here is another solution. The key for me was to use "_layers" instead of "layers". The latter only seems to return a copy.
import keras
import numpy as np
def get_model():
old_input_shape = (20, 20, 3)
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(9, (3, 3), padding="same", input_shape=old_input_shape))
model.add(keras.layers.MaxPooling2D((2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(1, activation="sigmoid"))
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['acc'], )
model.summary()
return model
def change_model(model, new_input_shape=(None, 40, 40, 3)):
# replace input shape of first layer
model._layers[1].batch_input_shape = new_input_shape
# feel free to modify additional parameters of other layers, for example...
model._layers[2].pool_size = (8, 8)
model._layers[2].strides = (8, 8)
# rebuild model architecture by exporting and importing via json
new_model = keras.models.model_from_json(model.to_json())
new_model.summary()
# copy weights from old model to new one
for layer in new_model.layers:
try:
layer.set_weights(model.get_layer(name=layer.name).get_weights())
except:
print("Could not transfer weights for layer {}".format(layer.name))
# test new model on a random input image
X = np.random.rand(10, 40, 40, 3)
y_pred = new_model.predict(X)
print(y_pred)
return new_model
if __name__ == '__main__':
model = get_model()
new_model = change_model(model)
Solution 4:
Unfortunately kerassurgeon did not support my model as I had frozen layers. I had to make a small change to @MilindDeore's solution - replace model.layers.pop(0) to model._layers.pop(0) and it worked for me. Note that I am using tf.keras in TF 2.0.