TF Keras v 1.14+: subclass model or subclass layer for "module"

Tensorflow has some docs for subclassing (tf) Keras Model and Layer.

However, it is unclear which to use for "modules" or "blocks" (e.g. several layers collectively).

Since it is technically several layers, I feel that subclassing Layer would be deceiving, and while subclassing Model works, I am unsure if there are any negative penalties for doing so.

e.g.

x = inputs
a = self.dense_1(x) # <--- self.dense_1 = tf.keras.Dense(...)
b = self.dense_2(a)
c = self.add([x, b])

which is appropriate to use?


Solution 1:

(Please note that this answer is old, later, Keras changed to allow and use subclassing regularly)

Initially, there is no need to sublass anything with Keras. Unless you have a particular reason for that (which is not building, training, predicting), you don't subclass for Keras.

Buiding a Keras model:

Either using Sequential (the model is ready already, just add layers), or using Model (create a graph with layers and finally call Model(inputs, outputs)), you don't need to subclass.

The moment you create an instance of Sequential or Model, you have a fully defined model, ready to use in all situations.

This model can even be used as parts of other models, its layers can be easily accessed to get intermetiate outputs and create new branches in your graph.

So, I don't see any reason at all to subclass Model, unless you are using some additional framework that would require this (but I don't think so). This seems to be something from PyTorch users (because this kind of model building is typical for PyTorch, create a subclass for Module and add layers and a call method). But Pytorch doesn't offer the same ease as Keras does for getting intermediate results.

The main advantage of using Keras is exactly this: you can easily access layers from blocks and models and instantly start branching from that point without needing to rebuild any call methods or adding any extra code for that in the models. So, when you subclass Model, you just defeat the purpose of Keras making it all more difficult.

The docs you mentioned say:

Model subclassing is particularly useful when eager execution is enabled since the forward pass can be written imperatively.

I don't really understand what "imperatively" means, and I don't see how it would be easier than just building a model with regular layers.

Another quote from the docs:

Key Point: Use the right API for the job. While model subclassing offers flexibility, it comes at a cost of greater complexity and more opportunities for user errors. If possible, prefer the functional API.

Well... it's always possible.

Subclassing layers

Here, there may be good reasons to do so. And these reasons are:

  • You want a layer that performs custom calculations that are not available with regular layers
  • This layer must have persistent weights.

If you don't need "both" things above, you don't need to subclass a layer. If you just want "custom calculations" without weights, a Lambda layer is enough.