Can I train the initial hidden state of a RNN to represent the initial conditions of my model?
I have some time-series data related to a bioreactor. Every 24h I feed glucose to the bioreactor and measure how much of some substances it produced since last feed.
Input: Glucose feed.
Ouput: Production of substances.
Objective: Estimate these substances concentrations over time, given the glucose I fed.
This bioreactor has some initial conditions, like initial concentration of glucose and substances. Each experiment has a different initial condition. In one experiment I can start with 10mM of a substance, and in another I can start with 100mM, so knowing the starting point is important.
I wanted use to this initial condition to train the initial hidden state of my RNN.
Is there anyway that I can do that? If not, are there other ways to express initial conditions to a RNN? I am using python with Keras. Thanks!
In code, I believe it would look something like this:
from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
input_layer = Input(shape=(16,3))
hidden_state_dim = 7
mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)
x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])
model = Model(input_layer, x)
But I receive the ValueError: Graph disconnected. Probably because there is no backpropagation to the mlp_dense_h/c.
The reason why you receive the error is because you did not incorporate the mlp_inp as one of the input in the "Model". The following revised code can work without error:
from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from keras.utils import plot_model
input_layer = Input(shape=(16,3))
hidden_state_dim = 7
mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)
x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])
model = Model(inputs=[input_layer, mlp_inp], outputs=x) # Here is the change
plot_model(model, to_file='IC.png')
enter image description here
I am currently working on the similar RNN problem as you did. Maybe we can discuss more about this interested problem.