Can I train the initial hidden state of a RNN to represent the initial conditions of my model?

I have some time-series data related to a bioreactor. Every 24h I feed glucose to the bioreactor and measure how much of some substances it produced since last feed.

Input: Glucose feed.

Ouput: Production of substances.

Objective: Estimate these substances concentrations over time, given the glucose I fed.

This bioreactor has some initial conditions, like initial concentration of glucose and substances. Each experiment has a different initial condition. In one experiment I can start with 10mM of a substance, and in another I can start with 100mM, so knowing the starting point is important.

I wanted use to this initial condition to train the initial hidden state of my RNN.

Model

Is there anyway that I can do that? If not, are there other ways to express initial conditions to a RNN? I am using python with Keras. Thanks!

In code, I believe it would look something like this:

from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam

input_layer = Input(shape=(16,3))
hidden_state_dim = 7

mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)

x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])

model = Model(input_layer, x)

But I receive the ValueError: Graph disconnected. Probably because there is no backpropagation to the mlp_dense_h/c.


The reason why you receive the error is because you did not incorporate the mlp_inp as one of the input in the "Model". The following revised code can work without error:

from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from keras.utils import plot_model

input_layer = Input(shape=(16,3))
hidden_state_dim = 7

mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)

x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])

model = Model(inputs=[input_layer, mlp_inp], outputs=x) # Here is the change
plot_model(model, to_file='IC.png')

enter image description here

I am currently working on the similar RNN problem as you did. Maybe we can discuss more about this interested problem.