Why can't I get reproducible results in Keras even though I set the random seeds?
Solution 1:
You can find the answer at Keras docs: https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development.
In short, to be absolutely sure that you will get reproducible results with your python script on one computer's/laptop's CPU then you will have to do the following:
- Set
PYTHONHASHSEED
environment variable at a fixed value - Set
python
built-in pseudo-random generator at a fixed value - Set
numpy
pseudo-random generator at a fixed value - Set
tensorflow
pseudo-random generator at a fixed value - Configure a new global
tensorflow
session
Following the Keras
link at the top, the source code I am using is the following:
# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0
# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)
# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)
# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)
# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)
# for later versions:
# tf.compat.v1.set_random_seed(seed_value)
# 5. Configure a new global `tensorflow` session
from keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
# for later versions:
# session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
# sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
# tf.compat.v1.keras.backend.set_session(sess)
It is needless to say that you do not have to to specify any seed
or random_state
at the numpy
, scikit-learn
or tensorflow
/keras
functions that you are using in your python script exactly because with the source code above we set globally their pseudo-random generators at a fixed value.
Solution 2:
The key point of making result reproducible is to disable GPU. See my answer at another question (link https://stackoverflow.com/a/57121117/9501391) which has been accepted.