Python: Fixing the Random Seed for Reproducibility

This post shows how to fix the random seed to get reproducible results with Keras.



Fixing the Random Seed for Reproducible Work




Setting random seeds fixed


For an illustration, a simple DNN model is implemented in the Jupyter Notebook using a function for fixing the random seed. The function seed_everything() is borrowed from https://dacon.io/codeshare/2363.

Every time we run this code as a whole, the result is always 96.562645 and is not changed.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Import library and define functions
 
# In[1]:
_________________________________________________________________
from keras.models import Sequential
from keras.layers import Dense
 
import tensorflow as tf
import numpy as np
import random
import os
 
def seed_everything(seed: int = 42):
    random.seed(seed)
    np.random.seed(seed)
    os.environ["PYTHONHASHSEED"= str(seed)
    tf.random.set_seed(seed)
_________________________________________________________________
 
# Simple DNN model with a fixed random seed 
# In[2]:
_________________________________________________________________
= np.array([[102030], [203040], [304050], [405060]])
= np.array([40506070])
 
seed_everything(1# fix the random seed
 
model = Sequential()
model.add(Dense(50, activation='relu', input_dim=3))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(x, y, epochs=1000, verbose=0)
 
x_input = np.array([607080])
x_input = x_input.reshape((1,3))
pred = model.predict(x_input, verbose=0)
print(pred)
_________________________________________________________________
 
[[96.562645]]
 
cs



Robustness checks


However, when we reuse and rerun some parts of the above code in the same file, we can encounter some unexpected results as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# rerun model fit
# In[3]:
_________________________________________________________________
model.fit(x, y, epochs=1000, verbose=0)
x_input = np.array([607080])
x_input = x_input.reshape((1,3))
pred = model.predict(x_input, verbose=0)
print(pred)
_________________________________________________________________
[[90.000145]]
 
# rerun model.fit with a fixed random seed
# In[4]:
_________________________________________________________________
seed_everything(1)
 
model.fit(x, y, epochs=1000, verbose=0)
x_input = np.array([607080])
x_input = x_input.reshape((1,3))
pred = model.predict(x_input, verbose=0)
print(pred)
_________________________________________________________________
[[90.00001]]
 
 
# rerun model.compile & model.fit with a fixed random seed
# In[5]:
_________________________________________________________________
seed_everything(1)
 
model.compile(optimizer='adam', loss='mse')
model.fit(x, y, epochs=1000, verbose=0)
x_input = np.array([607080])
x_input = x_input.reshape((1,3))
pred = model.predict(x_input, verbose=0)
print(pred)
_________________________________________________________________
[[89.99999]]
 
# call seed_everything(1) model = Sequential()
# In[6]:
_________________________________________________________________
seed_everything(1)
 
model = Sequential()
model.add(Dense(50, activation='relu', input_dim=3))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(x, y, epochs=1000, verbose=0)
x_input = np.array([607080])
x_input = x_input.reshape((1,3))
pred = model.predict(x_input, verbose=0)
print(pred)
_________________________________________________________________
[[96.562645]]
 
cs


The bottom line is that to reproduce always the same result of the same model several times in one file, seed_everything() function needs to be located before model = Sequential().


No comments:

Post a Comment