Keras Tuner for the optimization of hyper parameters
To run the Keras Tuner with Yahoo Finance data (Tesla stock price), two packages should be installed: keras-tuner and yfinance.
To install these packages, type the following commands in the Jupyter Notebook.
1 2 | !pip install yfinance !pip install keras-tuner | cs |
Data and some useful function
I apply a simple Keras deep learning model to Tesla stock returns for an illustration purpose. After data is loaded and transformed into returns, a train-test split is performed (80:20) using f_sequence_to_supervised() function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | import pandas as pd import numpy as np import matplotlib.pyplot as plt import pandas as pd import yfinance as yf #!pip install yfinance %matplotlib inline import tensorflow as tf import random import os def seed_everything(seed: int = 42): random.seed(seed) np.random.seed(seed) os.environ["PYTHONHASHSEED"] = str(seed) tf.random.set_seed(seed) # Construction of X and y def f_sequence_to_supervised(sequence, n_steps): X, y = list(), list() for i in range(len(sequence)): end_ix = i + n_steps if end_ix > len(sequence)-1: break seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] X.append(seq_x); y.append(seq_y) return np.array(X), np.array(y) # read stock price df_stkp = yf.download('TSLA', start='2020-01-01', end='2023-01-31', progress=False) df_stkp = df_stkp['Close'] # covert the stock price to daily stock returns and to plot it df_stkr=df_stkp.pct_change().dropna()*100 # percent df_stkp.plot(title="TSLA's stock return") n_steps = 12 # stock return (%) with its lags X, y = f_sequence_to_supervised(df_stkr.tolist(), n_steps) index1=int(round(len(X)*0.8)) X_train, X_test = X[:index1], X[index1:] y_train, y_test = y[:index1], y[index1:] | cs |
Keras model fitting and forecasting given predetermined hyper parameters
As a simple example, the following deep learning model is constructed. Fitting this model is followed by forecasting using the test data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | from keras.models import Sequential from keras.layers import Dense, LSTM, Dropout from keras.callbacks import EarlyStopping from keras import optimizers # define a simple Sequential model seed_everything(1) # fix the random seed model = Sequential() model.add(LSTM(50, input_shape=(n_steps,1))) model.add(Dropout(0.2)) model.add(Dense(6)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam', metrics = ['mse']) #fit the model early_stop = EarlyStopping(monitor='loss', patience=50, verbose=0) model.fit(X_train, y_train, epochs=1000, batch_size=32, verbose=2, callbacks=[early_stop]) # Forecast using test data y_pred = model.predict(X_test, verbose=0) plt.figure().set_figwidth(12) plt.plot(np.c_[y_pred, y_test]) plt.legend(('Test data','Forecast')) plt.show() | cs |
After the above model is fitted, a forecast is done using the test data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | Epoch 1/1000 20/20 - 2s - loss: 25.8200 - mse: 25.8200 - 2s/epoch - 84ms/step Epoch 2/1000 20/20 - 0s - loss: 25.6928 - mse: 25.6928 - 75ms/epoch - 4ms/step Epoch 3/1000 20/20 - 0s - loss: 25.6435 - mse: 25.6435 - 83ms/epoch - 4ms/step Epoch 4/1000 20/20 - 0s - loss: 25.4286 - mse: 25.4286 - 84ms/epoch - 4ms/step ⋮ Epoch 532/1000 20/20 - 0s - loss: 1.8256 - mse: 1.8256 - 103ms/epoch - 5ms/step Epoch 533/1000 20/20 - 0s - loss: 1.6459 - mse: 1.6459 - 103ms/epoch - 5ms/step Epoch 534/1000 20/20 - 0s - loss: 1.8238 - mse: 1.8238 - 107ms/epoch - 5ms/step Epoch 535/1000 20/20 - 0s - loss: 1.7009 - mse: 1.7009 - 114ms/epoch - 6ms/step Epoch 536/1000 20/20 - 0s - loss: 1.6138 - mse: 1.6138 - 110ms/epoch - 6ms/step | cs |
Set up and Run Keras Tuner
Keras Tuner is applied by the three stages in the next code. At first, some hyper parameters of interest are selected with their candidate values in the method of hp.Int(), hp.Float(), and so on. You can find more methods in the Keras web site (https://keras.io/api/keras_tuner/hyperparameters/).
A Keras code with these methods is implemented in a user-defined build function (f_build_model() in our case and this name can be changed, of course).
Using these information, tuner= kt.RandomSearch() is defined and tuner.search() function is called. The role of the latter is essentially same to the model.fit() in the standard Keras model. In particular, I use overwrite=True to avoid unexpected reloading of the previous output.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | import keras.backend as K import keras_tuner as kt #!pip install keras-tuner from keras_tuner.tuners import RandomSearch from keras_tuner.engine.hyperparameters import HyperParameters # options for hyper parameters “hp” are specified def f_build_model(hp): seed_everything(1) # fix the random seed model = Sequential() model.add(LSTM(50, input_shape=(n_steps,1))) #model.add(Dropout(0.2)) hp_rate = hp.Float('dropout1_rate',min_value=0.0, max_value=0.4,step=0.2) model.add(Dropout(hp_rate)) # model.add(Dense(6)) hp_unit = hp.Int('dense1_units',min_value=3, max_value=9,step=3) model.add(Dense(hp_unit)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam', metrics = ['mse']) return model # define tuner tuner= kt.RandomSearch(f_build_model, overwrite=True, objective='mse', max_trials=50, executions_per_trial=1) # instead of fitting the model # run the search function on the tuner object. tuner.search(x=X_train, y=y_train, epochs=1000, batch_size=32, verbose=1, callbacks=[early_stop]) | cs |
As Keras Tuner is a kind of a random search and then does not consider all combinations of hyper parameters since it is time consuming and ineffective practically, some restriction on this combination is necessary. This is the max_trials which is the number of hyperparameter combinations.
It is a standard practice to run model.fit() several times and get the average output to avoid parameter an initialization effect. Like this kind of robustness purposes, execution_per_trial is used and this parameter is the number of models that should be built and fit for each trial.
Running the above code, the optimal output is produced as follows.
1 2 3 4 5 6 7 | Trial 9 Complete [00h 01m 05s] mse: 1.3944592475891113 Best mse So Far: 0.004801980219781399 Total elapsed time: 00h 10m 19s INFO:tensorflow:Oracle triggered exit | cs |
Predicting the best model using selected hyper parameters
We can get the final model as well as optimized hyper parameters. The following code explains how to extract the final model or hyper parameters and how to predict.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # Get the optimal hyperparameters best_hps = tuner.get_best_hyperparameters(num_trials = 1)[0] # Read selected parameters print(f""" The hyperparameter search is complete. The optimal dropout rate : {best_hps.get('dropout1_rate')}. The optimal number of units : {best_hps.get('dense1_units')}. """) # summary results for hyper parameter optimization tuner.results_summary() # Now get the best model best_model = tuner.get_best_models(num_models=1)[0] best_model.summary() # Predict using the best model y_pred_best = best_model.predict(X_test, verbose=0) plt.figure().set_figwidth(12) plt.plot(np.c_[y_test, y_pred, y_pred_best]) plt.legend(('Test data','Forecast', 'Forecast (best)')) plt.show() | cs |
After the hyperparameter search is complete, the optimal dropout rate and the number of units in fully connected dense layer are 0.0.and 9 respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | The hyperparameter search is complete. The optimal dropout rate : 0.0. The optimal number of units : 9. Results summary Results in .\untitled_project Showing 10 best trials <keras_tuner.engine.objective.Objective object at 0x0000022E49675E50> Trial summary Hyperparameters: dropout1_rate: 0.0 dense1_units: 9 Score: 0.004801980219781399 ⋮ Trial summary Hyperparameters: dropout1_rate: 0.4 dense1_units: 6 Score: 3.4320671558380127 Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= lstm (LSTM) (None, 50) 10400 dropout (Dropout) (None, 50) 0 dense (Dense) (None, 9) 459 dense_1 (Dense) (None, 1) 10 ================================================================= Total params: 10,869 Trainable params: 10,869 Non-trainable params: 0 _________________________________________________________________ | cs |
Concluding Remarks
This post introduced Keras Tuner for hyper parameter optimization. As it is somewhat time-consuming, I used a very small set of candidate hyper parameters as an illustration. Therefore, it is recommended to use a relatively large set for real applications. Furthermore, you can use validation data by minimizing validation loss to prevent overfitting and enhance the ability of the model to generalize to new data. \(\blacksquare\)
No comments:
Post a Comment