44
Model Training Patterns - Hyperparameter Tuning
Model parameters refer to the weights and biases learned by the model as it goes through training iterations.
Hyperparameters are, on the other hand, parameters that we as model builders can control.
Model architecture hyperparameters - Hyperparameters that control model's underlying mathematical function
Model training hyperparameters - Hyperparameters that control the training loop and the way the optimizer works
A faster alternative to grid search.
Unlike grid search, this approach will randomly sample values for each hyperparameter and try the combination.
This library provides solution that scales and learns from previous trials to find an optimal combination of hyperparameter values.
EXAMPLE - tuning the number of neurons in the first and second hidden layers of a MNIST classification model
import keras_tuner as kt
from tensorflow import keras
def build_model(hp):
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(units=hp.Int('first_hidden', min_value=32,
max_value=256, step=32), activation='relu'),
keras.layers.Dense(units=hp.Int('second_hidden', min_value=32,
max_value=256, step=32), activation='relu'),
keras.layers.Dense(units=10, activation='softmax')
])
model.compile(optimizer=keras.optimizers.Adam(
hp.Float('learning_rate', min_value=.005, max_value=.01, sampling='log')),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
tuner = kt.BayesianOptimization(
build_model,
objective='val_accuracy',
max_trials=10,
)
tuner.search(x_train, y_train, validation_split=0.1, epochs=10)
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
Goal of this optimization approach - Directly train the model or call the objective function (the process of training the ML model) as few times as possible as it's a costly operation
One of the issues with the above approaches is that every time a new set of hyperparameters is tried on, it means running the model through an entire training loop. This is what Bayesian optimization tries to solve.
number_of_trials
44