Deep learning hyperparameter tuning is an essential step in optimizing the performance of neural networks. It involves finding the best hyperparameters that result in the highest accuracy and efficiency of the model. In this tutorial, we will discuss various hyperparameters that can be tuned in deep learning models using Python, TensorFlow, and Keras.
- Understanding Hyperparameters:
Hyperparameters are the parameters that are set before the training process begins. They control the behavior of the model and impact how the model learns and generalizes from the data. Some common hyperparameters that can be tuned in deep learning models include learning rate, batch size, number of epochs, activation functions, optimizer, dropout rate, and number of layers.
- Setting up the Environment:
Before we begin hyperparameter tuning, we need to set up our environment with Python, TensorFlow, and Keras. You can install these libraries using pip:
pip install tensorflow
pip install keras
- Loading the Dataset:
For this tutorial, we will use the Fashion MNIST dataset, which consists of grayscale images of 10 different types of clothing items. We can load the dataset using Keras:
from keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
- Preprocessing the Data:
Before training the model, we need to preprocess the data by normalizing the pixel values and reshaping the images:
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
- Building the Model:
Next, we need to build the deep learning model using Keras. For this tutorial, we will create a simple convolutional neural network (CNN) with two convolutional layers and two fully connected layers:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
- Compiling the Model:
Before training the model, we need to compile it with an optimizer, loss function, and metrics:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
- Defining the Hyperparameter Space:
Now, we can define the hyperparameter space that we want to search through using a grid search or random search. For this tutorial, we will tune the learning rate and batch size:
learning_rate = [0.001, 0.01, 0.1]
batch_size = [32, 64, 128]
- Performing Hyperparameter Tuning:
We can use the GridSearchCV or RandomizedSearchCV classes from scikit-learn to perform hyperparameter tuning. These classes allow us to define a search space and evaluate the model with different hyperparameter combinations:
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier
def create_model(learning_rate=0.01, batch_size=32):
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
MaxPooling2D((2, 2)),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Flatten(),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
model.compile(optimizer=Adam(learning_rate), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, verbose=0)
grid = GridSearchCV(estimator=model, param_grid=dict(learning_rate=learning_rate, batch_size=batch_size), cv=3)
grid_result = grid.fit(x_train, y_train)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
- Evaluating the Model:
Finally, we can evaluate the model with the best hyperparameters on the test set:
best_model = grid_result.best_estimator_
scores = best_model.score(x_test, y_test)
print("Test Accuracy: %.2f%%" % (scores*100))
By following this tutorial, you can successfully tune the hyperparameters of your deep learning model to achieve higher accuracy and efficiency. Experiment with different hyperparameters and search strategies to find the best combination for your specific task.
Take my courses at https://mlnow.ai/!
That was excellent. Need more videos on DL
Side comment – we divide x by 255, because the image is grayscale. An RGB of white is (255,255,255), so we are converting the values upon dividing to (1,1,1), then leaving black as (0,0,0). So, an important note when training images is first convert the images to grayscale.
@GreggHogg Hi,
I got stuck with keras tuner. It seems that code below will only only create the function 'model_builder' once. If I change anything like add Dropout layer and rerun the function it keeps displaying the comment (see below the code), like it was consistenly reaching to the first version of function.
Any clues on how to fix that? I would like to experiment with the 'model_builder' function (add/remove layers, dropouts, etc) and then observe what parameters tuner generates.
def model_builder(hp) :
model = Sequential()
hp_activation = hp.Choice('activation', values = ['relu', 'tanh'])
hp_layer_1 = hp.Int('layer_1', min_value = 2, max_value = 32, step = 2)
hp_layer_2 = hp.Int('layer_2', min_value = 2, max_value = 32, step = 2)
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])
model.add(Dense(units = hp_layer_1, activation = hp_activation))
model.add(Dense(units = hp_layer_2, activation = hp_activation))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = 'binary_crossentropy',
metrics = [tf.keras.metrics.Recall()])
return model
tuner = kt.Hyperband(model_builder,
objective = kt.Objective("val_recall", direction = "max"),
max_epochs = 50,
factor = 3,
seed = 42)
Comment : Reloading Tuner from .untitled_projecttuner0.json
good afternoon, I have a task and I have not been able to create the keras tuner for 5000 rows with 4 columns, in each column the numbers are random from 0 to 9 and I need an output of only 4 numbers this is the code # Initialising the RNN
model = Sequential()
# Adding the input layer and the LSTM layer
model.add(Bidirectional(LSTM(neurons1,
input_shape=(window_length, number_of_features),
return_sequences=True)))
# Adding a first Dropout layer
model.add(Dropout(0.2))
# Adding a second LSTM layer
model.add(Bidirectional(LSTM(neurons2,
input_shape=(window_length, number_of_features),
return_sequences=True)))
# Adding a second Dropout layer
model.add(Dropout(0.2))
# Adding a third LSTM layer
model.add(Bidirectional(LSTM(neurons3,
input_shape=(window_length, number_of_features),
return_sequences=True)))
# Adding a fourth LSTM layer
model.add(Bidirectional(LSTM(neurons4,
input_shape=(window_length, number_of_features),
return_sequences=False)))
# Adding a fourth Dropout layer
model.add(Dropout(0.2))
# Adding the first output layer with ReLU activation function
model.add(Dense(output_neurons, activation='relu'))
# Adding the last output layer with softmax activation function
model.add(Dense(number_of_features, activation='softmax')) Thank you very much
Great Video Man but tbh I was actually expecting some sort of automation of the hyperparameter tuning.
Simple explanation, awesome video!
Thanks for an amazing video! Is there way to tune hyperparameters like in sklearn w/o using keras-tuner?
Thank you . I am learning deep learning .This helped me much
Thank you for this video! I have been learning about deep learning algorithms over the holiday break! Hope we see more videos from you! I love your channel and content! Keep up the awesome work, happy holidays and happy new year! 🙂
Can you suggest data science course?
I already read numpy,pandas and matplotlib.
Awesome video!!