Optimizing Hyperparameters for Deep Learning in Python Using TensorFlow & Keras

Posted by


Deep learning hyperparameter tuning is an essential step in optimizing the performance of neural networks. It involves finding the best hyperparameters that result in the highest accuracy and efficiency of the model. In this tutorial, we will discuss various hyperparameters that can be tuned in deep learning models using Python, TensorFlow, and Keras.

  1. Understanding Hyperparameters:

Hyperparameters are the parameters that are set before the training process begins. They control the behavior of the model and impact how the model learns and generalizes from the data. Some common hyperparameters that can be tuned in deep learning models include learning rate, batch size, number of epochs, activation functions, optimizer, dropout rate, and number of layers.

  1. Setting up the Environment:

Before we begin hyperparameter tuning, we need to set up our environment with Python, TensorFlow, and Keras. You can install these libraries using pip:

pip install tensorflow
pip install keras
  1. Loading the Dataset:

For this tutorial, we will use the Fashion MNIST dataset, which consists of grayscale images of 10 different types of clothing items. We can load the dataset using Keras:

from keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
  1. Preprocessing the Data:

Before training the model, we need to preprocess the data by normalizing the pixel values and reshaping the images:

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
  1. Building the Model:

Next, we need to build the deep learning model using Keras. For this tutorial, we will create a simple convolutional neural network (CNN) with two convolutional layers and two fully connected layers:

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])
  1. Compiling the Model:

Before training the model, we need to compile it with an optimizer, loss function, and metrics:

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  1. Defining the Hyperparameter Space:

Now, we can define the hyperparameter space that we want to search through using a grid search or random search. For this tutorial, we will tune the learning rate and batch size:

learning_rate = [0.001, 0.01, 0.1]
batch_size = [32, 64, 128]
  1. Performing Hyperparameter Tuning:

We can use the GridSearchCV or RandomizedSearchCV classes from scikit-learn to perform hyperparameter tuning. These classes allow us to define a search space and evaluate the model with different hyperparameter combinations:

from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier

def create_model(learning_rate=0.01, batch_size=32):
    model = Sequential([
        Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
        MaxPooling2D((2, 2)),
        Conv2D(64, (3, 3), activation='relu'),
        MaxPooling2D((2, 2)),
        Flatten(),
        Dense(128, activation='relu'),
        Dense(10, activation='softmax')
    ])
    model.compile(optimizer=Adam(learning_rate), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, verbose=0)
grid = GridSearchCV(estimator=model, param_grid=dict(learning_rate=learning_rate, batch_size=batch_size), cv=3)
grid_result = grid.fit(x_train, y_train)

print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
  1. Evaluating the Model:

Finally, we can evaluate the model with the best hyperparameters on the test set:

best_model = grid_result.best_estimator_
scores = best_model.score(x_test, y_test)
print("Test Accuracy: %.2f%%" % (scores*100))

By following this tutorial, you can successfully tune the hyperparameters of your deep learning model to achieve higher accuracy and efficiency. Experiment with different hyperparameters and search strategies to find the best combination for your specific task.

0 0 votes
Article Rating
12 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@GregHogg
1 month ago

Take my courses at https://mlnow.ai/!

@iftekharanam8980
1 month ago

That was excellent. Need more videos on DL

@BB-2383
1 month ago

Side comment – we divide x by 255, because the image is grayscale. An RGB of white is (255,255,255), so we are converting the values upon dividing to (1,1,1), then leaving black as (0,0,0). So, an important note when training images is first convert the images to grayscale.

@tomaszzielonka9808
1 month ago

@GreggHogg Hi,
I got stuck with keras tuner. It seems that code below will only only create the function 'model_builder' once. If I change anything like add Dropout layer and rerun the function it keeps displaying the comment (see below the code), like it was consistenly reaching to the first version of function.
Any clues on how to fix that? I would like to experiment with the 'model_builder' function (add/remove layers, dropouts, etc) and then observe what parameters tuner generates.

def model_builder(hp) :

model = Sequential()

hp_activation = hp.Choice('activation', values = ['relu', 'tanh'])
hp_layer_1 = hp.Int('layer_1', min_value = 2, max_value = 32, step = 2)
hp_layer_2 = hp.Int('layer_2', min_value = 2, max_value = 32, step = 2)
hp_learning_rate = hp.Choice('learning_rate', values = [1e-2, 1e-3, 1e-4])

model.add(Dense(units = hp_layer_1, activation = hp_activation))
model.add(Dense(units = hp_layer_2, activation = hp_activation))
model.add(Dense(units = 1, activation = 'sigmoid'))

model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = hp_learning_rate),
loss = 'binary_crossentropy',
metrics = [tf.keras.metrics.Recall()])

return model

tuner = kt.Hyperband(model_builder,
objective = kt.Objective("val_recall", direction = "max"),
max_epochs = 50,
factor = 3,
seed = 42)

Comment : Reloading Tuner from .untitled_projecttuner0.json

@luisalbertoburbano9295
1 month ago

good afternoon, I have a task and I have not been able to create the keras tuner for 5000 rows with 4 columns, in each column the numbers are random from 0 to 9 and I need an output of only 4 numbers this is the code # Initialising the RNN

model = Sequential()

# Adding the input layer and the LSTM layer

model.add(Bidirectional(LSTM(neurons1,

input_shape=(window_length, number_of_features),

return_sequences=True)))

# Adding a first Dropout layer

model.add(Dropout(0.2))

# Adding a second LSTM layer

model.add(Bidirectional(LSTM(neurons2,

input_shape=(window_length, number_of_features),

return_sequences=True)))

# Adding a second Dropout layer

model.add(Dropout(0.2))

# Adding a third LSTM layer

model.add(Bidirectional(LSTM(neurons3,

input_shape=(window_length, number_of_features),

return_sequences=True)))

# Adding a fourth LSTM layer

model.add(Bidirectional(LSTM(neurons4,

input_shape=(window_length, number_of_features),

return_sequences=False)))

# Adding a fourth Dropout layer

model.add(Dropout(0.2))

# Adding the first output layer with ReLU activation function

model.add(Dense(output_neurons, activation='relu'))

# Adding the last output layer with softmax activation function

model.add(Dense(number_of_features, activation='softmax')) Thank you very much

@dakshbhatnagar
1 month ago

Great Video Man but tbh I was actually expecting some sort of automation of the hyperparameter tuning.

@tigjuli
1 month ago

Simple explanation, awesome video!

@haneulkim4902
1 month ago

Thanks for an amazing video! Is there way to tune hyperparameters like in sklearn w/o using keras-tuner?

@rudrathakkar56
1 month ago

Thank you . I am learning deep learning .This helped me much

@billybobandboshow
1 month ago

Thank you for this video! I have been learning about deep learning algorithms over the holiday break! Hope we see more videos from you! I love your channel and content! Keep up the awesome work, happy holidays and happy new year! 🙂

@prabinbasyal1049
1 month ago

Can you suggest data science course?
I already read numpy,pandas and matplotlib.

@arsheyajain7055
1 month ago

Awesome video!!