A Beginner’s Guide to Hyperparameter Tuning with PyTorch

Posted by


Hyperparameter tuning is an essential step in training machine learning models to achieve better performance and accuracy. It involves selecting the best set of hyperparameters for a given model and dataset. In this tutorial, we will discuss hyperparameter tuning using PyTorch library, which is a popular open-source machine learning library for building deep learning models.

Step 1: Installing PyTorch
Before we begin with hyperparameter tuning in PyTorch, you need to have PyTorch installed in your system. You can install PyTorch using pip, by running the following command:

pip install torch torchvision

Step 2: Creating a neural network model
In this tutorial, we will use a simple neural network model to demonstrate hyperparameter tuning. We will create a basic neural network with a few hidden layers using PyTorch.

Here is an example of a simple neural network model in PyTorch:

import torch
import torch.nn as nn

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Step 3: Defining hyperparameters
Now, we need to define the hyperparameters for the neural network model. Hyperparameters are the parameters that are set before the training process begins. These parameters include the learning rate, batch size, number of epochs, etc.

In this example, we will define the following hyperparameters:

learning_rate = 0.001
batch_size = 64
epochs = 10

Step 4: Creating a dataset and dataloader
Next, we need to create a dataset and dataloader for training our neural network model. We will use the FashionMNIST dataset, which contains images of different clothing items. PyTorch provides pre-built datasets that can be easily used for training models.

import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

Step 5: Training the model
Now, we can train our neural network model using the defined hyperparameters and the dataset. We will also define a loss function and an optimizer for training the model.

model = NeuralNetwork()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(epochs):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(images.view(-1, 28*28))
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                  .format(epoch+1, epochs, i+1, len(train_loader), loss.item()))

Step 6: Hyperparameter tuning
To perform hyperparameter tuning, we need to experiment with different hyperparameter values to find the best combination that yields the highest accuracy. This can be done using a technique called grid search, which involves trying out all possible combinations of hyperparameters.

Here is an example of grid search for hyperparameter tuning in PyTorch:

params = {
    'learning_rate': [0.001, 0.01, 0.1],
    'batch_size': [32, 64, 128],
    'epochs': [5, 10, 15]
}

for lr in params['learning_rate']:
    for bs in params['batch_size']:
        for ep in params['epochs']:
            model = NeuralNetwork()
            optimizer = torch.optim.Adam(model.parameters(), lr=lr)

            for epoch in range(ep):
                for i, (images, labels) in enumerate(train_loader):
                    optimizer.zero_grad()
                    outputs = model(images.view(-1, 28*28))
                    loss = criterion(outputs, labels)
                    loss.backward()
                    optimizer.step()

            # Evaluate the model and choose the best hyperparameters

In this example, we define a grid search with different values for learning rate, batch size, and epochs. We train the model for each combination of hyperparameters and evaluate the model’s performance to select the best hyperparameters.

Step 7: Evaluating the model
Finally, after hyperparameter tuning, we need to evaluate the model on a separate test dataset to see how well it performs. We can calculate metrics such as accuracy, precision, recall, etc., to assess the model’s performance.

test_dataset = torchvision.datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        outputs = model(images.view(-1, 28*28))
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print('Accuracy of the model on the test dataset: {} %'.format(accuracy))

In this tutorial, we discussed how to perform hyperparameter tuning in PyTorch using grid search. Hyperparameter tuning is an important step in training machine learning models to achieve better performance and accuracy. By experimenting with different hyperparameter values, we can find the best combination that optimizes the model’s performance.