Hyperparameter tuning is a crucial step in the process of optimizing your neural network model. PyTorch provides a powerful framework for building and training deep learning models, and includes tools to help you tune hyperparameters efficiently.
In this tutorial, we will walk through the process of hyperparameter tuning in PyTorch, focusing on how to optimize the learning rate for a simple neural network. We will use the famous MNIST dataset for this demonstration.
Step 1: Setting up your environment
Before we begin, make sure you have PyTorch installed on your system. You can install PyTorch using pip:
pip install torch torchvision
Next, import the necessary libraries:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
from torchvision import datasets, transforms
Step 2: Loading the dataset
Load the MNIST dataset using torchvision. We will normalize the data and create data loaders for training and validation sets:
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
Step 3: Define the neural network
Next, define a simple neural network architecture. For this tutorial, we will use a fully connected network with one hidden layer:
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = x.view(-1, 784)
x = torch.sigmoid(self.fc1(x))
x = self.fc2(x)
return x
Step 4: Define the training loop
Now, define a training function that takes in a learning rate as an argument. Inside this function, instantiate the model, criterion, optimizer, and train the model:
def train_model(lr):
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=lr)
for epoch in range(5):
model.train()
for i, (inputs, targets) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
print(f'Epoch {epoch+1}, Loss: {loss.item()}')
return model
Step 5: Hyperparameter tuning
To tune the learning rate, define a list of learning rates to try:
learning_rates = [0.001, 0.01, 0.1, 1]
Loop through the list of learning rates, train the model, and evaluate the model on the validation set:
for lr in learning_rates:
model = train_model(lr)
model.eval()
correct = 0
total = 0
with torch.no_grad():
for inputs, targets in test_loader:
outputs = model(inputs)
_, predicted = torch.max(outputs.data, 1)
total += targets.size(0)
correct += (predicted == targets).sum().item()
print(f'Learning Rate: {lr}, Accuracy: {100 * correct / total}%')
By examining the accuracy results for each learning rate, you can identify the optimal learning rate for your model. You can also explore other hyperparameters like batch size, number of hidden units, and number of layers using a similar approach.
In this tutorial, we have demonstrated how to perform hyperparameter tuning in PyTorch using the learning rate as an example. Experiment with different hyperparameters and neural network architectures to optimize your models for better performance. Happy coding!
Can you please make a video on how we can do hyper parameter tuning for regression model for best activation functions , optimizers and no of hidden layers?