Implementing LeNet-5 using PyTorch

Posted by


LeNet-5 is a classical convolutional neural network architecture that was introduced by Yann LeCun in 1998. It was one of the first successful CNN architectures and has been widely used in various applications such as handwritten digit recognition and facial recognition. In this tutorial, we will implement the LeNet-5 architecture in PyTorch.

Step 1: Install PyTorch

Before we get started with implementing LeNet-5 in PyTorch, make sure you have PyTorch installed on your system. You can install it using the following command:

pip install torch torchvision

Step 2: Import Libraries

Once you have PyTorch installed, you can start by importing the necessary libraries:

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

Step 3: Define the LeNet-5 Architecture

Next, we will define the LeNet-5 architecture in PyTorch. The LeNet-5 architecture consists of 7 layers: 2 convolutional layers, 2 subsampling layers, and 3 fully connected layers. Here’s how you can define the LeNet-5 architecture in PyTorch:

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = nn.functional.relu(self.conv1(x))
        x = nn.functional.max_pool2d(x, 2)
        x = nn.functional.relu(self.conv2(x))
        x = nn.functional.max_pool2d(x, 2)
        x = x.view(-1, 16 * 5 * 5)
        x = nn.functional.relu(self.fc1(x))
        x = nn.functional.relu(self.fc2(x))
        x = self.fc3(x)
        return x

Step 4: Define the Loss Function and Optimizer

After defining the LeNet-5 architecture, we need to define the loss function and optimizer. We will use CrossEntropyLoss as the loss function and SGD as the optimizer:

model = LeNet()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001)

Step 5: Load and Preprocess the Data

Next, we will load and preprocess the data. We will use the MNIST dataset for this tutorial. You can load the MNIST dataset using torchvision.datasets and preprocess it using torchvision.transforms:

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=transform, download=True)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)

Step 6: Train the Model

Now we can train the LeNet-5 model on the MNIST dataset. You can train the model by iterating over the training data and updating the weights using the optimizer:

num_epochs = 10

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i+1) % 100 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
                  .format(epoch+1, num_epochs, i+1, len(train_loader), loss.item()))

Step 7: Evaluate the Model

Finally, we can evaluate the model on the test dataset and calculate the accuracy:

model.eval()
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the model on the test images: {} %'.format(100 * correct / total))

And that’s it! You have now successfully implemented and trained the LeNet-5 architecture in PyTorch. Feel free to experiment with different hyperparameters and architectures to improve the model’s performance. Happy coding!

0 0 votes
Article Rating
4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@prakhargupta1745
3 months ago

Hey Professor, I have been somewhat confused regarding the dimensions of the input and output features regarding CNN, so can you please confirm whether my understanding is correct?
Let's suppose that we have an input into a convolutional layer of size M*N, and there are P number of channels, and in the output, we need to create C number of channels with size A * B. So, for this to work, we will need C number of kernels in the layer, and each kernel will be of size A * B * P? Is my understanding correct?

@sweet_func9924
3 months ago

Many thanks for the explanation, it's very helpful. The confusion I'm having is that some sources put a third conv layer instead of your first fully-connected one. Like Conv2d(16, 120, kernel_size=5, stride=1). Could you comment on it maybe?

@kenbobcorn
3 months ago

Maybe it's a version issue, but using torch==01.10.2 the script throws an error when it runs torch.flatten(x, 1). It is returning a tuple of size 1 instead of a tensor. This can be resolved using the built-in method Tensor.flatten(start_dim=1) which works for me. This issue may come up with new students who are using an newer version of PyTorch.

@saifalkhalidy1793
3 months ago

thanks a lot, dear it is a very useful tutorial so plz share the link for the helper function and do more model classification with confusion matrix