Partner Talk at PT Conf. 2022: Optimizing PyTorch Training Speed on Mac Platforms with the MPS Backend

Posted by


At the PyTorch Conference 2022, a partner talk highlighted the use of the Metal Performance Shaders (MPS) backend to accelerate training on Mac platforms. This tutorial will guide you on how to leverage MPS to accelerate PyTorch training on your Mac.

What is Metal Performance Shaders (MPS)?
Metal Performance Shaders (MPS) is a high-performance GPU-driven framework that provides a set of optimized mathematical functions for accelerating machine learning and computer vision tasks on Apple devices. By utilizing the power of the GPU, MPS allows for faster and more efficient computation, making it an ideal choice for accelerating deep learning frameworks like PyTorch on Mac platforms.

Prerequisites:
Before getting started with accelerating PyTorch training using MPS, make sure you have the following prerequisites:

  1. A Mac platform with a supported GPU (e.g., M1 chip or other compatible GPUs).
  2. Xcode installed on your Mac for compiling Metal shaders.
  3. PyTorch installed on your machine. You can install PyTorch using pip:
    pip install torch torchvision

Steps to Accelerate PyTorch Training Using MPS:

Step 1: Enable the MPS backend in PyTorch
To enable the MPS backend in PyTorch, you need to set the environment variable TORCH_USE_METAL to 1. This can be done by running the following command in your terminal:

export TORCH_USE_METAL=1

Step 2: Create a PyTorch model
Next, you need to create a PyTorch model that you want to train using the MPS backend. For this tutorial, let’s create a simple neural network model with a single hidden layer:

import torch
import torch.nn as nn

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

model = SimpleNN()

Step 3: Prepare the data for training
Before training the model, you need to prepare the data by loading a dataset and creating data loaders. For this tutorial, let’s use the MNIST dataset:

import torchvision
from torchvision import transforms

transform = transforms.Compose([transforms.ToTensor()])
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)

Step 4: Initialize the optimizer and loss function
Next, you need to initialize the optimizer and loss function for training the model. For this tutorial, let’s use the Adam optimizer and cross-entropy loss:

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

Step 5: Train the model using the MPS backend
Now that you have set up the model, data loaders, optimizer, and loss function, you can train the model using the MPS backend. To do this, you need to loop over the training data, forward pass the input through the model, compute the loss, backpropagate the gradients, and update the model parameters:

model.train()
for epoch in range(5):
    for i, (images, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        images = images.view(-1, 784)
        output = model(images)
        loss = criterion(output, labels)
        loss.backward()
        optimizer.step()

        if i % 100 == 0:
            print(f'Epoch [{epoch + 1}/{5}], Step [{i + 1}/{len(train_loader)}], Loss: {loss.item()}')

Step 6: Evaluate the model
After training the model, you can evaluate its performance on a validation set. For simplicity, let’s evaluate the model on the test set of the MNIST dataset:

test_dataset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)

model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        images = images.view(-1, 784)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy on the test set: {100 * correct / total}%')

Conclusion:
In this tutorial, you learned how to accelerate PyTorch training on Mac platforms using the Metal Performance Shaders (MPS) backend. By enabling the MPS backend and setting up a simple neural network model, data loaders, optimizer, and loss function, you were able to train a model on the MNIST dataset and evaluate its performance. By leveraging the power of the GPU through MPS, you can achieve faster and more efficient training of deep learning models on your Mac platform.

0 0 votes
Article Rating

Leave a Reply

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x