In this tutorial, we will be exploring Part 10 of the PyTorch FSDP (Full Stack Deep Learning) series, which covers an end-to-end walkthrough of PyTorch FSDP. PyTorch FSDP is a new PyTorch extension that provides full-stack support for distributed and mixed-precision training of deep learning models. In this tutorial, we will go through the steps required to use PyTorch FSDP in your deep learning projects.
Step 1: Install PyTorch FSDP
The first step in using PyTorch FSDP is to install the package. You can easily install PyTorch FSDP using pip by running the following command:
pip install fsdp
Step 2: Import PyTorch FSDP
Once you have installed PyTorch FSDP, you can import it in your Python code by adding the following import statement:
import torch
from fsdp import FullyShardedDataParallel as FSDP
Step 3: Define your model
Next, you will need to define your deep learning model. You can use any PyTorch model architecture, such as a ResNet, DenseNet, or a custom model. Here is an example of how you can define a simple feedforward neural network using PyTorch:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
Step 4: Initialize FSDP
To use FSDP with your model, you need to wrap your model with the FSDP module. You can do this by creating an instance of the FSDP class and passing in your model and any additional parameters such as optimizer and loss function. Here is an example of how to initialize FSDP with your model:
# Initialize FSDP
model = Net()
model = FSDP(model)
Step 5: Data Loading and Training
After initializing FSDP with your model, you can proceed with loading your data and training your model as you would with a regular PyTorch model. Make sure to use the FSDP module to train your model with distributed and mixed-precision training. Here is an example of how you can load your data and train your model using FSDP:
# Data loading
train_dataset = datasets.MNIST('../data', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True)
# Training loop
for data, target in train_loader:
optimizer.zero_grad()
data, target = data.to(device), target.to(device)
output = model(data)
loss = F.nll_loss(output, target)
model.backward(loss)
optimizer.step()
Step 6: Evaluation
After training your model, you can evaluate its performance on a test set using the FSDP module. You can do this by disabling the gradient calculation during evaluation and running your model on the test dataset. Here is an example of how you can evaluate your model using FSDP:
# Evaluation loop
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
print('nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
And that’s it! You have successfully completed an end-to-end walkthrough of PyTorch FSDP. In this tutorial, we covered the installation of PyTorch FSDP, how to import it in your code, how to define your model, how to initialize FSDP with your model, how to load data and train your model, and how to evaluate your model. Start using PyTorch FSDP in your deep learning projects today to take advantage of its distributed and mixed-precision training capabilities.
hi! thanks for the tutorial. can you share sources for this?