Coding Llama 2 from scratch in PyTorch – Part 3
Welcome to Part 3 of our Coding Llama 2 tutorial series on how to build a PyTorch model from scratch. In this part, we will dive deeper into the code and explore more advanced concepts.
Setting Up the Model
Let’s start by setting up our PyTorch model. We will define the structure of the neural network architecture, including the number of layers, activation functions, and loss function.
import torch import torch.nn as nn class NeuralNetwork(nn.Module): def __init__(self): super(NeuralNetwork, self).__init__() self.fc1 = nn.Linear(784, 256) self.fc2 = nn.Linear(256, 128) self.fc3 = nn.Linear(128, 10) def forward(self, x): x = torch.relu(self.fc1(x)) x = torch.relu(self.fc2(x)) x = self.fc3(x) return x model = NeuralNetwork()
Training the Model
Next, we will train our model using a training loop. We will define the optimizer and learning rate, as well as the number of epochs and batch size.
import torch.optim as optim optimizer = optim.Adam(model.parameters(), lr=0.001) criterion = nn.CrossEntropyLoss() for epoch in range(num_epochs): for i, data in enumerate(train_loader, 0): inputs, labels = data optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step()
Evaluating the Model
Finally, we will evaluate our model on the test dataset to see how well it performs.
correct = 0 total = 0 with torch.no_grad(): for data in test_loader: inputs, labels = data outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum().item() accuracy = correct / total print(f'Accuracy: {accuracy}')
Congratulations! You have successfully built and trained a PyTorch model from scratch. Stay tuned for more advanced topics in our Coding Llama 2 series.
So in this series, you don't use any pre-trained weights? You build and train the model from scratch on a custom dataset?
@user-vd7im8gc2w
Why do you need position ids?
You use it to map the input ids to their respective position in the sequence.
Example:
input_ids = [100, 20, 4, 50]
position_ids = torch.arange(input_ids.shape…)
print(position_ids)
>> [0, 1, 2, 3]
First time watching your video. Keep going bro 💪, its your friend Afzal
Really great job!