Build and Train a PyTorch LSTM in Under 100 Lines of Code
PyTorch is a popular open-source machine learning library that is widely used for building deep learning models. In this article, we will learn how to build and train a Long Short-Term Memory (LSTM) neural network using PyTorch in under 100 lines of code.
Step 1: Import PyTorch and Prepare the Data
We start by importing the necessary libraries and preparing the data for training. This includes loading the dataset, transforming the data into tensors, and splitting it into training and testing sets.
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
from sklearn.model_selection import train_test_split
# Load the dataset
data = np.loadtxt('dataset.csv', delimiter=',')
# Transform the data into tensors
X = torch.tensor(data[:, :-1], dtype=torch.float32)
y = torch.tensor(data[:, -1], dtype=torch.float32)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Step 2: Define the LSTM Model
Next, we define the architecture of the LSTM model using the nn.LSTM module provided by PyTorch. We also define the forward method that specifies how the data flows through the network.
class LSTM(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size):
super(LSTM, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
out, _ = self.lstm(x, (h0, c0))
out = self.fc(out[:, -1, :])
return out
Step 3: Train the Model
Finally, we train the LSTM model using the training data and evaluate its performance on the testing data. We also define the loss function and optimizer for training the model.
# Define the device for training the model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Initialize the model, loss function, and optimizer
model = LSTM(input_size=5, hidden_size=64, num_layers=2, output_size=1).to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Train the model
num_epochs = 100
for epoch in range(num_epochs):
inputs = X_train.to(device)
targets = y_train.to(device)
outputs = model(inputs)
loss = criterion(outputs, targets)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
# Evaluate the model
with torch.no_grad():
inputs = X_test.to(device)
targets = y_test.to(device)
outputs = model(inputs)
loss = criterion(outputs, targets)
print(f'Test Loss: {loss.item():.4f}')
By following these simple steps, you can build and train a PyTorch LSTM model in under 100 lines of code. This demonstrates the ease and simplicity of using PyTorch for deep learning tasks.
recently came across this channel and I'm enjoying this style of content, many thanks!
there is not enough link to githov so as not to copy the code manually from the screen to your computer. think about it in the future. the video also starts with some kind of vague start, as if I had to watch some previous videos before it, although it does not say that this is part 2 or something like that.
Why do we need to intialize the hidden_states and cell_states of the lstm every training step?