Creating and Training a PyTorch LSTM in Less Than 100 Lines of Code

Posted by

Build and Train a PyTorch LSTM in Under 100 Lines of Code

Build and Train a PyTorch LSTM in Under 100 Lines of Code

PyTorch is a popular open-source machine learning library that is widely used for building deep learning models. In this article, we will learn how to build and train a Long Short-Term Memory (LSTM) neural network using PyTorch in under 100 lines of code.

Step 1: Import PyTorch and Prepare the Data

We start by importing the necessary libraries and preparing the data for training. This includes loading the dataset, transforming the data into tensors, and splitting it into training and testing sets.


import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
from sklearn.model_selection import train_test_split

# Load the dataset
data = np.loadtxt('dataset.csv', delimiter=',')

# Transform the data into tensors
X = torch.tensor(data[:, :-1], dtype=torch.float32)
y = torch.tensor(data[:, -1], dtype=torch.float32)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 2: Define the LSTM Model

Next, we define the architecture of the LSTM model using the nn.LSTM module provided by PyTorch. We also define the forward method that specifies how the data flows through the network.


class LSTM(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size):
super(LSTM, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)

def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(device)

out, _ = self.lstm(x, (h0, c0))

out = self.fc(out[:, -1, :])
return out

Step 3: Train the Model

Finally, we train the LSTM model using the training data and evaluate its performance on the testing data. We also define the loss function and optimizer for training the model.


# Define the device for training the model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Initialize the model, loss function, and optimizer
model = LSTM(input_size=5, hidden_size=64, num_layers=2, output_size=1).to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Train the model
num_epochs = 100
for epoch in range(num_epochs):
inputs = X_train.to(device)
targets = y_train.to(device)

outputs = model(inputs)
loss = criterion(outputs, targets)

optimizer.zero_grad()
loss.backward()
optimizer.step()

print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Evaluate the model
with torch.no_grad():
inputs = X_test.to(device)
targets = y_test.to(device)
outputs = model(inputs)
loss = criterion(outputs, targets)
print(f'Test Loss: {loss.item():.4f}')

By following these simple steps, you can build and train a PyTorch LSTM model in under 100 lines of code. This demonstrates the ease and simplicity of using PyTorch for deep learning tasks.

0 0 votes
Article Rating
3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@AmritBasi
6 months ago

recently came across this channel and I'm enjoying this style of content, many thanks!

@777TYT
6 months ago

there is not enough link to githov so as not to copy the code manually from the screen to your computer. think about it in the future. the video also starts with some kind of vague start, as if I had to watch some previous videos before it, although it does not say that this is part 2 or something like that.

@AlexeyMatushevsky
6 months ago

Why do we need to intialize the hidden_states and cell_states of the lstm every training step?