Creating a neural network from scratch using only numpy and math

Posted by

Alfalfa

–

September 27, 2024

Building a neural network from scratch using only numpy and math may seem like a daunting task, but it can actually be a very enlightening and rewarding experience. In this tutorial, we will go through the step-by-step process of building a simple neural network with one hidden layer to classify handwritten digits from the MNIST dataset.

First, let’s start by understanding the basic structure of a neural network. A neural network is made up of layers of neurons, each of which takes in inputs, performs some computations, and produces an output. The basic building block of a neural network is the perceptron, which takes in inputs, multiplies them by weights, adds a bias, and passes the result through an activation function.

To build a neural network from scratch, we need to define the following components:

Input Layer: This is the layer that takes in the input data. In our case, we will be using the images of handwritten digits from the MNIST dataset, which are 28×28 pixels each.
Hidden Layer: This is the layer in between the input and output layers that performs computations to extract features from the input data.
Output Layer: This is the final layer that produces the output of the neural network. In our case, we will have 10 neurons, each corresponding to a digit from 0 to 9.
Weights and Biases: These are the parameters of the neural network that are learned during the training process.
Activation Function: This is the non-linear function that introduces non-linearity into the neural network. In this tutorial, we will be using the sigmoid activation function.

Now, let’s move on to the implementation of the neural network.

Step 1: Load the MNIST dataset
First, we need to load the MNIST dataset using numpy. The dataset consists of 60,000 training images and 10,000 test images, each of which is 28×28 pixels. We will also normalize the pixel values to be between 0 and 1.

import numpy as np
from mnist import MNIST

mndata = MNIST('path_to_mnist_data')
train_images, train_labels = mndata.load_training()
test_images, test_labels = mndata.load_testing()

# Convert images and labels to numpy arrays
train_images = np.array(train_images)
test_images = np.array(test_images)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)

# Normalize pixel values
train_images = train_images / 255.0
test_images = test_images / 255.0

Step 2: Define the neural network architecture
Next, we need to define the architecture of the neural network. We will have 784 input neurons (28×28 pixels), 128 neurons in the hidden layer, and 10 output neurons.

input_size = 784
hidden_size = 128
output_size = 10

Step 3: Initialize the weights and biases
We need to initialize the weights and biases of the neural network. We will initialize the weights randomly and the biases to zeros.

weights_input_hidden = np.random.randn(input_size, hidden_size)
biases_input_hidden = np.zeros(hidden_size)

weights_hidden_output = np.random.randn(hidden_size, output_size)
biases_hidden_output = np.zeros(output_size)

Step 4: Implement the sigmoid activation function
Next, we need to implement the sigmoid activation function, which squashes the output of a neuron to be between 0 and 1.

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

Step 5: Implement the forward pass
Now, we can implement the forward pass of the neural network, which computes the output of the neural network given an input.

def forward_pass(input_data):
    hidden_layer_input = np.dot(input_data, weights_input_hidden) + biases_input_hidden
    hidden_layer_output = sigmoid(hidden_layer_input)

    output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + biases_hidden_output
    output_layer_output = sigmoid(output_layer_input)

    return output_layer_output

Step 6: Implement the loss function
We need to define a loss function to measure how well the neural network is performing. In this tutorial, we will use the cross-entropy loss function.

def cross_entropy_loss(predictions, targets):
    return -np.mean(targets * np.log(predictions) + (1 - targets) * np.log(1 - predictions))

Step 7: Implement the backward pass (backpropagation)
Finally, we need to implement the backward pass of the neural network, which uses the chain rule to compute the gradients of the weights and biases with respect to the loss.

def backward_pass(input_data, targets, predictions):
    output_error = predictions - targets
    hidden_error = np.dot(output_error, weights_hidden_output.T) * hidden_layer_output * (1 - hidden_layer_output)

    d_weights_hidden_output = np.dot(hidden_layer_output.T, output_error)
    d_biases_hidden_output = np.sum(output_error, axis=0)

    d_weights_input_hidden = np.dot(input_data.T, hidden_error)
    d_biases_input_hidden = np.sum(hidden_error, axis=0)

    return d_weights_input_hidden, d_biases_input_hidden, d_weights_hidden_output, d_biases_hidden_output

Step 8: Train the neural network
Now that we have implemented all the necessary components of the neural network, we can train it by running multiple iterations of the forward and backward passes.

learning_rate = 0.01

for i in range(1000):
    predictions = forward_pass(train_images)
    loss = cross_entropy_loss(predictions, train_labels)

    d_weights_input_hidden, d_biases_input_hidden, d_weights_hidden_output, d_biases_hidden_output = backward_pass(train_images, train_labels, predictions)

    weights_input_hidden -= learning_rate * d_weights_input_hidden
    biases_input_hidden -= learning_rate * d_biases_input_hidden
    weights_hidden_output -= learning_rate * d_weights_hidden_output
    biases_hidden_output -= learning_rate * d_biases_hidden_output

    if i % 100 == 0:
        print("Iteration {}, Loss: {}".format(i, loss))

Step 9: Make predictions on the test set
Finally, we can make predictions on the test set using the trained neural network and evaluate its performance.

test_predictions = forward_pass(test_images)
test_accuracy = np.mean(np.argmax(test_predictions, axis=1) == test_labels)

print("Test Accuracy: {}".format(test_accuracy))

Congratulations! You have successfully built a neural network from scratch using only numpy and math. You can now experiment with different architectures, activation functions, and optimization algorithms to improve the performance of your neural network. Happy coding!

and, Bottle, creating, django, fastapi,, flask, from, Keras, Kivy, math, network, neural, numpy, only, PyQt, PySimpleGUI, python, PyTorch, scikit-learn, scratch, TensorFlow, Tkinter, using

Alfalfa

0 0 votes

Article Rating

Leave a ReplyCancel reply

36 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

@Av3rjkRRow

1 day ago

Going for Mechanical and aerospace, i hate coding. This is pretty awesome, im actually looking forward to learning this 😮🤩

@lordrot20

1 day ago

awesome video!

@meenalthakur1440

1 day ago

I am getting accuracy as 0.098 in all iterations. What is the mistake?

@noabstruction

1 day ago

You made me subscribe

@RossettiAries-s5w

1 day ago

Perez Maria Jones Brian Smith Patricia

@bactrosaurus

1 day ago

Im currently doing this in java 😅

@sxmisok

1 day ago

samsung the goat

@Michael5029

1 day ago

this guy is your competition if you are a cs major, this field is cooked

@MatrixMoney-uo6pk

1 day ago

Thank you for sharing, this field is so interesting and satisfying

@amit6k

1 day ago

Thanks Samson, it was great to watch your explanation on building the NN from the scratch.
However, during the architecting the NN, you stated that the hidden layer and the output layer each will have 10 units of Neurons.
How did you decide these numbers (10) and the number of hidden layers? Can you please let me know more about architecting the NN?
Thank you.