Creating a neural network from scratch using only numpy and math

Posted by


Building a neural network from scratch using only numpy and math may seem like a daunting task, but it can actually be a very enlightening and rewarding experience. In this tutorial, we will go through the step-by-step process of building a simple neural network with one hidden layer to classify handwritten digits from the MNIST dataset.

First, let’s start by understanding the basic structure of a neural network. A neural network is made up of layers of neurons, each of which takes in inputs, performs some computations, and produces an output. The basic building block of a neural network is the perceptron, which takes in inputs, multiplies them by weights, adds a bias, and passes the result through an activation function.

To build a neural network from scratch, we need to define the following components:

  1. Input Layer: This is the layer that takes in the input data. In our case, we will be using the images of handwritten digits from the MNIST dataset, which are 28×28 pixels each.

  2. Hidden Layer: This is the layer in between the input and output layers that performs computations to extract features from the input data.

  3. Output Layer: This is the final layer that produces the output of the neural network. In our case, we will have 10 neurons, each corresponding to a digit from 0 to 9.

  4. Weights and Biases: These are the parameters of the neural network that are learned during the training process.

  5. Activation Function: This is the non-linear function that introduces non-linearity into the neural network. In this tutorial, we will be using the sigmoid activation function.

Now, let’s move on to the implementation of the neural network.

Step 1: Load the MNIST dataset
First, we need to load the MNIST dataset using numpy. The dataset consists of 60,000 training images and 10,000 test images, each of which is 28×28 pixels. We will also normalize the pixel values to be between 0 and 1.

import numpy as np
from mnist import MNIST

mndata = MNIST('path_to_mnist_data')
train_images, train_labels = mndata.load_training()
test_images, test_labels = mndata.load_testing()

# Convert images and labels to numpy arrays
train_images = np.array(train_images)
test_images = np.array(test_images)
train_labels = np.array(train_labels)
test_labels = np.array(test_labels)

# Normalize pixel values
train_images = train_images / 255.0
test_images = test_images / 255.0

Step 2: Define the neural network architecture
Next, we need to define the architecture of the neural network. We will have 784 input neurons (28×28 pixels), 128 neurons in the hidden layer, and 10 output neurons.

input_size = 784
hidden_size = 128
output_size = 10

Step 3: Initialize the weights and biases
We need to initialize the weights and biases of the neural network. We will initialize the weights randomly and the biases to zeros.

weights_input_hidden = np.random.randn(input_size, hidden_size)
biases_input_hidden = np.zeros(hidden_size)

weights_hidden_output = np.random.randn(hidden_size, output_size)
biases_hidden_output = np.zeros(output_size)

Step 4: Implement the sigmoid activation function
Next, we need to implement the sigmoid activation function, which squashes the output of a neuron to be between 0 and 1.

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

Step 5: Implement the forward pass
Now, we can implement the forward pass of the neural network, which computes the output of the neural network given an input.

def forward_pass(input_data):
    hidden_layer_input = np.dot(input_data, weights_input_hidden) + biases_input_hidden
    hidden_layer_output = sigmoid(hidden_layer_input)

    output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + biases_hidden_output
    output_layer_output = sigmoid(output_layer_input)

    return output_layer_output

Step 6: Implement the loss function
We need to define a loss function to measure how well the neural network is performing. In this tutorial, we will use the cross-entropy loss function.

def cross_entropy_loss(predictions, targets):
    return -np.mean(targets * np.log(predictions) + (1 - targets) * np.log(1 - predictions))

Step 7: Implement the backward pass (backpropagation)
Finally, we need to implement the backward pass of the neural network, which uses the chain rule to compute the gradients of the weights and biases with respect to the loss.

def backward_pass(input_data, targets, predictions):
    output_error = predictions - targets
    hidden_error = np.dot(output_error, weights_hidden_output.T) * hidden_layer_output * (1 - hidden_layer_output)

    d_weights_hidden_output = np.dot(hidden_layer_output.T, output_error)
    d_biases_hidden_output = np.sum(output_error, axis=0)

    d_weights_input_hidden = np.dot(input_data.T, hidden_error)
    d_biases_input_hidden = np.sum(hidden_error, axis=0)

    return d_weights_input_hidden, d_biases_input_hidden, d_weights_hidden_output, d_biases_hidden_output

Step 8: Train the neural network
Now that we have implemented all the necessary components of the neural network, we can train it by running multiple iterations of the forward and backward passes.

learning_rate = 0.01

for i in range(1000):
    predictions = forward_pass(train_images)
    loss = cross_entropy_loss(predictions, train_labels)

    d_weights_input_hidden, d_biases_input_hidden, d_weights_hidden_output, d_biases_hidden_output = backward_pass(train_images, train_labels, predictions)

    weights_input_hidden -= learning_rate * d_weights_input_hidden
    biases_input_hidden -= learning_rate * d_biases_input_hidden
    weights_hidden_output -= learning_rate * d_weights_hidden_output
    biases_hidden_output -= learning_rate * d_biases_hidden_output

    if i % 100 == 0:
        print("Iteration {}, Loss: {}".format(i, loss))

Step 9: Make predictions on the test set
Finally, we can make predictions on the test set using the trained neural network and evaluate its performance.

test_predictions = forward_pass(test_images)
test_accuracy = np.mean(np.argmax(test_predictions, axis=1) == test_labels)

print("Test Accuracy: {}".format(test_accuracy))

Congratulations! You have successfully built a neural network from scratch using only numpy and math. You can now experiment with different architectures, activation functions, and optimization algorithms to improve the performance of your neural network. Happy coding!

0 0 votes
Article Rating

Leave a Reply

36 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@Av3rjkRRow
1 day ago

Going for Mechanical and aerospace, i hate coding. This is pretty awesome, im actually looking forward to learning this 😮🤩

@lordrot20
1 day ago

awesome video!

@meenalthakur1440
1 day ago

I am getting accuracy as 0.098 in all iterations. What is the mistake?

@noabstruction
1 day ago

You made me subscribe

@RossettiAries-s5w
1 day ago

Perez Maria Jones Brian Smith Patricia

@bactrosaurus
1 day ago

Im currently doing this in java 😅

@sxmisok
1 day ago

samsung the goat

@Michael5029
1 day ago

this guy is your competition if you are a cs major, this field is cooked

@MatrixMoney-uo6pk
1 day ago

Thank you for sharing, this field is so interesting and satisfying

@amit6k
1 day ago

Thanks Samson, it was great to watch your explanation on building the NN from the scratch.
However, during the architecting the NN, you stated that the hidden layer and the output layer each will have 10 units of Neurons.
How did you decide these numbers (10) and the number of hidden layers? Can you please let me know more about architecting the NN?
Thank you.

@EueyrSueon
1 day ago

Anderson George Rodriguez Frank Anderson Maria

@clutchmaster6000
1 day ago

now build the entire computer and all of the software from scratch(forage for natural resources)

@jontemikael
1 day ago

Is this how google photo search fungerar?

@usethisforproductivity-tg7xq
1 day ago

I know for a fact he learned all this from Andrew ng. The notation is one to one

@mustansirmuhammad9488
1 day ago

can someone explain the math behind the equations written for back prop

@MuratDagcan
1 day ago

Gonzalez Robert Perez Margaret Brown Thomas

@miracle8237
1 day ago

next challenge must be -> no python,no programming language

@adrgonzato
1 day ago

Make it on assembly 😍

@ArpitTbala
1 day ago

Bro looks younger than me 😭😭😭

@misa-g3o
1 day ago

Picture up down up down up down , level 0, level 1, level 0 … nothing …. slow down and speak where mouse point on !

36
0
Would love your thoughts, please comment.x
()
x