Recognizing Handwritten Digits with Scikit-Learn: A Project in Machine Learning

Posted by

Alfalfa

–

October 8, 2024

Handwritten digit recognition is a popular problem in the field of machine learning. In this tutorial, we will use the Scikit-Learn library in Python to build a handwritten digit recognition system. This project can be used as a simple example to understand how machine learning algorithms work and how they can be applied to real-life problems.

To get started with this project, you will need to have a basic understanding of Python programming language, machine learning concepts, and some familiarity with Scikit-Learn library. If you are new to machine learning, it’s recommended to first go through some introductory tutorials on basic concepts such as supervised learning, classification algorithms, and the Scikit-Learn library.

Step 1: Importing Required Libraries

The first step is to import the required libraries for our project. We need Numpy for numerical operations, Matplotlib for plotting graphs, and Scikit-Learn for machine learning algorithms.

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

Step 2: Loading and Preprocessing the Dataset

Scikit-Learn provides a built-in dataset called ‘digits’ which contains images of handwritten digits along with their corresponding labels. We will load this dataset and preprocess it before training our model.

digits = datasets.load_digits()

X = digits.data
y = digits.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 3: Training the Machine Learning Model

For this project, we will use a simple logistic regression model to train our algorithm. Logistic regression is a popular classification algorithm that is widely used for problems like this.

model = LogisticRegression()

model.fit(X_train, y_train)

Step 4: Evaluating the Model Performance

Once the model is trained, we need to evaluate its performance on the test dataset. We will calculate the accuracy score of our model to see how well it can recognize handwritten digits.

y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Step 5: Making Predictions

Now that our model is trained and evaluated, we can use it to make predictions on new inputs. We can pass a new handwritten digit image to our model and get the predicted label as output.

# Randomly select an image from the test set
random_idx = np.random.randint(0, len(X_test))
input_img = X_test[random_idx].reshape(8, 8)

# Plot the input image
plt.imshow(input_img, cmap='gray')
plt.show()

# Make a prediction on the input image
prediction = model.predict([X_test[random_idx]])
print("Predicted digit:", prediction[0])

Conclusion

In this tutorial, we have built a simple handwritten digit recognition system using Scikit-Learn library in Python. We have successfully trained a logistic regression model to recognize handwritten digits with a good level of accuracy. This project is a great starting point for beginners to get hands-on experience with machine learning algorithms and understand how they can be applied to real-world problems.

Bottle, Digit Recognition, digits, django, fastapi,, flask, handwritten, handwritten digit recognition, Keras, Kivy, learning, machine, machine learning, Machine Learning basics, machine learning for beginners, machine learning nptel, machine learning project ideas, machine learning projects, machine learning projects with python, machine learning python, machine learning tutorial for beginners, MNIST Dataset, mnist digit recognition python, project, PyQt, PySimpleGUI, python, PyTorch, recognizing, scikit-learn, support vector machine, svm, TensorFlow, Tkinter, with

Alfalfa

0 0 votes

Article Rating

30 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

@srinusabbavarapu6509

1 month ago

I want to predict the double digits is it possible sir?, 0-9 not but I want more than 10,11,12 like these.

@himanshubhawnani401

1 month ago

It was very confusing, didn't understood it properly.

@anwaydeepnath5017

1 month ago

Sir which IDE is this???

@kalotheo

1 month ago

Do you know why in get_dataset in line 40 the compiler show this error? TypeError: a bytes-like object is required, not 'str'. I'm using Python 3.8.

@mistershort10

1 month ago

This is totaly outdated. Totaly time waste

@heratpatel6284

1 month ago

How much time it takes to train SVM for MNIST dataset? It is taking too long to run. I tried Google Colab and used it's GPU also. But not sure whether it was getting used or not. Any help would be appreciated. Thank you in advance.

@minkiaggarwall5529

1 month ago

plz give step 3 of svm_starter

@maeeshameem1578

1 month ago

I run the gen_dataset code, and no error showed. but still the dataset is not generated. can plz someone tell the reason?? plz plz :(((((

@mkarthik4768

1 month ago

Hello, how do we run this in jupyter Notebook instead of sublime

@danishtasheikh8341

1 month ago

Is python 3.8.1 compatiale to these libraries version?

@jayeshkulkarni9602

1 month ago

Trying to unpickle estimator SVC from version 0.19.2 when using version 0.20.4. This might lead to breaking code or invalid results…Want to get rid of this error

@forampattha6183

1 month ago

can you please explain why you have to used temp folder ?

@rahulsolankib

1 month ago

Can anyone tell me how to do the same for handwritten digits? I tried above solution but there every pixel value becomes 1 that means it is not able to classify it from image sample

@milanpatel8034

1 month ago

Is the file named "svm_starter.py" used to create the model which is further used to predict/test the model?

@krishanbhadana5308

1 month ago

Thankyou

@swathimparamesh8366

1 month ago

Unable to print the dataframe..what to do?

@harshinisewani800

1 month ago

AttributeError: '_csv.writer' object has no attribute 'writerrows'
How to solve this error ?

@mayanktripathi4u

1 month ago

Hi Sir,
I am using PIL to load the image, and facing issue. Please help.

I am loading the image , and reshaping it to 28*28.. where as when converting it to numpy array at-time it convert into 28*28*3 and at-times into 28*28*4… how to standardize it.
Below is the code.

`from PIL import Image

import numpy as np

size = 28, 28
img = Image.open("handwritten_image_256x256.png")

img
img = img.resize(size, Image.ANTIALIAS)

display(img.size)

img
img_array = np.array(img)

display(img_array.shape, img_array)`

@vinaypalnati8117

1 month ago

after forking the content ,i cannot able to clone to my local repo ,what might be the problem?

@palaksharma4334

1 month ago

how can i download the dataset?

Recognizing Handwritten Digits with Scikit-Learn: A Project in Machine Learning

Like this:

Recent Posts

Categories

Tags

Пример использования сверточной нейронной сети в PyTorch

Comparison of Keras, Tensorflow, and PyTorch: A Deep Learning Frameworks Analysis by Edureka

Пример использования сверточной нейронной сети в PyTorch

Comparison of Keras, Tensorflow, and PyTorch: A Deep Learning Frameworks Analysis by Edureka

Пример использования сверточной нейронной сети в PyTorch

Comparison of Keras, Tensorflow, and PyTorch: A Deep Learning Frameworks Analysis by Edureka

Пример использования сверточной нейронной сети в PyTorch

Comparison of Keras, Tensorflow, and PyTorch: A Deep Learning Frameworks Analysis by Edureka

Recognizing Handwritten Digits with Scikit-Learn: A Project in Machine Learning

Share this:

Like this:

Recent Posts

Categories

Tags