Handwritten digit recognition is a popular problem in the field of machine learning. In this tutorial, we will use the Scikit-Learn library in Python to build a handwritten digit recognition system. This project can be used as a simple example to understand how machine learning algorithms work and how they can be applied to real-life problems.
To get started with this project, you will need to have a basic understanding of Python programming language, machine learning concepts, and some familiarity with Scikit-Learn library. If you are new to machine learning, it’s recommended to first go through some introductory tutorials on basic concepts such as supervised learning, classification algorithms, and the Scikit-Learn library.
Step 1: Importing Required Libraries
The first step is to import the required libraries for our project. We need Numpy for numerical operations, Matplotlib for plotting graphs, and Scikit-Learn for machine learning algorithms.
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
Step 2: Loading and Preprocessing the Dataset
Scikit-Learn provides a built-in dataset called ‘digits’ which contains images of handwritten digits along with their corresponding labels. We will load this dataset and preprocess it before training our model.
digits = datasets.load_digits()
X = digits.data
y = digits.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Step 3: Training the Machine Learning Model
For this project, we will use a simple logistic regression model to train our algorithm. Logistic regression is a popular classification algorithm that is widely used for problems like this.
model = LogisticRegression()
model.fit(X_train, y_train)
Step 4: Evaluating the Model Performance
Once the model is trained, we need to evaluate its performance on the test dataset. We will calculate the accuracy score of our model to see how well it can recognize handwritten digits.
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Step 5: Making Predictions
Now that our model is trained and evaluated, we can use it to make predictions on new inputs. We can pass a new handwritten digit image to our model and get the predicted label as output.
# Randomly select an image from the test set
random_idx = np.random.randint(0, len(X_test))
input_img = X_test[random_idx].reshape(8, 8)
# Plot the input image
plt.imshow(input_img, cmap='gray')
plt.show()
# Make a prediction on the input image
prediction = model.predict([X_test[random_idx]])
print("Predicted digit:", prediction[0])
Conclusion
In this tutorial, we have built a simple handwritten digit recognition system using Scikit-Learn library in Python. We have successfully trained a logistic regression model to recognize handwritten digits with a good level of accuracy. This project is a great starting point for beginners to get hands-on experience with machine learning algorithms and understand how they can be applied to real-world problems.
I want to predict the double digits is it possible sir?, 0-9 not but I want more than 10,11,12 like these.
It was very confusing, didn't understood it properly.
Sir which IDE is this???
Do you know why in get_dataset in line 40 the compiler show this error? TypeError: a bytes-like object is required, not 'str'. I'm using Python 3.8.
This is totaly outdated. Totaly time waste
How much time it takes to train SVM for MNIST dataset? It is taking too long to run. I tried Google Colab and used it's GPU also. But not sure whether it was getting used or not. Any help would be appreciated. Thank you in advance.
plz give step 3 of svm_starter
I run the gen_dataset code, and no error showed. but still the dataset is not generated. can plz someone tell the reason?? plz plz :(((((
Hello, how do we run this in jupyter Notebook instead of sublime
Is python 3.8.1 compatiale to these libraries version?
Trying to unpickle estimator SVC from version 0.19.2 when using version 0.20.4. This might lead to breaking code or invalid results…Want to get rid of this error
can you please explain why you have to used temp folder ?
Can anyone tell me how to do the same for handwritten digits? I tried above solution but there every pixel value becomes 1 that means it is not able to classify it from image sample
Is the file named "svm_starter.py" used to create the model which is further used to predict/test the model?
Thankyou
Unable to print the dataframe..what to do?
AttributeError: '_csv.writer' object has no attribute 'writerrows'
How to solve this error ?
Hi Sir,
I am using PIL to load the image, and facing issue. Please help.
I am loading the image , and reshaping it to 28*28.. where as when converting it to numpy array at-time it convert into 28*28*3 and at-times into 28*28*4… how to standardize it.
Below is the code.
`from PIL import Image
import numpy as np
size = 28, 28
img = Image.open("handwritten_image_256x256.png")
img
img = img.resize(size, Image.ANTIALIAS)
display(img.size)
img
img_array = np.array(img)
display(img_array.shape, img_array)`
after forking the content ,i cannot able to clone to my local repo ,what might be the problem?
how can i download the dataset?