Training a K-Nearest Neighbors (KNN) Classifier with Scikit-learn in Python #upgrade2python

Posted by


In this tutorial, we will walk through the process of training a K-Nearest Neighbors (KNN) classifier using the Scikit-learn library in Python. KNN is a simple and intuitive machine learning algorithm that is commonly used for classification tasks.

Before we get started, make sure you have Scikit-learn installed in your Python environment. If not, you can install it using pip:

pip install -U scikit-learn

Now, let’s dive into the implementation:

Step 1: Import the necessary libraries
First, we need to import the required libraries for our implementation:

import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

Step 2: Load the dataset
For this tutorial, we will use the Iris dataset, which is a popular dataset for classification tasks. You can load the dataset using the following code:

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

Step 3: Split the dataset
Next, we need to split the dataset into training and testing sets. This can be done using the train_test_split function from Scikit-learn:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In this code snippet, we are splitting the dataset into 80% training data and 20% testing data.

Step 4: Train the KNN classifier
Now, we can train the KNN classifier using the training data. The KNeighborsClassifier class in Scikit-learn can be used to create a KNN classifier:

# Create and train the KNN classifier
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

In this code snippet, we are creating a KNN classifier with k=3 (i.e., 3 nearest neighbors) and training it using the training data.

Step 5: Make predictions
Once the classifier has been trained, we can make predictions on the test data using the predict method:

# Make predictions
y_pred = knn.predict(X_test)

Step 6: Evaluate the model
Finally, we can evaluate the performance of the model by calculating the accuracy score on the test data:

# Calculate the accuracy score
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

This code snippet calculates the accuracy of the model by comparing the predicted labels with the actual labels in the test data.

And that’s it! You have successfully trained a K-Nearest Neighbors classifier using Scikit-learn. Feel free to experiment with different values of k and other hyperparameters to see how they affect the model’s performance. Happy coding!

0 0 votes
Article Rating

Leave a Reply

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x