In this tutorial, we will cover the basics of Support Vector Machines (SVM) using Python. Support Vector Machines are a popular and powerful algorithm for classification and regression tasks. They are particularly useful for tasks where there is a clear boundary between classes.
Before we dive into the code, let’s first understand what Support Vector Machines are. SVM is a supervised machine learning algorithm that can be used for both classification and regression tasks. The main idea behind SVM is to find the hyperplane that best separates the data into different classes.
The hyperplane is defined as the line that maximizes the margin between the different classes. The data points that are closest to the hyperplane are called support vectors. The margin is the distance between the support vectors and the hyperplane.
Now, let’s dive into the code. We will start by importing the necessary libraries:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
Next, we will load the Iris dataset from the sklearn library and split it into training and testing sets:
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Now, we will create an instance of the SVM classifier and train it on the training data:
clf = SVC(kernel='linear')
clf.fit(X_train, y_train)
Next, we will make predictions on the test data and calculate the accuracy of the model:
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")
Finally, we can visualize the decision boundary of the SVM classifier:
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
ax = plt.gca()
xlim = ax.get_xlim()
ylim = ax.get_ylim()
xx = np.linspace(xlim[0], xlim[1], 30)
yy = np.linspace(ylim[0], ylim[1], 30)
YY, XX = np.meshgrid(yy, xx)
xy = np.vstack([XX.ravel(), YY.ravel()]).T
Z = clf.decision_function(xy).reshape(XX.shape)
ax.contour(XX, YY, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,
linestyles=['--', '-', '--'])
plt.show()
This code will create a scatter plot of the data points, where each class is represented by a different color. The decision boundary of the SVM classifier will be displayed as a line separating the different classes.
This concludes our tutorial on Support Vector Machines using Python. Support Vector Machines are a powerful algorithm that can be used for a wide range of classification and regression tasks. By understanding how SVM works and how to implement it in Python, you can tackle a variety of machine learning problems effectively.