In this tutorial, we will cover how to perform logistic regression using Scikit-learn library in Python. Logistic regression is a supervised learning algorithm used for classification tasks where the output variable is binary (0 or 1).
Step 1: Import the necessary libraries
First, we need to import the necessary libraries for our logistic regression model. We will be using Scikit-learn library for machine learning tasks and NumPy for numerical operations.
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
Step 2: Load and preprocess the data
Next, we need to load our dataset and preprocess it before training our model. For this tutorial, we will be using the famous Iris dataset available in Scikit-learn library.
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Step 3: Train the logistic regression model
Now that we have preprocessed our data, we can train our logistic regression model on the training data.
# Initialize the logistic regression model
model = LogisticRegression()
# Fit the model on the training data
model.fit(X_train, y_train)
Step 4: Make predictions and evaluate the model
Once we have trained our model, we can make predictions on the test data and evaluate the performance of our logistic regression model.
# Make predictions on the test data
predictions = model.predict(X_test)
# Evaluate the model
accuracy = model.score(X_test, y_test)
print(f"Model accuracy: {accuracy}")
Step 5: Save and deploy the model (optional)
If you are satisfied with the performance of your logistic regression model, you can save it to a file and deploy it in your applications. This step is optional but can be useful if you want to reuse your model later.
import joblib
# Save the model to a file
joblib.dump(model, 'logistic_regression_model.pkl')
# Load the model from the file
loaded_model = joblib.load('logistic_regression_model.pkl')
And that’s it! You have successfully trained a logistic regression model using Scikit-learn library in Python. Logistic regression is a simple yet powerful algorithm for binary classification tasks, and Scikit-learn makes it easy to implement and train machine learning models. Feel free to experiment with different datasets and hyperparameters to improve the performance of your model.