Tutorial 31: Introduction to Logistic Regression Machine Learning Method with Scikit Learn and Pandas in Python

Posted by

Alfalfa

–

September 3, 2024

In this tutorial, we will cover how to use the Logistic Regression machine learning method in Python using Scikit Learn and Pandas. Logistic Regression is a classification algorithm used to predict the probability of a binary outcome based on one or more independent variables.

We will use the iris dataset for this tutorial, which is a commonly used dataset in machine learning. The iris dataset contains 150 samples of iris flowers, each with four features: sepal length, sepal width, petal length, and petal width. The target variable is the species of the iris flower, which can be one of three classes: setosa, versicolor, or virginica.

Let’s get started by importing the necessary libraries:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

Next, we will load the iris dataset into a Pandas DataFrame and split it into features and target variables:

iris = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
X = iris.iloc[:, :-1]
y = iris.iloc[:, -1]

Now, we will split the data into training and testing sets using the train_test_split function from Scikit Learn:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

Next, we will create an instance of the Logistic Regression model and fit it to the training data:

model = LogisticRegression()
model.fit(X_train, y_train)

Now that our model is trained, we can make predictions on the test set and evaluate its performance using the classification_report and confusion_matrix functions:

y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))

The classification_report function displays precision, recall, F1-score, and support for each class in the target variable. The confusion_matrix function shows the number of true positives, true negatives, false positives, and false negatives for each class.

That’s it! You have successfully implemented the Logistic Regression algorithm using Scikit Learn and Pandas in Python. Logistic Regression is a powerful algorithm for binary classification tasks and is widely used in machine learning applications. Feel free to experiment with different datasets and hyperparameters to improve the performance of your model.

#31:, and, Bottle, classification, data-science, django, fastapi,, flask, introduction, jupyter notebook, Keras, Kivy, learn, learning, logistic, Logistic Regression, logistic regression machine learning, machine, machine learning, method, pandas, PyQt, PySimpleGUI, python, python data science, PyTorch, regression, scikit, scikit-learn, sk-learn, TensorFlow, Tkinter, Tutorial, with

Alfalfa

0 0 votes

Article Rating

Leave a ReplyCancel reply

18 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

@user-pw9wz2fo3o

25 days ago

BTW its spelt as logistic not logestic.

@danielngigi2608

25 days ago

Great video 👍

@haoyuan92

25 days ago

hello, to perform logistic regression, can my predictor variable be binary (0 or 1) or categorical (1,2,3,etc.) ?

@zoombiz3365

25 days ago

Sir can we predict the gender
If so what would be the changes that we have to make, like how to solve the value error
Anyone please help

@Majaroshimi

25 days ago

Thank you very much sir!!

@Drkalaamarab

25 days ago

Great share 👍, I will work on this data. What is the best way to contact you?

@7singar7

25 days ago

how to i find the best hyperparameters for logistic regression in python ?

@nehasheth3680

25 days ago

Sir, could you please share the link of the customer dataset?

@amioza73

25 days ago

can i run this directly on python

@kirank1923

25 days ago

hi how tune hyper parameters

@naveenkumargandla5386

25 days ago

why didn't you split the data into train and test . whatever the error metrics you checked that is for the data which was used in the model.

@danielsloan20

25 days ago

Amazing video thanks!!!

@electrology

25 days ago

I am unable to get anything working in the data set that I am using to build the logistic regression model after the step "Deploying and evaluating your model". I get the following error.

—————————————————————————
NotFittedError Traceback (most recent call last)
<ipython-input-14-e080ceade2d3> in <module>()
—-> 1 y_pred = LogReg.predict(X)
2 from sklearn.metrics import classification_report
3 print(classification_report(Y, y_pred))

~Anaconda3libsite-packagessklearnlinear_modelbase.py in predict(self, X)
322 Predicted class label per sample.
323 """
–> 324 scores = self.decision_function(X)
325 if len(scores.shape) == 1:
326 indices = (scores > 0).astype(np.int)

~Anaconda3libsite-packagessklearnlinear_modelbase.py in decision_function(self, X)
296 if not hasattr(self, 'coef_') or self.coef_ is None:
297 raise NotFittedError("This %(name)s instance is not fitted "
–> 298 "yet" % {'name': type(self).__name__})
299
300 X = check_array(X, accept_sparse='csr')

NotFittedError: This LogisticRegression instance is not fitted yet

Not sure what is going on. I imported all the necessary dependencies etc. and followed the instructions step by step but this is not working. In my data set, the X (independent variables) are in float64 formats. The Y (binary dependent variable) is in int64 format. Is there anything going wrong with the formats? CAN YOU PLEASE HELP!

@NadyaPena-01

25 days ago

Thank you for this video. Very helpful and clear. However, please share the files via Github or some other platform that's not 4Shared if possible. 4Shared is pretty inconvenient because they won't let me download the data unless I sign up with them and even after I did that, they made me wait on their page to download (navigating away from the page also stopped their download timer). Totally put off by that.

@joedandantech

25 days ago

Can you put your files public on GitHub? no one likes to download from links…

@datascienceds7965

25 days ago

Thanks for the video. It was very well explained. How can we plot the values precession, recall and support?

@battlemoose1594

25 days ago

Great video, explains exactly how to do it. Doesn't discuss any of the theory though.

@ankitadwivedi1183

25 days ago

Hello sir,

In classification report, I am getting 0.00 for precision, recall, f1score for true values (row as 1).
Please help me in finding where am I going wrong.

Tutorial 31: Introduction to Logistic Regression Machine Learning Method with Scikit Learn and Pandas in Python

Like this:

Leave a ReplyCancel reply

Recent Posts

Categories

Tags

Creating a Cute Pencil Box from a Water Bottle: A DIY Guide

Vue.js 101: Creating Dynamic and Easily Reusable Components Using Vue.js Properties

Building a Python Flask Web Application from Scratch: Part 1 – Starting Out

Creating a Cute Pencil Box from a Water Bottle: A DIY Guide

Vue.js 101: Creating Dynamic and Easily Reusable Components Using Vue.js Properties

Building a Python Flask Web Application from Scratch: Part 1 – Starting Out

Creating a Cute Pencil Box from a Water Bottle: A DIY Guide

Vue.js 101: Creating Dynamic and Easily Reusable Components Using Vue.js Properties

Building a Python Flask Web Application from Scratch: Part 1 – Starting Out

Creating a Cute Pencil Box from a Water Bottle: A DIY Guide

Vue.js 101: Creating Dynamic and Easily Reusable Components Using Vue.js Properties

Building a Python Flask Web Application from Scratch: Part 1 – Starting Out

Tutorial 31: Introduction to Logistic Regression Machine Learning Method with Scikit Learn and Pandas in Python

Share this:

Like this:

Leave a ReplyCancel reply

Recent Posts

Categories

Tags