Introduction to Linear Regression with Scikit-Learn: Single Variable Analysis

Posted by

Scikit Learn Tutorial: Linear Regression Single Variable

Scikit Learn Tutorial: Linear Regression Single Variable

In this tutorial, we will learn how to perform linear regression with a single variable using Scikit Learn, a popular machine learning library in Python. Linear regression is a simple and commonly used technique for predictive modeling, and it is a good starting point for understanding the basics of machine learning.

1. Importing Necessary Libraries

Before we start with the implementation, we need to import the necessary libraries. We will be using NumPy and Scikit Learn for this tutorial.

        
import numpy as np
from sklearn.linear_model import LinearRegression
        
    

2. Creating the Data

Next, we will create some synthetic data for our linear regression model. We will generate a set of input features (X) and corresponding output labels (y).

        
# Generate some random data
np.random.seed(0)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
        
    

3. Fitting the Model

Now, we will use Scikit Learn to fit a linear regression model to our data.

        
# Create a linear regression model
model = LinearRegression()

# Fit the model to our data
model.fit(X, y)
        
    

4. Making Predictions

Once the model is trained, we can use it to make predictions on new data.

        
# Make predictions
X_new = np.array([[0], [2]])
y_pred = model.predict(X_new)
        
    

5. Visualizing the Results

Finally, we can visualize the results of our linear regression model by plotting the original data points and the regression line.

        
import matplotlib.pyplot as plt

# Plot the original data
plt.scatter(X, y)

# Plot the regression line
plt.plot(X_new, y_pred, 'r-')

# Show the plot
plt.show()
        
    

That’s it! You have now learned how to perform linear regression with a single variable using Scikit Learn. You can apply this knowledge to various real-world regression problems and continue to explore more advanced techniques in machine learning.