Building a Basic Linear Regression Model with Scikit Learn

Posted by

Training a Simple Linear Regression Model in Scikit Learn

Training a Simple Linear Regression Model in Scikit Learn

Linear Regression is a fundamental machine learning technique used to predict numerical outcomes based on one or more input features. In this article, we will walk through the steps of training a simple linear regression model using the popular Python library, Scikit Learn.

Step 1: Import necessary libraries

First, we need to import the necessary libraries for our linear regression model. We will be using numpy for numerical computations and pandas for data manipulation.

<pre>
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
</pre>

Step 2: Load the dataset

Next, we need to load our dataset into a pandas dataframe. For this example, let’s assume we have a dataset with two columns: x (input feature) and y (output variable).

<pre>
data = pd.read_csv('dataset.csv')
X = data['x'].values.reshape(-1, 1)
y = data['y'].values
</pre>

Step 3: Split the data into training and testing sets

Before training our model, we need to split our data into training and testing sets. This will allow us to evaluate the performance of our model on unseen data.

<pre>
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
</pre>

Step 4: Train the linear regression model

Now, we can instantiate a LinearRegression object and fit it to our training data.

<pre>
model = LinearRegression()
model.fit(X_train, y_train)
</pre>

Step 5: Make predictions

Finally, we can make predictions using our trained model on the test set.

<pre>
predictions = model.predict(X_test)
</pre>

Congratulations! You have successfully trained a simple linear regression model in Scikit Learn. You can now evaluate the performance of your model using metrics such as Mean Squared Error or R-squared.