Quick Linear Regression with Scikit Learn in Python
Linear regression is a fundamental and widely used statistical method for modeling the relationship between a dependent variable and one or more independent variables. In this article, we will look at how to perform quick linear regression using the Scikit-Learn library in Python.
Setting Up the Environment
First, make sure you have Python and the Scikit-Learn library installed. You can install Scikit-Learn using pip:
pip install scikit-learn
Loading the Data
Next, let’s load our data. For this example, we will use the built-in diabetes dataset from Scikit-Learn. This dataset contains ten baseline variables, age, sex, body mass index, average blood pressure, and six blood serum measurements for 442 diabetes patients.
from sklearn import datasets
diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target
Fitting the Model
Now that we have our data, we can fit a linear regression model to it using the LinearRegression
class from Scikit-Learn:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)
Making Predictions
Once the model is trained, we can use it to make predictions on new data:
new_data = [[0.03906215, 0.05068012, 0.06169621, 0.02187235, -0.0442235, -0.03482076, -0.04340085, -0.00259226, 0.01990842, -0.01764613]]
prediction = model.predict(new_data)
Evaluating the Model
Finally, we can evaluate the performance of our model using metrics such as mean squared error or R-squared:
from sklearn.metrics import mean_squared_error, r2_score
y_pred = model.predict(X)
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
That’s it! We’ve successfully performed quick linear regression using the Scikit-Learn library in Python.