Stacking Ensemble Learning Method Python scikit-learn Demo

Posted by


Ensemble learning is a powerful technique in machine learning where multiple models are combined to make accurate predictions. One popular method of ensemble learning is stacking, where the predictions of multiple base models are combined using a meta-model.

In this tutorial, we will walk through how to implement stacking using Python and scikit-learn. We will create a demo example to demonstrate how to stack multiple classification models to make predictions.

Step 1: Import the necessary libraries
First, we need to import the necessary libraries to build our stacking model. We will use scikit-learn for building our base models and meta-model.

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.ensemble import StackingClassifier

Step 2: Load the dataset
For this demo, we will use the famous Iris dataset, which is included in scikit-learn.

from sklearn.datasets import load_iris
data = load_iris()
X = data.data
y = data.target

Step 3: Split the dataset into training and testing sets
Next, we will split the dataset into training and testing sets using the train_test_split function.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 4: Define the base and meta-models
We will define three base models (RandomForestClassifier, GradientBoostingClassifier, and LogisticRegression) and one meta-model (LogisticRegression) that will combine the predictions of the base models.

base_models = [
    ("rf", RandomForestClassifier(n_estimators=50, random_state=42)),
    ("gb", GradientBoostingClassifier(n_estimators=50, random_state=42)),
    ("lr", LogisticRegression())
]

meta_model = LogisticRegression()

Step 5: Create the stacking model
Now, we will create a StackingClassifier object, passing in the base_models and the meta-model.

stacking_model = StackingClassifier(estimators=base_models, final_estimator=meta_model)

Step 6: Train the stacking model
We will train the stacking model on the training data.

stacking_model.fit(X_train, y_train)

Step 7: Make predictions
Finally, we will make predictions using the stacking model on the test data and evaluate its accuracy.

y_pred = stacking_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

That’s it! You have successfully built a stacking ensemble model using Python and scikit-learn. Stacking is a versatile technique that can be applied to any machine learning problem to improve predictions. You can experiment with different base models and meta-models to further improve the accuracy of your stacking model.

0 0 votes
Article Rating
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
@visheshmittal468
3 months ago

First of all amazing content thanks for that and Second thing sir can u make a video on how to combine CNN and LSTM so that CNN can extract the features LSTM can be used for feature feature prediction

@venkatesanr9455
3 months ago

Hi Sir, Thanks for the video and knowledge sharing. I have worked on xgboost, LGBM, and random forest. I believe that I can use this above-discussed approach/process to build the final ensemble model. Any inputs are highly helpful and Kindly reply.