Save classifier to disk using scikit-learn. #quicktip

Posted by

Save classifier to disk in scikit-learn

Save classifier to disk in scikit-learn

One of the important aspects of machine learning is being able to save and load trained models for later use. In scikit-learn, this can be done with the help of the joblib library.

Here’s an example of how you can save a trained classifier to disk:


from sklearn import svm
from sklearn import datasets
from sklearn.externals import joblib

# Load a sample dataset
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Train a classifier
clf = svm.SVC()
clf.fit(X, y)

# Save the classifier to disk
joblib.dump(clf, 'classifier.pkl')
	

In the above example, we first load a sample dataset (the Iris dataset), then train a Support Vector Machine (SVM) classifier on the data. Finally, we use the joblib.dump() function to save the trained classifier to a file called classifier.pkl.

To load the saved classifier back into memory, you can use the joblib.load() function:


# Load the classifier from disk
clf = joblib.load('classifier.pkl')
	

By being able to save and load trained classifiers, you can easily reuse them in different applications or environments without having to retrain them each time. This can be particularly useful in production systems where you want to deploy machine learning models.