Understanding Confusion Matrix in Machine Learning using Python

Posted by

Understanding Confusion Matrix in Machine Learning

Confusion Matrix in Machine Learning

A confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known. It is a common tool in machine learning to evaluate the performance of a classification algorithm.

Understanding Confusion Matrix

The confusion matrix is a 2×2 matrix that contains four important metrics:

  • True Positives (TP) – The number of correct predictions that an instance is positive.
  • False Positives (FP) – The number of incorrect predictions that an instance is positive.
  • True Negatives (TN) – The number of correct predictions that an instance is negative.
  • False Negatives (FN) – The number of incorrect predictions that an instance is negative.

These metrics help us to evaluate the performance of a classification model by calculating metrics such as accuracy, precision, recall, and F1-score.

Confusion Matrix with Python

In Python, we can easily create a confusion matrix using libraries such as scikit-learn. Here’s a simple example of creating a confusion matrix:


from sklearn.metrics import confusion_matrix

# True values
y_true = [1, 0, 1, 1, 0, 0]

# Predicted values
y_pred = [1, 1, 1, 0, 0, 1]

# Create confusion matrix
cm = confusion_matrix(y_true, y_pred)

print(cm)

This code snippet will output the confusion matrix for the given true and predicted values.

Conclusion

Understanding the confusion matrix is essential in evaluating the performance of a classification model in machine learning. By analyzing the metrics provided by the confusion matrix, we can improve our model and make informed decisions.