How to Create a Classification Report in Python using sklearn

In this article, we show how to create a classification report in Python using the sklearn module.

A classification report is a report that tells us various metrics related to how well a machine learning model performed. This includes things such as the precision or accuracy of the machine learning program to correctly predict an outcome.

It also goes more in depth and tells us things such as recall and f1 score. recall is the ability of a machine learning to identify a true positive (recall= true positives/(true positives + false negatives)). The highest recall score is 1, which means the program identifies all positives correctly and the lowest score is 0. The f1 score is the calculated by the following formula, F1 = 2 * (precision * recall) / (precision + recall). It can be interpreted as a weighted average of the precision and recall. The best score is 1 and the worst score is 0.

So the classification report reveals important information to let u know how well a machine learning model is performing.

If the classification report gives us a low score, then it may not be an appropriate model to use or there may not be enough training data given to the program.

So below we have a decision tree classifier that classifies outcomes based on given data. It represents whether children will go outside to play (positive result) or not (negative results) based on a few weather variables (temperature, humidity, and whether it is windy).

The CSV file used with this machine learning program can be found at the following link: Play.csv

As an output, we create a classification report which represents a metric that allows us to see how well the program performed.

import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns df= pd.read_csv('Played.csv') from sklearn.model_selection import train_test_split X= df.drop(columns=['Played'], axis=1) y= df['Played'] X_train, X_test, y_train, y_test= train_test_split(X,y,test_size= 0.6) from sklearn.tree import DecisionTreeClassifier dtree= DecisionTreeClassifier() dtree.fit(X_train,y_train) predictions= dtree.predict(X_test) from sklearn.metrics import classification_report, confusion_matrix print(confusion_matrix(y_test,predictions))

So in order to create a classification report, we have to import classification_report from sklearn.metrics.

A classification report is a metric that allows us to see if our machine learning program is effective or not, and it does this through telling us the precision or accuracy of how the program predicts positives correctly and negatives correctly, as well as just the general prediction ability of positives and negatives overall.

The results of our program is shown below.

precision recall f1-score support 0 0.88 0.78 0.82 9 1 0.67 0.80 0.73 5 accuracy 0.79 14 macro avg 0.77 0.79 0.78 14 weighted avg 0.80 0.79 0.79 14

So we can see in the above results that there was a precision of 0.88 for identifying negatives and 0.67 for identifying positives.

There was an unweighted overall accuracy of 0.77 nad a weighted overall accuracy of 0.80.

Then other data is given such as the recall and f1-score.

The support is the number of data samples used in the report. There are 14 total data points, 9 of which have negative outcomes and 5 which have positive outcomes.

And this is how to create a classification report in Python using the sklearn module.

Related Resources

How to Randomly Select From or Shuffle a List in Python

HTML Comment Box is loading comments...

Learning about Electronics

How to Create a Classification Report in Python using sklearn