TLDR
In this Mage Academy lesson on model evaluation, we’ll learn how to calculate specificity using a confusion matrix and understand what it means for a model to have a high or low value.
Glossary
Definition
Calculation
How to code
Definition
Specificity is a classification metric used to measure your model’s performance. It’s represented by the total number of true negatives divided by the total number of actual negatives. This matters the most when a negative result represents something important, like determining the mortality rate of a drug.
A high specificity value means that “no means no” in a model, meaning that you can trust it to avoid false negatives. This is great as the model will have a very low chance of having errors and is important especially in healthcare or medical diagnosis.
(Source: ClinicalOdyssey)
A low specificity can mean that a model is overzealous, and will be more likely to have false negatives. While not the worst, it varies by use case whether you’ll want to focus on optimizing for specificity or
sensitivity
.
A low specificity will be ready to reject instead of saying yes to the wrong candidate. (Source: WEF)
Calculation
In our
, our true negatives lie in the fourth quadrant, and our actual negatives are the sum of the True Negatives and False Positives, quadrants 3 and 4.
Using this example, we can calculate it as (8/8+32). Hence we get a recall of 20%.
How to code
Specificity can be calculated by scratch for binary classification and can also be calculated similarly to
recall
using SKLearn.
Example data
From scratch
Let’s start by calculating the true negatives, which is when the model successfully predicted a negative value. In this case, 1 will be positive and 0 will be negative.
Next, we’ll calculate the actual negatives, this is when the result, y_true, is supposed to be negative, aka 0.
Finally, we’ll use the two values to calculate the metrics for specificity.
Scikit Learn
Specificity in this case, is changing the value of true and false in the recall case.
SciKitLearn handles this with the recall method, but in our case of binary classification, we’ll specify the pos_label parameter to avoid getting the
sensitivity
.
Related Lessons
Confusion matrix (Beginner)
Sensibility (Intermediate)
F1-Score (Advanced)
Want to learn more about machine learning (ML)? Visit
! ✨🔮