Model Evaluation - Specificity

First published on February 7, 2022

Last updated at April 19, 2022

 

4 minute read

Nathaniel Tjandra

TLDR

In this Mage Academy lesson on model evaluation, we’ll learn how to calculate specificity using a confusion matrix and understand what it means for a model to have a high or low value.

Glossary

  • Definition

  • Calculation

  • How to code

Definition

Specificity is a classification metric used to measure your model’s performance.  It’s represented by the total number of true negatives divided by the total number of actual negatives. This matters the most when a negative result represents something important, like determining the mortality rate of a drug.

A high specificity value means that “no means no” in a model, meaning that you can trust it to avoid false negatives. This is great as the model will have a very low chance of having errors and is important especially in healthcare or medical diagnosis.

(Source: ClinicalOdyssey)

A low specificity can mean that a model is overzealous, and will be more likely to have false negatives. While not the worst, it varies by use case whether you’ll want to focus on optimizing for specificity or 

sensitivity

.

A low specificity will be ready to reject instead of saying yes to the wrong candidate. (Source: WEF)

Calculation

In our 

, our true negatives lie in the fourth quadrant, and our actual negatives are the sum of the True Negatives and False Positives, quadrants 3 and 4.

Using this example, we can calculate it as (8/8+32). Hence we get a recall of 20%.

How to code

Specificity can be calculated by scratch for binary classification and can also be calculated similarly to 

recall

using SKLearn.

Example data

From scratch

Let’s start by calculating the true negatives, which is when the model successfully predicted a negative value. In this case, 1 will be positive and 0 will be negative.

Next, we’ll calculate the actual negatives, this is when the result, y_true,  is supposed to be negative, aka 0.

Finally, we’ll use the two values to calculate the metrics for specificity.

Scikit Learn

Specificity in this case, is changing the value of true and false in the recall case.

SciKitLearn handles this with the recall method, but in our case of binary classification, we’ll specify the pos_label parameter to avoid getting the 

sensitivity

.

Related Lessons

  • Confusion matrix (Beginner)

  • Sensibility (Intermediate)

  • F1-Score (Advanced)

Want to learn more about machine learning (ML)? Visit 

! ✨🔮