Overview

The population stability index (PSI) is a metric for detecting concept drift. The PSI originally gained prominence as a metric for triggering changes in credit scores. It is most commonly given as a discrete measure over the histograms of distributions and :

The PSI’s symmetry makes it a convenient metric for alerting on drift because it detects a wide range of issues. However, its symmetry limits its ability to reveal the nature of the detected problem, meaning that additional metrics are needed. It can also be noisy.

Comparison to Jensen-Shannon Divergence

Both PSI and Jensen-Shannon (JS) divergence (JSD) are closely related to the KL divergence, and both are symmetric. The key difference is that JSD incorporates an average of both the baseline and query distributions. This means that, if the query distribution observes a rare event, PSI will respond much more than JSD.

See Response of JSD and PSI to a rare event for a worked example.

Connection to the Kullback-Leibler divergence

The population stability index is a discretization of a distance measure based on the KL divergence, called the Jeffreys distance.

Recall the definition for the KL divergence :

We can create a symmetric measure, the Jeffreys distance , by taking the sum of the KL divergence in both directions:

Substituting the definition of the KL divergence, we get

which is equivalent to in the limit where the number of bins approaches infinity.

Sources