The Wasserstein metric
Given distributions
where
Earth mover’s distance
The WM is also called the “Earth mover’s distance” based on a metaphor about shoveling dirt. Imagine that each distribution is a pile of dirt. The Wasserstein distance represents the amount of work required to convert one into the other, where work is defined as mass times distance.
I’ll admit, I find this a little hard to see. It helps to see that the sum over the recursive steps is a bit like multiplication, such that we can think of