While typically associated with probability distributions, moments are defined for any real-valued function
Physical interpretation
Moments are best understood by analogy to physical bodies: if the function itself describes a distribution of mass in space, then
- the 0th moment is total mass;
- the 1st moment is the center of mass; and
- the 2nd moment is the moment of inertia.
(I don’t know what special name, if any, the higher moments have in a physical context.)
Utility in statistics
The moments of a probability distribution reveal its shape, and are used in a variety of far-flung applications (such as the Adam optimizer). They each have a physical interpretation:
- 0th: sanity check (must sum to 1 if it’s a probability distribution)
- 1st: mean of the PDF
- 2nd: variance of the PDF
- 3rd: skewness (how much it tilts to one side)
- 4th: kurtosis (breadth of the tails)
(For future me: if I want to build out the details of each of these, consult Gundersen 2020 to do so.)
Normalization of moments
For the second and higher moments, we typically apply normalizations from some of the preceding
: doesn’t matter : let : let \mu_1$ and onwards: ket and divide the base through by
While not strictly part of the definition of the moment, these normalizations make higher moments more informative.
Moment-generating function
See Moment-generating functions