Warning! ! See the Information theory notation cheat sheet.

Conditional entropy is exactly what it sounds like: it’s the Shannon entropy of the distribution of a dependent random variable given a known condition . Recall that

If we let , then we have the conditional entropy of :

The chain rule for conditional entropy is

where is the joint entropy of random variables and .