- Say you want to sort k objects into n bins. If n < k, then at least one bin must contain at least two objects.
The lemma is obvious almost to the point of triviality, so it comes as a surprise that you can use it to prove powerful results; there are some examples in the book.
Lately I've been wondering whether it can be used to say anything interesting about entropy. Here's the first thing I thought of: let X and Y be random variables on a discrete alphabet, and let S(X) and S(Y) represent the support of the probabilities of X and Y, respectively (i.e., x is in S(X) if and only if p(x) > 0). Then
- Theorem. If |S(X)| > |S(Y)|, then H(X|Y) > 0.
- Proof. By definition of entropy, H(X|Y) >= 0. If H(X|Y) = 0, then there must exist an injective map from S(X) --> S(Y). However, since |S(X)| > |S(Y)|, no such map exists (by the pigeonhole principle). Thus, H(X|Y) != 0, and the theorem follows.