Wednesday, November 10, 2010

Capacity in the presence of intractability

Shannon's channel capacity, C = max_p(x) I(X;Y), gives the highest rate at which information can be reliably transmitted over a noisy channel.  But the simplicity of the equation is deceptive: there are very few channels for which capacity is known.  Normally, the difficulty is in optimizing over all input distributions p(x), but what if I(X;Y) is itself intractable?

I(X;Y) can be hard to calculate if f(y) or f(y|x) are hard.  The former can happen in intersymbol interference channels with long memories, especially under a binary (or discrete) input constraint.  The latter can happen in timing channels, when x is a vector of inputs, and the output is of the form y = sort(x+n), where n is the vector of timing noise. In this case, calculating f(y|x) requires the calculation of a matrix permanent, a classic intractable problem for vectors of practical length. (See also: the Bapat-Beg theorem.)

However, there is an elegant trick you can use to find a lower bound on I(X;Y), as long as you can (tractably) generate random variables x and y according to their true distributions f(x) and f(y|x).  Let g(y|x) be any valid conditional distribution of y given x.  Further, let


where expectation is always taken with respect to the true distribution.  That is, if you can pick a tractable approximate distribution g(y|x), and you can simulate both x and y, then you can perform a Monte Carlo expectation on log g(y|x)/g(y), and get a lower bound on I(X;Y).  You can then optimize with respect to the bound to find a lower bound on capacity.

The proof is straightforward.  First let g(x|y) = g(y|x)f(x)/g(y). Then:

So the accuracy of the bound is equal to the relative entropy between the true f(x|y) and the approximate g(x|y).

This result has been around for a while, but a nice recent application is here:

D. M. Arnold, H.-A. Loeliger, P. O. Vontobel, A. Kavcic, and W. Zeng, “Simulation-based computation of information rates for channels with memory,” IEEE Trans. Inform. Theory, vol. 52, pp. 3498–3508, Aug. 2006.

No comments: