Andrew Eckford: The Blog: November 2010

Thursday, November 25, 2010

Sixty years of Hamming codes

Via Slashdot, this year marks the 60th anniversary of the venerable Hamming code. Arriving just two years after Shannon's seminal paper, the Hamming codes were the first nontrivial error-correcting code, and Hamming's contribution to the theory of error correcting codes (using metrics which are now known as Hamming distance and Hamming weight) are still taught today.

Here (pdf) is Hamming's original paper (from BSTJ), and here is my Hamming code lecture from last spring's information theory course:

Wednesday, November 10, 2010

Capacity in the presence of intractability

Shannon's channel capacity, C = max_p(x) I(X;Y), gives the highest rate at which information can be reliably transmitted over a noisy channel. But the simplicity of the equation is deceptive: there are very few channels for which capacity is known. Normally, the difficulty is in optimizing over all input distributions p(x), but what if I(X;Y) is itself intractable?

I(X;Y) can be hard to calculate if f(y) or f(y|x) are hard. The former can happen in intersymbol interference channels with long memories, especially under a binary (or discrete) input constraint. The latter can happen in timing channels, when x is a vector of inputs, and the output is of the form y = sort(x+n), where n is the vector of timing noise. In this case, calculating f(y|x) requires the calculation of a matrix permanent, a classic intractable problem for vectors of practical length. (See also: the Bapat-Beg theorem.)

However, there is an elegant trick you can use to find a lower bound on I(X;Y), as long as you can (tractably) generate random variables x and y according to their true distributions f(x) and f(y|x). Let g(y|x) be any valid conditional distribution of y given x. Further, let

Then

where expectation is always taken with respect to the true distribution. That is, if you can pick a tractable approximate distribution g(y|x), and you can simulate both x and y, then you can perform a Monte Carlo expectation on log g(y|x)/g(y), and get a lower bound on I(X;Y). You can then optimize with respect to the bound to find a lower bound on capacity.

The proof is straightforward. First let g(x|y) = g(y|x)f(x)/g(y). Then:

So the accuracy of the bound is equal to the relative entropy between the true f(x|y) and the approximate g(x|y).

This result has been around for a while, but a nice recent application is here:

D. M. Arnold, H.-A. Loeliger, P. O. Vontobel, A. Kavcic, and W. Zeng, “Simulation-based computation of information rates for channels with memory,” IEEE Trans. Inform. Theory, vol. 52, pp. 3498–3508, Aug. 2006.

Tuesday, November 2, 2010

Andrew Eckford: The Blog

Thursday, November 25, 2010

Sixty years of Hamming codes

Friday, November 12, 2010

Ph.D. position available (Update: No longer available)

Wednesday, November 10, 2010

Capacity in the presence of intractability

Tuesday, November 2, 2010

List of blogs in machine learning

Blog Archive

Links