# Log-Sum-Exp Trick

The Log-Sum-Exp trick is a really cool computational trick to prevent overflow or underflow when computing the log of the sum of exponentials (exactly as it name suggests!). I got to know about it while trying to code up mixture density networks which required me to calculate the log of the sum of a bunch of Gaussian distributions for its log-likelihood.

Suppose you would like to calculate a quantity that looks like the following,

If \(x_i\) happens to be a very negative or large positive number, then the exponential will make it even larger or smaller. When calculating with limited precision, for example GPU calculations in `float32`

, this problem is exacerbated and can lead to underflow or overflow.

Thankfully, we can re-write \(\mathcal{J}\) as

where \(a = \max \left(x_i;~i=1\text{ ... }N \right)\). If \(x_i\) is a very negative or large positive number, then \(x_i-b\) brings the exponential to more manageable territory for computation.

Also, the above re-arrangement is an approximation of the max of \(x_i\) since \(\mathcal{J}\) exists in the tight bounds of

See wikipedia for more details.

Incidentally, TensorFlow also provides an implementation of the Log-Sum-Exp trick to make things convenient for TensorFlow users. See here.

Here's a Youtube video on the Log-Sum-Exp trick (not by me) if reading's not your thing...