In mathematics, the log sum inequality is very useful for proving several theorems in information theory.


Suppose ai and bi are nonnegative numbers. The log sum inequality states that

$\sum_{i=1}^{n}a_{i}log \frac{a_{i}}{b_{i}} \seq a log \frac{a}{b}$,
where $a=\sum a_{i}$ and $b=\sum b_{i}$. The equality established if and only if $a_{i}/b{i}$ are equal for all $i$.


(1)Convex functioin

In mathematics, a real-valued function F(x) defined on an interval is called convex (or convex downward or concave upward) if the graph of the function lies below the line segment joining any two points of the graph. Equivalently, a function is convex if its epigraph (the set of points on or above the graph of the function) is a convex set.

The mathematical expressions are as follows,
A real valued function $f: X → R$ defined on a convex set $X$ in a vector space is called convex if, for any two points $x_{1}$ and $x_{2}$ in $X$ and any $t \in [0,1]$,

$f(tx_{1}+(1-t)x_{2}) \leq tf(x_{1}) + (1-t)f(x_{2})$.
The function is called strictly convex if remove the equal in the expression.

Examples of convex functions are the quadratic function $F(x)=x_{2}$ and the exponential function $F(x)=e^{x}$ for any real number $x$. For more examples and properties of this type of function, please refer to them on Wikipedia.

(2)Jensen's inequality

Jensen's inequality generalizes the statement that the secant line of a convex function lies above the graph of the function. There are also converses of the Jensen's inequality, which estimate the upper bound of the integral of the convex function.

In the context of probability theory, it is generally stated in the following form: if $X$ is a random variable
and $\phi$ is a convex function, then

$\phi(E[X]) \leq E[\phi(X)]$.
For a real convex function $\phi$ , numbers $x1, x2, ..., xn$ in its domain, and positive weights $ai$, Jensen's inequality can be stated as:
$\phi(\frac{\sum a{i} x{i}}{\sum a{j}}) \leq \frac{a{i}\phi (x{i})}{\sum a{j}}$.

(3)The Log Sum Inequality

Notice that after setting $f(x) = xlog x$ we have
$\begin{align} \sum{i=1}^{n}a{i}llog\frac{a{i}}{b{i}} =& \sum{i=1}^{n}b{i}f(\frac{a{i}}{b{i}}) = b\sum{i=1}^{n}\frac{b{i}}{b}f(\frac{a{i}}{b{i}}) \ \geq & bf(\sum{i=1}^{n}\frac{b{i}}{b}\frac{a{i}}{b{i}}) = bf(\frac{1}{b}\sum{i=1}^{n}a{i}) = bf(\frac{a}{b})\ = & alog \frac{a}{b} \end{align}$.

where the inequality follows from Jensen's inequality since $\frac{b{i}}{b} \geq 0, \sum {i} \frac{b_{i}}{b} = 1$, and $f$ is convex.


The log sum inequality can be used to prove several inequalities in information theory such as Gibbs' inequality or the convexity of Kullback-Leibler divergence.

For example to prove Gibbs' inequality it is enough to substitute $p{i}$s for $a{i}$s, and $q{i}$s for $b{i}$s to get

$D{KL}(P\parallel Q) \equiv \sum{i=1}^{n}p{i}log{2}\frac{p{i}}{q{i}} \geq 1log\frac{1}{1} = 0$.


[1]Convex function,
[2]Jensen's inequality,
[3]The Log Sum Inequality,

Original Link:
Attribution - NON-Commercial - ShareAlike - Copyright © Bill Xia