Concentration Inequalities

Concentration inequalities show that random variables are tightly clustered around their expected values. These tools are indispensable in probabilistic combinatorics, allowing one to go beyond expectation arguments and establish sharp thresholds.

Chernoff-Type Bounds

Definition7.7Chernoff Bound

Let $X = \sum_{i=1}^n X_i$ where $X_i$ are independent Bernoulli random variables with $\Pr[X_i = 1] = p_i$ . Let $\mu = \mathbb{E}[X] = \sum p_i$ . Then for $\delta > 0$ : $\Pr[X \geq (1 + \delta)\mu] \leq \left(\frac{e^\delta}{(1+\delta)^{(1+\delta)}}\right)^\mu,$ and for $0 < \delta < 1$ : $\Pr[|X - \mu| \geq \delta\mu] \leq 2\exp\left(-\frac{\delta^2 \mu}{3}\right).$

ExampleDegree Concentration in $G(n, p)$

In $G(n, p)$ with $p = d/n$ and $d = \omega(\log n)$ , the degree of each vertex is concentrated around $d$ . By the Chernoff bound, $\Pr[|\deg(v) - d| > \varepsilon d] \leq 2e^{-\varepsilon^2 d / 3}$ . A union bound over all $n$ vertices shows that with high probability, all degrees are $(1 \pm \varepsilon)d$ .

Martingale Methods

Definition7.8Azuma-Hoeffding Inequality

Let $X_0, X_1, \ldots, X_n$ be a martingale with $|X_i - X_{i-1}| \leq c_i$ almost surely. Then for $t > 0$ : $\Pr[|X_n - X_0| \geq t] \leq 2\exp\left(-\frac{t^2}{2\sum_{i=1}^n c_i^2}\right).$

Definition7.9Method of Bounded Differences

If $f: \Omega_1 \times \cdots \times \Omega_n \to \mathbb{R}$ satisfies the Lipschitz condition $|f(x) - f(x')| \leq c_k$ whenever $x$ and $x'$ differ only in coordinate $k$ , and $X_1, \ldots, X_n$ are independent random variables, then: $\Pr[|f(X_1, \ldots, X_n) - \mathbb{E}[f]| \geq t] \leq 2\exp\left(-\frac{2t^2}{\sum c_k^2}\right).$

ExampleChromatic Number Concentration

The chromatic number $\chi(G(n, 1/2))$ can change by at most 1 when a single vertex is added or removed. By the method of bounded differences with $c_i = 1$ : $\Pr[|\chi - \mathbb{E}[\chi]| \geq t] \leq 2e^{-2t^2/n}.$ This shows $\chi(G(n, 1/2))$ is concentrated in an interval of width $O(\sqrt{n})$ . Much sharper results are known: Shamir and Spencer showed concentration in $O(\sqrt{n/\log n})$ , and the true answer is believed to be $O(1)$ .

RemarkTalagrand's Inequality

Talagrand's concentration inequality provides much stronger bounds when the function $f$ depends on the variables in a "smooth" way. For product spaces, if $f$ is Lipschitz and certifiable, then $f$ is concentrated within $O(\sqrt{\mathbb{E}[f]})$ of its median, a far tighter bound than Azuma-Hoeffding.