ProofComplete

Introduction to Ergodic Theory - Key Proof

ProofProof Sketch of Birkhoff's Ergodic Theorem

We outline the proof of Birkhoff's ergodic theorem, emphasizing key ideas over technical details.

Theorem Statement: For a measure-preserving (X,ΞΌ,T)(X, \mu, T) and f∈L1(X,ΞΌ)f \in L^1(X, \mu), the limit lim⁑nβ†’βˆž1nβˆ‘k=0nβˆ’1f(Tk(x))=fβˆ—(x)\lim_{n \to \infty} \frac{1}{n}\sum_{k=0}^{n-1} f(T^k(x)) = f^*(x) exists almost everywhere, with fβˆ—f^* invariant and ∫fβˆ—dΞΌ=∫fdΞΌ\int f^* d\mu = \int f d\mu.

Step 1: Maximal ergodic lemma

Define the maximal function:

f+(x)=sup⁑nβ‰₯11nβˆ‘k=0nβˆ’1f(Tk(x))f^+(x) = \sup_{n \geq 1} \frac{1}{n} \sum_{k=0}^{n-1} f(T^k(x))

The maximal ergodic lemma states: for the set E={x:f+(x)>0}E = \{x : f^+(x) > 0\}:

∫Ef dΞΌβ‰₯0\int_E f \, d\mu \geq 0

Proof of lemma: Define Sn=βˆ‘k=0nβˆ’1f∘TkS_n = \sum_{k=0}^{n-1} f \circ T^k and note that:

max⁑1≀k≀nSk(x)=max⁑{f(x),f(x)+max⁑1≀k≀nβˆ’1Sk(T(x))}\max_{1 \leq k \leq n} S_k(x) = \max\{f(x), f(x) + \max_{1 \leq k \leq n-1} S_k(T(x))\}

Using this recursion and measure-preservation, one shows ∫Ef dΞΌβ‰₯0\int_E f \, d\mu \geq 0 through careful estimation. This lemma is the technical heart of the proof.

Step 2: Limsup and liminf

Define:

fβ€Ύ(x)=lim sup⁑nβ†’βˆž1nβˆ‘k=0nβˆ’1f(Tk(x))\overline{f}(x) = \limsup_{n \to \infty} \frac{1}{n}\sum_{k=0}^{n-1} f(T^k(x)) fβ€Ύ(x)=lim inf⁑nβ†’βˆž1nβˆ‘k=0nβˆ’1f(Tk(x))\underline{f}(x) = \liminf_{n \to \infty} \frac{1}{n}\sum_{k=0}^{n-1} f(T^k(x))

Both fβ€Ύ\overline{f} and fβ€Ύ\underline{f} are TT-invariant (if TT moves xx to T(x)T(x), the time average is the same, just shifted).

Step 3: Showing fβ€Ύ=fβ€Ύ\overline{f} = \underline{f} almost everywhere

For any c∈Rc \in \mathbb{R}, apply the maximal lemma to fβˆ’cf - c. The set where fβ€Ύ>c\overline{f} > c has:

∫{fβ€Ύ>c}(fβˆ’c) dΞΌβ‰₯0\int_{\{\overline{f} > c\}} (f - c) \, d\mu \geq 0

Similarly, for fβ€Ύ<c\underline{f} < c:

∫{fβ€Ύ<c}(fβˆ’c) dμ≀0\int_{\{\underline{f} < c\}} (f - c) \, d\mu \leq 0

If ΞΌ({fβ€Ύ>fβ€Ύ})>0\mu(\{\overline{f} > \underline{f}\}) > 0, choose cc between fβ€Ύ\overline{f} and fβ€Ύ\underline{f} on this set (by density of rationals). This yields:

∫{fβ€Ύ>c>fβ€Ύ}(fβˆ’c) dΞΌβ‰₯0Β and ≀0\int_{\{\overline{f} > c > \underline{f}\}} (f - c) \, d\mu \geq 0 \text{ and } \leq 0

implying ΞΌ({fβ€Ύ>c>fβ€Ύ})=0\mu(\{\overline{f} > c > \underline{f}\}) = 0. Taking countable union over rationals: ΞΌ({fβ€Ύ>fβ€Ύ})=0\mu(\{\overline{f} > \underline{f}\}) = 0.

Thus fβ€Ύ=fβ€Ύ=:fβˆ—\overline{f} = \underline{f} =: f^* almost everywhere, so the limit exists.

Step 4: Invariance and integral preservation

fβˆ—f^* is invariant: fβˆ—βˆ˜T=fβˆ—f^* \circ T = f^* by construction (time averages are shift-invariant).

For integral preservation, use the dominated convergence theorem (or monotone convergence with truncations):

∫fβˆ—β€‰dΞΌ=∫lim⁑nβ†’βˆž1nβˆ‘k=0nβˆ’1f∘Tk dΞΌ=lim⁑nβ†’βˆž1nβˆ‘k=0nβˆ’1∫f∘Tk dΞΌ\int f^* \, d\mu = \int \lim_{n \to \infty} \frac{1}{n}\sum_{k=0}^{n-1} f \circ T^k \, d\mu = \lim_{n \to \infty} \frac{1}{n}\sum_{k=0}^{n-1} \int f \circ T^k \, d\mu

Since TT preserves measure, ∫f∘Tkdμ=∫fdμ\int f \circ T^k d\mu = \int f d\mu, yielding:

∫fβˆ—β€‰dΞΌ=lim⁑nβ†’βˆž1nβ‹…n∫f dΞΌ=∫f dΞΌ\int f^* \, d\mu = \lim_{n \to \infty} \frac{1}{n} \cdot n \int f \, d\mu = \int f \, d\mu

Conclusion: The time average fβˆ—f^* exists almost everywhere, is invariant, and has the same integral as ff.

β– 

This proof combines measure theory, functional analysis, and clever inequalities. The maximal ergodic lemma is the key technical tool, controlling fluctuations in partial sums. Once this is established, the rest follows from standard arguments.

Remark

For ergodic TT, any invariant function is constant almost everywhere (by ergodicity definition). Thus fβˆ—=cf^* = c a.e., and integrating: c=∫fdΞΌc = \int f d\mu. This completes the classical statement: for ergodic systems, time averages equal the space average.

The proof extends to LpL^p spaces and more general settings (amenable groups, noncommutative spaces), demonstrating the robustness of the ergodic theorem beyond its original formulation.

ExampleApplication to Monte Carlo Methods

Birkhoff's theorem justifies Monte Carlo integration. To compute ∫Xf dΞΌ\int_X f \, d\mu for an ergodic system:

  1. Choose any typical initial x0x_0
  2. Compute time average: 1Nβˆ‘k=0Nβˆ’1f(Tk(x0))\frac{1}{N}\sum_{k=0}^{N-1} f(T^k(x_0))
  3. As Nβ†’βˆžN \to \infty, this converges to ∫f dΞΌ\int f \, d\mu almost surely

This provides theoretical foundation for Markov Chain Monte Carlo methods widely used in statistics, physics, and machine learning.

Birkhoff's ergodic theorem stands among the great results of 20th-century mathematics, connecting dynamics, probability, and analysis. It provides rigorous foundations for statistical mechanics, justifies computational methods, and reveals deep connections between individual trajectories and ensemble statistics.