TheoremComplete

Trees and Networks - Applications

Kruskal's algorithm efficiently finds minimum spanning trees, with correctness following from the greedy choice property and the cut property.

DefinitionKruskal's Algorithm

Given a connected weighted graph G=(V,E,w)G = (V, E, w), Kruskal's algorithm finds a minimum spanning tree:

  1. Sort edges by weight: e1,e2,,eme_1, e_2, \ldots, e_m where w(e1)w(e2)w(em)w(e_1) \leq w(e_2) \leq \cdots \leq w(e_m)
  2. Initialize T=T = \emptyset (empty forest)
  3. For each edge eie_i in sorted order:
    • If adding eie_i to TT doesn't create a cycle, add it: TT{ei}T \leftarrow T \cup \{e_i\}
  4. Return TT

Correctness Proof (Cut Property):

Lemma (Cut Property): Let SVS \subset V be a proper subset. Let ee be a minimum-weight edge with one endpoint in SS and one in VSV \setminus S. Then some MST contains ee.

Proof of lemma: Let TT be any MST. If eTe \in T, we're done. Otherwise, adding ee to TT creates a cycle. This cycle must cross the cut (S,VS)(S, V \setminus S) at least twice (once via ee, once via another edge ee').

Remove ee' from T{e}T \cup \{e\} to get a spanning tree TT'. Since w(e)w(e)w(e) \leq w(e') (by minimality of ee across the cut), we have w(T)w(T)w(T') \leq w(T). Since TT is an MST, TT' is also an MST, and it contains ee. \square

Main Proof: Kruskal's algorithm maintains a forest FF. We prove by induction that at each step, FF is contained in some MST.

Base: F=F = \emptyset is in every MST.

Inductive step: Suppose FF is in some MST TT. When we consider edge e={u,v}e = \{u,v\}:

  • If ee creates a cycle in FF, skip it (can't be in any spanning tree extending FF)
  • If ee doesn't create a cycle, uu and vv are in different components of FF. Let SS be the component containing uu.

The edge ee crosses the cut (S,VS)(S, V \setminus S). Among edges considered so far (including ee), ee is the minimum-weight edge crossing this cut (edges are sorted). By the cut property, some MST contains ee.

Moreover, this MST can be chosen to contain FF: if an MST contains FF, we can modify it to include ee (as shown in the cut property proof). Thus F{e}F \cup \{e\} is in some MST.

By induction, when the algorithm terminates with a spanning tree, it's an MST. \square

ExampleRuntime Analysis

With V=n|V| = n and E=m|E| = m:

  1. Sorting edges: O(mlogm)O(m \log m)
  2. Processing each edge with Union-Find: O(mα(n))O(m \cdot \alpha(n)) where α\alpha is the inverse Ackermann function (effectively constant)

Total: O(mlogm)=O(mlogn)O(m \log m) = O(m \log n) since mn2m \leq n^2, so logm2logn\log m \leq 2 \log n.

For dense graphs (mn2m \approx n^2), Prim's algorithm with Fibonacci heaps achieves O(m+nlogn)O(m + n \log n), which is better.

Remark

The greedy approach works for MST because the matroid structure: forests form an independence system where the greedy algorithm yields optimal solutions. This extends to other matroid optimization problems. The reverse-delete algorithm (dual to Kruskal's) also works: sort edges in decreasing weight order, remove edges that don't disconnect the graph. Minimum spanning tree algorithms have applications in network design, clustering (single-linkage hierarchical clustering), and approximation algorithms (e.g., 2-approximation for TSP on metric spaces via MST).