Trees and Networks - Applications

Kruskal's algorithm efficiently finds minimum spanning trees, with correctness following from the greedy choice property and the cut property.

DefinitionKruskal's Algorithm

Given a connected weighted graph $G = (V, E, w)$ , Kruskal's algorithm finds a minimum spanning tree:

Sort edges by weight: $e_1, e_2, \ldots, e_m$ where $w(e_1) \leq w(e_2) \leq \cdots \leq w(e_m)$
Initialize $T = \emptyset$ (empty forest)
For each edge $e_i$ $e_{i}$ in sorted order:
- If adding $e_i$ to $T$ doesn't create a cycle, add it: $T \leftarrow T \cup \{e_i\}$
Return $T$

Correctness Proof (Cut Property):

Lemma (Cut Property): Let $S \subset V$ be a proper subset. Let $e$ be a minimum-weight edge with one endpoint in $S$ and one in $V \setminus S$ . Then some MST contains $e$ .

Proof of lemma: Let $T$ be any MST. If $e \in T$ , we're done. Otherwise, adding $e$ to $T$ creates a cycle. This cycle must cross the cut $(S, V \setminus S)$ at least twice (once via $e$ , once via another edge $e'$ ).

Remove $e'$ from $T \cup \{e\}$ to get a spanning tree $T'$ . Since $w(e) \leq w(e')$ (by minimality of $e$ across the cut), we have $w(T') \leq w(T)$ . Since $T$ is an MST, $T'$ is also an MST, and it contains $e$ . $\square$

Main Proof: Kruskal's algorithm maintains a forest $F$ . We prove by induction that at each step, $F$ is contained in some MST.

Base: $F = \emptyset$ is in every MST.

Inductive step: Suppose $F$ is in some MST $T$ . When we consider edge $e = \{u,v\}$ :

If $e$ creates a cycle in $F$ , skip it (can't be in any spanning tree extending $F$ )
If $e$ doesn't create a cycle, $u$ and $v$ are in different components of $F$ . Let $S$ be the component containing $u$ .

The edge $e$ crosses the cut $(S, V \setminus S)$ . Among edges considered so far (including $e$ ), $e$ is the minimum-weight edge crossing this cut (edges are sorted). By the cut property, some MST contains $e$ .

Moreover, this MST can be chosen to contain $F$ : if an MST contains $F$ , we can modify it to include $e$ (as shown in the cut property proof). Thus $F \cup \{e\}$ is in some MST.

By induction, when the algorithm terminates with a spanning tree, it's an MST. $\square$

ExampleRuntime Analysis

With $|V| = n$ and $|E| = m$ :

Sorting edges: $O(m \log m)$
Processing each edge with Union-Find: $O(m \cdot \alpha(n))$ where $\alpha$ is the inverse Ackermann function (effectively constant)

Total: $O(m \log m) = O(m \log n)$ since $m \leq n^2$ , so $\log m \leq 2 \log n$ .

For dense graphs ( $m \approx n^2$ ), Prim's algorithm with Fibonacci heaps achieves $O(m + n \log n)$ , which is better.

Remark

The greedy approach works for MST because the matroid structure: forests form an independence system where the greedy algorithm yields optimal solutions. This extends to other matroid optimization problems. The reverse-delete algorithm (dual to Kruskal's) also works: sort edges in decreasing weight order, remove edges that don't disconnect the graph. Minimum spanning tree algorithms have applications in network design, clustering (single-linkage hierarchical clustering), and approximation algorithms (e.g., 2-approximation for TSP on metric spaces via MST).