Minimum Spanning Tree: Building the Perfect Network
The Story of the Lazy Builder
Imagine you’re a builder who needs to connect 5 houses with roads. You want everyone connected, but you’re lazy (smart!) — you want to use the least amount of road possible.
That’s exactly what a Minimum Spanning Tree (MST) does!
What is a Spanning Tree?
Think of a tree in your backyard. It has:
- One trunk (root)
- Branches connecting everything
- No loops — you can’t walk in circles
A Spanning Tree connects all points (nodes) in a graph with the minimum number of edges — and no cycles.
graph TD A["House A"] --- B["House B"] B --- C["House C"] C --- D["House D"] D --- E["House E"]
5 houses, 4 roads. Everyone connected. No loops!
What Makes it “Minimum”?
Each road has a cost (distance, money, time).
A Minimum Spanning Tree picks roads so the total cost is smallest.
| Road | Cost |
|---|---|
| A-B | 2 |
| A-C | 6 |
| B-C | 3 |
| B-D | 8 |
| C-D | 5 |
MST picks: A-B (2) + B-C (3) + C-D (5) = 10 total
Prim’s Algorithm: The Greedy Gardener
The Story
Imagine you’re growing a tree. You start with one seed (any node). Then you look around and ask:
“What’s the cheapest branch I can add right now?”
You pick it. Repeat until your tree reaches everyone.
How Prim’s Works
- Start at any node
- Look at all edges connecting your tree to outside nodes
- Pick the cheapest edge
- Add that node to your tree
- Repeat until all nodes are included
Example
Nodes: A, B, C, D
Edges: A-B(1), A-C(4), B-C(2), B-D(5), C-D(3)
Step by step:
| Step | Tree has | Cheapest edge | Add |
|---|---|---|---|
| 1 | {A} | A-B (1) | B |
| 2 | {A,B} | B-C (2) | C |
| 3 | {A,B,C} | C-D (3) | D |
MST edges: A-B, B-C, C-D Total cost: 1 + 2 + 3 = 6
Python Code
import heapq
def prims(graph, start):
mst = []
visited = {start}
edges = [(cost, start, to)
for to, cost in graph[start]]
heapq.heapify(edges)
while edges:
cost, frm, to = heapq.heappop(edges)
if to in visited:
continue
visited.add(to)
mst.append((frm, to, cost))
for next_to, next_cost in graph[to]:
if next_to not in visited:
heapq.heappush(edges,
(next_cost, to, next_to))
return mst
Key idea: Always pick the cheapest edge that expands your tree!
Kruskal’s Algorithm: The Edge Collector
The Story
Imagine you have all the roads in a pile. You sort them from cheapest to most expensive. Then you pick roads one by one:
“Does this road create a loop? No? Great, I’ll use it!”
How Kruskal’s Works
- Sort all edges by cost (smallest first)
- Pick the next cheapest edge
- Check if it creates a cycle
- If no cycle → add it to MST
- If cycle → skip it
- Repeat until you have (n-1) edges
Example
Edges sorted: A-B(1), B-C(2), C-D(3), A-C(4), B-D(5)
| Edge | Cost | Creates cycle? | Action |
|---|---|---|---|
| A-B | 1 | No | Add |
| B-C | 2 | No | Add |
| C-D | 3 | No | Add |
| A-C | 4 | Yes! (A-B-C-A) | Skip |
MST: A-B, B-C, C-D — Same result as Prim’s!
The Big Question
How do we quickly check if an edge creates a cycle?
Answer: Union-Find! (Coming next…)
Union-Find: The Family Reunion Tracker
The Story
Imagine a big family reunion. At first, everyone is a stranger. When two people shake hands, they become family friends.
You need to quickly answer:
“Are Alice and Bob already connected through friends?”
That’s what Union-Find does!
Two Operations
Find: “Who’s your family leader?”
Every person points to a leader. Follow the chain to find the root.
Union: “Let’s merge families!”
When two people connect, their families become one.
Basic Implementation
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
def find(self, x):
# Follow chain to root
while self.parent[x] != x:
x = self.parent[x]
return x
def union(self, x, y):
# Connect two families
root_x = self.find(x)
root_y = self.find(y)
if root_x != root_y:
self.parent[root_x] = root_y
Example: Tracking Connections
Initial: [0, 1, 2, 3] (everyone is their own parent)
union(0, 1): [1, 1, 2, 3] (0's parent is now 1)
union(2, 3): [1, 1, 3, 3] (2's parent is now 3)
union(1, 3): [1, 3, 3, 3] (1's parent is now 3)
find(0) → 0→1→3 → returns 3
find(2) → 2→3 → returns 3
Same root = Same family!
Union-Find Optimizations: Making it Lightning Fast
Problem with Basic Union-Find
Chains can get very long! Finding the root becomes slow.
Worst case: 0 → 1 → 2 → 3 → 4 → ... → 1000
That's 1000 steps just to find the root!
Optimization 1: Path Compression
Idea: While finding root, make everyone point directly to root!
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
Before: 0 → 1 → 2 → 3 After find(0): 0 → 3, 1 → 3, 2 → 3
Now everyone reaches root in 1 step!
Optimization 2: Union by Rank
Idea: Always attach the shorter tree under the taller tree.
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def union(self, x, y):
root_x = self.find(x)
root_y = self.find(y)
if root_x == root_y:
return
# Attach smaller to larger
if self.rank[root_x] < self.rank[root_y]:
self.parent[root_x] = root_y
elif self.rank[root_x] > self.rank[root_y]:
self.parent[root_y] = root_x
else:
self.parent[root_y] = root_x
self.rank[root_x] += 1
How Fast Is It?
With both optimizations, Union-Find is nearly O(1) per operation!
Technically it’s O(α(n)) where α is the inverse Ackermann function — so slow-growing that for any practical n, it’s basically constant.
Kruskal’s with Union-Find: Complete Code
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(
self.parent[x])
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py:
return False
if self.rank[px] < self.rank[py]:
px, py = py, px
self.parent[py] = px
if self.rank[px] == self.rank[py]:
self.rank[px] += 1
return True
def kruskal(n, edges):
# edges: [(cost, u, v), ...]
edges.sort()
uf = UnionFind(n)
mst = []
for cost, u, v in edges:
if uf.union(u, v):
mst.append((u, v, cost))
return mst
Prim’s vs Kruskal’s: When to Use Which?
| Feature | Prim’s | Kruskal’s |
|---|---|---|
| Best for | Dense graphs | Sparse graphs |
| Approach | Grow one tree | Merge forests |
| Data structure | Priority queue | Union-Find |
| Time (with heap) | O(E log V) | O(E log E) |
Rule of thumb:
- Many edges? → Prim’s
- Few edges? → Kruskal’s
Real World Examples
Internet Cables
Connect data centers with minimum cable length → MST
Power Grid
Connect cities with minimum wire → MST
Road Networks
Connect villages with minimum road construction → MST
Summary
graph TD MST["Minimum Spanning Tree"] MST --> P["Prim&#39;s Algorithm] MST --> K[Kruskal&#39;s Algorithm"] K --> UF["Union-Find"] UF --> PC["Path Compression"] UF --> UR["Union by Rank"] P --> |Grow tree| G["Greedy expansion"] K --> |Sort edges| S["Pick cheapest safe edge"]
You learned:
- MST connects all nodes with minimum total edge weight
- Prim’s grows a tree from one node
- Kruskal’s sorts edges and picks safe ones
- Union-Find tracks connected components
- Path compression + union by rank make it super fast!
You’re now ready to build efficient networks!
