Minimum Spanning Tree

Back

Loading concept...

Minimum Spanning Tree: Building the Perfect Network

The Story of the Lazy Builder

Imagine you’re a builder who needs to connect 5 houses with roads. You want everyone connected, but you’re lazy (smart!) — you want to use the least amount of road possible.

That’s exactly what a Minimum Spanning Tree (MST) does!


What is a Spanning Tree?

Think of a tree in your backyard. It has:

  • One trunk (root)
  • Branches connecting everything
  • No loops — you can’t walk in circles

A Spanning Tree connects all points (nodes) in a graph with the minimum number of edges — and no cycles.

graph TD A["House A"] --- B["House B"] B --- C["House C"] C --- D["House D"] D --- E["House E"]

5 houses, 4 roads. Everyone connected. No loops!


What Makes it “Minimum”?

Each road has a cost (distance, money, time).

A Minimum Spanning Tree picks roads so the total cost is smallest.

Road Cost
A-B 2
A-C 6
B-C 3
B-D 8
C-D 5

MST picks: A-B (2) + B-C (3) + C-D (5) = 10 total


Prim’s Algorithm: The Greedy Gardener

The Story

Imagine you’re growing a tree. You start with one seed (any node). Then you look around and ask:

“What’s the cheapest branch I can add right now?”

You pick it. Repeat until your tree reaches everyone.

How Prim’s Works

  1. Start at any node
  2. Look at all edges connecting your tree to outside nodes
  3. Pick the cheapest edge
  4. Add that node to your tree
  5. Repeat until all nodes are included

Example

Nodes: A, B, C, D
Edges: A-B(1), A-C(4), B-C(2), B-D(5), C-D(3)

Step by step:

Step Tree has Cheapest edge Add
1 {A} A-B (1) B
2 {A,B} B-C (2) C
3 {A,B,C} C-D (3) D

MST edges: A-B, B-C, C-D Total cost: 1 + 2 + 3 = 6

Python Code

import heapq

def prims(graph, start):
    mst = []
    visited = {start}
    edges = [(cost, start, to)
             for to, cost in graph[start]]
    heapq.heapify(edges)

    while edges:
        cost, frm, to = heapq.heappop(edges)
        if to in visited:
            continue
        visited.add(to)
        mst.append((frm, to, cost))
        for next_to, next_cost in graph[to]:
            if next_to not in visited:
                heapq.heappush(edges,
                    (next_cost, to, next_to))
    return mst

Key idea: Always pick the cheapest edge that expands your tree!


Kruskal’s Algorithm: The Edge Collector

The Story

Imagine you have all the roads in a pile. You sort them from cheapest to most expensive. Then you pick roads one by one:

“Does this road create a loop? No? Great, I’ll use it!”

How Kruskal’s Works

  1. Sort all edges by cost (smallest first)
  2. Pick the next cheapest edge
  3. Check if it creates a cycle
  4. If no cycle → add it to MST
  5. If cycle → skip it
  6. Repeat until you have (n-1) edges

Example

Edges sorted: A-B(1), B-C(2), C-D(3), A-C(4), B-D(5)
Edge Cost Creates cycle? Action
A-B 1 No Add
B-C 2 No Add
C-D 3 No Add
A-C 4 Yes! (A-B-C-A) Skip

MST: A-B, B-C, C-D — Same result as Prim’s!

The Big Question

How do we quickly check if an edge creates a cycle?

Answer: Union-Find! (Coming next…)


Union-Find: The Family Reunion Tracker

The Story

Imagine a big family reunion. At first, everyone is a stranger. When two people shake hands, they become family friends.

You need to quickly answer:

“Are Alice and Bob already connected through friends?”

That’s what Union-Find does!

Two Operations

Find: “Who’s your family leader?”

Every person points to a leader. Follow the chain to find the root.

Union: “Let’s merge families!”

When two people connect, their families become one.

Basic Implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))

    def find(self, x):
        # Follow chain to root
        while self.parent[x] != x:
            x = self.parent[x]
        return x

    def union(self, x, y):
        # Connect two families
        root_x = self.find(x)
        root_y = self.find(y)
        if root_x != root_y:
            self.parent[root_x] = root_y

Example: Tracking Connections

Initial: [0, 1, 2, 3]  (everyone is their own parent)

union(0, 1):  [1, 1, 2, 3]  (0's parent is now 1)
union(2, 3):  [1, 1, 3, 3]  (2's parent is now 3)
union(1, 3):  [1, 3, 3, 3]  (1's parent is now 3)

find(0) → 0→1→3 → returns 3
find(2) → 2→3 → returns 3
Same root = Same family!

Union-Find Optimizations: Making it Lightning Fast

Problem with Basic Union-Find

Chains can get very long! Finding the root becomes slow.

Worst case: 0 → 1 → 2 → 3 → 4 → ... → 1000
That's 1000 steps just to find the root!

Optimization 1: Path Compression

Idea: While finding root, make everyone point directly to root!

def find(self, x):
    if self.parent[x] != x:
        self.parent[x] = self.find(self.parent[x])
    return self.parent[x]

Before: 0 → 1 → 2 → 3 After find(0): 0 → 3, 1 → 3, 2 → 3

Now everyone reaches root in 1 step!

Optimization 2: Union by Rank

Idea: Always attach the shorter tree under the taller tree.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def union(self, x, y):
        root_x = self.find(x)
        root_y = self.find(y)
        if root_x == root_y:
            return
        # Attach smaller to larger
        if self.rank[root_x] < self.rank[root_y]:
            self.parent[root_x] = root_y
        elif self.rank[root_x] > self.rank[root_y]:
            self.parent[root_y] = root_x
        else:
            self.parent[root_y] = root_x
            self.rank[root_x] += 1

How Fast Is It?

With both optimizations, Union-Find is nearly O(1) per operation!

Technically it’s O(α(n)) where α is the inverse Ackermann function — so slow-growing that for any practical n, it’s basically constant.


Kruskal’s with Union-Find: Complete Code

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(
                self.parent[x])
        return self.parent[x]

    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def kruskal(n, edges):
    # edges: [(cost, u, v), ...]
    edges.sort()
    uf = UnionFind(n)
    mst = []
    for cost, u, v in edges:
        if uf.union(u, v):
            mst.append((u, v, cost))
    return mst

Prim’s vs Kruskal’s: When to Use Which?

Feature Prim’s Kruskal’s
Best for Dense graphs Sparse graphs
Approach Grow one tree Merge forests
Data structure Priority queue Union-Find
Time (with heap) O(E log V) O(E log E)

Rule of thumb:

  • Many edges? → Prim’s
  • Few edges? → Kruskal’s

Real World Examples

Internet Cables

Connect data centers with minimum cable length → MST

Power Grid

Connect cities with minimum wire → MST

Road Networks

Connect villages with minimum road construction → MST


Summary

graph TD MST["Minimum Spanning Tree"] MST --> P["Prim&&#35;39;s Algorithm] MST --&gt; K[Kruskal&&#35;39;s Algorithm"] K --> UF["Union-Find"] UF --> PC["Path Compression"] UF --> UR["Union by Rank"] P --> |Grow tree| G["Greedy expansion"] K --> |Sort edges| S["Pick cheapest safe edge"]

You learned:

  • MST connects all nodes with minimum total edge weight
  • Prim’s grows a tree from one node
  • Kruskal’s sorts edges and picks safe ones
  • Union-Find tracks connected components
  • Path compression + union by rank make it super fast!

You’re now ready to build efficient networks!

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.