Minimum Spanning Trees

A minimum spanning tree is the same as any other fully connected graph, except that each of the edges (not paths) is minimal. Note that this does not mean all the paths are minimal. The minimum spanning tree is the (minimal, in terms of cost) set of edges required to join the graph. A further constraint is that the minimum spanning tree contains no cycles. This is because a cycle means there's an extra edge somewhere that isn't required.

Take for example this graph. It has 7 nodes, most of which with 3 or 4 edges connecting them to other nodes. Clearly, this is not a minimum spanning tree. However, if we remove all the edges labeled with red, we will have a min spanning tree.

 

Take, for example, this simple graph. First, note that the shortest path from any one node to any other node is 1. For a minimum spanning tree, we aren't interested in shortest paths. Our concern is having each node connected to the minimum spanning tree with the cheapest edge possible.

This is one possibility of a minimum spanning tree.
There are two other correct solutions.
One with edges { (A,B), (A,C) } and { (A,C), (B,C) }
The solution set is dependent on how a minimum path is chosen.


Kruskal's Algorithm

Kruskal's algorithm searches through the edges, picks the smallest one, and adds it to the tree, provided that edge doesn't create a cycle. Recall that a cycle is the condition where you can follow a series of edges and arrive back where you started without backtracking. Pseudo-code for Kruskal's Algorithm follows:

T = {}       // T is the set of pairs (edges) that make up the min span tree
 
while ( T has < n-1 edges and E not empty ) {
 
  pick (u,v) from E such that its weight is minimal
 
  delete (u,v) from E
 
  if adding (u,v) doesn't create a cycle, add it to T
 
}

Remarks

On completion, T will contain a list of pairs. E is a list of pairs, sorted in ascending order according to weights. For the graph above, we have the following list for E:

Weight (edge)

1  ( 1, 7 )
1  ( 2, 4 )
2  ( 0, 2 )
2  ( 4, 5 )
3  ( 0, 3 )
3  ( 2, 5 )
4  ( 1, 2 )
4  ( 4, 6 )
5  ( 0, 1 )
5  ( 5, 7 )
7  ( 0, 4 )
10 ( 3, 4 )

21 ( 3, 6 )

Trace through Kruskal's algorithm:

·  Initially, T is empty, so the first edge (1,7) is added.

·  The next edge, (2,4) is also added, since its addition to T does not create a cycle. At this point the spannning tree contains two, unconnected edges.

·  The process continues, adding the edges (0,2), (4,5), (0,3)

·  The addition of next edge, (2,5) will create a cycle (T already contains (2,4) and (4,5), so adding (2,5) is unnecessary).

·  The algorithm terminates when we've emptied out the list E. The list T will contain { (1,7), (2,4), (0,2), (4,5), (0,3), (1,2), (4,6) }, which are exactly the edges that make up the minimum spanning tree. Each edge is the cheapest one that connects that node to the rest of the tree.


Prim's Algorithm

Prim's algorithm takes a different approach. It maintains two lists, TV which is the nodes already in the tree, and T, the list of edges that makes up the spanning tree. In determining candidate edges for the tree, we look for a node that's in TV, and one that isn't, such that its path is minimum.

TV = { 0 }
 
T = { }
 
while( T has < n-1 edges ) {
 
  find (u,v) with least cost, such that
               u is in TV and v isn't in TV
 
  if no such edge exists, break
 
  add v to TV
 
  add (u,v) to T
}

On termination, if the graph is connected, TV will contain all the nodes in the graph, and T will contain the set edges comprising the minimum spanning tree.
Tracing through the algorithm, we get the following:

TV

T

{ 0 }

{ }

{ 0, 2 }

{ (0,2) }

{ 0, 2, 4 }

{ (0,2), (2,4) }

{ 0, 2, 4, 5 }

{ (0,2), (2,4), (4,5) }

{ 0, 2, 4, 5, 3 }

{ (0,2), (2,4), (4,5), (0,3) }

{ 0, 2, 4, 5, 3, 1 }

{ (0,2), (2,4), (4,5), (0,3), (2,1) }

{ 0, 2, 4, 5, 3, 1, 7 }

{ (0,2), (2,4), (4,5), (0,3), (2,1), (1,7) }

{ 0, 2, 4, 5, 3, 1, 7, 6 }

{ (0,2), (2,4), (4,5), (0,3), (2,1), (1,7), (4,6) }

On the last step, it is impossible to find a pair (u,v) where u is in TV, and v isn't, because we have all the nodes already in TV.


This page and contents maintained by Erik Huizing