Strongly Connected Components

1. Introduction

A strongly connected component (SCC) of a directed graph is a maximal set of vertices such that there is a directed path from every vertex in the set to every other vertex in the set. In other words, within an SCC, every vertex is reachable from every other vertex by following the direction of edges.

Decomposing a directed graph into its strongly connected components is a fundamental operation in graph theory with wide-ranging applications. The decomposition partitions the vertex set into disjoint groups, and the resulting structure -- called the condensation graph -- is always a directed acyclic graph (DAG), which can be processed using topological ordering.

"The strongly connected components of a directed graph represent the finest partition of the vertices such that within each part, every vertex can reach every other." -- Robert Tarjan

2. Definitions and Properties

Strongly Connected

A directed graph is strongly connected if there exists a directed path from every vertex to every other vertex. A strongly connected component is a maximal subgraph with this property -- meaning no additional vertices can be added to it while maintaining strong connectivity.

Key Properties

Every vertex belongs to exactly one SCC.
A single vertex with no self-loop forms its own SCC (it is trivially strongly connected).
If vertex u can reach vertex v and v can reach u, then u and v are in the same SCC.
The condensation graph (SCCs as nodes) is always a DAG.
A directed graph is strongly connected if and only if it has exactly one SCC.

Relationship to undirected graphs: The undirected analog of strongly connected components is simply "connected components." In undirected graphs, connectivity is symmetric, so any two vertices that can reach each other are automatically in the same component.

3. Kosaraju's Algorithm

Kosaraju's algorithm (also known as Kosaraju-Sharir algorithm) finds all SCCs using two passes of depth-first search. It relies on the key insight that the SCCs of a graph and its transpose (reversed edges) are identical.

Steps

First DFS pass: Perform DFS on the original graph, recording the finish order of vertices in a stack.
Transpose the graph: Reverse the direction of every edge to create the transpose graph G^T.
Second DFS pass: Process vertices from the stack (in reverse finish order) and perform DFS on the transpose graph. Each DFS tree in this pass forms one SCC.

functionKosaraju(Graph): stack = empty visited = empty set // Pass 1: DFS on original graph, record finish orderfor each vertex v in Graph.vertices: if v not in visited: DFS1(Graph, v, visited, stack) // Transpose the graph G_T = Transpose(Graph) // Pass 2: DFS on transposed graph in reverse finish order visited = empty set SCCs = [] while stack is not empty: v = stack.pop() if v not in visited: component = [] DFS2(G_T, v, visited, component) SCCs.append(component) return SCCsfunctionDFS1(Graph, v, visited, stack): visited.add(v) for each neighbor u of v: if u not in visited: DFS1(Graph, u, visited, stack) stack.push(v)functionDFS2(Graph, v, visited, component): visited.add(v) component.append(v) for each neighbor u of v: if u not in visited: DFS2(Graph, u, visited, component)

4. Tarjan's Algorithm

Tarjan's algorithm finds all SCCs in a single DFS pass, making it more elegant and slightly more efficient in practice. It maintains a stack and uses two values per vertex: a discovery index and a "low-link" value.

disc[v]: The order in which vertex v was discovered during DFS.
low[v]: The smallest discovery index reachable from v through the DFS subtree and back edges.

A vertex v is the root of an SCC if low[v] == disc[v]. When such a root is found, all vertices on the stack above and including v form one SCC.

functionTarjan(Graph): index = 0 stack = empty SCCs = [] for each vertex v in Graph.vertices: if disc[v] is undefined: StrongConnect(v) return SCCsfunctionStrongConnect(v): disc[v] = low[v] = index index += 1 stack.push(v) onStack[v] = truefor each neighbor w of v: if disc[w] is undefined: StrongConnect(w) low[v] = min(low[v], low[w]) else if onStack[w]: low[v] = min(low[v], disc[w]) // If v is a root of an SCCif low[v] == disc[v]: component = [] do: w = stack.pop() onStack[w] = false component.append(w) while w != v SCCs.append(component)

5. Complexity Analysis

Algorithm	Time	Space	DFS Passes
Kosaraju's	O(V + E)	O(V + E)	2 (plus graph transpose)
Tarjan's	O(V + E)	O(V)	1

Both algorithms run in linear time, which is optimal since every vertex and edge must be examined at least once. Tarjan's algorithm has a slight practical advantage because it requires only one DFS traversal and does not need to construct the transpose graph, saving both time and memory.

6. Worked Example

Consider a directed graph with 8 vertices (A-H) and the following edges:

From	To
A	B
B	C
C	A
B	D
D	E
E	F
F	D
G	F
G	H
H	G

Tracing Tarjan's Algorithm

Starting DFS from vertex A:

Visit A (disc=0, low=0)

Stack: [A]. Explore neighbor B.

Visit B (disc=1, low=1)

Stack: [A, B]. Explore neighbor C.

Visit C (disc=2, low=2)

Stack: [A, B, C]. Neighbor A is on stack: low[C] = min(2, 0) = 0. Backtrack.

Back at B

low[B] = min(1, low[C]=0) = 0. Explore neighbor D.

Visit D (disc=3, low=3)

Stack: [A, B, C, D]. Explore neighbor E.

Visit E (disc=4, low=4)

Stack: [A, B, C, D, E]. Explore neighbor F.

Visit F (disc=5, low=5)

Stack: [A, B, C, D, E, F]. Neighbor D is on stack: low[F] = min(5, 3) = 3. Backtrack.

Back at E

low[E] = min(4, 3) = 3. Backtrack.

Back at D

low[D] = min(3, 3) = 3. Since low[D] == disc[D], D is an SCC root. Pop stack until D: SCC 1 = {F, E, D}

Back at B, then A

low[A] = 0, disc[A] = 0. SCC root. Pop stack until A: SCC 2 = {C, B, A}

Start new DFS from G (disc=6, low=6)

Stack: [G]. Explore H.

Visit H (disc=7, low=7)

Stack: [G, H]. Neighbor G is on stack: low[H] = min(7, 6) = 6. Backtrack.

Back at G

low[G] = min(6, 6) = 6. Since low[G] == disc[G], SCC root. Pop: SCC 3 = {H, G}

Result: Three SCCs

SCC	Vertices	Description
SCC 1	{A, B, C}	Cycle: A->B->C->A
SCC 2	{D, E, F}	Cycle: D->E->F->D
SCC 3	{G, H}	Cycle: G->H->G

7. Condensation Graph

The condensation graph (also called the component graph or meta-graph) is formed by contracting each SCC into a single vertex. Edges between SCCs in the original graph become edges between the corresponding vertices in the condensation.

From our worked example, the condensation graph has three vertices:

SCC{A,B,C} -> SCC{D,E,F} (via edge B->D)
SCC{G,H} -> SCC{D,E,F} (via edge G->F)

Key property: The condensation graph is always a DAG (directed acyclic graph). If there were a cycle among SCCs, all the vertices in those SCCs could reach each other, contradicting the maximality of each SCC. This DAG structure enables topological sorting and further analysis.

8. Advantages and Disadvantages

Kosaraju's Algorithm

Advantage: Conceptually simpler to understand and prove correct.
Advantage: Naturally identifies SCCs in reverse topological order of the condensation.
Disadvantage: Requires two DFS passes and construction of the transpose graph.
Disadvantage: Higher memory usage due to storing the transpose graph.

Tarjan's Algorithm

Advantage: Single DFS pass, more efficient in practice.
Advantage: Lower memory usage -- no transpose graph needed.
Advantage: Can be extended to find articulation points and bridges.
Disadvantage: More complex to implement and understand (low-link concept).
Disadvantage: The recursive implementation can cause stack overflow on very large graphs.

9. Comparison of Approaches

Feature	Kosaraju's	Tarjan's	Path-Based
DFS Passes	2	1	1
Transpose Graph	Required	Not needed	Not needed
Extra Data	Finish-time stack	Low-link values	Two stacks
Time	O(V + E)	O(V + E)	O(V + E)
Ease of Proof	Straightforward	Moderate	Moderate
Practical Speed	Good	Best	Good

10. Applications

2-SAT solver: The satisfiability of 2-SAT formulas can be determined in linear time by computing the SCCs of the implication graph.
Compiler optimization: Identifying cycles in call graphs and data dependency graphs for optimization.
Web graph analysis: Analyzing the structure of the World Wide Web, where SCCs represent "communities" of interlinked pages.
Social network analysis: Finding tightly-knit groups where every member can reach every other member through directed connections.
Model checking: Verifying temporal properties of finite-state systems by analyzing the SCC structure of state-transition graphs.
Deadlock detection: Identifying potential deadlocks in concurrent systems by finding cycles in resource allocation graphs.

11. References

Tarjan, R. E. (1972). "Depth-First Search and Linear Graph Algorithms." SIAM Journal on Computing, 1(2), 146-160.
Sharir, M. (1981). "A Strong-Connectivity Algorithm and Its Applications in Data Flow Analysis." Computers and Mathematics with Applications, 7(1), 67-72.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press. Chapter 22.
Sedgewick, R. and Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley. Section 4.2.
Aspvall, B., Plass, M. F., and Tarjan, R. E. (1979). "A Linear-Time Algorithm for Testing the Truth of Certain Quantified Boolean Formulas." Information Processing Letters, 8(3), 121-123.