!main_tags! Graphs - Data Structures | What's Your IQ !main_header!

Introduction

Graph models relationships between entities. Vertices (nodes) represent entities, edges represent relationships. Fundamental: networks (social networks, computer networks, road networks), dependencies (software modules, course prerequisites), knowledge (semantic networks, knowledge graphs).

Key abstraction: more flexible than trees (edges can form cycles, multiple paths between vertices). Enables modeling real-world complexity. Algorithms: finding shortest paths (GPS navigation), detecting cycles (deadlock detection), finding connected components (social network clustering).

Challenges: large graphs (billions of vertices, trillion edges). Algorithms must scale. Representations trade-offs: adjacency list (space-efficient for sparse graphs) vs. adjacency matrix (access-efficient for dense graphs).

"Graphs are everywhere. From social networks to protein interactions, from road systems to neural networks, graphs are the language of connected worlds." -- Graph theory in application

Graph Terminology and Types

Core Concepts

  • Vertex (Node): Entity in graph
  • Edge: Relationship between two vertices
  • Degree: Number of edges incident to vertex
  • Path: Sequence of vertices connected by edges
  • Cycle: Path starting and ending at same vertex
  • Connected: Path exists between any two vertices

Graph Types

Type Edges Properties Examples
Undirected Bidirectional Symmetric: (u,v) = (v,u) Friendship, roads
Directed (Digraph) One direction In-degree, out-degree Follow relationship, dependencies
Weighted Numeric value Edge weight matters Distances, costs, capacities
Cyclic Has cycles General graphs Most real graphs
DAG No cycles Acyclic directed Precedence, inheritance

Density

Sparse: few edges (O(V) edges). Dense: many edges (O(V^2) edges). Sparse graphs common in practice. Representation choice affects performance.

Graph Representations: Adjacency List and Matrix

Adjacency List

For each vertex, store list of adjacent vertices.
graph = {
 0: [1, 2],
 1: [0, 3],
 2: [0, 3],
 3: [1, 2]
}

Space: O(V + E)
Edge lookup: O(degree of vertex), typically O(average degree)
Iteration over edges: O(E)

Adjacency Matrix

V x V boolean matrix. matrix[i][j] = True if edge (i,j).
 0 1 2 3
0 [0, 1, 1, 0]
1 [1, 0, 0, 1]
2 [1, 0, 0, 1]
3 [0, 1, 1, 0]

Space: O(V^2)
Edge lookup: O(1)
Iteration over edges: O(V^2)

Comparison

Adjacency List: Space-efficient for sparse graphs. Iteration faster. Edge lookup slower.

Adjacency Matrix: Fast edge lookup. Space wasteful for sparse graphs. Cache-friendly (contiguous memory).

Weighted Graphs

Adjacency list: store (neighbor, weight) pairs. Adjacency matrix: store weight instead of boolean.

Depth-First Search (DFS)

Algorithm (Recursive)

def dfs(vertex, visited, graph):
 visited.add(vertex)
 print(vertex)
 for neighbor in graph[vertex]:
 if neighbor not in visited:
  dfs(neighbor, visited, graph)

Call: visited = set(); dfs(start, visited, graph)

Time: O(V + E), Space: O(V) for visited set + recursion

Iterative DFS (Using Stack)

def dfs_iterative(start, graph):
 visited = set()
 stack = Stack()
 stack.push(start)
 while not stack.is_empty():
 vertex = stack.pop()
 if vertex not in visited:
  print(vertex)
  visited.add(vertex)
  for neighbor in graph[vertex]:
  if neighbor not in visited:
   stack.push(neighbor)

Applications

  • Topological sort (DAG)
  • Cycle detection
  • Connected components
  • Strongly connected components (SCC)

Breadth-First Search (BFS)

Algorithm (Using Queue)

def bfs(start, graph):
 visited = {start}
 queue = Queue()
 queue.enqueue(start)
 while not queue.is_empty():
 vertex = queue.dequeue()
 print(vertex)
 for neighbor in graph[vertex]:
  if neighbor not in visited:
  visited.add(neighbor)
  queue.enqueue(neighbor)

Time: O(V + E), Space: O(V)

Shortest Path in Unweighted Graphs

Track parent pointers. Reconstruct path by backtracking from destination to source.

BFS naturally finds shortest path (explores by levels).

Applications

  • Shortest path (unweighted)
  • Connectivity testing
  • Level-order tree traversal

Connectivity and Components

Connected Components

Partition vertices into groups where internal vertices connected, external disconnected. DFS or BFS from unvisited vertex finds one component.

def find_components(graph):
 visited = set()
 components = []
 for vertex in graph:
 if vertex not in visited:
  component = []
  dfs_collect(vertex, visited, graph, component)
  components.append(component)
 return components

Time: O(V + E)

Strongly Connected Components (SCC)

Directed graph: SCC is maximal set of vertices mutually reachable. Kosaraju's or Tarjan's algorithm: O(V + E).

Bipartiteness

Graph bipartite if vertices 2-colorable (no adjacent same color). BFS coloring: assign colors level-by-level, check no conflicts.

Shortest Path: Dijkstra and Bellman-Ford

Dijkstra's Algorithm

Single-source shortest path (all destinations). Greedy: always expand shortest unknown distance. Requires non-negative weights.

Algorithm:
1. Initialize distances: source = 0, others = infinity
2. While unvisited vertices:
 - Pick unvisited with smallest distance
 - For each neighbor:
 - Relax edge: if distance[neighbor] > distance[vertex] + weight(edge):
  distance[neighbor] = distance[vertex] + weight(edge)

Time: O((V + E) log V) with heap, O(V^2) with array

Bellman-Ford Algorithm

Handles negative weights (not negative cycles). Relaxes all edges V-1 times. Can detect negative cycles.

Time: O(V * E), Space: O(V)

For each of V-1 iterations:
 For each edge (u, v, w):
 relax(u, v, w)

All-Pairs Shortest Path

Floyd-Warshall: O(V^3) time, O(V^2) space. Dynamic programming on intermediate vertices. Simpler than Dijkstra for all-pairs.

Minimum Spanning Trees (MST)

Definition

Subset of edges connecting all vertices with minimum total weight. Unique if weights distinct. Tree: V vertices, V-1 edges, acyclic, connected.

Kruskal's Algorithm (Edge-Based)

1. Sort edges by weight ascending
2. For each edge:
 - If endpoints not connected, add edge
 - Use union-find for connectivity

Time: O(E log E) for sorting + O(E α(V)) for union-find ≈ O(E log E)

Prim's Algorithm (Vertex-Based)

1. Start at arbitrary vertex
2. While tree incomplete:
 - Add cheapest edge connecting tree to non-tree
 - Expand tree

Time: O((V + E) log V) with heap

Applications

  • Network design (minimize cable cost)
  • Cluster analysis
  • Image processing (superpixels)

Topological Sorting

Definition

Linear ordering of vertices such that for every edge (u, v), u comes before v. Only possible for DAGs (no cycles).

DFS-Based Algorithm

def topological_sort(graph):
 visited = set()
 order = []
 def dfs(vertex):
 visited.add(vertex)
 for neighbor in graph[vertex]:
  if neighbor not in visited:
  dfs(neighbor)
 order.append(vertex) # Add after visiting all descendants
 for vertex in graph:
 if vertex not in visited:
  dfs(vertex)
 return reversed(order)

Time: O(V + E)

Applications

  • Build systems (compile dependencies)
  • Course prerequisites
  • Task scheduling

Cycle Detection

Undirected Graphs

DFS: if reach visited non-parent vertex, cycle exists. Equivalently: number edges > V-1 in connected component.

Directed Graphs

DFS with colors: white (unvisited), gray (visiting), black (visited). Back edge (gray to gray) indicates cycle.

def has_cycle_directed(graph):
 color = {v: 'white' for v in graph}
 def dfs(vertex):
 color[vertex] = 'gray'
 for neighbor in graph[vertex]:
  if color[neighbor] == 'gray':
  return True # Back edge = cycle
  if color[neighbor] == 'white' and dfs(neighbor):
  return True
 color[vertex] = 'black'
 return False
 for vertex in graph:
 if color[vertex] == 'white':
  if dfs(vertex):
  return True
 return False

Applications

  • Deadlock detection (circular wait in resource requests)
  • Dependency breaking (break cycles in dependency graphs)

Applications in Networks and Routing

Social Networks

Vertices: users. Edges: friendships. Analyze: clustering coefficient (friend groups), diameter (degrees of separation), influence propagation.

Network Routing

Dijkstra finds shortest path (IP routing uses variations). BGP uses modified shortest path with policy constraints.

Recommendation Systems

User-item graph: edges between users and items they interact with. Collaborative filtering: recommend items connected to user's items.

Knowledge Graphs

Entities (people, places) as vertices. Relations as edges. Traversal finds knowledge (path from person A to person B via "friend of" edges).

Computer Networks

Vertices: computers, routers. Edges: connections. Analysis: redundancy (multiple paths), connectivity, bottlenecks.

References

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
  • Diestel, R. "Graph Theory." Springer, 5th edition, 2016.
  • Sedgewick, R., and Wayne, K. "Algorithms, Part II." Addison-Wesley, 4th edition, 2011.
  • Bondy, J. A., and Murty, U. S. R. "Graph Theory." Springer, 2008.
  • West, D. B. "Introduction to Graph Theory." Prentice Hall, 2nd edition, 2001.
!main_footer!