Introduction
Linked lists are dynamic data structures where elements (nodes) connected via pointers/references. Unlike arrays (fixed contiguous block), linked lists grow dynamically, nodes scattered in memory. Trade-off: O(1) insertion/deletion at known position vs. O(n) search (no random access).
Variants: singly linked (each node points to next), doubly linked (each node points to prev and next), circular (last node points back to first). Choice depends on access patterns and operations needed.
Rarely used in modern high-performance code (arrays/vectors superior for most tasks due to cache locality). But conceptually important: foundation for other structures (hash tables with chaining, graph adjacency lists), educational value (pointer manipulation, dynamic memory).
"Linked lists teach pointer mastery and dynamic thinking. While arrays outperform in practice, understanding linked lists deepens understanding of memory and structure design." -- Algorithms educators
Node Structure and Pointers
Basic Node
class Node:
def __init__(self, data):
self.data = data # Element value
self.next = None # Pointer to next node
List Representation
Linked list defined by head pointer (first node). Traversal: follow next pointers until None (end). Example: 1 → 2 → 3 → None represents list [1, 2, 3].
Memory Layout
Nodes scattered throughout memory (unlike arrays). Each node stores data + pointer. Pointer enables connection despite non-contiguity. Flexible growth (allocate new node as needed, link it).
Pointer Operations
Create node: Node(value). Follow pointer: current = current.next. Modify pointer: node.next = new_node. Insert between: node1.next = new_node; new_node.next = node2.
Singly Linked Lists
Structure
Each node has data and next pointer. Traversal unidirectional (forward only). Head points to first node, None marks end.
head → [1|•] → [2|•] → [3|•] → [None]
node1 node2 node3
Insertion at Head
def insert_head(head, data):
new_node = Node(data)
new_node.next = head
return new_node
Time: O(1), Space: O(1)
Insertion at Position
def insert_at(head, position, data):
if position == 0:
return insert_head(head, data)
current = head
for i in range(position - 1):
if current is None:
return head # Invalid position
current = current.next
new_node = Node(data)
new_node.next = current.next
current.next = new_node
return head
Time: O(position), Space: O(1)
Deletion
def delete_head(head):
if head is None:
return None
return head.next
def delete_at(head, position):
if position == 0:
return delete_head(head)
current = head
for i in range(position - 1):
if current is None:
return head
current = current.next
if current.next:
current.next = current.next.next
return head
Time: O(position), Space: O(1)
Search
def search(head, value):
current = head
while current:
if current.data == value:
return current
current = current.next
return None
Time: O(n), Space: O(1)
Doubly Linked Lists
Structure
Each node has data, next pointer, and previous pointer. Bidirectional traversal.
head ↔ [1] ↔ [2] ↔ [3] ↔ None
prev/next pointers in each direction
Node Definition
class DNode:
def __init__(self, data):
self.data = data
self.next = None
self.prev = None
Insertion at Position
More complex: must update both next and prev pointers of affected nodes.
new_node.next = current.next
new_node.prev = current
if current.next:
current.next.prev = new_node
current.next = new_node
Time: O(position) to reach position, O(1) to insert
Space: O(1)
Advantages Over Singly
- Backward traversal possible (find previous node O(1) vs. O(n) in singly)
- Deletion given node pointer O(1) (don't need previous node)
- Reverse iteration
Disadvantages
- Extra memory: prev pointer per node (~2x pointer overhead)
- More complex operations (update two pointers)
- Not worth overhead for simple sequential access
Circular Linked Lists
Structure
Last node points back to first (instead of None). No explicit end. Circular traversal: loop infinitely or detect cycle by tracking visited.
[ 1 ] ← next
↓ ↑
[ 2 ] next
↓ ↑
[ 3 ] → next (points back to 1)
Insertion
Insert at head: new_node.next = head, last_node.next = new_node. Must maintain last pointer for O(1) tail insertion.
Traversal Caution
While loop condition: current != head (prevents infinite loop). Detect cycle: current != start and current != None never true in circular list.
Use Cases
- Round-robin scheduling: keep pointer to current, advance on each scheduling
- Game turn order: circular player list
- Buffer rings (circular buffers)
Basic Operations: Insert, Delete, Search
| Operation | Singly (Best) | Singly (Worst) | Doubly (Best) | Doubly (Worst) |
|---|---|---|---|---|
| Insert head | O(1) | O(1) | O(1) | O(1) |
| Insert tail | O(n) | O(n) | O(1)* | O(1)* |
| Insert position i | O(i) | O(n) | O(min(i,n-i)) | O(n) |
| Delete head | O(1) | O(1) | O(1) | O(1) |
| Delete position i | O(i) | O(n) | O(1)* | O(n) |
| Search | O(1) | O(n) | O(1) | O(n) |
| Access index i | O(i) | O(n) | O(i) | O(n) |
*With tail pointer (additional maintenance overhead)
Time and Space Complexity Analysis
Time Complexity
Access: O(n) (no random access, must traverse). Insert/Delete: O(n) worst case (if position unknown or at end), O(1) if at known pointer. Search: O(n) (linear search).
Space Complexity
Storage: O(n) for n elements. Each node: data + pointer(s). Singly: 1 pointer per node. Doubly: 2 pointers. Auxiliary: O(1) for most operations (in-place modifications).
Why Linked Lists Underperform
- Cache misses: Non-contiguous memory. Next pointer often points to distant address. CPU cache ineffective.
- Pointer dereferencing: Extra CPU cycles per access (follows pointer).
- Memory overhead: Pointers per node (8-16 bytes each). For small data (integer: 4 bytes), pointer overhead significant.
Practical Performance
Linked list often 10-100x slower than array for same operations despite theoretical equivalence. Real-world: arrays dominate. Linked lists justified only when: frequent insertion/deletion at known positions, size wildly variable, avoiding pre-allocation.
Traversal and Iteration
Forward Traversal (Singly)
def traverse(head):
current = head
while current:
print(current.data)
current = current.next
Backward Traversal (Doubly)
def traverse_backward(tail):
current = tail
while current:
print(current.data)
current = current.prev
Iterator Pattern
Encapsulate traversal: iterator object maintains current position, provides next() method. Enables for-loop syntax in Python.
for data in LinkedList(head):
print(data)
List Manipulation: Reverse, Merge, Sort
Reverse
def reverse(head):
prev = None
current = head
while current:
next_node = current.next
current.next = prev
prev = current
current = next_node
return prev
Time: O(n), Space: O(1) in-place
Merge (Two Sorted Lists)
def merge(l1, l2):
dummy = Node(0)
current = dummy
while l1 and l2:
if l1.data <= l2.data:
current.next = l1
l1 = l1.next
else:
current.next = l2
l2 = l2.next
current = current.next
current.next = l1 or l2
return dummy.next
Time: O(n+m), Space: O(1) (reuses nodes)
Sort (Merge Sort)
Merge sort natural for linked lists (unlike quicksort which wants random access). O(n log n) time, O(log n) space (recursion stack). Split list by slow/fast pointers, recurse, merge.
Memory Management and Allocation
Manual Memory Management (C)
Allocate: malloc(sizeof(Node)). Deallocate: free(node_ptr). Forgetting free() causes memory leak. Double-free causes crash. Error-prone.
Garbage Collection (Python, Java)
Automatic deallocation: when no references to node, garbage collector frees. Programmer doesn't explicitly free. Safer, simpler.
Smart Pointers (C++)
Ownership semantics: unique_ptr (exclusive ownership), shared_ptr (shared ownership). Automatic deallocation when ownership released. Modern C++ preferred over manual.
Memory Fragmentation
Scattered allocations fragment heap. Allocating/deallocating many small objects creates gaps. Arrays avoid (single contiguous allocation). Linked lists suffer (many small allocations).
Linked Lists vs. Arrays
| Aspect | Array | Linked List |
|---|---|---|
| Random Access | O(1) | O(n) |
| Insert/Delete Head | O(n) | O(1) |
| Insert/Delete Middle | O(n) | O(n) search + O(1) modification |
| Space | O(n) + capacity overhead | O(n) + pointer overhead |
| Cache Performance | Excellent | Poor |
| Practical Speed | Fast | Slow (10-100x) |
| Sortability | Efficient (quicksort, mergesort) | Limited (merge sort only) |
Conclusion
Arrays win in practice. Linked lists educational, useful in specific scenarios (LRU cache, undo stacks via doubly-linked). Modern preference: vectors/dynamic arrays (combine array benefits + growth).
Applications
LRU Cache (Doubly Linked List + Hash Map)
Doubly-linked list maintains access order. Hash map enables O(1) lookup. Evict LRU (oldest): remove from head.
Undo/Redo (Doubly Linked List)
Sequence of operations. Undo: traverse backward. Redo: traverse forward.
Skip List
Probabilistic data structure. Multiple levels of linked lists. Level 0: full list. Level 1: every 2nd element. Level 2: every 4th. Binary search-like performance O(log n).
Hash Table (with Chaining)
Collision resolution: store colliding elements in linked list at hash index. O(1) average insertion, O(length of chain) search.
Graph Adjacency Lists
Sparse graph representation. For each vertex: linked list of neighbors. Memory-efficient compared to adjacency matrix.
Polynomial Representation
Polynomial terms: linked list of (coefficient, exponent) pairs. Efficient sparse representation.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. "Introduction to Algorithms." MIT Press, 3rd edition, 2009.
- Knuth, D. E. "The Art of Computer Programming, Volume 1: Fundamental Algorithms." Addison-Wesley, 3rd edition, 1997.
- Sedgewick, R., and Wayne, K. "Algorithms, Part I." Addison-Wesley, 4th edition, 2011.
- Stroustrup, B. "The C++ Programming Language." Addison-Wesley, 4th edition, 2013.