Introduction
Directory structure: system organizing files in hierarchical manner. Purpose: enable efficient file storage, retrieval, and management. Essential for multi-user, multitasking OS environments. Enables namespace management, access control, and metadata storage.
"A well-designed directory structure is fundamental to file system integrity and performance." -- Andrew S. Tanenbaum
Basic Concepts
Directory
Container mapping file names to file metadata or locations. Abstracts physical storage details from users. Supports hierarchical organization: directories contain files and subdirectories.
File System
Logical structure managing files and directories on storage devices. Includes naming, storage allocation, and access control mechanisms.
Hierarchy
Tree-like structure with root directory as origin. Nodes represent directories or files. Enables path-based addressing and inheritance of permissions.
Directory Types
Single-Level Directory
Flat structure: all files in one directory. Simple, but name collisions common. Poor scalability for large systems.
Two-Level Directory
Separate directory per user. Reduces collisions, improves organization. Limited inter-user collaboration.
Hierarchical (Tree) Directory
Multiple levels of directories and subdirectories. Supports complex organization, sharing, and path navigation.
Acyclic-Graph Directory
Allows shared subdirectories or files via multiple parent directories. Prevents cycles to maintain hierarchy integrity.
General Graph Directory
Supports cycles using reference counts or garbage collection. Complex but allows maximum flexibility.
Directory Organizations
Linear List
Directory entries stored sequentially. Simple but inefficient for large directories due to linear search.
Hash Table
Directory entries stored via hash keys. Enables O(1) average lookup time. Collision handling needed.
Tree Structures
B-Trees or balanced trees for indexing directory entries. Efficient for very large directories.
Path Structures
Absolute Path
Full path from root directory to target file/directory. Unambiguous, system-wide unique.
Relative Path
Path relative to current working directory. Allows flexible navigation and command execution.
Path Components
Sequence of directory names separated by delimiters (e.g., '/'). Includes special entries '.' (current) and '..' (parent).
Directory Operations
Create
Allocate new directory entry. Initialize metadata. Update parent directory accordingly.
Delete
Remove directory entry. Deallocate associated storage. Ensure no dangling references.
Search
Locate file/directory by name using directory structure. Efficiency depends on organization.
Rename
Modify entry name or move entry between directories. Update links and metadata.
Directory Implementation
Entry Structure
Contains file name, pointer to inode or metadata block, timestamps, permissions.
Storage
Stored in fixed-size blocks on disk. May use linked lists, indexed blocks, or hash tables internally.
Caching
Directory entries cached in memory to reduce disk I/O latency. Uses buffer cache or page cache.
Access Control in Directories
Permissions
Read, write, execute permissions managed per directory. Control file visibility and modification.
Ownership
Directories owned by users/groups. Ownership defines default permissions and inheritance.
Access Control Lists (ACLs)
Fine-grained permission control listing specific user/group rights.
Directory Algorithms
Search Algorithm
Linear search for lists. Hash lookup for hash tables. Tree traversal for B-trees.
Insertion Algorithm
Append to list or insert in hash bucket/tree. Maintain ordering or balancing.
Deletion Algorithm
Remove entry, update pointers or rehash. Rebalance trees if necessary.
Algorithm: Directory Search (Hash Table)Input: directory D, file name F1. Compute hash index h = hash(F)2. Access bucket B = D[h]3. For each entry E in B: if E.name == F: return E.pointer4. Return not foundAlgorithm: Directory Insertion (B-Tree)Input: directory BTree, entry E1. Locate leaf node L where E.name fits2. Insert E into L in sorted order3. If L overflows: Split L into L1, L2 Promote middle key to parent4. Repeat splits up to root if necessaryAdvantages and Limitations
Advantages
Efficient file organization. Simplifies file management. Supports name uniqueness. Enables access control and sharing.
Limitations
Complexity increases with depth. Performance degrades with large directories if poorly organized. Maintaining consistency and preventing cycles challenging.
Comparison of Directory Structures
| Structure | Advantages | Disadvantages |
|---|---|---|
| Single-Level | Simple, easy implementation | Name conflicts, poor scalability |
| Two-Level | User isolation, fewer conflicts | Limited sharing, duplication |
| Hierarchical | Flexible organization, sharing | Complexity, potential cycles |
| Acyclic Graph | Supports sharing, no cycles | Complex reference management |
Case Studies
Unix File System (UFS)
Hierarchical directory structure. Uses inodes for metadata. Supports hard and symbolic links. Pathnames absolute or relative. Permission bits and ACLs for access control.
Windows NTFS
Hierarchical directories stored as B+ trees. Master File Table (MFT) stores file metadata. Supports complex ACLs, encryption, and journaling. Efficient large directory management.
Google File System (GFS)
Flat namespace with directory-like abstractions. Metadata server manages file to chunk mappings. Designed for distributed large-scale data storage with fault tolerance.
References
- Silberschatz, A., Galvin, P. B., Gagne, G. "Operating System Concepts," 10th Ed., Wiley, 2018, pp. 250-280.
- Tanenbaum, A. S., Bos, H. "Modern Operating Systems," 4th Ed., Pearson, 2015, pp. 130-170.
- Stallings, W. "Operating Systems: Internals and Design Principles," 9th Ed., Pearson, 2018, pp. 310-350.
- McKusick, M. K., Neville-Neil, G. V. "The Design and Implementation of the FreeBSD Operating System," 2nd Ed., Addison-Wesley, 2014, pp. 100-140.
- Ghemawat, S., Gobioff, H., Leung, S.-T. "The Google File System," ACM SIGOPS Operating Systems Review, Vol. 37, No. 5, 2003, pp. 29-43.