RAID 6 provides more redundancy than RAID 5 but with two parity disks.
There are two general types of files: stream/sequential and random access.
Accessing file contents can be done through I/O system calls such as read, write, lseek, etc., or through memory mapped files.
The File Descriptor Table is a kernel state maintained for each open file, including the read/write offset.
The File Descriptor Table is kept as an array of structures inside kernel memory to prevent user programs from modifying it directly.
User programs specify files using a file descriptor, which is an index into the kernel’s array.
The File Descriptor Table also contains function pointers for the Virtual File System (VFS) abstraction.
Bitmap is a common file system data structure that maintains a free list similar to a final program, memory management, and uses File Allocation Table (FAT) for increased performance.
Indirect Inode (inode) is a real inode, not what final program calls inode, and contains file metadata.
Blocks appearing only once in file, only in one file, in both file and free list
DiskSnapshots: Easy way to protect against deleted files (also easy to recover files)
‘Replay’ log instead of full consistency check
Fixing file system data structure doesn’t guarantee the file contents are correct
Deep copy is too expensive, so snapshots use a shallow copy
File/directory extents form a tree instead of a list in inodes, resulting in faster random access time.
Inodes provide varying levels of tree depth to handle large and small files efficiently.
Soft links store the target file path in inode (final program), work between large file systems, and break when the target file is moved or renamed.
Hard links allow multiple directory entries to refer to the same inode number, inode keeps reference count so data isn’t deleted while still in use (nlinks), and undeleting files involves checking the nlinks field.
Disk blocks are shared between different snapshots
Performance considerations include disks being orders of magnitudes slower than RAM, keeping contents in cache, and what happens when cache contents are inconsistent with disk contents when drive is disconnected.
Minimizing data corruption involves writing through cache for important meta data, using system call for user level for disk write (fsync), and nagging users.
Fixing corruption involves reading through every disk block and looking for inconsistencies, leaked block in free list, using nlink