dsd

Cards (125)

  • Storage management
    Overseeing of data storage equipment used to store user/computer-generated data
  • Storage management
    • Involves tools and processes utilized by administrators to ensure data and storage equipment safety
    • Optimization of storage device usage and protection of data integrity are primary objectives
    • Subcategories encompass aspects like security, virtualization, provisioning, and automation
    • Attributes include performance, reliability, recoverability, and capacity management
  • Mass-storage structure

    The bulk of secondary storage for modern computers is provided by hard disk drives (HDDs) and nonvolatile memory (NVM) devices
  • Hard Disk Drives (HDDs)

    • Consist of disk platters with magnetic surfaces for storing information
    • Information is magnetically recorded and read using read-write heads that fly just above the platter surfaces
    • Platters are logically divided into circular tracks, subdivided into sectors, forming cylinders
    • Sector size transitioned from 512 bytes to 4KB around 2010
    • Drives spin at speeds measured in RPM (rotations per minute), common speeds include 5,400, 7,200, 10,000, and 15,000 RPM
    • Transfer rate relates to rotation speed; positioning time involves seek time and rotational latency
    • Typical transfer rates range from tens to hundreds of megabytes per second, with seek times and rotational latencies in milliseconds
    • DRAM buffers in drive controllers enhance performance
    • Disk heads float on a thin cushion of air or gas, risking head crashes if they make contact with the disk surface
    • Head crashes usually require disk replacement, resulting in data loss unless backed up or RAID protected
    • HDDs are sealed units; some chassis allow removal without system shutdown for expansion or replacement
  • Nonvolatile Memory Devices (NVM)

    • Comprised of a controller and flash NAND semiconductor chips
    • Utilized in various forms such as SSDs, USB drives, DRAM sticks, and smartphone storage
    • More reliable and faster than HDDs due to lack of moving parts and absence of seek time or rotational latency
    • Consumes less power but is more expensive per megabyte compared to HDDs
    • Capacity has increased rapidly while prices have dropped, leading to increased usage
    • Some NVM devices connect directly to the system bus (e.g., PCIe) for enhanced throughput
    • NAND semiconductors have specific characteristics, including page-based read/write, block-based erasure, and limited program-erase cycles measured in DWPD
  • Volatile Memory Devices
    • Used as temporary mass-storage devices, though volatile
    • RAM drives created by device drivers allocate a section of system's DRAM
    • Not survivable through system crashes or power loss
    • Found in major operating systems (Linux, MacOS, Windows) for temporary data storage
    • Useful for high-speed temporary storage, faster than NVM devices
  • Secondary Storage Connection Methods
    • Secondary storage devices connect via system or I/O buses such as ATA, SATA, eSATA, SAS, USB, and FC
    • NVMe is a specialized interface for NVM devices, offering increased throughput and decreased latency
    • Data transfers are managed by controllers or host-bus adapters (HBA), with host controllers at the computer end and device controllers in storage devices
    • Mass storage I/O operations involve commands sent from host to device controllers, with data transfer occurring between cache and storage media, and between cache and host DRAM via DMA
  • Magnetic Tapes
    • Early secondary-storage medium with slow access times compared to HDDs and SSDs
    • Mainly used for backup, storing infrequently accessed data, and transferring information between systems
    • Stored on spools and accessed past a read-write head, with capacities exceeding several terabytes
    • Some tapes feature built-in compression for increased effective storage
    • Categorized by width (e.g., 4mm, 8mm, 19mm, 1/4 inch, 1/2 inch) or technology (e.g., LTO, SDLT)
  • HDD Scheduling
    1. FCFS (first-come, first-served)
    2. SCAN
    3. C-SCAN
  • FCFS Scheduling
    • First-Come, First-Served (FCFS) or FIFO scheduling
    • Requests are serviced in the order they arrive
    • Fair but not the fastest as it may lead to longer wait times for later-arriving requests
  • SCAN Scheduling
    • Also known as the elevator algorithm
    • Disk arm moves from one end of the disk to the other, servicing requests along the way
    • After reaching the end, the direction is reversed, and servicing continues
    • Provides faster service compared to FCFS by minimizing seek time
    1. SCAN Scheduling
    • Circular SCAN (C-SCAN) variant designed for more uniform wait times
    • Similar to SCAN but does not service requests on the return trip
    • After reaching the end, immediately returns to the beginning of the disk, avoiding delays for requests waiting at the end
  • Error Detection and Correction
    • Error detection identifies problems like bit flipping using methods such as parity or CRC
    • Error-correction codes (ECC) can detect and correct some errors, distinguishing between soft and hard errors
  • Storage Device Management
    1. New storage devices require low-level formatting, dividing into sectors/pages and initializing
    2. Partitioning divides the device into groups of blocks/pages, with file systems treating each partition as a separate device
    3. Volume creation and management may involve multiple partitions or devices used together, such as RAID sets
    4. Logical formatting creates file system data structures on the device, including maps of free and allocated space
    5. File systems group blocks into clusters for efficiency, and some systems allow raw I/O bypassing file-system services for specific applications like swap space or databases
  • Partition
    • Divide the device into one or more groups of blocks or pages
    • Each partition is treated as a separate device
    • Can allocate partitions for various purposes like storing executable code, swap space, or user files
    • Some systems perform partitioning automatically when managing an entire device
    • Partition information is stored in a fixed format at a fixed location on the storage device
  • Volume
    • Create and manage volumes, which may involve multiple partitions or devices
    • Volumes can be implicit, with file systems placed directly within partitions, or explicit, especially in RAID setups
    • Volume creation prepares the volume for mounting and use
  • Logical Formatting
    • Creation of a file system on the device
    • Initial file-system data structures are stored onto the device, including maps of free and allocated space and an empty directory
    • Partition information indicates if a partition contains a bootable file system, establishing the root of the file system
    • Mounted file systems are made available for use by the system and its users
    • Most file systems group blocks into larger chunks called clusters for efficiency
    • File system I/O is done via clusters to assure more sequential access and fewer random accesses
    • Some systems allow partitions to be used as large sequential arrays of logical blocks, termed raw disks, bypassing file-system services for specific applications like swap space or databases
  • Boot Block

    • Initial bootstrap loader stored in NVM flash memory firmware on the motherboard
    • Instructions for initializing system components and loading the full bootstrap program
    • Full bootstrap program stored in boot blocks at a fixed location on the device
    • Boot partition contains the operating system and device drivers
    • Boot process involves loading boot code from the master boot record (MBR) or boot sector into memory
    • Failure scenarios may require disk replacement or handling of defective sectors
  • Storage Attachment
    • Computers access secondary storage in three ways: via host-attached storage, network-attached storage, and cloud storage
  • Host-Attached Storage
    • Storage accessed through local I/O ports, often using SATA
    • Additional storage can be connected via USB, FireWire, Thunderbolt, or FC
    • Suitable storage devices include HDDs, NVM devices, optical drives, tape drives, and SANs
    • I/O commands for host-attached storage involve reads and writes of logical data blocks
  • Network-Attached Storage
    • Provides access to storage across a network
    • Accessed via protocols like NFS for UNIX/Linux and CIFS for Windows
    • Commonly implemented as storage arrays with software implementing the RPC interface
    • Supports file sharing between hosts with locking features
    • Less efficient and lower performance compared to host-attached storage
  • Cloud Storage
    • Provides access to storage over the Internet or WAN to remote data centers
    • Accessed via APIs instead of traditional protocols like NFS or CIFS
    • Examples include Amazon S3, Dropbox, Microsoft OneDrive, and Apple iCloud
    • Designed to handle latency and failure scenarios of WANs
  • Storage-Area Networks
    • Private network connecting servers and storage units
    • Flexibility to connect multiple hosts and storage arrays
    • Dynamically allocates storage to hosts and supports RAID protection
    • SAN connectivity over short distances with no routing
    • Typically more ports and higher cost compared to NAS
  • Storage Array
    Purpose-built device with SAN ports, network ports
  • NFS
    Network File System for UNIX/Linux
  • CIFS
    Common Internet File System for Windows
  • Storage arrays
    • Commonly implemented with software implementing the RPC interface
  • File sharing between hosts
    • Supports locking features
  • Less efficient and lower performance
    Compared to host-attached storage
  • Cloud Storage
    Provides access to storage over the Internet or WAN to remote data centers
  • Cloud Storage access
    Via APIs instead of traditional protocols like NFS or CIFS
  • Cloud Storage examples
    • Amazon S3, Dropbox, Microsoft OneDrive, Apple iCloud
  • Cloud Storage
    • Designed to handle latency and failure scenarios of WANs
  • Storage-Area Networks (SAN)

    Private network connecting servers and storage units
  • SAN
    • Flexibility to connect multiple hosts and storage arrays
    • Dynamically allocates storage to hosts and supports RAID protection
    • Connectivity over short distances with no routing
  • SAN vs NAS
    Typically more ports and higher cost compared to NAS
  • Storage Array
    • Purpose-built device with SAN ports, network ports, drives, and controller(s)
    • Controllers manage storage and provide access across networks
  • Storage Array
    • Features may include RAID protection, snapshots, replication, compression, and encryption
    • Can include SSDs for performance or a mix of SSDs and HDDs for capacity
  • SAN interconnects
    FC and iSCSI are common, with InfiniBand (IB) also used
  • RAID
    Redundant Arrays of Independent Disks to improve performance and reliability