distributed c5

Cards (32)

  • Synchronization
    How processes can coordinate and cooperate to access shared resources or agree on the ordering of events
  • Synchronization in distributed systems is often much more difficult compared to synchronization in uniprocessor or multiprocessor systems
  • Synchronization based on actual time
    1. Discussion of the issue
    2. Synchronization in which only relative ordering matters rather than ordering in absolute time
  • Clock synchronization
    Achieving agreement on time in a distributed system
  • In a centralized system, time is unambiguous
  • In a distributed system, achieving agreement on time is not trivial
  • Physical clocks
    Quartz crystal oscillators that generate clock ticks at a well-defined frequency
  • With a single computer and a single clock, it does not matter much if the clock is off by a small amount
  • As soon as multiple CPUs are introduced, each with its own clock, the situation changes radically
  • Clock skew
    Difference in time values between clocks on different machines
  • External physical clocks are needed in some systems where the actual clock time is important
  • Logical clocks
    Clocks that track the ordering of events rather than absolute time
  • Happens-before relation

    Relation that captures the ordering of events in a distributed system
  • Lamport's logical clock algorithm
    1. Assign logical timestamps to events based on the happens-before relation
    2. Ensure that if event a happens before event b, then a's timestamp is less than b's timestamp
  • Vector clocks
    Clocks that capture causality between events, allowing more precise reasoning about the ordering of events
  • Vector clocks enable ensuring that a message is delivered only if all messages that causally precede it have also been received
  • Mutual exclusion
    Granting exclusive access to shared resources to prevent concurrent corruption or inconsistency
  • Distributed mutual exclusion algorithms
    • Token-based solutions
    • Permission-based solutions
  • Token-based solutions can fairly easily ensure that every process will get a chance at accessing the resource, and can easily avoid deadlocks
  • The main drawback of token-based solutions is that when the token is lost, an intricate distributed procedure is needed to ensure a new token is created
  • Token-based solutions
    • They can fairly easily ensure that every process will get a chance at accessing the resource, avoiding starvation
    • They can easily avoid deadlocks by which several processes are waiting for each other to proceed, contributing to their simplicity
    • The main drawback is that when the token is lost (e.g., because the process holding it crashed), an intricate distributed procedure needs to be started to ensure that a new token is created, but above all, that it is also the only token
  • Permission-based approach

    A process wanting to access the resource first requires the permission of other processes
  • Centralized algorithm
    1. Process sends request message to coordinator
    2. Coordinator grants permission if no other process is accessing the resource
    3. When process is finished, it sends release message to coordinator
    4. Coordinator grants permission to next queued request
  • The centralized approach has shortcomings: the coordinator is a single point of failure, processes cannot distinguish a dead coordinator from "permission denied", and the coordinator can become a performance bottleneck
  • Decentralized algorithm
    1. Each resource is replicated n times, each with its own coordinator
    2. Process needs to get a majority vote from m>n/2 coordinators to access the resource
    3. When a coordinator crashes, it recovers quickly but forgets any previous votes, potentially granting permission to another process
  • Distributed algorithm
    1. Process sends request message with resource name, process number, and timestamp to all other processes
    2. If receiver is not accessing the resource and does not want to, it sends OK message
    3. If receiver is accessing the resource, it queues the request
    4. If receiver wants to access the resource as well, it compares timestamps and sends OK if incoming message has lower timestamp
    5. Process waits for all permissions before accessing the resource, and sends OK messages to queued processes when finished
  • Token ring algorithm
    1. Processes are arranged in a logical ring, with a token circulating among them
    2. If a process is handed the token and is not interested in the resource, it just passes the token along
    3. When no processes need the resource, the token just circulates at high speed around the ring
    • The centralized algorithm is simplest and most efficient, requiring only 3 messages to enter and leave a critical region
    • The decentralized algorithm requires m attempts, each with 3m messages
    • The distributed algorithm requires 2(n-1) messages
    • The token ring algorithm has a variable number of messages, from 1 per critical region entry to unbounded if the token circulates without anyone needing it
    • The centralized algorithm has the shortest delay to enter a critical region, taking only 2 message times, while the decentralized algorithm takes 3mk times and the distributed algorithm takes 2(n-1) message times
    • The token ring algorithm has a delay varying from 0 (token just arrived) to n-1 (token just departed)
    • All algorithms except the decentralized one suffer badly from crashes, requiring special measures to avoid bringing down the entire system
    • The decentralized algorithm is less sensitive to crashes, but processes may suffer from starvation and special measures are needed to guarantee efficiency
  • Geometric overlay networks

    Each node is given a position in an m-dimensional geometric space, such that the distance between two nodes reflects a real-world performance metric like internode latency
  • Geometric overlay networks enable efficient execution of distributed algorithms like routing, multicasting, data placement, and searching, by using the geometric positions to make local decisions