Formulations in terms of components, how components are connected, data exchanged, and how these elements are configured into a system
Architectural styles for distributed systems
Layered architectures
Object-based architectures
Data-centered architectures
Event-based architectures
Layered architectures
Components are organized in a layered fashion where a component at layer L is allowed to call components at the underlying layer L-1, but not the other way around
Control generally flows from layer to layer: requests go down the hierarchy whereas the results flow upward
Data-centered architectures
Processes communicate through a common (passive or active) repository
Wealth of networked applications has been developed that rely on a shared distributed file system in which virtually all communication takes place through files
Web-based distributed systems are largely data-centric: processes communicate through the use of shared Web-based data services
Event-based architectures
Processes communicate through the propagation of events, which optionally also carry data
Processes are loosely coupled and need not explicitly refer to each other (decoupled in space or referentially decoupled)
Event-based architectures
Can be combined with data-centered architectures, yielding shared data spaces where processes are decoupled in time (need not both be active when communication takes place)
Software architectures aim at achieving distribution transparency, but require trade-offs between performance, fault tolerance, ease-of-programming, etc.
System architecture
An instance of a software architecture, considering where software components are placed
System architectures
Centralized
Decentralized
Hybrid
Centralized (client-server) architecture
Processes are divided into clients that request services and servers that implement specific services
Communication can be implemented using connectionless or connection-oriented protocols
Application layering
Distinction between user-interface level, processing level, and data level
User-interface level
Contains programs that allow end users to interact with applications, ranging from simple character-based screens to advanced graphical interfaces
Processing level
Contains the core functionality of applications, such as information retrieval, data analysis, etc.
Data level
Contains the programs that maintain the actual persistent data, often organized as a relational database
Responsible for keeping data consistent across different applications
Client-server model
The data level is typically implemented at the server side
Data level
Responsible for keeping data consistent across different applications
Stores metadata such as table descriptions, entry constraints and application-specific metadata
Relationaldatabase
Organized independent of the applications in such a way that changes in that organization do not affect applications, and neither do the applications affect the data organization
Relationaldatabases in the client-server model help separate the processing level from the data level, as processing and data are considered independent
Relational databases are not always the ideal choice for applications that operate on complex data types that are more easily modeled in terms of objects than in terms of relations
Object-oriented or object-relational database
Implemented for data levels where data operations are more easily expressed in terms of object manipulations
Two-tiered architecture
Distinction between only two kinds of machines: client machines and server machines
Possible organizations of client-server architecture
Only the terminal-dependent part of the user interface on the client machine, with applications having remote control over the presentation of their data
Entire user-interface software on the client side, with the front end communicating with the rest of the application (residing at the server) through an application-specific protocol
Part of the application moved to the front end, with the front end checking the correctness and consistency of the form, and where necessary interacting with the user
Most of the application running on the client machine, but all operations on files or database entries going to the server
Client's local disk containing part of the data, such as when browsing the Web and building a cache of recent Web pages
There is a trend to move away from configurations where client software is placed at end-user machines, with most of the processing and data storage handled at the server side, as clientmachines are moreproblematictomanage
Thin clients
Easier to manage from a system's management perspective, but may have less sophisticated user interfaces and client-perceived performance
Server-side solutions are becoming increasingly more distributed as a single server is being replaced by multiple servers running on different machines
Three-tiered architecture
Programs that form part of the processing level reside on a separate server, but may additionally be partly distributed across the client and server machines
Examples of three-tiered architecture
Transaction processing, with a separate process called the transaction processing monitor coordinating transactions across different data servers
Web site organization, with a Web server acting as an entry point, passing requests to an application server where the actual processing takes place, which then interacts with a database server
Vertical distribution
Achieved by placing logically different components on different machines
Horizontal distribution
A client or server may be physically split up into logically equivalent parts, each operating on its own share of the complete data set, thus balancing the load
Peer-to-peer systems
Processes that constitute the system are all equal, with each process acting as both a client and a server
Structured peer-to-peer architectures
Overlay network is constructed using a deterministic procedure, typically a distributed hash table (DHT)
Data items and nodes are assigned random identifiers, with an efficient and deterministic scheme mapping data item keys to node identifiers
Unstructured peer-to-peer architectures
Rely on randomized algorithms for constructing the overlay network, with each node maintaining a list of neighbors in a more or less random way, and data items assumed to be randomly placed on nodes
Leaving is just as simple: node id informs its departure to its predecessor and successor, and transfers its data items to succ(id). Similar approaches are followed in other DHT-based systems.
CAN deploys a d-dimensional Cartesian coordinate space, which is completely partitioned among all the nodes that participate in the system.
Unstructured peer-to-peer systems
Rely on randomized algorithms for constructing an overlay network
Partial view
List of neighbors maintained by each node
Nodes regularly exchange entries from their partial view. Each entry identifies another node in the network, and has an associated age that indicates how old the reference to that node is.
Topology management of overlay networks
1. Unstructured peer-to-peer system maintains an accurate random graph in the lowest layer
2. Higher layerselectsentries from the partial view to construct the desired topology
Ranking function
Orders nodes according to some criterion relative to a given node
As the network grows, locating relevant data items can become problematic in unstructured peer-to-peer systems due to the lack of a deterministic way of routing a lookup request.