Minimizing downtime byidentifyingpotentialfailures and taking steps to avoid those failures and to reduce their effects
High availability
A systemdesignprotocol and associated implementation that ensures a certaindegreeofoperationalcontinuity during a given measurement period
Fault-tolerant components
Disks (use RAID and hot spares)
Power supplies (use redundantpowersupplies)
Network cards (use redundantnetworkcards)
Power supply
A mechanical device that convertsACpowerintocleanDCpower and includes fans for cooling
Systems that cannot afford to be down should have redundant power supplies
Error Correcting Code (ECC) memory
Corrects a single failed bit in a 64-bit block, high-end servers use more expensive ECC memory with special circuitry for testing data accuracy
NIC teaming
The process of groupingtogethertwoormorephysicalNICsintoonesinglelogicalNIC, which can be used for network fault tolerance and increased bandwidth through load balancing
Computercluster
A groupoflinkedcomputersthatworktogetherasonecomputer, can provide fault tolerance and load balancing
Most popular forms of clusters
Failover clusters
Load-balancing clusters
Failover cluster
A setofindependentcomputersthatworktogethertoincreasetheavailabilityofservices and applications, ifonenodefailsanothernodebegins to provideservices
Active-passive cluster
Bothserversareconfiguredtoworkasone, butonlyoneatatimeisactive, the passive node becomes active if the active node goes down
Active-active cluster
Designed toprovidefaulttoleranceandloadbalancing
Create a failover using Windows Server 2008
Need two compatible servers with identical hardware, running the same Windows Server 2008 version, in the same domain
Load-balancing/networkloadbalancing (NLB)
Multiplecomputersconfiguredasonevirtualserver to share the workload
Uninterruptible Power Supply (UPS)
An electrical device with batteriestoprovidebackup powerduring a poweroutage, ranging from small to large data center units
Power generator
A backup electrical system that operatesautomaticallywithinsecondsofapoweroutage, may be required by building codes for critical systems