The ability of a system to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.
There are four best practice areas and tools for reliability in the cloud:
Foundations – IAM, Amazon VPC, AWS Trusted Advisor, AWS Shield
Change Management – AWS CloudTrail, AWS Config, Auto Scaling, Amazon CloudWatch
Failure Management – AWS CloudFormation, Amazon S3, AWS KMS, Amazon Glacier
Workload Architecture – AWS SDK, AWS Lambda
Key AWS service:
Amazon CloudWatch