How to Determine Availability in AWS

How to Determine Availability in AWS

April 17, 2022

What is Availability?

Availability is referred to in AWS as the percentage of time a service is available to be used. In simple terms, your availability determines how reliable your system is. So how do we calculate availability in AWS? All AWS services have an AWS Service Level Agreement (SLA). See What is an AWS Service Level agreement? for more information. This AWS Service Level Agreement tells us how reliable each AWS service we will use is. Availability calculations for services can also be based on Multi Availability Zone (AZ) deployments vs using a service in a single AZ. The AWS Service Level Agreements cover this in detail for you. Typically each component (EC2, RDS, etc) within your VPC has an availability rating. There are also services that are not within your VPC like AWS Simple Storage Solution (S3). S3 is a service that resides within an AWS Region, not within your VPC. See Is S3 in a VPC? or What are AWS Availability Zones? for additional details.

What Does Availability Actually Mean?

When you look at an AWS Service Level Agreement, you will notice that the availability is measured as a percentage. As an example, at the time of this blog post an EC2 instance in a single Availability Zone has an availability or “instance up-time percentage” of 99.5%. So it will be available 99.5% of the time you use it. Most people would agree that is pretty reliable, but what does that mean in terms of actual time? Each percentage is based on a usage of the service during a year. So, using this information we can determine how much calendar time a service will be unavailable. Below are some rough estimate of what percentages calculate to in terms of days, hours, and minutes they will be unavailable for in a year.


Percentage Calendar Time Not Available
99% 3 days 15 hours 39 minutes
99.9% 8 Hours 45 Minutes
99.95% 4 hours 22 minutes
99.99% 52 Minutes
99.999% 5 Minutes

How Do I Calculate the Availability of My Entire System?

When you look at an AWS Service Level Agreement, you will notice that the availability is measured as a percentage. As an example, at the time of this blog post an EC2 instance in a single Availability Zone has an availability or “instance up-time percentage” of 99.5%. So it will be available 99.5% of the time you use it. Most people would agree that is pretty reliable, but what does that mean in terms of actual time? Each percentage is based on a usage of the service during a year. So, using this information we can determine how much calendar time a service will be unavailable. Below are some rough estimate of what percentages calculate to in terms of days, hours, and minutes they will be unavailable for in a year.

This information is very useful for determining the availability of a service. But this doesn’t tell you how available your entire system actually is. This information tells you the availability for ONE component of your system. So how do we find out how available our entire system is? What we have done so far is actually the starting point for determining the overall availability of your system within your VPC. In order to determine the overall availability of your system, you must first calculate the availability of each component. Then, you can use a simple formula to combine all the availability percentages. This gives you the availability of your entire system.

Let’s take a look at a simple example to calculate availability. In our example, let’s say we have a web application hosted on a single EC2 instance and a Multi-AZ Oracle RDS instance. We already established a single EC2 instance has an availability of 99.5% above. At the time of this article, a Multi-AZ RDS instance has an availability of 99.95%. To get our overall percentage we multiply each of these dependencies together:

.9995 * .995 = approximately 99.5%.

How Can You Increase Availability?

In terms of downtime, that equates to 1 day, 19 hours, and 49 minutes a year. Or, 50 minutes and 24 seconds a week. This isn’t bad, but if we are architecting applications using AWS best practices we may want a more resilient design. We have a single point of failure by using one EC2 instance. And as Werner Vogels put it, "Everything fails, all the time". So in order to increase uptime and create a more fault tolerant system, we could add an EC2 instance in another availability zone that would host our web application as well. And an Elastic Load Balancer (ELB) with a 99.99% availability to distribute traffic to the EC2s. This would mean we no longer have a single point of failure in our system. Increased redundancy also means increased availability. To calculate the new availability of the EC2 components we take 100% minus the product of each single instance failure rate (100% - 99.5% = .5% for an EC2).

100% - (.5% * .5%) = 100% - .0025% = approximately 99.99%

So now we can calculate the total availability:

.9999 (EC2) * .9999 (ELB) * .9995 (RDS) = 99.93% availability or 6 hours and 8 minutes of downtime a year.

Determining how much availability you need depends on your system's requirements. Mission or life critical systems have different standards than a web application that performs a simple business function. Keep that in mind and remember that there is a positive relationship between availability and cost. The more available your system is, the more it will cost to run.



Author Rob Lansing
2022 April 17th