How reliable is the Cloud?

Recently InfoWorld published a list of 10 worst cloud outages which happened in last 3 years. The list included all the big name like Amazon, Google, Microsoft, Rackspace etc. The focus of the post is to draw lessons from these failures. However, for an objective assessment, we need to ask the questions:

  • How reliable are these cloud services?
  • Are they more or less reliable than your in-premise application?
  • Should reliability be measured in the same way for IaaS & SaaS?

(Here is the link to the article)

How reliable are these cloud services?
According to my calculation the reliability number for these services comes out as follows.

  1. Amazon Web Services = 99.920%
  2. Sidekick = 99.400%
  3. Gmail > 99.999%
  4. Hotmail > 99.999%
  5. Intuit = 99.750%
  6. Microsoft’s BPOS = 99.958%
  7. Salesforce  = 99.996%
  8. Terremark = 99.971%
  9. Pay Pall = 99.983%
  10. Rackspace = 99.958%

How does it compare it what you have?

How do you calculate the reliability?

The answer is not as straight forward as you may think. For example, it is commonly believed that air travel is much safer than road travel. But it depends on how you are measuring the reliability. If you go to the Air safety in the Wikipedia you will notice that three different statistics are provided.

  1. Deaths per billion passenger-journeys: Both Bus (4.3) and Car (40) comes out much safer than Air (117)
  2. Deaths per billion passenger-hours: Though Air (30.8) is safer than Car (130), it is still worse than Bus (11.1)
  3. Deaths per billion passenger-kilometers: Here Air (0.05) is much safer than both Bus (0.4) and Car (3.1)

This is how I have calculated the reliability number.

I have assumed that these are the only failure these services have suffered in last 1000 days. The reliability % is calculated by the following formulae:

% Reliability = 1 – (down days / 1000) * fraction of services or users affected

For AWS:

Down days = 4

Fraction of services or users affected = 1 of the 5 availability zone = 0.2

AWS reliability % = 1 – (4 / 1000) * 0.2 = 0.9992 = 99.92%

For Gmail:

Down days = 4

Fraction of services or users affected = 150,000 out of 200 million users = 0.00075 or 0.075%

AWS reliability % = 1 – (4 / 1000) * 0.00075 = 0.999997 = 99.9997%

Should reliability be measured in the same way for IaaS & SaaS?

As far as reliability is concerned, there is a fundamental difference between IaaS and SaaS.

If you are using IaaS you can take the following measures:

–          Have an alternate DR site

–          Have your own backup of the data

However, when you are using SaaS, neither of these approaches is feasible. Imagine setting up a backup mailing system to cater for Gmail going down! For that matter, can you imagine backing up your data which stored in

So, the reliability standard for SaaS has to be much higher than IaaS.

2 Responses to “How reliable is the Cloud?”
  1. eswarann says:

    You may as well ask how reliable is God?

Check out what others are saying...
  1. […] IaaS partially relieves you of the burden of looking after the physical server infrastructure which includes physical security. However, you have to still manage the virtual instances of each server. Though some degree of automation is possible, you will still have to manage each instance of the virtual machine. The concept of Dev-Op is gaining momentum but that only shifts the burden and does nothing to reduce the workload. As some of the recent cloud service outage has demonstrated, you will still have to plan for DR (see this). […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: