High Availability Definition

I love conversations around High Availability. I have often had discussions in my classes and in my consulting engagements around the definition of availability. It is a confusing topic. There are typically several definitions, but the main two are:

  1. The services must be available when users need to consume them. That means that if the users are not online, it is OK if everything is down for maintenance and it won’t count against the availability numbers. This usually also includes maintenance windows as being exempt since users are told that the systems will be down.
  2. The services must be available all the time.

I tend to lean towards definition number 1. However, I certainly understand that more and more businesses are now 24/7 and even if they are not 24/7 as far as customer facing, they still have 24/7 needs as there are many automated processes. If you think about it, even backup times are considered production times. We can’t have systems down during the backup windows or our backups will fail, and we will be exposed to significant risk until we can get a current backup.

Today, I posted on my Facebook account, and it created the following thread. FYI, I removed a couple of comments, but I thought these ones were perfect:

Russ Kaufmann Based on there being around 30,000 commercial flights per day in the world, if airlines met the 99.999% standard, there would be 109 crashes per year.

Matthew Roche Only if you define “failure” as “crash.”

Russ Kaufmann I define it as “down time” during production (in the air) times.

Matthew Roche It seems to me that a more fair approach would be to consider significant (with this term being defined by the SLA) delays and cancellations as being “down time” in this context, because the service being provided is not available.

Russ Kaufmann Sorry, that is not how we measure availability. Either a system is available, or it isn’t. The services must be available during production hours.

Perhaps, we need to focus on there being approximately 5,000 commercial airliners in the air …at any one time and run our numbers on them. That would significantly reduce the expected failures of 99.999% availability.

I have never talked to a C-Level officer and said, “Yeah, but the stock trading software was back up and running within the terms of the SLA, so you can’t say there was any unavailability.”

That won’t fly. 🙂

Matthew Roche So can I paraphrase you as saying that the only time a flight is unavailable is when the plane has crashed?

Russ Kaufmann If I am consuming a flight (on board), then yes. After all, it isn’t a failure where the service is unavailable.

OK, maybe that is too stringent. How about loss of control of the air craft being added to the list? I would also add cancellations per your earlier statement as that would mean that the flight is unavailable.

Matthew Roche There we go. So “if airlines met the 99.999% standard, there would be 109 crashes, cancellations or comparable service interruptions per year.” Is that accurate? Is that what you’re trying to say?

Russ Kaufmann You took all of the fun out of it. LOL

Matthew Roche I took 99.999% of the fun out of it.

This entry was posted in Clustering. Bookmark the permalink.

3 Responses to High Availability Definition

  1. Amy says:

    Hey, I want to send you a Christmas letter! What’s your street address these days?


    • Holy crap. There you are! I sent you email with info. Check your junk mail, I think my emails have been going there for a couple of years now. I will also send via a couple of other addresses.

      • Amy says:

        Got the e-mails, and the letter (with lots of photos) will be in the mail tomorrow!

        Glad I could give you a little holiday surprise!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s