Featured Posts

The New Economics of Technology Startups? I have recently been reading the book "Free: The Future of a Radical Price" by Chris Anderson.  Well I am not actually reading it as I find I do not have time for reading books any more.  These days...

Readmore

Here is my hammer. Show me your screw! Well I have been traveling out of the country a lot these past few weeks so its been a while since I posted.  I will try and do better in the future.  During my travels I had a lot of interesting discussions...

Readmore

Consideration For The Technical Implementation of an... I had a lot of questions from people after my last post on BPM and SOA about the layered SOA I proposed and whether it would be slow performance wise.  The answer I gave people was "It depends".  In...

Readmore

Why a Business Process Modeling (BPM) Approach to SOA... I was having a Twitter conversation with Brenda Michelson (@bmichelson) and Todd Biske (@toddbiske) about the tight coupling in peoples minds between BPM and SOA, and why I find that when people take a...

Readmore

Enterprise 2.0 Needs To Stop Being So Naive You know I really struggle to get excited about Enterprise 2.0.  Not because I don't think IT needs to undergo change, but because I feel that Enterprise 2.0 as we seem to be defining it, and covering...

Readmore

  • Prev
  • Next

It’s Inadequate Design That Lets Systems Fail, Not Whether They Are SaaS or Deployed in The Cloud

Posted on : 15-08-2009 | By : Paul Michaud | In : Cloud Computing, High Availability (HA), Software Design, Software as a Service

Comments

There have been many high profile outages lately which have caught peoples attention.  These failures are being used as an argument for why critical systems should remain internal and not be deployed as SaaS or in the Cloud.  Some of these outages included Google App Engine’s performance issues in early July , Rackspace’s loss of their Dallas data center due to power failure and the fire in Seattle that took Authorize.Net offline for 12 hours to name but a few.

What amazes me is how so many people point to this and argue that this is proof for why Cloud and/or SaaS is bad and that everything should be in house.  It’s preposterous.  The fact that these systems went down with a data center failure (or otherwise) is nothing more than an argument for inadequate system design, where High Availability (HA) is concerned.  The bottom line is it takes planning, forethought and good design to make a system highly available, and most systems simply are not designed with that in mind.

The reasons for not making a system highly available are many and include the following:

  1. Naivete: People don’t believe it could happen to their system and thus choose not to put in the time, effort and cost of making a system highly available
  2. Cost: Bottom line is it costs a lot of money to make a system HA and for a lot of firms, particularly when starting out or for smaller businesses, it just not a viable option
  3. Difficulty: Its bloody hard to make a system HA.  Its one thing to ensure no data loss,  its quite another to ensure little to no down time.

For most of my career I have built systems for the World’s largest financial companies including the World’s leading Investment Banks and Stock Exchanges.  These firms take high availability very seriously as a rule, but even with their resources and decades of experience systems still go down.

Consider the London Stock Exchange (whose system I did not design), who last year had a very public outage when they were down for most of a trading day.  This was not a SaaS system or one deployed in a Cloud.  It was an internal system run by a highly reputable company whose business is based on being reliable and never losing a trade.  These exchanges, for the most part, have highly redundant systems, multiple backup data centers, design for High Availability and run fail over tests regularly, yet they still experience downtime from time to time.

The point is, failures happen, whether the system is run internally, or in the cloud.  Whether its a SaaS system or one of home grown legacy design.  The objective is to minimize those failures and the downtime associated with them.

That said,  with today’s technologies, some careful planning and good design, it is possible to build systems that should almost never go down, even in the face of a 9/11 type event, but thats a topic for another day.

  • Share/Bookmark
blog comments powered by Disqus