Thoughts on Windows Azure Leap Day Downtime

I’d be remiss not to mention the Windows Azure Downtime on Leap Day.  Because of my employment at Microsoft I won’t speculate or say too much on the situation.   I have said before that cloud computing does not completely alleviate the risks of downtime.

I would like to reiterate that there are always inherent risks in building and running software, and failure is to be expected not avoided.  The best designed systems are set up for failure, and can handle these cases with grace.  This particular event with Windows Azure further highlights the need to design applications that sit on top of any infrastructure (traditional, cloud, or hybrid) in such a way that they can work when (not if) a major portion of the infrastructure fails.

Don’t be fooled into thinking that any cloud service provides a silver bullet to resiliency.  Outsourcing your IT infrastructure to a cloud provider greatly improves your resiliency to for the cost you have to pay; most of us cannot afford to build & maintain a fault tolerant world-wide infrastructure.   When a failure does occurs, don’t overlook the economies of scale that benefit the application tenants most of the time when things are working properly.

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , | Leave a comment

Through the Storm – Interview with Arterian IT Founder Jamison West

Jamison West

"Having a comprehensive plan that's bigger than just IT is key, but often IT can be the forcing function to get you started."

I recently had a chance to interview Jamison West of Arterian. Jamison, who founded the company that is now Arterian in 1995, envisions a future where every small to mid-sized company will have an IT partner become a vital part of its core operations team keeping them free from disaster and flourishing.

SoftwareDisastersBlog: How do you help your customers prevent and prepare for IT disasters?

Jamison West: We see with our customers that reliance on connectivity is higher than it’s ever been for businesses to execute and support their customers. People now expect email to work like instant messaging, sent and received as fast as they type it.  We try to prevent IT issues  by adding redundancy to make sure that if there are problems — natural disasters or bad weather like we had recently in Seattle  — our customers are still up and running at least for critical operations.

Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , | 1 Comment

Learning from the Costa Concordia Shipwreck

Costa Concodia

Photo from csmonitor.com

On Friday January 13th, the Costa Concordia had a disaster – running into rocks off the shores off Italy’s western coast and eventually rolling onto its side in the water.   The toll on human life is tragic – several are dead (the number still growing), more missing, and everyone involved went through a traumatic experience.  

The saddest part of the story is that it appears it could have been prevented.  And if not prevented, could have been handled better.

As I’ve been following the story and reflecting on it, a few things have jumped out that I think we can learn from. 

Human Error

Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , , | 1 Comment

5 Disaster Preparedness Resolutions

happy new year 2012

Photo by Creativity103

You might think that new year’s resolutions are made to be broken.  Whether it’s to exercise more, chew fingernails less, or other clichés, they are hard to follow through on.  Witness the packed gym in January that becomes empty before March.

When it comes to keeping the systems that run your business humming along, the new year is a good time to pause and reflect on what you can do differently. And  you also have the energy to take action to make it a reality. But don’t let your well-intentioned resolution become lost in the shuffle of forgotten promises of self improvement.

1. “I will learn from last year”

Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , | 1 Comment

Thursday Link Day

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , , , , , | Leave a comment

5 Stages of Grief

Grief

Photo by Todd Huffman

“It’s not rocket science.” – Some Expert

“One test is worth a thousand expert opinions” – Wernher von Braun (Rocket Scientist)

I am a firm believer that reducing software & IT risks is not easy, but it is imperative.  It takes a strong will, the right attitude, and acceptance of responsibility for your web service or application.  You have to really understand the realities of how software works (or doesn’t) and the impact a disaster would have on your business & livelihood.  But accepting may be the hardest part.

It’s sort of like the 5 Stages of Grief.  Behold, I present to you my 5 Stages of Disaster Preparedness.

Denial

Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , | Leave a comment

No Place Like Risks For the Holidays

Christmas Tree

Photo by dannynorodo

The biggest holiday risk on your mind might be electrocution by lawn decoration or having to suffer through a eggnog induced story from your brother-in-law. 

However, there are a few risks to consider for your online business or website that are holiday specific.  The best present is peace of mind as you’re able to confidently ignore your business for much deserved downtime.

1. “Low-Risk” Updates & Changes

 

Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , | Leave a comment