5 Stages of Grief

Grief

Photo by Todd Huffman

“It’s not rocket science.” – Some Expert

“One test is worth a thousand expert opinions” – Wernher von Braun (Rocket Scientist)

I am a firm believer that reducing software & IT risks is not easy, but it is imperative.  It takes a strong will, the right attitude, and acceptance of responsibility for your web service or application.  You have to really understand the realities of how software works (or doesn’t) and the impact a disaster would have on your business & livelihood.  But accepting may be the hardest part.

It’s sort of like the 5 Stages of Grief.  Behold, I present to you my 5 Stages of Disaster Preparedness.

Denial

“I don’t need to prepare for an IT disaster. My site is running fine now, customers are happy.  I have bigger things to worry about. This isn’t even on my radar.”

If this sounds like you, your IT systems may be at risk of sustaining downtime that could impact your customers and your business.  You may need to start with the basics and make sure you take the first step and admit you have a problem.

Anger

“I thought modern software was supposed to just work! Didn’t they tell me cloud would solve all my problems?  Why should I have to waste money on preparation?”

It’s easy to get upset about the fragility of networks, IT infrastructure, backup technologies, etc.  There are lots of new products & technologies out there that will help you reduce many of the resiliency and recoverability risks in your software.  But the 2nd law of thermodynamics applies to your software as much as to the rest of the universe, and eventually your software and infrastructure will decay and expose you to risks of failure.

It’s nothing to get angry about.

Bargaining

“Maybe I can get away with putting this off a little longer.  If I just add some backups that should be enough.  My team knows what to do if there’s an emergency, so I think we’re sort of prepared.”

It’s true that you can and should prioritize your mission-critical systems for resiliency and disaster preparation, and make a calculated decision on where to invest in uptime improvement.  But be careful what you choose to ignore as you focus on top business priorities.  Your online reputation is more important than ever (regardless of your klout score), and your customers will notice if your website or application goes down.  And your employees will be much less productive if they are unable to use email and business software.

Depression

“There’s nothing I can do to make any of this better.  Disasters are going to happen, that’s life.  I can’t stop them.”

You can’t stop disasters, but you can lessen their impact on you.  You can learn how to stop preventable disasters, and you can learn to cope with the inevitability of software failure.  I can’t think of a tough downtime or failure situation I’ve been in that couldn’t have been improved by better planning, testing, and execution.

Acceptance

“If we’re smart, we can reduce our risks and improve our uptime.  There have to be some simple changes to make things better.”

“Disasters may come, but we’ll be ready.”

Identify your biggest risks, create a plan to address them, implement solutions, and (most importantly) test them.  Prevention is not buying expensive tools that promise silver bullet solutions, it’s good engineering, self-critical analysis, and expecting the worst.

Preparation is not a big fancy document that sits on a shelf, it’s having a team that’s been through it before.

Conclusion

Don’t wait any longer, come to grips with how you’re going to prepare for downtime & disasters.  Don’t get stuck in denial or angry at the problem.  Take the first steps to reduce your risks and keep your customers happy.

What’s stopping you from investing in preparation?  Is it something you know you “should” do but you’ve found a good reason to put it off?  What helped you make the decision to invest in preparation?

Advertisements

About Kit Merker

Product Manager @ Google - working on Kubernetes / Google Container Engine.
This entry was posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s