Tag Archives: Software Design

I recently found this slideshare about how to run a post-mortem here: Post Mortems for Humans I am personally a firm believer that if you want a reliable, problem-resistant system, you have to incrementally improve by asking questions and systematically eradicating … Continue reading

Link | Posted on by | Tagged , , , , , , | Leave a comment

Solve Human Error Disclosures

It doesn’t matter how good your technology systems are if you trust people to follow certain steps to keep data secure as a prison in England learned the hard way. The best part of this story is that they “were … Continue reading

Posted in Cloud, Security, Technology | Tagged , , , , , , , , , , | Leave a comment

What You Wish You Knew During a Crisis…

From my guest post at ContinuityInsights.com During a crisis, there is almost by definition a shortage of accessible information. Because of the time pressure a disaster creates, anything considered noise gets filtered out and ignored. However, if you could create … Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , | Leave a comment

Thoughts on Windows Azure Leap Day Downtime

I’d be remiss not to mention the Windows Azure Downtime on Leap Day.  Because of my employment at Microsoft I won’t speculate or say too much on the situation.   I have said before that cloud computing does not completely alleviate … Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , | Leave a comment

Software Shouldn’t Cry Wolf

It’s simple – if you have false alarms people will eventually ignore them. When you run a high-scale online service, it’s standard practice to have monitors & alerts watching the system so you can quickly find any problems in your system … Continue reading

Posted in Business Continuity, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , , , | 1 Comment

11 Rules of Awesome Software

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. Antoine de Saint-Exupery French writer (1900 – 1944) In honor of 11/11/11, I’ve put together my 11 Rules for Awesome … Continue reading

Posted in Business Continuity, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , | Leave a comment

Stop the Dominos from Falling

If we assume that “software will fail” is a fact, not a problem, then we need to look at how to cope with the failure.  I’d like to discuss one coping strategy which is isolating components. Some failures cause ripple … Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , | Leave a comment

Musings on Recovery Oriented Computing: Part 2

This is the second post related to Recovery Oriented Computing.  Here’s where you can find the first one.  I’m also throwing in another scalability/resiliency topic not from ROC. Design for Failure “If a problem has no solution, it may not … Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , | 5 Comments

Musings on Recovery Oriented Computing: Part 1

Photo by Tim Green aka atoach “a safe structure will be the one whose weakest link is never overloaded by the greatest force to which the structure is subjected” Petroski 1992 A couple years ago, I had the pleasure of … Continue reading

Posted in Business Continuity, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , | 1 Comment

I’m in the cloud, am I still at risk?

I had a chat with a small software business owner the other day, and we were talking about preparing for disasters.  He asked me, “Since I’ve moved to a public cloud service, I’m protected, right?”  While public clouds significantly reduce … Continue reading

Posted in Business Continuity, Cloud, Disaster Recovery, Downtime, Technology, Uptime | Tagged , , , , | 1 Comment