It’s not like you don’t have a thousand other tasks that need your attention, but friends please hear me out. I’m here to tell you that an incident response plan needs to be at the top of that list of tasks!
I like to think of our company’s Incident Response Plan as a living entity. It grows, evolves, and adapts to the constant change around it. And it is single-handedly the most effective and expedient method your company will recover from an incident. Read about my own experience further on in this story.
I can tell you that a good incident response plan is quick and predictable. It swiftly turns detection into response, escalates to the right people quickly, makes communications clear, and keeps customers informed. It’s simple enough for people to follow under stress, but broad enough to work for the variety of incident types encountered. The primary goal of our incident response process has been a rapid resolution, so speed and simplicity are paramount
The inner workings of our incident response plan come from the National Institute of Standards and Technology (NIST SP 800-61) incident response lifecycle. It breaks incident response down into four main phases:
Phase 1: Preparation - The Preparation phase is the work your company does to get ready for incident response, including establishing the right tools and resources and training the team. This phase also includes work done to prevent incidents from happening.
Phase 2: Detection and Analysis - Accurately detecting and assessing incidents is often the most difficult part of incident response. Sometimes, the best response is to wait and observe.
Phase 3: Containment, Eradication, and Recovery - This phase focuses on keeping the incident impact as small as possible and mitigating service disruptions.
Phase 4: Post-Incident Activity - Learning and improving after an incident is one of the most important parts of incident response and the most often ignored. This post-mortem or debrief phase of the incident is where efforts are analyzed. The goals here are to limit the chances of the incident happening again and to identify ways of improving future incident response activity.
In addition to our incident response plan, we’ve also developed three other processes that prepare and guide us during incidents. These include our:
- Crisis Communications Plan – How we communicate with affected customers, vendors, investors, etc. during an incident. What communication tools to use, customers contact information, etc.
- Disaster Recovery Plan – This is our bible for what to do in a serious incident or disaster.
- Tabletop Exercises - Pretty much the whole company can participate in these exercises of “what ifs” and "what would you do”? We often start with CISA templates and then customize to our own needs.
If you’d like a template of the documents I’ve created, send me an email and I’d be happy to share. If I haven’t bored you yet about incident response plans, I’d love to tell you a story about a data center and a fire.
A Datacenter And A Fire
Prior to co-founding CyberNINES in 2020, I was and still am with 5NINES, a data center, and internet company. In one of my roles as Chief Security Officer, I have been responsible for developing, updating, and testing our incident response plan to meet various threats like cyberattacks, security breaches, and server outages.
We learned over the years that our incident response plan was not just a document sitting somewhere on SharePoint that tells us “what to do” with an incident, but much more about how we learned from it during our postmortem reviews. From those reviews, we made investments for improvements and practiced our incident response regularly with ongoing tabletop exercises.
That all came to our aid in July 2019, when an unprecedented “incident” occurred. At 7:40 am on the 19th, two fires broke out at our local electric utility power substations in downtown Madison. It was already hot and steamy that day, and the weather forecast was calling for record-breaking temps. To fight the fire safely, electrical service was shut down to all of the downtown area at 8:10 am. It was unknown when the power would be restored. Our office building was among the first downtown that quickly went dark.
However, our data center was able to keep running throughout the entire day due to our business continuity plan, our incident response plan, and our continuous improvement process. Those plans and processes had led to annual investments in uninterruptable power systems and two diesel generators. As you might imagine, we ran into several issues that day, but fortunately, we were able to “repair” or work around those issues quickly because we had good communication channels, delegated authority and technical ability. We learned a lot from that incident and as with all good incident response plans, we improved our business by implementing several changes to our environment.
Feel free to read more details on the fire and our incident response to it.