Business Resilience Planning: When 'Good Enough' Isn't Enough
WTF was heard throughout the room.
Though they had plans for an emergency, the local area's power was lost when the fire struck, and access to the facility was restricted. “WTF” was heard across the company's MS Teams Call for the emergency response. The plans had been clear: staff would remove all products ready to ship and use an alternate shipping location like a UPS store, remove supplies and take them to a friend of the owner's restaurant who would allow us to use available cold storage when needed and other steps in the plan based upon the scenario. “WTF, ….. why didn’t we think of this as a potential scenario?” the company’s owner yelled!
My client had drafted plans and even created scenario-based plans, but it was very evident in the heat of the moment that never exercising the plans and never asking ‘what if’ during the planning stage resulted in huge gaps.
Previously, I provided a high-level overview of Business Blindspots: Facing the Disaster You Never Saw Coming and How Risk Management Can Fail: Identifying Critical Assets for Business Preparedness. This article will explore the second step, the DO stage, of the PLAN-DO-CHECK-ACT(PDCA) lifecycle.
How to draft a plan
Previously, we discussed identifying your business's critical areas and prioritizing their recovery. We will discuss how to draft a recovery plan or minimize the impacts on those services. Technology recovery is somewhat easily defined and can be very detailed. For this article, we will focus on the business continuity/resiliency plans for the business side of the recovery and not address the IT systems disaster recovery plans.
First, the business needs to decide what should be included in the recovery plan, what the priorities are (discussed in the previous article), and what other priorities, like employee health and safety, may exist. Then, it needs to identify a recovery team for the whole response or each business function, considering the potential of not having all personnel available. After that, document specific plans for the recovery or continuation of the business function when impacted.
I recommend considering scenarios while attempting to keep them as high-level and adaptable as possible. For example, loss of access to a facility may result from multiple events, like flooding, fire, and civil unrest. Would it be the employees' responsibility to address the cause of the impact of the business function or just the function itself? Probably just the function itself, as those responsible for the facility would be planning for a fire, flood or other response by that specific business area.
Once you have identified who is responsible for what business areas, the business unit or SMEs (Subject Matter Experts) should begin drafting the plan. They should start by identifying whether the work can be done elsewhere or must be stopped or delayed. For each of these options, they will need to identify what action will need to be taken. Let's walk through an example of a high-level plan that multiple causes could result in the implementation of the same plan.
Alternate Work Location
COVID proved that there are multiple options for alternate work locations. Not all work can be performed at home; some may need an alternate office setting or be able to be moved to a company's alternate office and performed there by a local team. Regardless, the SMEs will need to walk through what the plan would entail. What would they need to consider pre-event? Things like phones? Email access? Software—is it local PC-based or web-based? Client records? What changes may need to be implemented when an event occurs? The SMEs would want to document these items and seek help implementing changes as needed to implement these resilient plans.
For services that may be limited, SMEs need to identify those limitations, how they can be addressed, and identify potential workarounds. All of this would need to be documented in the plan.
The last category would be the services that would stop because they could not be conducted offsite for various reasons identified. The SMEs need to document these services and seek leadership support to resolve the limitations if a solution is available. The plans for these services may include some partial workarounds; for example, the customer service department may take order requests that the sales team will follow up to confirm once the systems are available. These partial workarounds need to be documented.
Other Scenario Plans
In developing your plans, you want to address the critical scenarios. Above, we walked through some considerations for alternate work locations, but you can consider Natural Disasters, Cyber, Data Loss/Breach, Power Outage, Civil Unrest/Pandemic, supply chain, and vendor issues. These would probably cover most scenarios to consider, and some may overlap.
Business SMEs must consider the identified scenarios, usually called out in the policy and how the plans may change. What options the business might have may depend upon your company's unique services and partnerships. Still, SMEs should identify what they can plan for, what they might be able to plan for if resources are provided, and what they cannot plan for.
Multiple plan templates are available online, as well as software and companies that can help in this process. Just remember that, in the end, the company owns the plans and is responsible for their effectiveness. So do your due diligence and try not to settle for ‘good enough.’
Incident Response
Differences in well-defined reporting, escalation, and response plans and processes can save millions. When one employee reported a potential security issue, the help desk did not classify the issue accurately, resulting in the continued infiltration of the breach and eventual ransomware attack. The company was found to be negligent and, therefore, partially responsible for the loss. This resulted in their insurance covering only a tiny portion of the loss. This is just one real-life example of why planning to succeed in a response is vital.
Though this example is cyber-related, I have personally witnessed a company that ignored a leak detector going off, which eventually resulted in the explosion of multiple homes; another ignored a door and window sensors that had been triggered, but after guards investigated and found nothing, the security center silenced them, resulting in the theft of equipment in the high six figures. This could have been avoided with proper reporting, escalation and response plans.
Often, the most overlooked part of planning is how the business manages reporting, escalation and response to potential issues. Even large companies can fail to perform well in this area, even though they have dedicated departments to manage it. I have learned that consistency is the key to suitable identification, escalation, and response. How an employee reports a concern often depends upon where in the company they work, the issue, and what time zone or country they may be in. Depending upon what path the report is made, it may result in different classifications of the threat and different escalation paths, resulting in more confusion and potential for delay. Finally, what are the criteria for escalating to incident response and who is identified as responsible for managing the response?
A good incident response starts with how items are reported. It does not matter if it is through a phone call, email, or internal submission form; each item should result in being reported to the same team, documented, followed up on and correctly classified. It is always better to classify something initially at a higher risk and then adjust as the item is called than to misclassify something and risk a loss. I was told once by a past employer, “You will never be fired for calling the troops, but you will be for not making the call!” They were right; focusing more on a potential issue is better than waiting and seeing.
The second thing to consider is how an issue is escalated. There needs to be an obvious escalation path; ideally, it should be the same every time, except for what teams may be included. For example, had some homes been blown up, had the site manager contacted the help desk and received a call that a fuel leak detector kept going off? The help desk takes the report and sends it to the appropriate team for management. That team could have not only gone to the site to validate the alarm and perhaps performed some testing, but they could have also gone through the fuel reports and perhaps caught the difference in the volume of fuel sold versus the volume delivered over time and what remains in the tank. Ultimately, identifying the root cause was a slow leak of fuel into the sewer system.
Unfortunately, the site manager and employees keep silencing the alarm every few hours for perhaps months, resulting in the explosion of multiple homes.
Escalating an issue is crucial, and follow-up is imperative. It is far better for a team to respond to a matter and eventually lower the priority once it is better understood than not correctly addressing the risk.
Last is the incident response, which occurs once the matter is confirmed as an incident. An incident occurs when an issue's impact is confirmed, felt or is happening. The incident response team should have been identified in the plans and processes. Depending on the company's size and what they do, who responds may vary, but there are some consistencies regardless of size. FEMA’s “National Incident Response Systems guides all levels of government, non-governmental organizations and the private sector to work together to prevent, protect against, mitigate, respond to and recover from incidents.” It is an excellent guide to apply to your company's overall incident response model, with considerations to scale.
Considering the FEMA model, and I will keep it very simple here, you will need an Incident Commander, an Incident Response Team – usually a set of identified business leaders who can make high-level decisions (including legal and HR) and an appropriate Tactical Response Group(TRG). The TRG should be appropriately staffed with experts in the area of concern. For example, a facility with a data center may be flooded, and the data center staff may consist of IT management, InfoSec, Cyber Security, and Facilities. This team will meet at a predefined cadence to discuss the situation, concerns, actions taken, and progress on the currently implemented plans. Simplistically, they are the acting chiefs, ensuring each area of the business being impacted is having their concerns heard and addressed, prioritizing (based upon the BIA discussed in previous articles) the response activities based upon the plans and criticalities of the impact.
The incident response procedures aim to address any potential issue adequately before it escalates to an impact. Most companies have issues reported daily, with most never becoming incidents. Still, those that can potentially become incidents need to be addressed early and with the right resources, resulting in reduced impact and loss.
Plans are like Google Maps, guiding you in the right direction.
I like to think of the plans I help develop as a map. I am in Arizona but want to go to New York. Today, we may open up Google Maps on our phone and enter our destination, and it will provide a route. As we drive, if there is a road closure or an accident, Maps may provide an alternate route to save time. If you need gas, search for gas stations on the route, and Maps will direct you to a chosen station. Maps will provide options for your route based on the data it has and what you provide to it.
The Google Maps example is much like your plan. You know you need to recover or maintain services during an impact, so you have to follow the plan with consideration for the actual impact. So, as much as the plans need to be detail-oriented, they must be adaptable based on the actual event.
Understanding your plan does not necessarily consist of the exact path to recovery. However, having a plan and a consistent management process can reduce the time it takes to take appropriate actions, leaving the company in control of the event. Not having plans or poor incident management can cause delays or oversight of risks and result in reputational impacts, legal & financial losses, market share decline or potential closure.
My next article will address the Check portion or the third phase of the PDCA lifecycle. How can you ensure your plans are accurate and address the risk before it matters? It does not matter what size your company is; implementing this lifecycle will allow your risk management processes to mature, ensuring the company's sustainability.
James Knox is a resiliency expert with an innovative spirit who thrives when building meaningful solutions to various daily problems in the corporate world. He is an avid outdoorsman and loves extreme rock crawling, fishing, and hunting. As a survivalist, James has learned from necessity how to prepare for life’s bumps and thrive with practical and sensible solutions, supporting his family's self-sustaining lifestyle.
Tags
- All
- 25 year food
- 25 year shelf life food
- 72 hour kit
- Best food storage types
- Best long-term food storage
- Blizzard preparedness
- Budgeting
- canning
- Certified GMO-free Emergency foods
- Certified GMO-free foods
- Coffee
- Comparison of emergency food methods
- Composting tips
- Dangers of genetically modified foods
- dehydrated food
- Edible Wild Plants
- emergcy preparedness
- Emergency Cooking
- Emergency Food
- Emergency food Christmas gifts
- emergency food storage
- Emergency Food Supply
- Emergency food supply recommendations
- Emergency Planning
- Emergency Preparedness
- Emergency preparedness advice
- emergency preparednesss
- Emergency Supplies
- Emergency supplies checklist
- Emergency Survival
- emergency survival gear
- Emergency survival kit checklist
- Emergency Survival skills
- exercise
- Family emergency preparedness
- Family emergency preparedness plan
- Family Preparedness
- Food Storage
- Food storage 25 year shelf life
- Food storage amounts
- Food storage Christmas
- Food storage containers long term
- Food Storage Secrets
- Food storage serving size
- Food storage types compared
- freeze dried food
- Freeze dried food storage
- freeze dried meats
- Freeze-dried emergency food storage
- Fruit Trees
- Gardening
- Getting Started
- Gluten-free food Storage
- Gourmet emergency food
- Healthy food storage
- How much emergency food to store
- Improved emergency preparedness
- Jared Markin
- Jared Matkin
- Legacy Premium
- Lessons learned from Hurricane Sandy
- Lessons learned from natural disasters
- long-term food storage
- Long-term Food Storage Guidelines
- Long-term Food Storage tips
- Long-term water storage
- Mental Emergency Preparedness
- Mental toughness
- Money-saving tips
- Natural disaster planning
- Natural Disasters
- Perfect Christmas gifts
- Pet Emergency preparedness checklist
- Pet Emergency preparedness kit
- Pet Emergency Survival tips
- Pets and Emergency Preparedness
- Plant Foraging
- portable solar panels
- portable solar power
- portable water filters
- protein drinks
- Risk of genetic modification
- Seed saving and storage
- Seed saving guide
- Self-reliance
- Self-reliant practices
- Shelf Life
- Solar Cooking
- Solar Ovens
- Special Dietary needs
- Stranded in a car in a blizzard
- Survival food
- Survival Gear
- survival kit
- Survival kits
- Survival Ovens
- Survival Skills
- survivalist gear
- suvival kit
- Tree Pruning tips
- Tree Trimming basics
- unique ideas
- water bottle with filter
- water filter
- water filter straw
- water filters
- Water Filtration
- water pitcher with filter
- water pitchers with filters
- Water purification
- Wild Food Foraging
- Winter composting
- Winter driving
- Winter preparedness tips
- Winter storm preparedness tips
- Winter Survival