Failure, outage, or test, every resilience activity is incredibly important for proving or enacting recoverability. It’s no surprise that resilience has been increasingly stressful for the teams involved recently, particularly when tasked with maintaining operations and stability during a global pandemic. The pressure to have a plan for even highly improbable eventualities has never been higher.
With this in mind, it still surprises me how many organizations I talk to still rely on “how they’ve always done it: the spreadsheets, the docs, the stale recovery plans,” and still heavily rely on the knowledge residing in key team member’s heads to get the job done.
The importance of resilience has never been higher, and yet some of the approaches are still decades old. So, these are three simple tips to bring your resilience strategy forward and reduce the time and complexity when it’s time to test or recover.
- At Cutover, we talk about the ‘resilience gap’ a lot. It’s the space between your last update to your recovery plan and the most recent change you made in production. In one plan, this isn’t a monumental problem, but when you accumulate resilience gaps across tens or hundreds of plans – you have a problem. Don’t underestimate the amount of time it takes to update your plan before you use it. Our advice is to take the time to make regular updates to your most critical plans, at a minimum. Failing that, find a solution that enables you to keep parity between your plans and the assets they relate to.
- Use templates to speed up plan creation and ensure consistency. Speed and consistency are crucial for best-in-class resilience, which is one of the reasons why we added templating functionality to our all-new Resilience Workspace in Cutover. By enabling teams to either use expert-built templates or build their own for key assets, we enable our customers to rapidly build standardized plans. Here are some suggestions for templates you could create:
- Recovery plans for key applications and critical infrastructure
- Business continuity management
- Incident management
- Facility assessments
- Cyber assessments
- Supply chain assessments
- Pre-invocation checklist
- Ensure your plans are centrally located. Critical plans need to be quickly accessible by the people who need them and not hidden deep in someone’s files, inbox, or (even worse) head. No matter what format you choose for this, the initial effort of gathering everything together and increasing its accessibility will be worth it in the long run.
If you fail to prepare, you better prepare to fail.
Resilience is complex, which means that the activities enabling you to deliver it need to be as simple as possible. While the advice above requires some initial time and resourcing, I promise you it will be worthwhile when you come to plan for a test, or in those precious moments when invoking a recovery.
Ready to test better and recover faster?
After seeing common frustrations, like speed, readiness, time to test/invoke a plan, no time for evaluation, and the ‘everything is everywhere’ scramble, we knew that teams would immediately see the benefits in a dynamic, responsive, highly visual, integrated place for all things resilience. So we built one, our Resilience Workspace. Would you like to bring your resilience activities together in a dedicated workspace, built and supported by experts, with pre-configured templates, saved views, real-time visibility, performance insights, in a fast and accessible environment? Why not find out more about our Resilience Workspace here.
Interested in hearing from more of the Cutover team and seeing the workspace for yourself? I spoke to Mark Heywood, Resilience expert, and some of the Cutover team to explore the challenges and opportunities when aiming for best-in-class resilience. Watch our on-demand discussion where we explore how our global customers are evolving their resilience strategies to thrive in the current climate. You can watch the session recording here.
Want to read more about how to take the risk out of resilience? Read 5 ways to test better and recover faster.