Being prepared is more than just a scout motto. It is a must-have for any company to survive a disaster, natural or man-made. Some events come with warnings such as hurricanes and snow storms where companies can batten down the hatches and communicate with customers and partners ahead of time. Others – a tornado, fire, or even something as simple as a local work crew knocking out the main power line – come with little or no warning at all. You have to be prepared for anything.
Although natural disasters garner much media attention (and we have already seen several such events in 2015), these incidents only cause a small amount of business continuity issues. According to the State of Global Disaster Recovery Preparedness report, only about 14 percent are weather-related disasters. A majority of data center downtime is brought on by software/network failure (50 percent), human error (44 percent), and power failure (24 percent).
Regardless of the cause, any degree of downtime is detrimental to a business. The report reveals that the cost of losing critical applications to system outages can be as high as $5,000 a minute. Given this statistic, it’s easy to see how disasters can drive businesses to downsize or even close their doors.
A solid business continuity/disaster recovery (BC/DR) plan ensures that your organization survives these events. It clearly outlines how your company will mitigate risk, continue to bring in revenue, and avoid those costly technology outages. It also provides employees with peace of mind that if an event was to occur, their livelihoods would be secure.
However, only a small percentage of companies have a well-thought-out and tested plan in place. The State of Global Disaster Recovery Preparedness study found that 60 percent do not have a documented DR plan, and 40 percent said that their current plan was ineffective when used to respond to an event. Companies cannot play Russian roulette any more with BC/DR. It is time to get serious about avoiding costly outages and institute a plan that addresses the following four areas.
People: Protect the Lifeblood of the Company
Your organization’s data and technology infrastructure are valuable and must be protected, but people are the lifeblood of any company. Thus when a disaster occurs, you must have established safety protocols and procedures in place to protect those employees who are on-site at the office.
Implement a communication procedure for how employees will be contacted as soon as possible in the event of a disaster. This plan should have a current contact database that includes the names, home and cell phone numbers, and email address of all personnel and an emergency contact. Make sure to have employees review and update this information yearly. In the event that one mode of communication is not working, there should be redundant contact methods. An old-fashioned phone tree or automatic notification system can easily be put in place to notify employees of office closures as well as provide information on how to access IT systems, local disaster resources, and payroll changes. Your company may have a loyal staff, but if payroll is delayed for an extended period, people may seek other job opportunities.
Appoint a key decision maker who will kick off and oversee the execution of the BC/DR plan. Have a back-up contact if that individual is away on vacation or is unavailable. Your plan should identify the key operational personnel within the appropriate business departments, not just those within the IT business unit, and clearly outline their role during the disaster. The IT team must provide them with the ability to work from another office, co-share space, or remote access from other locations.
Processes: Keep the Business Running
So your data center is down and the company’s office is out of commission, but customers still need to be served and goods must be shipped. Bottom line: you need to enable the company to keep making money. What is the company’s risk mitigation plan?
When evaluating what systems and processes are crucial for operations, a key item to address is how much downtime your company can afford. For example, your website is one of the first things the CEO demands be brought back up online; it may be best to ignore the ecommerce section of the site and look at it as simply a lead generation mechanism for potential customers. Another concern is the time it will take to bring data and processes back online. Nowadays with the cloud, it is easy to store things in Amazon, Google, or other public clouds, but it could take days to get that data back. Knowing the recovery time objective (RTO), the amount of time a system can be down without causing significant damage and costs is critical.
Additionally, the vital personnel should receive step-by-step guidelines on how to work remotely and access the mission-critical data and applications. These employees must also practice their roles; this helps to uncover the interdependencies across various departments and allows for the plan to be modified to best fulfill those needs.
Although technology is an essential element of your company’s infrastructure, the BC/DR plan must include other daily business operations from payroll and HR to customer service and order processing. You must factor in how these areas will continue to function if the staff is working remotely. And if the outage will be for any extended period, know how to communicate with customers, partners, and regulatory bodies, or even the media, about the restoration process or why the company may be slower to respond to inquiries.
Infrastructure: Make Technology Available
Let’s picture for a moment … the fast swipe of windshield wipers, ineffectual against a torrent of rain, a car shaking from gusts of wind, and you hunched over the steering wheel hoping that a hurricane hasn’t brought flood waters into the data center! This (admittedly dramatic) white knuckled dash to collect a back-up tape containing your company’s crucial data is nothing you want to experience in this or any lifetime. Yet, it has happened to some because they didn’t have their organization’s data backed up to a secondary site. By replicating your data and infrastructure at another physical or virtual location, you can add a layer of redundancy to the overall IT framework. This ensures that mission-critical information and systems are easily retrievable, and business can get up and running quickly.
In years past, IT organizations used to back up data onto physical tapes that had to be transported to an off-site storage location (sound familiar?). The problem is that backup took hours, was often incomplete, and required on-premise staff for switching and transporting the tapes. The security for this type of media was also fraught with issues. However, with the evolution to disk storage and now cloud platforms, companies have a fast, cost-efficient method of replicating data and applications and then accessing that data within minutes of an outage or disaster.
When using the cloud as part of your BC/DR plan, carefully look at the fine print of the service provider’s service level agreements (SLAs) for information retrieval, as the industry standard has shrunk from days, to hours, to mere minutes. Make sure that the provider’s guarantee of 100 percent uptime is really 100 percent. Some SLAs only count downtime events that are longer than 30 minutes consecutively, meaning you could have 25-minute outages five times a day and they still count that as 100 percent uptime. Then, along with the key personnel and executives, clearly decide which applications are business-critical so these are brought back online first. For instance, the customer service database or billing application will likely take precedence over an internal sales chatter site. Realistic RTOs and recovery point objectives (RPOs), the amount of data a company can lose, must be set for each data set, system, server, and application to ensure that SLAs are met. When service providers deliver on the pre-determined SLA, RTO, and RPO requirements, you can eliminate risk, minimize revenue loss, and lessen damage to the company’s brand reputation.
Testing: Get Recovery Assurance
However many hours you work on creating a BC/DR plan, it is all for naught if it is never tested. What may look good on paper may not work in the event of an actual disaster. With company growth, expansion of goods and services, and an evolving IT infrastructure, this plan cannot be set-it-and-forget-it. The BC/DR plan should be evaluated on a quarterly basis by all the parties involved and then a mock test run across the entire company.
It’s similar to how fire drills are conducted a few times each year in public schools. The drills prepare students and guide them through the process of what would happen if an event was to occur. The same holds true for the BC/DR plan. An employee who knows what to expect when a service outage or weather-related disaster occurs will be less anxious and can focus on his or her role in keeping the company operating. Schedule a mock disaster at your company at least once a year. This allows you to test out your phone tree, the procedure process for employees remotely logging into IT infrastructure, and how executives can clearly communicate decisions around company operations.
On the IT side of the business, there is a second aspect to the testing of a BC/DR plan. You have to make sure that employees can access applications and data from home offices, but your team must also ensure that the data is properly backed and available. Automated processes and technologies can assist in creating a solid IT recovery plan that can help make sure that everything is backed up correctly and can be recovered within the allotted SLA and RTO timeframes. Consider that there are technology innovations available today that automate governance and testing of various systems and provide recovery assurance not unattainable with manual processes.
People, Process, Infrastructure, and Testing: Ensure ‘Business’ Recovery
How confident are you that your organization can continue to operate, ship goods, provide professional services, and meet customer needs in the event of a disaster or outage? Only a well-documented, communicated, and battle-tested BC/DR plan that addresses all of these key areas will ensure not just the availability of your organization’s data, applications, and systems, but that your business will recover quickly.
Dave LeClair is the vice president of product marketing at Unitrends, a leading data protection and disaster recovery provider.