As many organizations are accelerating their path to the cloud, in discussions with other industries’ professionals, many leaders now think they are protected from downtimes because they are in the cloud. As continuity professionals, this statement causes heartburn for many of us because the cloud is just someone else’s data center. However, the foundation of disaster recovery plans and procedures does not just go away because it isn’t an infrastructure being managed by the company any longer.
Over the last year, many big cloud providers have seen outages that took down their internal and customers’ applications. The cloud is not inherently more redundant than keeping it in a data center or colocation yourself. The cloud provides agility and speed because infrastructure no longer has to be bought, racked, stacked, and configured before growth can happen in the environment. Many companies want to get back to their core tenets, which may mean downsizing their IT infrastructure footprint due to financial costs, especially considering the difference between capital expenditures and operating expenses.
The cloud can bring many otherwise unattainable features such as having disparate geographic locations so your data can be in two places and highly available. However, that is only applicable if the organization pays for that level of redundancy, which sometimes may be more expensive due to the costs of maintaining multiple instances. In addition, being highly available means corruption can also have a significantly higher impact on your environment. Disaster recovery and backup are still necessary, which is another cost many do not consider when thinking about how moving to the cloud will save money.
With moving to the cloud, the risk transfers from controlling your data and destiny to putting it into the hands of someone who has many customers they must satisfy. Especially if a cloud vendor is consuming their servers, the vendor may have to ensure they are back online before responding to their customers based upon their disaster recovery plans. Contracts and service level agreements are essential, but they can provide only so much protection.
Ensuring your critical applications and processes have the level of redundancy which your organization requires is very important. Still, there also must be a consideration to the upstream and downstream application, techniques, and authentication methods. In the last year, there was an outage that impacted the authentication of users into applications using one of these cloud providers’ native authentication platforms. The applications were up, but if users can’t authenticate to them, they are not up and accessible from end to end. All of those various micro servers become so much more important because they may not even be hosted in the same cloud provider, making things interesting.
From a practitioner standpoint, developing exercises to test these features and capabilities can be challenging, mainly if multiple strategies and cloud providers are being used across an organization. Testing may become more fractured because of all of the plans an application needs do not align, it could be where one component has to be failed over to another region while another is recovering from a backup. Another common issue is the matter of recovery point objectives (RPO) and backup retentions windows changing because the cloud services are not able to keep up with the standard may have been for an organization. An organization has done the assessments to see what RPO and recovery time objectives (RTO) are acceptable, but the push to the cloud while reducing costs should not lead to more risk.
We must stop treating the cloud as some mythical being that solves every organization’s problem because it is just someone else’s data center. The company is ultimately responsible for ensuring they are protected. Customers do not care that a vendor went down when they can’t access the services they are trying to use. Cloud environments need to go through the same cost-benefit analysis and risk assessment as would be done to bring infrastructure in-house. There are significant benefits to moving to the cloud, but companies can lose all if precautions are not taken before signing a contract.
These cloud vendors should be held to the same standards that organizations would keep themselves to, ensuring that they are testing their disaster recovery plans and having the foundations such as diverse network carriers and generators for a power failure. Do they have good security controls for protection from ransomware attacks? What mitigation controls are they using? As organizations continue to move toward the cloud rapidly, continuity professionals have to update their plans, procedures, and playbooks for this reality. Continuity professionals must manage those exercises and events with the same care and precision as they would if it were their infrastructure.
Cloud is not going anywhere, but we need to change how we think about it. Organizations’ services are just now running in someone else’s castle, but it doesn’t change the requirements that resiliency needs to be at the forefront of everyone’s mind. It may not be cheaper to move and run the cloud. However, it can give you agility you may not have had previously. We have to consider the total cost of running in the cloud. Including potential refactor of applications down to monitoring the environment for significant events. The decision to move may still be the same, but at least the costs are known upfront versus it being a surprise that resilience comes with a high price tag.