Disasters come in many forms. There are the things you think about when you hear words like earthquakes, floods, and other natural events. There are man-made disasters, like power outages and electrical fires. For IT professionals, there are malware and ransomware disasters happening every day.
Obviously, no company wants to deal with any of these scenarios. Unfortunately, not wanting to deal with it, doesn’t mean it won’t happen. With the sheer number of threats seeming to grow by the day, having a comprehensive disaster recovery and backup solution in place has never been more important.
One solution that has been gaining steam recently is disaster recovery as a service (DRaaS). It can provide significant benefits when it comes to cost and performance compared to traditional solutions. On top of that, it is better equipped to make sure your company recovers quickly and completely in case of an outage. However, it still presents its share of challenges.
While the service model takes many aspects of the disaster recovery process off your plate, companies still need to do their research ahead of implementing a solution to make sure you get the features and capabilities you need. For most organizations, that means three very broad buckets: returning to normal, data loss threshold, and performance.
This article will look at each of these buckets, to give you a baseline of questions to ask and features and strategies to consider on your road to implementing a DRaaS solution.
Returning to normal operations
What’s the goal of any DR solution? Returning your organization to normal operations as quickly and smoothly as possible. A recovery time objective (RTO) of 0 is ideal, but hard to come by in the world of DRaaS due to cloud latency challenges and unpredictability. Achieving a reasonable RTO depends on how much data you have, how often you need to access it, how sensitive it is, and many other factors. If your data and applications are complex, there are a few things you have to consider right off the bat.
First, there’s the very real possibility that you may have to operate in the cloud for longer than you’d like if a major incident affects a data center. In these scenarios, latency and other issues can prevent “normal” from happening, so technologies like edge computing, and taking a hybrid approach to your data in these cases can help with both performance and cost.
Whenever you do start getting back up and running in your “normal” environment, it can be quite an undertaking, especially if your data footprint is on the complex side. Even “fast-synch” – synching up items that have changed since an outage – can often feel like it’s named ironically, taking weeks. This has to be a consideration when considering different DRaaS solutions. Furthering this example, if you have a complex environment and needs, bulk shipments and other options might make sense so you can cut your restore time down, and get back to “normal” quickly.
Consider: Recover without moving data
One of the terms that appears in DRaaS writing most frequently – including this piece – is “time.” It’s with good reason: all disaster recovery solutions are in a battle with time, and virtually all disaster recovery metrics have something to do with time. The less time you’re out, the less time to return to normal, the better.
Things like reading data from a disk appliance, transferring it across a network, and writing it to primary storage can take considerable time. All these steps can conspire to make recovery in place a critical feature. Basically, it saves transfer time by enabling protected data access while that data is on the backup storage solution. It can even let applications instantiate on a backup server.
Recovery without moving data has seemed elusive, but is a reality for enterprises today. Look for edge-based solutions that take the replication out of protection and focus on keeping a single, durable copy of data that’s available when and where it’s needed.
Data loss threshold (RPO)
As discussed in the previous section, getting up and running after an outage is the main reason DRaaS exists in the first place. The right DRaaS solution can make an outage just a blip on your company’s radar, the wrong one can mean a lengthy process before you get back to normal operations. Before evaluating what DRaaS providers can do, however, you must know what your company needs.
Drilling down from the overall “return to normal” discussion, a good place to start is with recovery point objective (RPO), which is basically your exposure and threshold to data loss. Go through your workloads and data and ask questions. What is realistically acceptable loss? How often do those large video files need to be backed up? What applications are business critical, which are less so?
Once you answer all these questions, and many more, you can start considering whether a provider’s offering is sufficient. If you need an RPO of five minutes – i.e. you can have no more than five minutes of potential data loss – make sure the DRaaS solution can effectively take an image of your system that often.
What if you need an RPO of 0, as many organizations do? When you’re copying data to multiple locations for protection, there is inevitably lag between when it’s written and when it’s replicated. So, RPO simply can’t be 0 because of that lag. Solutions that don’t rely on replicating your data multiple times to protect it, those that keep one copy of data, and make it accessible at all times can get you there.
Consideration: Rapid testing
A real benefit new models for DRaaS have over traditional backup and disaster recovery is the ability to do rapid testing. Think of the last time you conducted a real, thorough test of a secondary datacenter. Was it on a weekend? Did you have to bring your IT offsite? Did you have to plan and run a disaster recovery drill manually? As you answer “yes” to these questions, honestly ask another one: Did the pain of these tests make it so you tested less frequently than you should? (Don’t answer that.)
With DRaaS, a full test can involve nothing more than clicking a button that isolates your network and initiates recovery. This allows most companies to test regularly and thoroughly, making them really prepared for when disaster strikes.
Identify performance requirements
You often hear that disaster recovery as a service is there to protect your company in the event that something bad happens. As we’ve discussed throughout this article, though, something eventually will happen to cause an outage. It’s inevitable. Even cloud providers experience outages every now and then; after all, the cloud is still infrastructure, it’s just maintained by a provider instead of your IT department.
This is one of the reasons multi-cloud approaches are the best and safest. Utilizing multiple providers ensures that if one goes down, you have complete redundancy and aren’t locked into a single provider for recovery. It also has the benefit of natural competition for your business, ensuring cost savings and performance gains.
Consider: Maximize Performance for Each App
Not only does avoiding vendor lock-in let you get the best deal, if you do it right you can get some real performance gains, as well. Just like it makes sense from a continuity point of view to make use of different vendors in different locations, it makes business sense to do the same based on your apps. No two cloud providers are alike, and as you are all too aware, no two applications are alike, either.
Some apps have bursts in usage, while others are nice and steady; some apps don’t need to ping the network and database very much, others are extremely data intensive. As anyone in IT knows, these differences just scratch the surface, each app has a unique usage profile and has unique performance and data needs. For example, Google performs especially well with AI functions, and Microsoft applications, like SQL Server, tend to perform especially well on Azure.
The bottom line? All apps are different, all cloud providers are different. Take the time, do the research, and determine what cloud provider works best with each of your apps.
Nobody likes to think about disaster recovery and backup, they force us to consider situations we hope will never happen. Having a comprehensive DR and backup plan and solution, however, can reduce the pain one of these unpleasant scenarios causes for your organization. Increasingly, DRaaS options are emerging that work for companies of all sizes, across industries. By considering some of the concepts discussed here, your DRaaS solution will get you back to normal seamlessly when an outage does happen.
Laz Vekiarides is the CTO and co-founder of ClearSky Data. Prior toClearSky Data, he was a former EqualLogic and Dell executive. He is an expert in data storage, virtualization and networking technologies, and holds several storage technology patents.