DRJ Spring 2020

Conference & Exhibit

Attend The #1 BC/DR Event!

Winter Journal

Volume 32, Issue 4

Full Contents Now Available!

Wednesday, 04 November 2015 06:00

5 Disaster Recovery Options: Balancing the Pros and Cons, Objectives, and Cost

Written by  Craig Hurley

Disasters are unavoidable. From the storm of the century to the backhoe severing a power line at a local construction site, disasters come in many forms. However, even the most mundane of "disasters" can have a devastating effect on your business if it keeps you from interacting with customers or destroys data. Here are the pros and cons of the most common disaster recovery strategies to help you decide which is best suited to your company's needs.

Two Key Metrics

When evaluating your options, there are two key metrics you should know: RPO and RTO.

RPO (recovery point objective) refers to the amount of time, e.g., 30 minutes, 4 hours, etc., for which it is tolerable to lose data should a disruptive event occur. RPO largely determines the frequency of data replication required.

Some businesses could survive losing as much as a few hours’ worth of data. Others would suffer irreparable damage if they lost just a few minutes. A short RPO, such as 15 minutes, indicates very little data loss is acceptable; a longer RPO, such as 4 hours, indicates less critical timeframes for preserving data.

RTO (recovery time objective) refers to the window of time between a disruptive event and a return to operational status. RTO largely determines the class of equipment and size of connection to your provider that is necessary to meet your recovery objectives. >

Highly seasonal businesses, especially in high-volume industries like retail, may decide that even an hour of downtime is more than they can afford. Other businesses may be able to keep things going without system access, at least for the afternoon, or maybe longer.

Not All Applications Carry the Same Requirements

Not only do businesses have different RPO and RTO targets, applications within a business will have different requirements as well. For instance, customer-facing applications usually demand a shorter RPO and a lower RTO because the loss of data and downtime can have a severe impact on the business. Administrative applications may be able to withstand more downtime or a higher level of data loss.

Setting RTO too long or the RPO too high can put the organization at unacceptable levels of risk. Conversely, setting RPO and RTO at levels that are too aggressive leads to over-investment and ties up capital that could be spent in more productive ways.

Option No. 1 Synchronous Replication

Also known as active/active replication, real-time ("live") copies of your data are created and stored offsite. In the event of a disaster or system failure, key systems fail over to the back-up site, keeping downtime and data loss to an absolute minimum.

Your disaster recovery site should be geographically diverse from your main data center. Too many businesses located in the World Trade Center in New York discovered the problem with locating their data center in the building next door on 9/11. >


  • Little to no downtime. Recovery time is typically a function of boot time/order and any requisite DNS/routing changes. (Best RTO)

  • Can be configured for automatic failover. No loss of time due to manual switching.

  • Little to no data loss. Data is always current as the back-up site is always active. (Best RPO)

  • Solutions often leverage checkpoints and journaling, allowing the customer to failover to a specific point in time.

  • Allows for failback to production.


  • Requires extra equipment and software. (Highest cost)

  • Requires reliable (redundant) network connectivity between locations.

  • Increased complexity. Need to manage both sites and the connection between them.

  • Imposes a low latency requirement. Often this is sub 10ms, which may limit options for geographical diversity.

Option No. 2 Asynchronous Replication

Asynchronous replication, also known as active/passive replication, or a warm site, refers to a disaster recovery strategy where near-real-time copies of your data are created and stored offsite, either at a service provider’s data center or at a second company location. Both synchronous and asynchronous replication strategies require a robust architecture that includes reliable connectivity between the primary and secondary sites. The desired RPO and the amount of data being modified will affect connectivity requirements.>


  • Data is replicated continually. However, unlike synchronous replication, activity at the primary site does not have to wait for the copy to complete to the disaster recovery site in order to continue.

  • Utilizing snapshots allows you to replicate data on a scheduled basis at the time interval you specify.

  • No data restore is needed. After a disruptive event, replica data becomes active. (RTO/RPO second only to synchronous option)

  • Can be configured for automatic failover.

  • Allows for failback to production.


  • Equipment costs are high to medium.

  • Potential for some data loss. Data is only current to last replication point.

Option No. 3 Cloud Backup/Full System Recovery

With this approach, your data is backed up via the Internet or a dedicated circuit to a secure location. Features can vary significantly.>


  • Easy to implement.

  • Medium to low cost. This approach provides a relatively short RTO and low RPO yet minimizes your investment in redundant systems.

  • Highly flexible. Back-up solutions can be tailored to recovery plans.

  • May be able to restore physical systems to virtual machines.

  • Bare metal restore can be combined with backups to shorten recovery time.

  • Backups can be automated. With monitoring, there is no need to worry about whether your data will be there in the event of a disaster.


  • Features vary. It is important to ensure it supports the specific applications you use.

  • Medium time to restore production. This approach is slower than synchronous or asynchronous replication, yet can be considerably faster than restoring from a traditional cloud backup or local tape.

  • Restores are a manual process.

Option No. 4 File Backup

The most significant downsides to file backups are the time it takes to restore the backup and the limited full-system recovery options.


  • Low cost. You only pay for what you use.

  • Good option for a low volume of data or relatively static data.

  • Web interface. Generally easy to use, even for the non-technical user.

  • Uses existing Internet connection for backup.

  • Typically backup data is stored in geographically diverse locations. No need to wonder if your data is sitting in a "hot zone."

  • Can be automated so the user doesn't need to "remember to back up the data." This lends itself to better RPO than the local tape/disk backup.


  • The speed of the back-up process is variable, driven by the connection speed and user load at any given time.

  • Slower restore times. When leveraging the Internet for data transport, bandwidth and latency constraints apply.

  • Full system recovery, if possible at all, takes a long time.

  • Users may not be notified of back-up errors or failure and unaware that backups are not completing successfully.

  • Restores are a manual process.

  • Most services only back up data. To recover, the business may need to reload and reconfigure applications first, adding time and complexity to the process of restoring operations.

  • Some services have limits on the amount of data that can be stored and may charge for restores.

  • Not a good fit for organizations with large volumes of data or that need a more aggressive RTO/RPO.

Option No. 5 Local (Tape/Disk) Backup

It all started with tape backup. However, as newer technologies have emerged, the old standby has been somewhat supplanted by more reliable options.

And we say "somewhat" supplanted because a survey by CIOinsight.com revealed that 48 percent of medium-sized businesses still use tape as a primary back-up solution. Although there are significant differences between tape and disk back-up methodologies, we are listing them together because they both are carried out locally, generally without third-party involvement.


  • tape: low to low-middle cost

  • tape: cost effective for long-term storage/archiving

  • tape: well suited to be moved off site

  • disk: low-middle to middle cost

  • disk: can be cost effective for long-term storage/archiving when utilizing de-duplication technologies

  • disk: data protection capabilities can help eliminate data loss


  • tape: high rate of media failure

  • tape: Introduces security risks during tape handling

  • tape: retrieving backup from offsite location or vendor can delay restores

  • disk: susceptible to local site issues (fire, flood, power loss)

  • disk: offsite capabilities typically require secondary hardware at a remote site or disk to tape export capabilities

  • for both media, restores are a manual process; full system recovery will take a long time

  • the number of simultaneous restores is limited

  • additional staffing costs may be required to manage backups

We like to think that a disaster will never befall our business. However, in a 2014 vendor study, businesses with more than 250 employees lost $1.7 trillion dollars to lost data and downtime in a 12-month period alone. It's clear that business leaders need to do more than hope for the best. Having the right disaster recovery strategy for your organization will help you minimize the impact and get back to business as quickly as possible.

Hurley-CraigCraig Hurley is Cosentry vice president of product management and is responsible for the entire life cycle of the company’s data center, hosting, cloud and managed services portfolio. Prior to joining Cosentry, he served as the director of product management for NTT Communications, a global IT infrastructure services provider. During his 10-year career at NTT, Hurley managed the entire U.S. data center services portfolio, most recently developing NTT's enterprise cloud and its recovery as a service portfolio of products. Hurley has more than 15 years of product management and IT experience including positions at Verio, Ameritech Cellular, and AT&T. A veteran of the U.S. Army, Hurley received his bachelor’s degree in political science from Indiana University and his master’s degree from DePaul University.