As an industry professional, you're eligible to receive a printed copy of the journal.

Fill out your address below.






Please reset your password to access the new DRJ.com
Reset my password
Welcome aboard, !
You're all set. We've send you an email confirmation to
just to confirm you're you.

Welcome to DRJ

Already registered user? Please login here

Existing Users Log In
   

Create new account
(it's completely free). Subscribe

Given the current climate, the last thing most of us want to think about is another disaster, but we can’t afford not to. Working from home is “flattening the curve” but it cannot prevent the multitude of natural and man-made misfortunes that can strike down a network. Network downtime is extremely expensive under normal circumstances. According to Gartner, one minute of downtime is equivalent to $5,600 or $93.333 per second.

According to recent data, almost seven out of 10 knowledge workers are working from home right now, magnifying the importance of network uptime. Companies are focused on keeping their employees on the payroll and serving customers, budgets are stretched, significant downtime can damage the business and likely cost someone like a CTO their job.

To mitigate this, many savvy businesses are moving to the cloud for enhanced redundancy, reliability, and improved uptime. While this is a smart move, it’s important to recognize that not all clouds are created equal.

How Many 9s Do You Need?

There are two basic cloud architectures for disaster recovery, active-active which can deliver >99.999% uptime and active-passive architecture which struggles to offer 99.99% uptime. When I was in school, 99.99% was a fantastic grade. However, in terms of business uptime, the difference between four 9’s of reliability and five 9s could be millions in revenue and untold losses in brand reputation and customer loyalty.

In this article, we’ll explore the basic how-tos and the benefits of implementing an active-active cloud architecture. In short, we’ll look at an architecture that helps prevent disasters rather than recover from them. Keep in mind that this is one piece of the puzzle when it comes to deploying a complete disaster recovery plan. Nothing is fail-safe so no matter how robust your architecture is or how well you’ve planned for disasters, you still may have to recover from one. It’s a good practice to have your team run disaster recovery drills at least once a quarter.

When Disaster Strikes

We often take for granted our data is available to us at all times, via our iPhone for example. Everything is available to us from the cloud but there is a datacenter somewhere that is hosting and mirroring that data. What happens when a natural disaster strikes and the datacenter housing that data is affected by a flood or a tornado? In these cases, you could lose all your information including data, photos, videos, emails, etc.

Now think of this in a business sense. Would you be able to complete your job without access to all that data and content on your laptop? It’s crucial to have at least two redundant data centers available in completely different locations in the case of a disaster so you can continue to remain productive and get work done without any downtime. This can be achieved by implementing an active-active cloud architecture.

Active-Active vs. Active-Passive Architecture

Most clouds do have redundancy so when one datacenter goes down (in the case of a natural disaster for example), you can still access the mirrored datacenter to stay productive. However, these connections are usually passive, meaning someone has to manually move all users to the alternate site, which causes more downtime for the customer. In the case of an active-active architecture, both sites are active-active – this means if one site goes down, users are automatically moved to the other site without any downtime. This is a much more modern approach, where the network is self-healing and self-monitoring without the need for manual attention. An active-active is also a mobile solution where users can stay online and productive without experiencing the latency associated with accessing datacenters around the world. When customers travel, they are automatically connected to the nearest healthy datacenter for a seamless experience.

An active-active architecture also affects the recovery time objective (RTO), which is a targeted duration of time a defined service level must be restored after a disaster or disruption – as well as the recovery point objective (RPO), which is the maximum amount of time data may be lost in the event of an incident. In an active-active scenario, both metrics are eliminated completely, as both will equal zero seconds, meaning any time or data loss after an incident or a disruption is eliminated and end-users will be unaffected.

Steps to Building an Active-Active Cloud Architecture

To build an active-active cloud architecture, you must create at least two physically separated networks that mirror each other. Then, implement content and data synchronization between the sites:

1. Data Replication

In any organization, there are large amounts of data being created constantly. Continuous access to this data is crucial for keeping the business up and running. Data Replication ensures redundancy and automatically reroutes users to the functioning network(s) if one site is down.

2. Application Aware Architecture

In addition to data replication, the applications must be redesigned to ensure awareness of the active-active cloud architecture ensuring they respond appropriately during any disruption. IT teams should carefully evaluate which aspects of the cloud architecture should be designed as active-active and which are more fault tolerant and appropriate for an active-passive architecture.

3. Geo Proximity

Geo proximity means a user is connected to the nearest, healthy data center location automatically, no matter where they are. Also, if one site goes down, the user is automatically connected to the next nearest healthy site for a seamless experience. This is similar to how cell phone towers work. This ensures highly reliable, low-latency performance.

4. Testing 1, 2, 3…

Always test your cloud architecture with a syndicated outage to determine if the process is successful. Don’t wait for that first hurricane to hit before you know what you’re dealing with.

Why Implement an Active-Passive Architecture

An active-active architecture is the best option for preventing downtime and preventing data loss, the reality is that many organizations defer to an active-passive for one reason – because it’s easier. An active-active architecture is difficult to maintain, requires extensive coding on the backend, and large amounts of time, costs and resources, which will vary by specific architectures. The truth is that many organizations choose the easy road by implementing an active-passive approach because they believe it’s “good enough” to get by and serve their customers adequately.

However, if an organization has a real goal is to provide a seamless, always-on experience for the customer, the organization must take a hit in terms of time and resources to make this a reality. After all, the real success of a company comes down to the success of their customers and in the end, every second of downtime matters.

The best way to achieve >99.999% availability is to deploy an active-active cloud architecture. But in reality, an active-active is only as good as the quality of the software. Building two datacenters is the easy part, but the best user experience really comes down to the software – this is the real secret sauce and will determine the long-term success of any solution. As businesses strive to find their new normal, keep employees engaged while working remotely and maintain customers the last thing they need is unreliable technology, data loss, or downtime. Determine which areas of your network are mission-critical and begin moving them to an active-active architecture as soon as possible. As we’ve learned all too well, the world can change in an instant and you need to be prepared.

 

August 12, 2020 – DRJ Academy Introduction

WATCH NOW

August 19, 2020 – Preparing to Reopen: Protecting Employees, Customers, and Visitors

WATCH NOW

August 26, 2020 – Peak Hurricane Season: 9 Tactical Steps to Preparedness

WATCH NOW

September 2, 2020 – DRaaS Playbook: Achieve IT Resilience through Cloud-Based DR with iland and Zerto

WATCH NOW

ABOUT THE AUTHOR

Curtis Peterson

Curtis Peterson is the senior vice president of operations for RingCentral. Peterson has 27 years of experience managing information technology and carrier-scale data and packet voice communication networks. At companies ranging in size from startups to Fortune 500 firms, Peterson has managed teams responsible for engineering, project management, operations, data security, network security, data center, carrier operations, and internet backbone design and operation. Peterson has been a pioneer in VoIP services in the business communications space and has been developing, launching, and operating Class 4 VoIP and customer-facing hosted PBX systems since 2002.

Three Quick Tips to a Successful Disaster Recovery Runbook
A myriad of scenarios can take a business down, risking damage to reputation, regulatory fines, and data loss. It’s key to...
READ MORE
Young Professional Spotlight – Zachary Falb
Zachary Falb did not begin his career in business continuity and disaster recovery. In fact, he was completing contract work...
READ MORE
Going Cloud Native with Your Cloud Backup Strategy
Large enterprises or small businesses, they increasingly have a common objective. They expect the cloud to play a larger role...
READ MORE
Stadiums Making a Play for Better Digital Infrastructure
There is no doubt that modern and emerging technologies have drastically changed how fans experience live events. Fans want to...
READ MORE