As an industry professional, you're eligible to receive a printed copy of the journal.

Fill out your address below.






Please reset your password to access the new DRJ.com
Reset my password
Welcome aboard, !
You're all set. We've send you an email confirmation to
just to confirm you're you.

Welcome to DRJ

Already registered user? Please login here

Existing Users Log In
   

Create new account
(it's completely free). Subscribe

As a technology professional, I like to tinker in home automation and smart devices. I have two WiFi networks at home, Cat6 cables to every room, multiple gigabit switches, and routers, all connected to an Internet service provider (ISP) gigabit fiber network. When my ISP had an unexpected network outage for two days, I was shocked to discover how much it impacted my daily life.

While I could use my laptop by tethering it via USB to my mobile phone, the rest of the house was disconnected from the world. Without an Internet connection, I had no access to Amazon Alexa, the control hub to most of my smart devices, meaning no cameras, no smart door locks, no smart garage door opener, no smart appliances, no smart lights, no smart switches, no streaming music service, and no streaming TV services. Even my smart sprinkler system stopped working!

For two whole days, I had no access to many of the Internet-related services at home. Because of the outage, I had a lot of free time to muse over the disaster recovery (DR) implications. What would happen if a business lost Internet service for two days?

In professional services consulting, we approach problems methodically. My favorite approach is the Six Sigma DMIAC, which stands for define, measure, analyze, improve, and control.

  1. Define the problem – My goal is to provide a level of redundancy so most of my home services can continue to function in case of another outage.
  2. Measure the current baseline for improvement – After a detailed inventory, I counted 97 networked devices on my home network including 62 Internet of Things (IoT) devices. While 35 of them would not work at all without Internet connectivity, the other 27 devices would continue to function if I had some type of local management over WiFi.
  3. Analyze the root cause – There are many factors that could cause Internet service interruption. My goal is to continue the service locally. I found I had three problems to address.
    1. Single ISP connection – The problem could be solved with a redundant ISP connection. I would need an edge router to connect to both ISPs, and load balance the connections. In case of a single ISP outage, the traffic would automatically failover to the working ISP port.
    2. Physical connectivity – What if there is a regional physical outage? My last outage was related to road construction when the crew accidentally severed the ISP buried fiber line. This could impact the physical lines to both ISPs. A cellular-based ISP solution to back up the physical networks would be ideal. After a quick search on Amazon, I was able to find what I needed. There are many modems that can connect to the cellular network via 4G LTE and fallback to 3G if needed. The modem can connect to one of the Wide Area Network (WAN) ports on the edge router and provide seamless failover.
    3. Complete Internet outage – Finally, if Internet service is completely unavailable, I should still have the ability to operate critical services locally on my home network. I identified several of these critical services and wrote down the possible solutions. Three of the examples are as follows:
      1. Locally hosted security cameras
      2. Locally hosted media streaming service
      3. Manually bypass for local sprinkler control
  4. Improve – To address the list of the DR problems, I implemented the following improvements:
    1. In addition to my physical ISP provider, I added a cellular modem with my current mobile provider for redundancy.
    2. I installed an edge router with three WAN ports. Traffic would normally go through the gigabit fiber connection to the primary port when both physical and cellular ISPs are available. If a physical connection is unavailable, traffic would automatically failover to the cellular ISP at 4G speed.
    3. I also added a media server to my network to deliver streaming content in case of a complete outage.
  5. Control – After the implementations, I continued to identify issues with the new solution and made additional improvements. The last problem is the low bandwidth issue associated with a cellular connection. I have five security cameras around my home. Each camera would require 2 Mbps upload speed to send video to the cloud. That equals a total of 10 Mbps already, and the typical 4G upload speed is only 5-6 Mbps.

Then I realized I had just designed a multi-access edge compute (MEC) environment for my home!

IoT Problem Statement

The number of IoT devices is increasing exponentially in the recent years, especially in the wide-area cellular connections category. According to Ericsson, the number of IoT devices to reach 18.1 billion by 2022. Of those, 2.1 billion IoT devices will be using cellular connections, as well as unlicensed low-power technologies. Security will become more critical due to real-time data sharing, some of which may be confidential.

Business Insider conducted a survey for IoT planned investment and found that IoT investment is accelerating in enterprise, governmental, and consumer segments. With this increase, providing a secure and stable infrastructure is even more critical to manage the IoT devices that typically have very different network requirements based on different use cases. Some requirements include the following:

  • High availability – MEC design can be used in a highly available design that is extremely fault-tolerant. A MEC distributed cloud environment is very loosely coupled to minimize the impact of regional outages.
  • High data throughput – The current MEC implementations based on 3G and 4G are not sufficient for this requirement. However, with 5G just over the horizon, throughput would no longer be an issue. MEC with 5G implementation can provide gigabit data speed via radio signal without the limitation of physical cables.
  • Low latency – MEC architecture focuses on cached content delivery network (CDN) caching to minimize latency.
MEC Overview

Multi-access edge compute, previously called mobile edge compute, is a natural solution to the convergence of IT and telecommunications networks. MEC is the application of cloud architecture principles to compute, storage, and networking infrastructure close to the user at the edge of a network. There are many advantages to MEC design compared to traditional networks, especially in a disaster recovery scenario, including the following:

  1. Integrates the cellular network for redundancy – Instead of relying solely on physical land lines, integrating the cellular ISP connection provides an additional level of redundancy. The cellular network is also faster to restore than land lines in a severe disaster scenario.
  2. Resiliency and fault isolation – If Internet connectivity is down, an edge cloud is still able to function. MEC-based applications use geographically distributed edge clouds for fault isolation. In the event of a widespread outage, separate edge cloud environments can still function independently.
  3. Localized content – Content caching in a distributed cloud MEC design improves performance to enhance the local user experience. In the event of a disaster, the localized content delivery system can continue to provide service.
Utilizing MEC to Solve IoT Challenges

1. Network Redundancy

In my home network example, my redundant network design, I implemented an LTE modem with cellular ISP. This new connection is plugged into a multi-ISP router. In the event of failure of my physical ISP, all the network traffic would automatically failover to the cellular network. There would be a brief reconnection of packets, but the user impact would be minimal.

Scaling this approach to a large enterprise environment is similar. Instead of installing only a physical ISP carrier network, cellular ISP is implemented for redundancy. Additional resiliency can be achieved by implementing a cellular network within the local private enterprise network.

Private MEC implementation is also on the rise. After the Sept. 11 attacks in New York City, the public network infrastructure was severely damaged. First responders and government agencies were not able to communicate using cellular phones. However, the private MEC operated by the utility company was still functional, and the utility company opened their network to some first responders to help with the communication issues.

Using my home network as an example for redundancy on a small scale, I have two local WiFi networks in addition to a wired network. I designated the first WiFi network to run all my critical infrastructure devices such as my security system, fire and smoke detectors, garage door opener, water leak sensor, and home automation systems. The second WiFi network is used for general network traffic with a separate guest network for family, friends, and neighbors. Either of the WiFi networks could run the devices of the other network in case of an emergency, and I could share my limited cellular connection with others to provide some local Internet coverage.

2. Resiliency and Fault Isolation

My network security cameras require significant network bandwidth to upload the video data to the cloud. I replaced four of the cameras with a different system with a local video recorder that operates on my local network. The system uploads data to the cloud for backup every hour. In case of a physical network outage, the quality of service (QoS) rule would stop the backup traffic. This will enable other more critical services to function with the backup cellular network and also continue security monitoring of my home. In case of a complete network outage for both physical and cellular providers, the four cameras on my local network will still be able to obtain security footage. This is a good example of how MEC works to provide fault isolation.

3. Localized Content

Putting Netflix content on my local network is the basis of a CDN. A CDN – or content distribution network – is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and high performance by distributing the service spatially relative to end users.

Enterprise CDN is similar in design. The original server with content is connected to MEC edge servers distributed geographically closer to the end users. Content is delivered by edge servers locally to the end users. By caching and storing contents closer to the edge of the network, we can reduce Internet traffic and provide a much better user experience in terms of performance and response time. There is a BC/DR aspect to this as well. A localized content server can provide service even if the origin server is not available.

Summary

With the exponential growth of IoT, utilizing MEC is a great solution to solve many of the IoT specific challenges. Cellular-based ISP can supplement the existing physical network for redundancy. MEC architecture can provide additional resiliency and fault isolation capabilities. MEC-based CDN can enhance the local user experience and continue to deliver services in case of a disaster. Finally, the MEC deployments are based on 3G and 4G technology. 5G will elevate the MEC use case to a new level, ensuring a future-ready solution.

February 3, 2021 – Using Mass Notification to Accomplish Your 2021 Business Continuity Goals

WATCH NOW

February 17, 2021 – Is your BIA effective? Or are you using it ineffectively? How 2020 Changed My View on “Traditional” Business Continuity

WATCH NOW

February 24, 2021 – Evolving Employee Safety for the Anywhere Worker

WATCH NOW

ABOUT THE AUTHOR

James Zhang

James Zhang is a disaster recovery architect with Verizon Professional Services. He is a Certified Business Continuity Professional with a wide range of experience in infrastructure and solution architecture. Zhang graduated from University of Illinois at Chicago with a degree in information decisions sciences. He also has an MBA degree from Lake Forest Graduate School of Management with focus on organizational behavior and management analytics.

4 Keys to Business Success During the Pandemic and Beyond
Remember all the way back in early 2020, when you started rolling out your carefully crafted business plans for the...
READ MORE
Recovering from Ransomware
Recently, I got the call from a firm that had been attacked by ransomware. With all servers infected, they refused...
READ MORE
The State of Disaster Recovery Preparedness 2020
Forrester Research and Disaster Recovery Journal have partnered to field a number of market studies in business continuity (BC) and...
READ MORE
Three Quick Tips to a Successful Disaster Recovery Runbook
A myriad of scenarios can take a business down, risking damage to reputation, regulatory fines, and data loss. It’s key to...
READ MORE