Multi-Region Disaster Recovery Automation

Cold/Warm Standby in other region

Once you have defined the Disaster Recovery Plan with all the critical applications, and you have created and secured Backups the next step to improve the resiliency, reduce the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) is automating the continuous replication to another region of those critical apps (asynchronous process), and periodically checking that the infrastructure in the secondary region can work properly.

A common architectural pattern to balance cost, availability, and recovering from different kind of incidents, when the workload requires to withstand the fall of a region, is to have High Availability in multiple Availability Zones, and Disaster Recovery configured to maintain a copy of the data with continuous replication to a different region, and a CloudFormation template to create on demand the infrastructure required to work with the data (C).

Keep in mind that Disaster Recovery has an advantage over High Availability in the fact that allows to recover to a specific point in time (such as “before the compromise” or “before the ransomware started spreading”)

AWS Elastic Disaster Recovery

You can leverage the service AWS Elastic Disaster Recovery to set up, test and operate with disaster recovery scenarios:

AWS Elastic Disaster Recovery

AWS Elastic DR allows you to maintain copies in remote locations, which greatly reduces the effort required for the initial configuration, monitoring, and recovery process of the Disaster Recovery Plan.

The services can keep an updated copy in the cloud or in a second region so that the implementation of the backup is done in a short time.

Disaster Recovery - From On-Prem to the Cloud

A frequent use of AWS Elastic DR is copying the data from on-prem to the cloud, since it is a cost efficient way of having a recovery site, thanks to the pay-per-use modality of the cloud. These services facilitate this task, provide encryption in transit (TLS) and the services where they store the information, support encryption at rest.

The service supports multiple virtualization technologies, operating systems, hardware configurations, and applications. More details here:

Disaster Recovery in another region

Another common use is to copy data from a source region that they use to a second region for Disaster Recovery, so that if the entire region goes down, you can continue the operation by activating the recovery site in a second region.

Risk Mitigation

  • Region outages are not frequent but have a significant imapact, therefore critical applications may require Multi-Region DR with automation to achieve their SLA. Also, Data Destruction or Ransomware can have significant impact without a service such as AWS Elastic DR that allows the recovery of a point in time.

Guidance for assessments

  • Have you discussed the required SLAs with the business for critical applicaitons?
  • Do you have Multi-Region Disaster Recovery ?
  • Have you automated the recovery of critical applications ?
  • Have you tested the switch to the DR region ? (Fire drill)

If multi-region DR automation was not implemented due to resource constraints but it was required by the SLA indicated by the business an executive should sign the Risk acceptance letter for this risk not mitigated.

Webinar: Disaster Recovery on AWS

View Webinar