Disaster Recovery Plan

Architecting to withstand failures

It’s recommended to have a Business Continuity Plan detailing which workloads need to resist the fall of an entire region and which ones have to withstand the fall of an availability zone (most of the applications will be here). Depending on the Recovery Time Objective and the Recovery Point Objective, different techniques will apply.

A common architectural pattern to balance cost, availability, and recovering from different kind of incidents, when the workload requires to withstand the fall of a region, is to have High Availability in multiple Availability Zones, and Disaster Recovery configured to maintain a copy of the data with continuous replication to a different region, and a CloudFormation template to create on demand the infrastructure required to work with the data.

Keep in mind that Disaster Recovery has an advantage over High Availability in the fact that allows to recover to a specific point in time (such as “before the compromise” or “before the ransomware started spreading”)

You can leverage the service AWS Elastic Disaster Recovery to set up, test and operate with disaster recovery scenarios:

AWS Elastic Disaster Recovery

Disaster Recovery

Both services allow you to maintain copies in remote locations, which greatly eases the effort required for the initial configuration, monitoring, and recovery process of the Disaster Recovery Plan.

The services can keep an updated copy in the cloud or in a second region so that the implementation of the backup is done in a short time.

Disaster Recovery - From On-Prem to the Cloud

A frequent use of AWS Elastic DR is copying the data from on-prem to the cloud, since it is a cost efficient way of having a recovery site, thanks to the pay-per-use modality of the cloud. These services facilitate this task, provide encryption in transit (TLS) and the services where they store the information, support encryption at rest.

Both services support multiple virtualization technologies, operating systems, hardware configurations, and applications. More details here:

Disaster Recovery in another region

Another common use is to copy data from a source region that they use to a second region as Disaster Recovery, so that if the entire region goes down, you can continue the operation by activating the recovery site in a second region.

Pricing

https://aws.amazon.com/disaster-recovery/pricing