How to protect application data running in AWS

It may sound obvious, but even cloud instances need backup. Let’s leave for now infrastructure-related outages that may impact your application – I believe that it is more dangerous to see administrator or developer mistake bringing down application then hardware failure in AWS infrastructure.

Why you should protect application data running in AWS?

You may ask “why”? Well, because most of the AWS deployments are designed to be highly available. We use auto-scaling groups, data replication mechanisms etc. So even if something fails on the infrastructure level – we typically are prepared for such scenario.

Situation is significantly different however if an application bug or simply other human mistake damages data. Neither auto-scaling groups, nor data replication can help us at this point, as you want rather to go back.

This is where many would simply suggest snapshots. Well, it is a partial solution, but we should all remember that “snapshot is not a backup”. Unless these snapshots are transferred somewhere, they still can be lost in case of a hardware failure.

Joining ability to store snapshot-based backups remotely and using snapshot management mechanism for more frequent recovery points seems to be also a good strategy, but let me give you a few additional points that you may want to have as far as DR planning is concerned.

Possible backup scenarios

Firstly, you may not want to use crash-consistent snapshot mechanism as a backup – especially for databases or other applications that may not be suitable for such strategy. AWS Backup allows you to protect several existing services including RDS databases. But what if you have it installed locally in your instance? Actually you can use application backup feature in vProtect, which allows you to backup virtually any application with its own mechanisms. This means that you’ll have a proper, application-consistent backup of your data. Generic means that you’re able to provide your own scripts, or just application native backup command and Storware vProtect will execute it and grab data for you.

Now the other scenario – let’s assume that EBS snapshot-based backup is OK. You may want to revert to the specific state, but it is quite common that you just want to browse and restore individual files – not whole EBS volume or even EC2 instance. In that scenario you really may want to have file-level restore capability. vProtect can actually do that for your. One of the options is to use mountable backups and browse through via web UI or directly on the vProtect Node. However we’ve you also have option to expose drives over iSCSI so that you can mount volume directly on the other instance without having to create a volume in AWS.

You also can implement hybrid approach – periodically snapshot your instance volumes and export them from time to time. You just need to assign both backup and snapshot management policies to your VMs and specify when, how often it is supposed to be backed up or snapshotted, and how many or for how long snapshots or backups should be kept. Especially, that you can automate policy assignment based on tags attached to your AWS instances.

Multiple backup destinations

Let’s be honest – Amazon does a great job in providing a wide range of services that cover basically every area in IT you would need. While this is great for the customers, it obviously locks them more and more in AWS ecosystem. So what if you need your backups to reside in other locations than AWS? Well, vProtect allows you to specify multiple backup destinations – starting from file systems, through other cloud storage providers such as Azure or GCS and ending with you instances of enterprise backup provider such as IBM Spectrum Protect. This means that you can address several other scenarios – data protected in a single AZ, transferred to other AZ/region, cloud storage or maybe to your local DC. Destinations can be anywhere, provided that you have a network connectivity between your cloud and locally residing backup provider. This means that if anything goes wrong with AWS, you still have local copy of your data.

File system backup provider in vProtect also can use global deduplication. This means that if you have a dedicated EBS volume used as a backup storage it will significantly reduce costs of storing data. Imagine that you’ve created 100 GB EBS low-cost volume and with deduplication you’re able to store over 2TB or more backup data on it.

Finally, having 3rd party backup solution also allows you to have full control over job execution. Especially, that not only you can configure and invoke everything from HTML-based responsive web UI, but also integrate with your self-service/orchestration systems with its API.

Marcin Kubacki

text written by:

Marcin Kubacki, CSA at Storware