Attempting to protect the cloud with your data center is like hauling a ton of manure in a sports car. It’s possible, but a really bad idea. You’ll need to make several trips, you’ll waste a lot of money on extra gas, and when you’re done, your car won’t smell very good. Likewise, using your data center to protect cloud data is more expensive and less efficient than a cloud-based solution – especially during restores.
Data center backup systems were designed for protecting the data center — in a world where bandwidth was essentially unlimited. If you needed more bandwidth, you could always buy more. Even if the backup system tasked the current LAN more than it was capable of handling, you could simply upgrade the LAN or even create a separate one for backup purposes. Some backup software products even included special features that enabled users to send backup traffic across specific interfaces, so a backup could use its own network.
Perhaps even more important, the cost of network infrastructure is not typically charged back to the internal customers that use it. Unlike the cloud world, where every bit and byte creates additional cost, backup systems can use the LAN without incurring charges to their cost center.
Because of these assumptions made by every major data center backup solution, there was very little development aimed at minimizing bandwidth usage. The biggest development was to minimize bandwidth usage between a backed-up client and a deduplication appliance. Once again, the assumption was that an appliance would be available.
This is perhaps the most important reason why data center–based software is a really bad idea for backing up cloud data. While most cloud-storage providers do not charge for sending data to the cloud, they do charge to take data out of the cloud. Even if you had a bandwidth-efficient way to back up data from your cloud repository — such as a source dedupe backup software product — you would need to pay extra to copy that data from the cloud repository to your data center devices. (These are known as egress charges.)
In addition, there are the economics of data center storage versus cloud storage. Whereas cloud storage is pay-only-for-what-you-use, data center storage requires you to buy and provision everything upfront, including over-provisioning significant amounts of capacity.
As previously mentioned, data center backup software was designed to back up data centers, so it simply isn’t expecting the kind of latency associated with a WAN connection to a public cloud provider. Such products were also not designed to reduce round-trips, because most data centers would expect a typical round-trip conversation to take no longer than 5 to 10 milliseconds. Such a conversation over a WAN can take 100 times longer! Therefore, it is incredibly important for cloud backup software to include optimization techniques. In addition, latency is less of an issue when staying within the cloud. For example, a backup service that runs inside Amazon Web Services (AWS) would experience much less latency when backing up AWS than a product that is running in a data center on the other side of the Internet.
In addition to minimizing round-trips, backup software designed for the cloud uses source side, application-aware deduplication. This type of deduplication is more efficient and cost-effective than target deduplication for many reasons and can mean the difference between getting a job done well and not getting it done at all.
There are two reasons why a data center–based backup solution is less efficient then a cloud-based solution during restores. First, there is less latency between two applications running in the cloud than there is between one application running in the cloud and another application running in a data center. There is also higher bandwidth between two cloud applications, which results in a much faster restore than when a cloud application is restored from a data center–based backup.
Data backed up to the cloud can be restored to any location—even if it’s just back into the cloud—without having to transfer the data across the Internet. Not only is this more efficient, it leaves many options open to you that simply aren’t possible if you use your data center to back up the cloud.
The most important reason why you should protect the cloud with the cloud is that this offers recovery options that are very difficult to do in the data center. For example, a cloud-based recovery solution could easily update Amazon Machine Images (AMIs) in the event of a disaster. Although it’s technically possible to use a data center–based backup solution to do the same thing, you would have to deal with inefficiencies and extra costs associated with pulling the data out of the cloud and into your data center, only to then return that data back to the cloud in order to update your DVR images. This goes back to my opening analogy: you can haul a ton of fertilizer with a sports car, but why in the world would you want to?
If you take on the role of backup provider for your cloud data, you are assuming all of the privacy and security risks that a Software as a Service (SaaS) provider could easily take care of for you. Backups are meant to be the last line of defense, but backing up the cloud to your data center will put your data on the first line of attacks. This includes all of the data covered by the new General Data Protection Regulations (GDPR), which apply to any companies that store the data of EU citizens — even if you’re in a different continent.
When you’re hauling fertilizer, you should use a pickup truck; when you’re backing up the cloud, you should use the cloud. The Druva Cloud Platform will match whatever requirements your cloud environment has, whether you’re backing up an Infrastructure as a Service (IaaS) or Platform as a Service (PaaS) system that’s running in AWS or Azure, or backing up a SaaS service like Office 365, G-Suite, or Salesforce.com. The Druva Cloud Platform also enables you to back up your mobile devices and data centers — all while operating via an operating expenditures (OPEX) model. By contrast, attempting to meet those same requirements by designing and scaling a data center–based solution is quite difficult and requires significant capital expenditures (CAPEX).
Unlike other solutions that call themselves cloud backup applications, the Druva Cloud Platform is not just traditional backup software running inside a virtual machine (VM) that you manage in the cloud. The Druva solution is a true cloud-based service designed with the cloud in mind. You never create, manage, or pay for VMs or block storage associated with your account. Druva uses a cluster of nodes that dynamically scales up and down to meet our customers’ immediate needs. We use DynamoDB as our metadata database, and S3 and Glacier as storage for backups. Backing up a cloud application to our Cloud Platform is more efficient, costs less, and is easier to manage then the alternatives.