News/Trends, Tech/Engineering

Future Proofing with Cloud Backup

February 21, 2019 W. Curtis Preston, Technical Evangelist

Data backup and recovery has been around almost as long as the data center. For the last 30 years, the idea of backing up data to a box has been the status quo, but does that logic still apply in the age of cloud and distributed databases? As enterprise data moves to the cloud, it only makes sense that cloud backup solutions should also be a top consideration to ensure alignment with the business.  

In many cases, direct-to-cloud data protection and management service is the easiest and least expensive way for most companies to backup data to a central location, after which a variety of other data management services may be provided. (Data management can include data analysis, supporting dev & test, legal hold & E-discovery, and other things.) But companies like Druva are clearly going against the traditional methods of data center backup and recovery, and so the question becomes “why should I move to the cloud?” After all, if it doesn’t appear to be broken, why fix it?

Download the ebook: The Cloud-First Data Protection Playbook

Indeed, many seem to think that backing up to the cloud is not possible or financially feasible, but this blog post will look into the reasons why the cloud could be the best place to protect your data.

It’s where we’re going

Before getting into the details of how and why cloud data protection makes sense, it’s important to mention that most companies are either already using the cloud or have plans to do so very soon. Many companies have migrated most of their computing into the cloud. In fact, most modern startups begin in the cloud, scale up in the cloud, and never leave it. Consider Salesforce.com, the original and largest SaaS provider, that still runs most of its workloads on AWS.

The cloud is where you’re going for some or all of your IT infrastructure. When you start using the cloud – if you haven’t already begun doing so – you will most definitely be using the cloud for backup. So the question really is not whether or not you should use the cloud for backups, but really about when and where you should do so.

Why shouldn’t you back up to the cloud?

Let’s get the objections out of the way first. A lot of people wonder how they can feasibly backup and restore their data center to and from the cloud, given how much data is in the data center. The good news is that backing up many smaller data centers to the cloud is easy.  The challenge with cloud-based data protection is a single data center with a large amount of data, so let’s look at that.

The first challenge with a single large data center is the first backup to the cloud, often referred to as the seed. A data center with 1PB of data and a 1 Gb network connection would take two weeks or more to get its first backup to the cloud. (Notice that both the size of the data and the amount of bandwidth are both necessary to determine what “large” is.) Druva solves this problem using AWS Snowball Edge devices provided free of charge to our customers. Customers decide how many Snowball Edge devices they need, AWS ships the devices preconfigured with Druva data management technology, and all a customer needs to do is plug them into their network and turn them on. The first backup will go to those devices, after which the customer will ship the Snowball Edge devices back to AWS, who will automatically upload the data from them into S3. This entire process would take a few days instead of several weeks. Seeding challenge solved.

The second challenge when using the cloud as the repository for your backup data is people wonder what kind of recovery time objective (RTO) can you support if your data is in the cloud? Druva customers have three recovery options to meet different RTOs. Most Druva customers find that a direct-from-cloud restore is fast enough to meet their RTO demands for a typical server. Customers with more aggressive RTOs deploy Druva CloudCache, which is a free agent loaded onto a local server of your choice that holds a cache of recent data to enable LAN-speed restores. Finally, customers concerned with recovering an entire data center investigate our included cloud-disaster recovery option, where an entire data center can be brought online in the cloud in a matter of minutes.

Why should you backup to the cloud?

To start, cloud-based  data protection is more secure than backing up to your own data center. Druva runs its service on AWS, the most vetted cloud provider of them all. Beyond the inherent security of using such a trusted cloud provider, there are the ideas of air gaps and defense in depth. While not technically an air gap, placing the primary protection copy of your data in the cloud does separate it in a number of ways from things that might attack your data center. Druva uses different communication protocols than a typical server and your data is stored in a layered security system best described as a series of locks that one must pass through before actually accessing data.  Druva also stores data and metadata in separate systems, unlike most backup systems. Contrast this with a backup system on a server in your data center, running the same operating system as the server you are protecting (e.g. Windows), and storing its data on a backup appliance that uses the same storage protocols your server uses (e.g. NFS/SMB). Backing up to the Druva Cloud Platform is much more secure than your typical backup system.

Backing up to the cloud via a source deduplication-based system – table stakes for proper cloud backup – is also more efficient than the alternatives. Typical backup software sends full backups and full-file incremental backups to a backup appliance that then duplicates them. This uses more bandwidth and more resources than a well-written source deduplication system, where data is de-duplicated before it is ever sent across the network. Instead of full backups, a source deduplication system sends only the new, unique blocks each time a backup runs. Backups to the Druva Cloud Platform happen in seconds and minutes, not hours like most other backup products.

Traditional backup systems are notoriously difficult to scale. Most operate as scale up basis, not scale out. Depending on the architecture in question, you’re constantly tweaking the size of the main backup server, adding media servers, and most importantly adding additional backup appliances. Each backup appliance is a deduplication island, creating duplicates and wasted space across your environment when you backup something to more than one system. If tape is part of your equation – and it probably is – scaling is even more problematic. In addition to all these scaling problems, you also need to maintain the operating system and application, constantly upgrading and patching them for security, performance, and feature reasons.

It is true that there are some newer scale out backup systems with integrated storage. These systems are easier to scale than traditional backup systems, but they are not maintenance-free. They must be consistently upgraded and patched, and require customers  to buy capacity before it is ever used, usually purchasing two or three years worth of capacity at a time. If you do need additional capacity, you lose bargaining power and will have to pay whatever the vendor feels is appropriate.

But here’s the real problem with a typical backup system: how many hours a day are you performing backups? How many hours a day do those backup systems go unused, but still consume power and space? How much spare computing capacity is being wasted because you purchased a single-purpose backup appliance? It may be simpler and easier to use than the traditional systems of days gone by, but it also goes completely against the last 20 years of computing trends and the move toward more virtualization and cloud computing. Have you ever stopped to consider why you are buying a full-time resource to perform a part-time job?

Backing up to the cloud makes sense because it is perfectly matched to how the public cloud was meant to be used. Data protection and management contain a series of tasks that sometimes need significant computing resources, and other times need none at all. In the beginning, backup needs very little storage, but over time can need quite a bit. Backup appliances try to solve both problems by selling enough compute capacity to meet peak needs and enough storage capacity to last multiple years, but creates  massive amounts of waste in the meantime.

But if you use a properly designed, cloud-native application, it would only use computing resources when necessary and would only consume storage as needs grow. No longer would you pay for compute and storage – and all the related power and cooling requirements  – that goes unused most of the time. Backing up to the cloud is more secure, faster, easier, and cheaper than the alternatives. Finally, a cloud-native service like Druva also moves your data protection system into the latest way we do things in IT – in the cloud.