Every time people introduce a new way of storing data, everybody is convinced that they don’t need to back it up. The latest, “Are you sure we need to back it up?” entries are SaaS applications like Microsoft 365, GSuite, Salesforce, and Slack. But the question isn’t whether you should, it’s how can you protect these new applications?
Multiple users, one backup
Data backup isn’t just for backup administrators anymore. The backup team is still responsible for protecting the data environment and meeting regulatory requirements, but users, application owners, and legal teams now create and access backups themselves, rather than submitting tickets to the backup team.
With SaaS applications, multiple teams need to use backups to support their use cases:
Application administrator: If a Salesforce administrator accidentally deletes a large number of records, the company holds them responsible for recovery, not the backup team. Therefore, they need to be able to backup before making changes and recover data quickly by themselves.
Users: Microsoft OneDrive users may want to retrieve a file from 6 months ago. They expect to be able to browse and retrieve the file without calling a backup administrator.
Legal: The legal team may need to put a legal hold on some users’ emails and then extract them for further analysis. The legal team wants simple access that provides a chain of custody and does not involve the backup team.
Backup team: The backup team needs to make sure all SaaS applications are protected offsite to meet retention and data residency requirements. They also are the final line of defense if one of the other users needs help.
With modern cloud workloads, a data backup solution needs to enable every team to do their job. The legacy “call the backup team” model cannot meet the business requirements for one application, much less scale to handle multiple new cloud applications. SaaS application backup must work for all users.
It’s all about the application
As businesses shift their focus to applications, backup vendors continue to focus on throughput, deduplication rates, and appliance form factors. After a decade of application homogenization (e.g., Oracle, SQL Server, and Exchange), the pendulum is swinging back to a diverse set of applications — this time as SaaS.
Application protection in general is difficult. It begins with understanding the appropriate steps to protect the application (e.g., quiescing, truncating logs, or excluding datasets). Then the backup solution needs to identify the appropriate application stores. Of course, backing up the data is insufficient, because you have to protect the application metadata. Finally, recovering the applications requires deep application expertise. And, of course, don’t forget that it has to enable all of the users to do their jobs. Every backup vendor builds agents because it is the only way to protect applications.
Protecting SaaS applications is more difficult than backing up traditional applications. SaaS applications are far more complicated than traditional databases. For example, Microsoft Teams channel conversations are stored in group mailboxes on Exchange Online, while Microsoft Teams channel files are stored within the SharePoint Online team site! It is impossible to protect SaaS applications without understanding them, the customers’ configurations, and their use cases. With SaaS, the features expand even without an administrator upgrade, so data protection must stay current as well. And every day, there are new SaaS applications to protect — Microsoft 365 keeps expanding, Salesforce, Slack, and more.
After a decade of “We backup VMs,” SaaS applications are returning application-intelligent backup into the spotlight.
Data and metadata movement
Most of what backup companies learned about moving data for backup in the data center does not apply to SaaS backup. In the data center, IT owned the infrastructure – servers, network, and backup — so they owned authentication, security, and performance management. With SaaS applications, everything changes.
Most CIOs turn the backup team’s world upside down by asking, “If the application lives in the cloud, why would we keep the backup on-premises?” IT moves from controlling the entire infrastructure to controlling none of it.
Building cloud-native protection for cloud applications changes virtually every design decision of a backup application. Authentication becomes critical because you cannot install a “backup agent” in a SaaS vendor’s environment. Instead, the backup application needs to login to the SaaS provider, get authorization to access the data, and call application-specific APIs. Since API access can be rate limited, the backup has to optimize its accesses. Each step must be secured and encrypted (in flight and at rest) to protect the data from everybody — cloud provider, backup provider, and potential attackers.
SaaS applications change every part of the security infrastructure, so the backup architecture has to evolve to meet their data transfer needs.
SaaS applications are built on a new cloud storage architecture, so they need a new cloud backup storage architecture to support them.
SaaS application backup formats look almost nothing like traditional backups. Most people think of backups creating “tar-like” data streams, clumping together all of the data and metadata, so it could be stored on tape. When given the choice to optimize between granular recovery and backup throughput, we chose throughput every time e.g., multiplexing database files, VM-level backups. We consistently have sacrificed metadata protection performance to optimize data protection throughput.
SaaS application backup has a much higher ratio of metadata to data than other applications. When you’re protecting Salesforce records, Exchange Online emails, or GDrive files, there are a lot of small objects to protect and recover.
Modern SaaS applications require a metadata-centric architecture:
Data movement: Since the backup is extracting the data via APIs that may not be tuned for backup, there is no clumping of data together on the source. There will be millions of independent objects with their metadata.
Application-centric: Since the complex applications are tied together by metadata, it cannot simply be pushed to the side and handed back to the application. The backup application needs to understand the metadata.
User needs: The goal isn’t just to restore the entire application anymore. Users only want their data back. Legal teams want specific data back. Therefore, they need to search for their information by metadata and get the information back in a readily accessible format.
A metadata-centric architecture separates the data and metadata. A metadata-optimized store enables high-performance SaaS application backup and allows all users to find and access their data quickly. A data-optimized store enables high-performance short-term backup and recovery, with automated tiering for long-term retention.
SaaS applications change everything. SaaS vendors use microservices architectures, API-driven access, and cloud storage to meet the needs of different types of users in their environments. They barely resemble the legacy on-premises applications that customers are used to building and running themselves. Not surprisingly, traditional backup applications, which were architected for the data center, struggle to protect applications inside Microsoft 365, Salesforce, and GSuite.
For SaaS applications, you need a new data protection architecture. It needs a metadata-centric design to support the API-driven backups and the variety of user recoveries. It needs built-in authentication, encryption, and network optimization. It must be built with in-depth knowledge of the SaaS applications. Most importantly, it must support the requirements for multiple users in your environment.
For SaaS applications, you need SaaS data protection.
Learn how Druva provides a secure, centralized cloud platform that ensures compliance and protects against SaaS data loss.