Encryption, Deduplication and Making Sense

Recently, I was in a meeting with the CIO of a leading bay area company, when he interrupted my cloud security presentation and said “Encryption, Global Deduplication and Making Sense. You can only choose two of them.” This statement is probably true for 99% of the vendors out there, but it did give me pause for a moment. But then I got a wicked smile on my face, as I began to explain how Druva is different.

A global deduplication algorithm needs not just the hash for the new block but also the information about the existing blocks in their original (non-encrypted) format. Unless the cloud stores the encryption keys it’s simply impossible to deduplicate the data. When other vendors claim deduplication in software or cloud, they most likely either have a common encryption key for all the users stored in the cloud or simply fake deduplication.

At Druva, we took a different approach, developing an innovative concept called “two-factor encryption” which in simple terms works like a bank locker system. Both the user and the cloud have their own parts of the key, and only when the user authenticates, can cloud (in that very session) perform encryption and in-line but global deduplication.

This is how it works :
For users, the key is his own password and for each user, the cloud stores a respective unique token further encrypted by the user’s password. So at no point, does the cloud have the full encryption key and is locked out of accessing the data. But when the user tries to authenticate, the password is used to decrypt the token which in turn authenticates him as well. The decrypted token (with some additional details), is then used as encryption key and also used to perform in-line global deduplication.

It works great for enterprises, as no single user or the cloud provider store the encryption key, and yet we are able to achieve secure backup, global deduplication and data retrieval.

This morning I saw an update in salesforce for the same customer. So I think we managed to convince Mr. CIO and the security team in that meeting.

Enterprise Cloud Security: inSync Cloud Deployment Learnings

At Druva, we’re currently going through an ISAE 3402 Type-I/II audit. It caused me to step back to understand what the findings of this audit have taught us. It reinforces that safeguarding our customers data is critical.


Cloud security can be broken down into the following categories :

  • Network access and security
  • Authentication and access control
  • Data storage security
  • Cloud administrator access
  • Physical infrastructure security

Network Access

As for the outermost layer, it’s fairly straight forward. We applied three simple rules, ennabling security robust enough for any network intrusion :

  • Strong (preferably 256 bit) SSL v3 network encryption
  • One-way firewall port forwarding
  • Limiting the IP addresses or PCs which have priviledged access to the infrastructure

Authentication and Access Control

With cloud security, the key is to control authorized access. This is one of the most critical steps in ensuring security of your infrastructure. Druva deployed the following steps to prevent any unauthorized access :

  • Two-factor authentication
  • Strong password policies
  • SAML integration
  • Strong metadata encryption
  • Choosing a non-intuitive database schema
  • Data masking and scrambling
  • Audit trail on access or changes

The two-factor authentication for administrators and password control for users ensure the cloud is protected from any identity thefts. SAML integration further helps single sign-on and centralizing the authorization. Strong encryption, non-intuitive schema and data scrambling helps mitigate any identity theft in case of intrusion.

Data Storage Security

The innermost part of the infrastructure is the data storage. At this stage, unauthorized access is the biggest risk. A good security policy will enable the following :

  • Two-factor encryption – A bank locker system to avoid unauthorized access from either parties
  • Data splitting – Splitting the structured data across different files and servers
  • Bucketing and sandboxing data – Making sure the extent of data compromised can be contained

Druva was the first to develop and use two-factor encryption for securing stored data. The encryption works like a bank locker system, where both the user and the cloud hold part of the key. For the user it’s his own password and for the cloud, its a token unique to every user.

Data splitting helps both in load-balancing and physical security of data. Any attempt to mask the knowledge of any direct access to data is always useful. And data sandboxing ensures that each enterprise customers data is sandboxed (physically, logically and through encryption) to avoid the security thread spilling over.

Cloud Administrative Access

We learned that security infrastructure is incomplete without a solid security policy. The rules around who owns policies and who implements them should be clearly defined. Druva applied the following processes:

  • Clear separation of roles: In other words, the security team, the engineering team, and the operations teams should be defined and exclusively independent.
  • Multi-level authorization to gain access to cloud servers
  • Audit trails for access and control

Physical Infrastructure

And lastly, the physical security of servers is critical. For this, we trust our cloud partner, AWS, and regularly check their internal processes and audit reports to ensure physical security of servers.

Overall security has been our cloud teams area of focus, and we learn something new every day. Hopefully these recommendations will help you in your cloud strategy planning and implementation.

Six Key Enterprise Cloud Trends for 2012

1. Cloud Security

There was a steep rise in the number of security breaches in 2011. Part of it was due to an increase in detection and reporting, but these sophisticated attacks have definitely improved over the past few years. The sheer size of the public clouds often makes it easy for the hacker to find his way in and consequently harder to get detected.

Leasing and sharing the infrastructure comes with drawbacks, and a security threat could very well arise from within the cloud. Outside the secure enterprise firewall, the basic building blocks of security – authentication, encryption etc. have to be thought through all over again. The fact that someone else may hold the keys and passwords to your data can can greatly affect your service architecture.

With wider adoption and increasing awareness the cloud providers will have to rethink security. With better security initiatives, new audit standards (ISAE 3000) and accountability we will eventually experience a more secure cloud.

2. Will the Real Cloud please Stand Up?

In 2011 a number of vendors launched “cloud” or “virtual cloud” editions of their traditional on-premise applications. Some analysts called it cloud washing (from brain washing), because these applications were not designed for cloud and missed some critical security and availability features.

As enterprises understand clouds better, the “real cloud” vendors will be more visible in 2012. The parameters around quality of service and enterprise SLAs will highlight the “real” cloud vendors from the rest.

3. Big “Structured” Data and Evolving Cloud Storage

The promise of “scaling linearly” has made the enterprises ride the cloud story much faster than they had planned. As data accumulates more and more data in the cloud, it opens new challenges to manage it. Zetabyte clouds are not unheard of and petabyte cloud are more common than before. And the challenge to scale is more complex for structure data, which has even more challenges in terms of quick access, transactional commits etc.

While technologies like HADOOP evolve with better offerings, the commercially available solutions aren’t mature enough yet to solve the real world big-data challenges. While most of large cloud players are still using something home grown, this year will see a big rise in commercially viable offerings which would help newer cloud providers scale better.

Another interesting trend to watch will the changing nature of cloud storage. Unlike compute the cloud storage doesn’t offer any classification on quality of service. The hierarchy within cloud, “less available” or “offline” option, will save cost and something we would likely see in 2012.

4. Integration

Public clouds are still considered external to an enterprise. This year will see better integration of cloud with enterprise resources like Active Directory for improved security, single sign-on and seamless access.

Integration between different cloud services like Salesforce.com will also be key for different enterprise teams to exchange data and collaborate.

5. Standardization and Interoperability

The fear of single vendor lock-ins drove the need for some highly successful open standards like Open Storage, NDMP and NFS. With high adoption of Object based storage (AWS S3) and new no-SQL databases (AWS DynamoDB, Google BigTable), the new cloud players would be forced to make their solutions API-compatible with the leading players. This would be drive customer adoption and better cloud friendly designs, without the fear of vendor lock-ins.

6. Cloud Balancing

Amazon is a role model for server-oriented architectures. As different cloud options become viable, new cloud application architectures may not restrict to using a single cloud backend for different service components.

Load balancing, fear of single vendor lock-ins and high availability of cloud infrastructure will open doors for cloud balancing. Vendors will look for ways to make sure they can build cloud services with what’s best available from different vendors and also use interoperability to balance the cloud between different cloud backends.