I’d like to share Druva’s recent experience with AWS DynamoDB. As you may be aware, inSync cloud runs on top of AWS infrastructure. The backup servers run on EC2 instances and the backup data is stored as S3 objects.
A critical piece of the puzzle is the database engine to store dedupe and CDP (continuous data protection) index. As per our analysis, each 100 kbytes of backup data performs 12 reads and equal number of writes to this index. So for a 100 mbytes/sec of backup speed, we need a database engine that can sustain 12000 reads + 12000 writes every second. Not only that, the database should be able to scale out linearly because an average customers data is about 1.2 TB which adds 100GB of metadata.
Long story short, DynamoDB fits the bill. Running on top of SSDs allows DynamoDB to provide throughput guarantees and its distributed nature gives it the required scale-out. Apart from that the in-built replication ensures enhanced availability and durability guarantees. Most of all, since its a managed service from AWS, the Druva team focuses on a superior backup software rather than maintenance of a distributed database.
The development process was very smooth. inSync has a very different database access pattern and it has uncovered some bugs in other databases that we tried in the past. But, we did not come across any bugs in DynamoDB in spite of extensive testing. Our scalability tests also showed near linear performance (more on that in another blog). Overall, it’s been a very smooth experience. Thank you AWS for a great offering.