In part 2 of our enterprise series, Curtis and Stephen talk about three challenges that enterprise customers have when trying to protect their data: one very large server, many large servers, and data distributed across many locations.
They discuss how one would solve each challenge with an on-premises system, and then explain how Druva solves these problems. The Druva Data Resiliency Cloud is designed to handle all three of these challenges, and this episode will help you to understand exactly why.
W. Curtis Preston: Hi and welcome to Druva’s No Hardware Required podcast. I’m your host, W. Curtis Preston, AKA Mr. Backup, and I have with me, our CTO, Stephen Manley from his video recording studio. How’s it going?
Stephen Manley: I’m feeling very enterprise today. This is an enterprising kind of day for me. So I’m excited.
W. Curtis Preston: So on our previous podcast, when we were recording a podcast about what it means to be enterprise scale, you mentioned three challenges that an enterprise typically has when it comes to backup and recovery. I’m going to play that clip now.
Stephen Manley: There’s a lot of different things that come into what enterprise means. Enterprise scale in particular, right? Cause you get some people that say, I have these massive. Databases or these massive NAS servers or are big VM farms, and so to them it’s can you handle the biggest thing I’ve got. For other people it’s I’ve got lots of these things, right? I have tens of thousands of VMs. Can you scale to that for other people? It’s actually, I’ve got lots of locations, right? So enterprise scale actually means, can you handle all these different locations?
Stephen Manley: And then of course my favorite ones are the ones that have all of those.
W. Curtis Preston: There are three different things. So we’ll talk about the first challenge, which is a really big server. When we go to design a backup system to back up a really big server. How would we do that if we were designing an on-premises backup system?
W. Curtis Preston: It’s interesting because in my experience, customers always feel the tension between two things. The ideal world is you’d like to break it up into smaller chunks. And back them up in parallel. So you’ve reduced the backup window. Maybe it even optimizes recovery, but that’s always balanced by, but I only get a certain number of streams and it’s more things to manage.
Stephen Manley: And do I really want to try to pull all these together? So you get that push and pull. And so a lot of times the answer is are you need to buy the biggest baddest box that we’ve got and get the biggest network you can between them, and just unleash whatever you can unleash for as long as you can unleash it.
Stephen Manley: And every three years you need to buy the new, bigger, badder box. And it just, it always feels like you’re not really doing it the right way, but it’s, it seems to be the way we always did it.
W. Curtis Preston: Well, when I think about That’s how many of our competitors have continued to address that problem. We still have very large servers and there’s really no way you can back up a really large server without multiple streams. If you’re something measured in. Hundreds of terabytes or petabytes, or God forbid an exabyte.
W. Curtis Preston: I have met customers who have exabytes sized systems. There is no way you’re going to back that up without doing multiple streams. And then you go to the point of when you go to restore that. The challenge is that bringing back all of those multiple streams, they may not always have been placed in a way on the storage that optimizes bringing the data back in the same way that it optimized putting the data there in the first place.
W. Curtis Preston: Would you agree?
Stephen Manley: A hundred percent agree. Again, cause we’re doing everything we can to optimize and reduce the storage costs. Optimize those backups cause they happen every day. And again, everybody’s on edge. And can I get this done in the backup window? And, you mentioned exabyte sized systems.
Stephen Manley: I had customers that said my backup window at this point is a month and they were struggling to get a system backed up in a month. That doesn’t mean it was running 24 7 but you get the idea that the, like lowered expectations and even then they were barely able to meet those lowered expectations.
Stephen Manley: At some point you gotta look at in a different way. You can’t keep just trying to brute force this stuff.
W. Curtis Preston: And so when I look at the ways that we have redesigned backup, the first of which is that we’re one of very few vendors that are using source-side deduplication. And what that does is it assists. It means that when you’re backing up a very large server, that you are only backing up once you get that initial backup done. Which we should talk about that. You’re only backing up the changed blocks that are new and unique to that server each day. So you’re reducing the amount of data that has to be transported across the network. And that’s probably more important for our customers than it is to say an on-prem, data protection box. But what it does is it makes it more feasible to back up across the internet.
W. Curtis Preston: But I’ll challenge you and say but doesn’t that create the problem that you mentioned before, where we create a scenario where a customer gets that first big backup, and then they’re able to back up this really large system, because we don’t need to back up much each day. And now they have this one big monolithic while, although it’s not a single monolithic image. As we described in a previous podcast, we actually store it as many millions of little pieces. How does that not then create a problem? When we go to restore that single large image.
Stephen Manley: Let’s face it restore of a large system has always been the untold darkness because whether it’s a large database, a large NAS server. Even if I have a high power deduplication appliance, with a hundred gig network tied to whatever I’m trying to restore to.
Stephen Manley: There’s only so fast that thing you’re writing to can even lay that new data down. And so so a lot of times I think people look and go the network’s going to be the bottleneck. A lot of times the bottleneck is the thing that’s just trying to recover all that data. And so one of the things that we try to guide our customers through is a couple of things.
Stephen Manley: The first is, look, let’s be honest. If you’ve got a large NAS system, especially. You should be using snapshots and replication for your disaster recovery. You just should, because, having that data, in a near ready state, that’s just a win. For other types of systems you should think about, where could I restore fastest?
Stephen Manley: And that’s one thing I think we educate people on is I’ve got all your data in the cloud. I’ve got enormous ability to stream that data back to your account in the cloud. So have you thought about maybe for that rapid restore, you restore into the cloud and then at your leisure, you can pull that back on premises or maybe decide never to move back on premises again. So I think it’s, it’s really important to take a holistic view on it. But you’re right. If you only tuned for backup, when the restore comes, you’ll find yourself with a bottleneck on the network or you find yourself on a bottleneck of the restoring system. So let’s think through an entire plan on how you’re actually going to pull that to.
W. Curtis Preston: That’s a really good point because the only way around any kind of bottleneck for a large system restore or a large disaster recovery, maybe it’s not just one system, but you’re restoring an entire data center. The only way around physics is to not use physics. And what do I mean by that, that you mentioned about the fact that we can restore in the cloud?
W. Curtis Preston: What’s really important to understand is that we can pre restore in the cloud for systems that you are concerned about. You specify the systems that you want to do cloud-based DR and we actually pre restore those systems and have them ready in your AWS account. You create your own AWS VPC and we restore your image in there.
W. Curtis Preston: And that’s why, regardless of the size of the system, we can support a 15 to 20 minute RTO. There are some instant restore features that can do this, but that doesn’t work for a really big monolithic system. You’re not going to do an instant restore
Stephen Manley: At scale.
W. Curtis Preston: We can do it at scale, regardless of the size of the image, regardless of how many images we’re talking about. We can restore all of them in about 15 to 20 minutes. Why? Because we already restored them before you ask us to that does require prior planning that does require you understanding about your environment and understanding the systems that you want to do this for.
W. Curtis Preston: And then a little bit of setup upfront, but that is the best way to do a restore of a large system or a large data center. I would say a little bit less than that in terms of not as good as that. We also offer cloud cache at no extra cost. So if a customer just wants a quicker restore of certain systems, they can have an on premises system load our cloud cache offering on there and then back up to that system, which then that system then copies to the cloud. They can do that. We even offer a solution for customers that didn’t do any of that prior thinking. We offer the, the. sneaker-net version where we can ship data to them on an AWS appliance. It’s nowhere near as fast as it should be because it’s, again, physics, we, you can’t get a system from a to B that quickly it’s the best system you have if you did no prior planning. But the point is that we do have that option, even for customers that did zero planning and they’ve got this really big system and they go and do the math and they say, it’s going to take me two weeks to do a restore while we’re like we have something faster than two weeks.
W. Curtis Preston: And that’s the using the snowball edge to have the data sent back to you. So we have three very different. Options from two of them requiring prior planning, one of them requiring no prior planning to do that large restore. So let’s talk about the second one that you talked about, which is many systems that may be a various sizes.
W. Curtis Preston: I think everybody has one system that to them is very large. And then they have many other systems. What sort of challenges do we have with it with an on-prem data protection system when we’re dealing with those kinds of systems?
Stephen Manley: The biggest thing I always saw in the on-prem environment for those tons and tons of little things it was really two. One is the management. Right. It’s just, it can be difficult to keep track of all those. Are they all getting backed up? Is there anything unusual happening with any of those backups?
Stephen Manley: Because there’s just so many little things going on. It’s really hard to track and monitor. I think the second thing that we would also see people struggle with at times was. The performance of those, because a lot of times those smaller systems, you’d have a limited number of connections, a limited number of streams, whatever the term you want to use is, And while all of them are small, you just couldn’t run them all at the same time.
Stephen Manley: And so you would miss your backup window, not because you’re at a deluge of data, but just because it took so long to set up and go through all of that.
W. Curtis Preston: We’ll go back to something we said in that previous podcast. And that is that when you’re backing up many systems, if you have a backend that is divvied up, so you don’t have a single backup system that can handle all of your servers, you have this. And I, when I say backup system, literally at least in the on-prem world, it’s a single host, because if it’s another host it’s going to be another dedupe database. And that’s the way we do backups these days. So if you don’t have a single backup storage device that can handle your entire environment, when you have. these Many many systems, the problem that you refer to of where you’ve got them all waiting in line, that’s absolutely true.
W. Curtis Preston: And one of the ways to deal with that. Is so if you’ve got several systems to send backups to you have two choices, you can either say, I’ve got a thousand systems and I’ve got 10 servers. I’m going to send 10 systems or I’m sorry, my math is bad. I’m going to send a hundred systems to each of my 10 backup servers.
W. Curtis Preston: That’s the best way from a dedupe perspective, because then you won’t get duplicate data across your different dedupe systems. But it’s not the best way from a get all your backups done perspective. Because the problem you mentioned is you’ll have a bunch of servers waiting in line because they happened to be on a box that is busy, whereas there may be other appliances.
W. Curtis Preston: that have plenty of resources, but they’re not available. And so what you end up doing is you end up, doing the spray and pray method is it’s you said, you say, I’ve got these 10 servers, I’ve got these thousand backups just divvy it up and that makes them work better. But as we mentioned in the previous podcast, it makes your dedupe poor you end up with increased costs. You mentioned about how do I get a view into all of this, and this is why I’m so hard on some of these other systems where they end up with so many little systems. I want a single place to go to create my policies, look at all my clients. I don’t want to have to log into many backup servers and having a large enterprise scale system Druva gives me that regardless of the number of systems that we’re backing up, we can go to a single place and look at all of our backups and restores without having to do all the challenges that we talked about before.
Stephen Manley: The value of the dynamic nature of the cloud is I can also do all sorts of more interesting analysis of what’s going on in your backup patterns. Because again, I’m only consuming resources when I’m doing that calculation.
Stephen Manley: Whereas if I were trying to build that into my box I’ve got to persistently allocate extra compute, extra storage, extra everything. And you’re going to look and say, I’m not sure that value add is enough because I’m not willing to pay more for the backup. Whereas with Druva, it just basically comes bundled in.
W. Curtis Preston: Yeah, that’s a really good point. We’re not the only data protection vendor that has started adding features onto the backup system, but with most of our competitors that are based on a box, you need to buy extra compute in order to get that job done. And that job quite often, again, going to something we said in the previous podcast, the cyclical nature that, that is something that isn’t happening all the time.
W. Curtis Preston: And so you pay for that extra compute, whether you’re using it or not. So that’s another challenge. So let’s talk about the third challenge, which is this idea of having many servers all over the place. This has always been a challenge for any large company. And it’s now become a challenge for many smaller companies because of the remote work world.
W. Curtis Preston: I think that the the need to back up laptops right now versus. A couple of years ago. I think it’s greater right now. And there’s no greater example of the third problem than to have thousands of laptops all over the place. And at list point in many parts of the world, nowhere near a data center.
W. Curtis Preston: I work for Druva. I have, for four years, I haven’t been to the corporate headquarters in a while. And yet all my data is still backed up. So let’s talk about the challenge of the way that particular enterprise challenge would be met by a typical enterprise backup system.
W. Curtis Preston: How would you do that?
Stephen Manley: Just to even double down and triple down, Right, cause some people say a laptop, I’m now looking at, say Microsoft 365. You still have to back that up. Or yeah, I’ve got that, but but we’re also doing a lot of M&A, so I have remote offices, I have to worry about.
Stephen Manley: Okay. That’s a good point. We’re looking at doing more things distributed at the edge. Okay. That’s gotta be protected. So even if you look and say I’m not as worried about the end points. You’re going to have something that is wildly distributed and right now a lot of vendors, the answer starts with you need to install an appliance, a virtual or a physical appliance in some location.
Stephen Manley: That’s got proximity. And again, you look at the distribution of workers with the laptops. That’s tricky. You think about trying to install a virtual appliance inside of Microsoft 365. I’d be curious to see how that works
Stephen Manley: Or on your edge devices. So to some extent the answer is largely, what we’re going to need you to do is back that up over the network to our appliance, even though that appliance and its communication mechanism are built for data centers. So it’s not good at low bandwidth connections. It’s not good at lossy connections. It’s not good at high latency connections. So you’re just going to have to brute force it or alternately cross your fingers and hope nothing goes wrong.
W. Curtis Preston: Yeah, that’s a really good point because, and the reason that they are that way is that the bulk of them do target deduplication. I’m pretty sure. I mentioned that earlier in this podcast that we do source-side deduplication, meaning that we deduplicate the data before it’s ever sent. And target deduplication is where you dedupe the data at the storage target.
W. Curtis Preston: If you do the de-dupe at your appliance, that’s why you have to have an appliance, even a virtual appliance at every location that you’re backing up, because otherwise there simply won’t be enough bandwidth to do the backup, and we are very different than. Because we do source-side deduplication, meaning we don’t need a local appliance everywhere in order to do the backup. We talk a lot about how much we scale up, right? We spent the last podcast talking a lot about how well we can scale up better than a box based vendor, but. I don’t think we talk enough about how well we can scale down, maybe because it doesn’t sound sexy enough, but this is this third challenge of many systems all over the place, whether it’s many remote offices and there are many industries where that is a situation. I think about a Realty company, right? They have all these remote offices. I think about retail. I think about food service. Each one of those sites has a computer on it that has some data that is important to that site, but it’s also important to the corporation. And if you use most of our competitors, you would then need to put an appliance next to your cash register. It doesn’t make any sense at all, but with us, you can just put an agent on that system and we will do source-side deduplication and back that up without needing any local appliance. And so to me, that’s why I say that we scale down, meaning that we don’t care how small something is.
W. Curtis Preston: You just have to put an agent on it and magic will happen.
Stephen Manley: Yeah. I often think of a scale down, some would scale out though. Obviously, that means different things to different people, it’s such a critical thing, I think. And again, I think data is getting more and more distributed and that’s that trends going to continue because I want to move data closer to again, if I’m doing AI/ML, I want data closer to what’s making the decision.
Stephen Manley: I want data closer to my users. I want data closer to my employees. Pulling it all into the data center just doesn’t work anymore. Going back to physics. because I just, I can’t get that back and forth across the network fast enough. So it’s gotta be distributed, which means your protection has to be the thing that centralizes it because nothing else will.
Stephen Manley: And that means you need protection that can do that scaling down into those locations. But also again, protection that is just incredibly network efficient, but also network resilient. And I think that’s the thing that, that sometimes we forget about is you need to be network resilient because a lot of these locations, you’re not going to get the kind of performance you need.
Stephen Manley: You’re not going to get the kind of networking stability that you need. And so you need a system and that is designed for that. And, and if we go back to what Druva’s origins were, which were these end points we bring that ethos to everything else that’s happening. As opposed to starting from the, I assume I have a really big, fast, reliable pipe.
Stephen Manley: We assume we’ve got a really small unreliable, slow pipe and we make the best of it.
W. Curtis Preston: Right. I like that a lot because some people, would ding us, oh, they’re a laptop backup company. First off. We’re not, we were, that was our original product, but we’ve added multiple products to it along the way. And we designed them with that viewpoint. Whereas a typical data center protection product was designed with a very different viewpoint.
W. Curtis Preston: And I would argue that our viewpoint is more appropriate. And I think that’s the argument that you were making. Our viewpoint is more, more apropos to today’s networking environment. As well as today’s compute environment, whereas because some of it’s going to be in the cloud. Some of them it’s going to be on somebody’s laptop.
W. Curtis Preston: Some of them it’s going to be in a big data center. Some of them it’s going to be in a lot of little data centers. And our way of looking at that from a we’re going to handle whatever network connection you throw at us is able to meet the needs of all of those environments. So that seem about right?.
Stephen Manley: A hundred percent, a hundred percent.
W. Curtis Preston: Yeah. I think we’ve talked enough about this topic thanks. Thanks again for, know for chatting.
Stephen Manley: Love it love it. It’s all about the scale. man. We are enterprise scale.
W. Curtis Preston: It’s all about the scale. And we want to, once again, thank our listeners. We’d be nothing without you. And don’t forget to subscribe so that you don’t miss an episode. And remember here at Druva there’s no hardware required.