Podcasts

How is a Ransomware Recovery Different than a Disaster Recovery?

W. Curtis Preston, Chief Technology Evangelist

If you're thinking that you're prepared for a ransomware attack because you have a well-documented and well-tested disaster recovery plan, this episode may surprise you. The good news is you've done one of the many steps to get ready for a ransomware recovery. This epsiode focuses on the rest of the steps, which are much more about your cyber response than your DR response. How will you know what to recover? How do you stop the attack while it is in progress? How do you ensure your backups weren't also infected? We answer these and other questions in this episode.

[00:00:00] W. Curtis Preston: This week on no hardware required. We're talking about how a ransomware recovery is different than a disaster recovery. With me as always is my co-host Stephen Manley. Thanks for joining.

[00:00:14] W. Curtis Preston: Hi, and welcome to Druva No Hardware required podcast. I'm your host, w Curtis Preston, aka mr. Backup. And I have with me the guy who always knows which one of these is not like the other . Stephen Manley. How's it going, Stephen?

[00:00:31] Stephen Manley: It's going well. I mean, and usually the easiest answer is Curtis, you're not like the other, so,

[00:00:37] W. Curtis Preston: is, that is, that is true. I was just, what I was thinking about, what I was thinking about, what we were gonna talk about on the episode of the, the little thing from Sesame Street just kept going up in my mind. You know, um, you remember that? Which one of these

[00:00:50] Stephen Manley: Oh, of, of course, of course. And, and, yeah. And I, I know exactly where you're going with this. I've done all sorts of recoveries in my life, and I will tell you, ransomware recovery is not like any recovery you've ever done in your life.

[00:01:03] W. Curtis Preston: No, it's not. I, you know, you know, once again, we find ourself talking about ransomware because, well, it's where the pain is right now. It's a real concern. And the thing is that unless you've actually been hit by a full ransomware attack that took out a significant part of your infrastructure, you don't quite understand.

the difference. We, we say it a lot, that there's a difference between ransomware recovery and disaster recovery. Um, you know, the, we, we've done a lot of DRS over the time. Right? The biggest thing that I think the, the, the quickest way, what I'm explaining to people is typically when you do a, a, Dr. , the D is over, right?

right. The flood is gone. The, the waters have subsided. Um, you know, the hurricane has passed. The, um, you know, the sinkhole is taken out your entire data center. I grew up in Florida, by the way, so we were all about sinkholes.

[00:02:09] Stephen Manley: so I, I was gonna say, so these disasters, this was just called Wednesday for you? Yeah. Uh,

[00:02:14] W. Curtis Preston: Absolutely. Or as we call it, Wednesday. Yeah. Uh, I actually was, I was only a few miles from the infamous Winter park sinkhole, which took out a couple of city blocks, including a Porsche dealership. Um, yeah, so that's a disaster recovery. Something horrible happened.

[00:02:38] Stephen Manley: Yeah.

[00:02:39] W. Curtis Preston: now you have to put everything back together.

How is a ransomware recovery different than that?

[00:02:46] Stephen Manley: Well, so, so there's, there's a couple, right? And, and again, we're helping multiple customers a week at this point recover from ransomware attacks. And so, so the first one is, you know, when we get the call and sometimes we're the ones calling them saying, Hey, we, we see something funny. But, but either way that goes, It's still happening, right?

Uh, it, it, it's not the All right. Well, we've, we've, we've removed the malware from our environment and we know it's clean now. So, so one is, it's still happening. And then the second piece, uh, is, and that means we trust nothing. Like we can't trust that the production environment's, okay, we can't trust the backups haven't been compromised until we go through and look at things.

We have to assume that everything is broken and, and so those two fundamental assumptions are so different than a usual disaster, which is again, a past tense event where you can then look and say, well, this is still running, and. That's gone. You know, this is, this is the, I don't know what's still happening and I don't know if any of this is okay.

[00:03:53] W. Curtis Preston: Yeah. I don't, I don't know. Even the servers that are running Yeah. That, I actually hadn't thought about that before. In, in a disaster. It's really easy to go, that's on fire That's not in a, in a ransomware recovery, it's, well, the servers are all still running. Right, which ones are not doing what they're supposed to be doing.

Which is why the very typical, there's a, I don't know. Did you ever watch, alias back in the

[00:04:22] Stephen Manley: Oh yeah. Jennifer Garner and uh,

[00:04:25] W. Curtis Preston: Yeah. There's a scene in there where, uh, Marshall Flank man, right. The, I, I love that. I love that character. And he goes running in the thing and he's just flipping power switches in the data center.

And he, and, and, and in his case, they had detected a, a, um, uh, some exfiltration of data and his immediate response was to flip all the power switches. It's really funny. Uh, Well, that's now what's happening, right? That's the immediate, normal response is to shut things down. And there's a variety of ways to do that.

Um, you know, do you want to talk about that?

[00:05:00] Stephen Manley: Well, so, so, so certainly in in, in our experience, we've seen some customers and, and, and you do get into that, you know, I don't wanna shut everything down. We did have, we did have a, uh, a, a new customer that in their old ransomware, They, they basically powered everything down and, and found out that the ransomware had affected only two servers.

It took them four weeks to get their environment back up. So shutting everything down is a little dangerous, um, because they didn't necessarily know what everything was, where it was, how to bring it up. So, so that's dangerous. Uh, so we have, we have other customers who say, okay, well, what I'm gonna do is I'm gonna lock down any external internal network access.

So we're, we're gonna, we're gonna shut down. Access. Uh, so at least we keep it contained. But of course that, that often opens up questions inside the company of what's going on. Are we, are we under attack? And, and you don't necessarily want that to leak to the public, you know, through one of your employees before you know more.

And so we get others who try to just segment off a part of their network and say, well, Area is down. Um, at the risk of course, that, that the, the, the malware has spread further than where they're shutting down. And, and that is one of the challenges you face is how destructive is this going to be in terms of how long will it take me to recover, you know, what's the right blast radius to cover so that I'm, I'm not, uh, exposing myself more than I need to, but I'm also not protecting myself from an ongoing attack.

And, and, and at this point I'd say there's no company I've worked with that has the right answer, right? Everyone's kind of making it up as they go.

[00:06:36] W. Curtis Preston: Well, the, the only right answer I would say in terms of not among all those choices is to figure out to, to basically have rules of engagement, right? Um, you know, I'm, I'm former military, you know, when you're in, when you're, uh, in theater, right? Um, you have rules of engagement. There are certain things you can do when you're being shot at.

There are certain things you can do, uh, when you're not being shot at, and, um, you can escalate based on what's happening. And so I would think that the only right answer is to have this conversation now. Right to say, okay, if we have this scenario, this is going to be our right. If, you know, if we, if we have one server that appears to have ransomware, we're gonna react like this if we have a bunch of servers that we're gonna react like this.

And, and if, if what your decision is is, for example, like, like flipping, flipping off breakers, that's, that's a relatively easy one, I suppose. And by the way, that's assuming you have a physical data center

[00:07:44] Stephen Manley: Yeah.

[00:07:44] W. Curtis Preston: Right. And by the way, I, I speak as a person who, uh, pulled the e or pushed the EPO button at the wrong time.

Um, so I remember how easy it was to turn off power in the entire data center, which by the way, extended an outage. Sorry, sorry, sorry. Um, I'm still apologizing for that, but the, the, the other is, I, I like the network segmentation, right? Cutting off, like we're gonna do that. So, so my question is, if that's your plan, make sure you know how to automatically do that in advance, right.

Um, I, I, I, that's my preferred method. Uh, again, there's no right or wrong answer, but that's my preferred method is. The way ransomware works generally, uh, is it as soon as it gets infected, it wants to promulgate itself. It wants to spread out in the environment. So let's stop. No more, no more talking.

Nobody's allowed to

talk. Right. Um, and, and yeah, I, I would probably, if I knew there was a server that was do that was, that had been infected, I would probably immediately shut that one off. Um, but, you know, I don't know what I do outside of that. But the point is to have that conversation now, plan for it. Um, and it's better. I live, I live in San Diego. Uh, we have a bail bondsman here, and the commercial is better to know me than not need me and need me and not know me, right? Um, so better , better to, better to know what to do when a ransomware happens, uh, and, and not need it. Although the chances of that are relatively small than to, than to not plan and have, again, I am.

You know, quote, what happened at Rackspace, it took them two weeks to figure out what their plan was. That should have happened, uh, way beforehand. So that, that's the biggest thing I think. That's the first thing is sort of stopping the, the further damage.

Um, what, what do you think's next after that?

[00:09:49] Stephen Manley: So there is one thing I want to add to, to, to what you said in terms of having to plan, one of the most key parts in that plan, by the way, is understanding roles and responsibilities of people. Uh, because there are too many times where we've seen, analysis paralysis in terms of, well, should legal involve, should we bring in hr?

Should we, when does the board get notified? What general man? And so again, having rules of engagement about who gets involved and who's got authority, that becomes really critical as well. So, so you need your technical plan, but you also need sort of the people plan because the technical plan will never be 100% percent accurate, right? There'll be something that happens differently than you expect, and you're gonna need the right people marshaled together to be able to make that call. Now, now, now, once you've got that set up, yeah. So, so your, your second part here is you, you are gonna have to basically do an in investigation and, you know, sort of understand how did the ransomware hit, uh, where did it come through, how did it spread?

And, and, and the lifeblood for, for any sort of investigation like that is you're gonna need logs. and, and, and it's always fascinating to me the number of organizations that say, wow, you know, we, we've rotated those logs out, you know, we rotate every week, et cetera, et cetera, et cetera. And the first question we ask is, so that server, you're storing those logs on or that NAS box or wherever you're storing it, it looks like we back that up.

Oh yeah. Can we restore those logs? You know, we can, let's get you that data back. And, and so, so I think very often backup teams don't think they have a role till it comes time to recover. But for a lot of backup teams, you hold that critical log information that, that your security analysts are going to need to be able to understand what's going on.

[00:11:39] W. Curtis Preston: I like that a lot. Um, the, I, I think the, the hardest part when you get to the actual recovery phase. We stopped the damage.

We we're using the logs. We started figure, we did some forensics. We figured out, all right, these are the, the VMs, the servers, the whatever, the applications that have been affected.

Um, the next step is to do a recovery. The challenge then becomes how do you recover? . So you know, the challenge then becomes dwell time, right? So the ransomware has been in your environment for some period of time and you don't know how long it's been there. How do you restore? I think the biggest challenge, like I know we have a great solution for the, for the file system restore.

I think the biggest challenge for anybody is how do you restore a vm? , um, you know, and know that you're not just restoring another already infected server. What do people do there?

[00:12:42] Stephen Manley: Right. Well, well, I'll tell you the, the first thing that, that most people forget, um, when it comes to recovery, because we do get that questionable, how fast am I gonna recover? What version am I gonna recover? The first thing you've gotta prove is that your backup system wasn't compromised. . Um, and, and so and so, most organizations don't think, and again, this is that pre-thought, so you've got security analysts coming in, investigators coming in.

What do you need to show them such that they say, okay, your backup system is clean. and, and so, so that's one of the things that we've really encouraged customers to do because, you know, as, as we pointed out on previous podcasts, you know, your backup servers, your backups themselves are often a target.

And so, you know, how do I a ahead of time sort of say, look, we've already proven X, Y, and Z. This system is resilient. And then when you need to do the final sign off, we'll give you, you know, logs or data. A, b and C as well. And having that lined up means that now I can start to use my backups. Until then, your backup system is assumed, compromised, the same as anything else's, and that, and you don't wanna wait a week for that to get, uh, to, to get sort of the, the, the clean bill of health from, from, you know, whatever external, uh, forensics organization you're using.

[00:14:00] W. Curtis Preston: And I know that that's easier for our customers to answer than it is for those who have on-premises based backup systems.

[00:14:09] Stephen Manley: Right, because, and, and that's, and that's one of the things I always tell our customers is, look, if it's an on-prem thing, you know, you've gotta show them the servers weren't compromised. You've gotta show 'em the storage wasn't compromised. You've gotta show 'em. The software wasn't compromised with Druva.

there's none of that, right? So, so yeah, again, let, let's get you a cup of the logs. We'll show you, you know, anomaly data. We'll do all that. But, but you're not having to do deep forensics on any hardware because there's no hardware to do forensics on. So, so really critical in terms of getting that turnaround time.

Now, once you've done that, then you want to be able to show your security team, look, here's the versions of backups. I'm gonna recover. Again, that ability to show them, look, this is where we saw unusual activity. This is, you know, we've quarantined these backups. You know, this, this matches up with your timeline.

Great. We think these are clean backups. That, that becomes job one. Job two then is, you know, sort of recovering to some sort of sandbox area where you can test it, you can scan for, you know, for, for the malware signatures, all those sorts of things. And, and if you're gonna want a clean. Cloud's your best bet because there's not a lot of people that have standby data centers

[00:15:23] W. Curtis Preston: Right, right.

[00:15:24] Stephen Manley: a ransomware attack.

So being able to spin up a clean VPC in aws, for example, where you can do all of that work enormously powerful, because again, your security team's gonna say, yeah, clean environment, it's all net new. You know, we, we, we already, you know, assessed before any of this happened that this is the the way we're gonna proceed and we can run our malware scans against what's in that.

[00:15:48] W. Curtis Preston: Yeah, I like that. Yeah, and, and, and of course the fact that you can create that clean environment, programmatic, , right. Um, hardware as code, right? Um, it, it's not so easy to do that with a bunch of boxes.

[00:16:04] Stephen Manley: Yeah, I mean you're, you're either buying a bunch of new boxes, or again, you've gotta get each of those bot boxes certified that they're not infected, which is gonna take time and.

[00:16:15] W. Curtis Preston: Yeah, it's funny. So, you know, the point of this, this particular episode was to talk about the difference between a ransomware recovery and a disaster recovery. And um, you know, I was just looking at the timer. We didn't get to the part that most people think. The restore part. We didn't get to that until 17 minutes into the discussion.

Uh, and so again, the only thing I can say is have this conversation now. Right? Um, what will we do? This is an important meeting to have, an important planning session to have, and then to go through the process, to go through a tabletop exercise of what will we do and how will we do it. And then get to the point and, and, and I, you know, and again, I know it's the whole like, uh, uh, to a hammer.

Everything looks like a nail. I'd say at least, you know, outsource the recovery part, uh, to, to Druva so that you at least the, the part. That should be easy is actually easy.

[00:17:20] Stephen Manley: Yeah.

[00:17:22] W. Curtis Preston: right? The rest will still be difficult. Uh, we can make it less difficult because we can take parts off the table, like you said, that at least you know your backups aren't compromised, which is something that you can't say if you're running an on-premises backup infrastructure.

Um, but, uh, yeah, I, this, this is a conversation. It's time.

[00:17:42] Stephen Manley: Yeah. And, and, and, and, and, and the, the key thing I think for any customer, what I tell them is, you know, so often in other recoveries, you know, we always talked about speeds and feeds, which was always a little bit of losery anyway, because usually the bottleneck in your recoveries, the system you're recovering to trying to write all that data out versus the system you're pulling from, but whatever, right?

So, so even, even tapes usually could stream data out much faster than servers could restore them. But even more so in this, that, that clock for when you're needing to pump a bunch of data back, you know, like, like we said in Rackspace, it might be a couple of weeks before you even start that. So the key thing is how do you reduce that window?

Because the windows that involve people interacting with each other and overworked security people trying to to, to sort of give you the blessing on things, those are the things that that are gonna take forever. The actual recovery of data. Is again, we're 20 minutes in now. That's the part that's actually gonna be easiest for you to take care of.

So, so, so it's all about the planning ahead of time, getting everybody lined up, getting the testing in place, getting the pre-approvals in place so that when the bad thing happens and you need to recover, everybody says, yep, we already, we, you know, it's just X, Y, and Z. Let's get that. That's, that's when you're gonna be the hero in the organization.

It's not cuz you ran 12% faster on your throughput, it's, you took two weeks off the recovery time to begin with.

[00:19:14] W. Curtis Preston: Right. Absolutely. All right, well thanks for having the chat again,

[00:19:19] Stephen Manley: Hey, anytime I can make, uh, make you worry even more about ransomware, I feel like I've done my job.

[00:19:28] W. Curtis Preston: and, uh, for those of you worried out there and you're listening, thanks to that. And, uh, also be sure to subscribe so that you don't miss an episode. And remember, here at Druva, there's no hardware required.