Should You Scan for Malware Before or After a Backup?

W. Curtis Preston, Chief Technology Evangelist

There are currently two ways to scan for malware in your backups: as you're backing up (before) or as you're restoring (after). Which one is the correct way? Mr. Backup and Stephen Manley, Druva's CTO, discuss this topic, using the decades of experience they both have. They think scanning on restore is the right way, and they explain just why that is. You also learn something new and interesting about Stephen... Who would have guessed?

[00:00:00] W. Curtis Preston:This week on no hardware required. We're talking about when you should scan from malware in your backup. With me as always is my co-host Steven Manley. Thanks for joining.

[00:00:11] W. Curtis Preston: Hi and welcome to Druva's No Hardware required podcast. I'm your host, W. Curtis Preston a k a Mr. Backup and have with me my scanning specialist. At least that's what he is gonna be for the next 15 to 20 minutes. Stephen Manley, how's it going?

[00:00:26] Stephen Manley: Oh, it's great. You know, this is, this is really good because this takes me back to my Latin days when we had to do scansion and Elysian and understand exactly. That's what we're talking about, right?

[00:00:36] W. Curtis Preston: You know, there's no good language, like a dead language

[00:00:40] Stephen Manley: I'll tell you, it doesn't change on you once you learn it. Here's still, there's no new slang to pick up.

[00:00:47] W. Curtis Preston: no new slang. So you, you actually took Latin in in college.

[00:00:51] Stephen Manley: High school. I took four years of high school Latin. I was, uh, I, most people don't know this because I don't tell them, but I actually went to the National Junior Classic, uh, Latin competition in, in San Diego that year. And I will tell you that, uh, some of the coolest people you're ever gonna meet, uh, I, I, I, I got third place in you know, oratory. I did one of Cicero's, oratories and Latin. I mean, it was, I was, I was a beast. I was, I was. I was such a nerd in high school. Yeah.

[00:01:27] W. Curtis Preston: And this has changed how

[00:01:29] Stephen Manley: Well, you know, now I'm just not in high school.

[00:01:34] W. Curtis Preston: Yeah, exactly. Um, well, we're gonna talk about something pretty nerdy today. Uh, we're gonna talk about, we're gonna talk about virus scanning and, you know, in the backup world we started, I, I, I think, I don't remember, you know, I've been around backups, you know, a minute as it could say. And,

[00:01:59] Stephen Manley: Is back. We used to speak Latin in those days.

[00:02:01] W. Curtis Preston: Yeah, exactly, exactly.

For the record, I didn't take Latin in college. I took Greek, uh, which, you

[00:02:10] Stephen Manley: hadn't been invented yet.

[00:02:11] W. Curtis Preston: similar, similarly, uh, worthless, uh, anyway, but, uh, man, we are, we've lost anyone who was fine. I'll listen to this podcast. Um, but, but my point is that back in the day, we never thought about scanning the backup. Right.

You scanned the devices. . Um, and then, you know, and then I suppose maybe after a restore you might scan the device again, but no one thought of this idea of scanning the backup. I don't have anything against it, but it's just a relatively new concept. And so, um, does that match your experience as well?

[00:02:50] Stephen Manley: Going a little less back on the way back machine in the early 2000 tens, which is wow. Now a decade ago, um, at Data Domain, we did actually, uh, experiment and file a couple patents on searching for. Basically malware signatures, uh, inside a dedupe array. Um, we never shipped it because it turns out that, uh, there were literally at the time, no customers who wanted it, but

[00:03:18] W. Curtis Preston: Other

[00:03:18] Stephen Manley: advanced development project. It was super cool. Yeah.

[00:03:21] W. Curtis Preston: Interesting, interesting. Yeah. So it, it's a good idea, right? You know, more, more scanning, better than no scanning. Right. Um, there is a question though, in the market right now, depending on which vendor you talk to, and that is whether or not you should scan. , um, before you back up or scan, you know, before you actually store it on the storage device or if you should scan it on Restore.

And the, the, this is one of those things where, . Well, it would seem like if, if you don't know anything about technology, I would think that the, the default might lean, well, of course you wanna scan it before, right, like that, that would seem like, well, why would you want to, in, why would you want to back up infected data if it, if it's got, if it's got malware in it, why would you want to back it up?

That sounds like a good idea. The more I thought about it, um, the more problems I came up with the, the i with scanning before, what, what do you think about that?

[00:04:30] Stephen Manley: Yeah, I mean, I, I think, I think the heart of it came down to, and, and again, having, having at one point having a box that, that, that had this capability but never shipping. The thing we constantly got from customers was twofold. Maybe three, but, but 2, 2, 2 main ones. Number one was, okay, so, um, you're scanning for stuff.

Uh, I need you to be scanning for things I don't know about rather than scanning for things I do, because if I did know about it, my scanners on the way into my environment would've caught it. So appreciate it, but. , but that's not gonna work. And then the second one was, and, and then, you know, so you scanned it and it was good.

Then I get a whole bunch of new virus signatures I wanna scan for you just end up in this never ending loop of scanning. And so the customers went, this, this, this feels like what you're trying to do is justify that you have a box that's idle 12 hours a day and you're trying to convince me you're doing , you're doing something with it.

And we, we looked at me and we said, No, not at all.

[00:05:40] W. Curtis Preston: Yeah, I think that, that, that first thing that you said, I think that's the first obvious one, is it's using the same signature set that you had that day when you're, um, backing up, right? When you're, when you're running your regular, uh, the production virus scan and perhaps you've got a virus scan even on the firewall or something, you've got it on the email, right?

There's, there's virus scans in, in a lot of places. But they tend to be often acting on the same information set, right? The same. Um, um, would, would, would IOCs, would that be the right, the right term to use there?

[00:06:16] Stephen Manley: I like IOCs is fine. Again, what What we did in the DD is we turned those into a hash cuz then we just scan the DUP table for hashes. But yeah, it's the same idea.

[00:06:24] W. Curtis Preston: Right. Which of course stands for indicator of compromise, um, which is basically the, it, it, I, I don't know, a virus signature. That would be a similar phrase. So you, you have that same database. Um, that you're scanning tonight, you have the, basically the same information you had when, when you got the virus in the first place.

So if you got it today, how would you catch it tonight? What you really want is to sort of go forward in time. I need you to go forward, uh, a week from now and scan the backups from a week ago, right? Um, cuz that, that's what I was saying. It, it sounds like a good idea. to scan on the way in, but you're not gonna catch everything, right?

And so, um, you would continually be scanning all the time. And I guess the question that I would have at that point is to what end? Right? Why do you need to scan data that is not currently being used by anything? Uh, and one possible, um, one possible re-answer to that question is if that data being inside your backup system can do it damage. I don't know if that's the case at any, this, this definitely, I'm gonna be very honest here. This definitely falls into the FUD category. I don't know if this is true of any competitors. Um, the, but that is a possible reason why, if you would want to go and, and scan it. Um, and again, I'm not even suggesting I'm, I'm gonna maybe back up my statement.

I'm not suggesting this is true of competitors. I'm just saying. That would be one reason you want to do it. , right? Um, is, is, if that's the case, I don't think that's the case with anybody. Um, but I know it's not the case with us, right. That

[00:08:30] Stephen Manley: Um,

[00:08:31] W. Curtis Preston: you could have the most infected thing sitting in the backup and it's, it's inert, right? It's stored in a place that it can't be executed.

[00:08:41] Stephen Manley: I, I think the only one you might worry about there, and again, this is, this is something that, that struck me, is, is yeah, I think if it's a pure backup environment, we know that that data tends to be deduped. It tends to be, you know, tarred up or, or some format, and so it's gonna be really unlikely that it executes.

The one exception to that is there are some systems that say, Hey, I store your backups, but I can also serve as a NAS server for you. , and, and that may be something that they do have to worry about, is

[00:09:10] W. Curtis Preston: Okay.


[00:09:11] Stephen Manley: files really are stored in an accessible format by an active system. So, but, but, but again, to, to your point, you, you know, I, I think the, the biggest thing, and you know, like those customers told me years ago, of all the places I want to scan for malware.

Backup. Yeah. Like if the backup should be like the last thing, like I wanna find it at every stage before that. But to your point, you know, there may be cases where if the backup storage system is shared, I might wanna be scanning that storage system just to be sure that you know, something that is in one area, can't infect and, and, and, and, uh, destroy a different area.

[00:09:50] W. Curtis Preston: Yeah, exactly. The alternative to scanning before right, or scanning on backup is to scan on, restore. Um, and this one would seem. It, it, it's as good as a scan can get, given that you can hold onto the data for as long as you'd like. And then when you go to finally restore, we can scan it using the most recent, um, you know, i o C database, hash table, whatever, whatever term you want to use, um, that's available at that time, which by design will be.

in a future time from when you backed it up, right? It might be a day, it could be weeks or months. Um, but it would seem like to me, a and also it's one scan, right? It's not continual scanning throughout, throughout time. That's a waste of time and money and effort and resources and all of that. Um, it's one scan that's done at the time of when you're placing the information on the, on the server.

Does that seem about right?

[00:11:02] Stephen Manley: And to your point, right? By the time you're doing it on the restore, you have the signatures that you're looking for. So it's not this hopeless. Uh, well, maybe I'll find something. It's like, you know what you're looking for. You, you've, you've done your incident response to the point where you're, you're, you're looking for something.

It does seem a lot more productive, certainly, or at least more prac, more practical in terms of, oh, this is why I'm doing this, as opposed to the other way, just feels like, eh, it's something I could do.

[00:11:30] W. Curtis Preston: Right. And you brought up a really good point, so I know that in our case, that if what we're talking about is a ransomware attack. So we, you, you are going to have, you know, uh, we, we have another episode where we talk about what is an incident response plan. In that response plan, one of the things that's going to be determined what it is you are actually infected with, get the IOC of that. Malware. And then in our case, you can directly give us, you're like, this is the malware they got us. Please look for this. Um, when you're scanning, so you can directly target the specific piece of malware that you don't wanna show up, rather than just blindly scan, uh, all of them, you can do both, but, but you can directly target the, that that piece that you're trying to stop is that.

[00:12:23] Stephen Manley: Right. That's exactly right. And again, it really comes down to anytime you do something like this because, because especially in the security space, right? The. , you can always do activity. And I, and I think people, people like, it's the difference between, you know, looking busy and being productive. So you can do a lot of stuff that looks busy, but what you really want to focus on, cause there's so much to cover in security, is what is productive.

It is guaranteed productive that before you restore data and put it back into production, you need to look for the, for the malware signatures of what struck you because. Because it's just, that's an obvious right thing to do. Some of the other steps, it may help you look busy, but you gotta really ask yourself, what, what's the upside I'm getting out of it?

Is this delivering the value that, that, that justifies the effort I'm putting in?

[00:13:16] W. Curtis Preston: So you're suggesting that scan before is security theater?

[00:13:21] Stephen Manley: Uh, it's, it's a little bit of take your shoes off, uh, you know, and,

[00:13:25] W. Curtis Preston: exactly where I went.

[00:13:29] Stephen Manley: like, is this really solving anything? Well, no. It makes it look like we're doing something and we're doing it every day, and we get a little report that says, guess what? I still didn't find anything. All right. We must be. Well, you know, the absence of finding, uh, malware is not the same as not having malware.

[00:13:48] W. Curtis Preston: That's a good point. All right, well, um, I think we've, uh, talked about this about as much as we could talk about it, so, uh, thanks for, uh, helping me answer that question.

[00:13:59] Stephen Manley: Oh, my pleasure. And, and, and again, for, for everybody out there, you get limited time, limited energy. Put it where it counts.

[00:14:07] W. Curtis Preston: Absolutely, and thanks for listening today. And remember here at Druva, there's no hardware required for.