Dive Deeper Into Your Data with Full Text Search Indexing

February 09, 2016 Katherine Cho

As your business grows, so does the amount of data you collect and manage. So, how can you gain a deeper understanding of this wealth of information and use it to your organization’s advantage? Dave Packer from Druva’s product team offers a secure and efficient solution through our Full Text Search Indexing capabilities, a fundamental feature built into our elastic cloud architecture.

Watch the video below to learn more:

How would Full Text Search Indexing capabilities be beneficial to your business? Share your experiences and/or thoughts with us in the comments below.

Video Transcript

Hi, I’m Dave Packer with Druva’s product team. And I’m here today to talk about our Full Text Search Indexing capabilities, which are a fundamental part of our elastic cloud architecture. The Full Text Search Indexing provides a way for organizations to get a deeper understanding of their data, for both analytics and governance capabilities.

Most organizations have collected a large volume of information and being able to gain insight into that data, understand it better, is a fundamental need for businesses rather than just letting that data sit by the side. Why not get better leverage out of it and understand what’s inside that data, how you might be able to leverage it for other purposes, especially we start talking about compliance requirements, legal reasons or just overall data analytics and finding information?

One of the big challenges with indexing data has been that most companies do this in an on-premise environment. We’re trying to predict how much compute required or how much drive space is required to actually build out the index when we’re looking across hundreds of repositories and sources of information. It can be very hard for organizations to predict and scale out accordingly their own infrastructure to actually address the challenge at hand.

As organizations scale back the data set or they use the indexer in a way that it winds up taking weeks or months to actually index through all the data. The way that we’ve accomplished this is very unique, one is to provide a very elastic search technology, but one that’s also very secured in a way that enables businesses to protect their data to the utmost requirement of their own internal security policy.

One way that we accomplish this today to make it effective is the way that we separate the data and metadata layer from the actual data itself, what we do is we split those two pieces out and the metadata layer of our system actually keeps track of where those files are, what are they, where they’re located, what is all the information about a file. And then we store the data separately, it just creates a much more efficient way to manage data, but also creates what we call a time index file system which allows us to actually store and retrieve data in a real time manner.

And the advantage there is if you’re doing Full Text Search Indexing what we can do is we’re collecting the information throughout the organization, we can actually index it as we collect it. So files that change, new files that come in, they can be indexed at the ready or on demand. And that’s one of the benefits of utilizing the cloud, we can basically distribute compute and scale as necessary to take in those petabytes of data and index them, so that you have and can achieve the goals of understanding what data is stored within that.

Now when we look at it from the standpoint of the security model, what we’ve done is something fairly unique. One of the big questions we get when we engage with customers obviously is that’s great that you do Full Text Search Indexing, but how do you secure the data? We have a very unique approach in the way that we manage encrypted data to begin with, which is through our two-factor encryption model, and I encourage you to listen to our other video that talks a bit about a way that we manage encryption. But through this model what we do is we’re protecting client’s data so that the client itself only has access to that information, Druva can’t get to the key, we can’t be compelled to produce your data. It’s stored in a way that’s highly secure. But the benefit is, is that that’s that same mechanism that we use to connect to the virtual private cloud that contains the search indexer, and the search indexer what it allows for them is a secure communication between the storage area and the indexer.

And so you can’t actually access the indexer from outside the environment. There’s no direct access point, the storage system and the indexer work in tandem. It’s a secure link, the key has to be through the two factor encryption model, has to be borne into the session, produced in the session and then it has to be acknowledged by the indexer before the indexer will do anything or can access any of its data.

The advantage of that is the stored index itself is actually stored and encrypted using that same key mechanism, and so the benefit here at the end of the day is not only do we as a company Druva have no access to your data that’s stored inside your index, but we have no access to this virtual private cloud instance that is built and specifically segmented for each customer. That’s an important aspect of how we’ve secured the data in the cloud as well as securing the index to ensure that you have the security. You know that that index is yours. And that is protected in a way that makes it so that it’s impenetrable by anybody on the outside.

Now another key element of what we do and something that comes up with our global customers is a question about regional requirements. So, you might be a company that has multiple geographies covering maybe the UK, Germany, you might have some site maybe in Singapore, or other places in the world. And each one of those regions has its own kind of data privacy requirements and some of those might place restrictions on what you can index, or what type of information you can look at.

So, what we’ve done is we’ve actually per region built out this indexing capability. The admin then depending on how they’re configured through our delegated admin capabilities gets a segmented view of that data. If I’m an admin, has the ability to have view of everything, you get that federated view through all those storage nodes as if they were one rather than being six or seven separate storage areas that are around the world.

As a company, it’s advantageous because I can go in and I can find information quickly, locate it, no matter where it is in the world and figure out what is it, what do I need to know more about, how does it pertain to any particular type of maybe compliance or legal issue that I’m researching. So that I can figure out what I need to do next or what are the next steps. Now, the real advantage of this environment of course is the elasticity itself. And as I mentioned earlier, one of the big challenges with on-premise traditional systems is that to grow the index which can at times be thirty to forty percent of the original data set, it’s a very complex proposition, especially if you’re dealing with petabytes of data. You might have three petabytes of data and a petabyte-and-a-half of index. Well, that’s not only a lot of compute, it’s a lot of storage space. But it’s also because of the way that indexes work, it’s a lot of memory and being able to manage an infrastructure to support that is very difficult.

With the cloud, we basically have taken that off the plate for you. So, you don’t even have to think about it. You can set it. You can start indexing. You can start looking through your data, searching it, using compliance tools. For example, using our Proactive Compliance technology, you can identify data, automate it in an automated fashion. And then be told there’s files out there that contain PHI, PII, or PCI type data. So, very advantageous, provides a lot of flexibility to you.

As an organization, what this does is it opens up the ability to index to the masses, right. So, to large volumes of data, the things that we want to know more about or understand more in depth. So, that’s a basic overview of our full text search indexing technology. How we secure it and how it can scale to meet your needs. If you have any more questions, feel free to learn more by going to druva.com or reach out to us via e-mail, I’m more than happy to respond and give you any more information you might need to understand what we’re doing here deeper so that it might benefit your organization. Thank you very much for listening today.

Dive Deeper Into Your Data with Full Text Search Indexing

Druva Blog: Cloud Technology & Data Protection Articles

The Druva Platform

Use Cases

Industries

Druva vs. Competitors

Company

Druva is a Gartner® Magic Quadrant™ Leader — Again.