Platform
- Data Resiliency Cloud
  Data Resiliency Cloud
  Enterprise Cloud Backup and data management across edge, on-premises and cloud workloads
- Data Protection
  Data Protection
  Modernize data protection to reduce costs and complexity
- Cyber Resiliency
  Cyber Resiliency
  Be ready for cyber attacks with data that is always safe, always ready
  - Accelerated Ransomware Recovery
  - Security Posture & Observability
- Governance & Compliance
  Governance & Compliance
  Secure, protect, and streamline data governance for all your critical data, wherever it lives
  - eDiscovery and Legal Hold
  - Sensitive Data Management
- Take a Tour
Solutions
- Business Drivers
  Business Drivers
  Learn how Druva helps you accelerate key business initiatives
- SaaS Applications
  SaaS Applications
  Druva provides comprehensive data protection that supports multiple SaaS applications from a single platform. Discover the Druva difference today.
- Enterprise Workloads
  - Virtualization
    Virtualization
    Transform data center backup and disaster recovery for virtual environments
    
    VMware
    
    Nutanix
  - Databases
    Databases
    Reduce the cost and complexity of data protection for enterprise databases
    
    Oracle
    
    MS SQL
    
    SAP HANA
  - Files
    Files
    Discover a more cost-efficient way to protect on-premises and cloud NAS
    
    NAS/files
  - Public Cloud
    Public Cloud
    Protect native AWS and Azure deployments with secure backups without the cost and complexity
    
    AWS
    
    Microsoft Azure
- Enterprise Endpoints
  Enterprise Endpoints
  Unify SaaS apps and end-user device protection to reduce data risks. Improve cyber resilience and compliance by protecting enterprise workloads and assets.
- Free Trial
Customers
- Explore All Customer Stories
  We are trusted by the world's leading organizations to protect their data. Explore customer success stories to see how your peers are using Druva.
- Ransomware recovery ready
  Learn why Medallia chose Druva
  
  SaaS data protection across the enterprise
  See why Regeneron partnered with Druva
Resources
- 2023 Gartner® Magic Quadrant™
  See why Druva is recognized as a Visionary
  
  Data Resiliency for Dummies
  Get your guide to data resiliency
Partners
- Strategic Partners
  Strategic Partners
  Learn about Druva's strategic capabilities across platform, OEM, and other partnerships. Find out how Druva accelerates and protects customers' cloud journeys.
  - Dell Technologies
  - AWS
  - VMware
  - Nutanix
- Programs
  Programs
  Learn how you can profit with Druva and a cloud-first SaaS selling motion. Explore partner programs, access resources, and discover the benefits of partnering with Druva.
- Become a Partner
Company
- - Company
  - Leadership
  - Investors
  - Careers
  - Contact Us
  - Newsroom
  - Awards
  - Events
  - Blog
  - Diversity, Equity & Inclusion
- Get in touch with us
  Contact Us
  
  News, product innovations, and more
  Blog
Get Started
Support
Login
Language

News/Trends, Innovation Series

Emerging trends in artificial intelligence and machine learning – Part 2

March 13, 2020 Preethi Srinivasan, Director of Innovation

In Part 1 of this blog series, I walked you through the focus areas that are emerging in AI that can be meaningfully leveraged across a broad range of industries. In part 2, let’s dive deeper to understand the technological advancements in AI that are emerging as applications in the real world. We’ll also cover some of the risks that arise with these advancements and the efforts needed to address them.

Generative adversarial networks (GANs)

Generative adversarial networks (GANs) are generative models which can create plausible art, text, images, audio, gifs, music, videos, etc. GANs achieve plausibility by simultaneously training two models — a generator and a discriminator. The generator tries to challenge the discriminator by generating plausible output, and the discriminator tries to learn the difference between the generated data and the true data. You can learn more from Google’s self-study guide for machine learning practitioners on GANs. GANs are used as a technique to implement generative AI.

Amazon Web Services (AWS) identifies generative AI as one of the biggest, recent advancements in artificial intelligence technology because of its ability to create something new. Generative AI opens the door to an entire world of possibilities for human and computer creativity, with practical applications emerging across industries, from turning sketches into images for accelerated product development, to improving computer-aided design of complex objects. Doubling down on AI democratization, AWS also introduced DeepComposer to help developers with almost no ML background to get started on GANs.

Without responsible AI, technology as creative as GANs can be used destructively — e.g., deepfakes, which create disinformation. Even worse, as this MIT Technology Review article points out, the biggest threat of deepfakes is not the deepfake itself, but when people stop believing real things are real. Companies like Google and Facebook are releasing datasets to accelerate the development of deepfakes detection tools. You will see more improvements in responsible AI that help detect and fight deepfakes.

Bidirectional Encoder Representations from Transformers (BERT)

Creating a general language representation model using millions or even billions of training data samples is called Natural Language Processing (NLP) pre-training. These generic models can be fine tuned to build further specific NLP capabilities such as language inference, translation, natural language question and answering, etc. Google open sourced a technique called BERT (Bidirectional Encoder Representations from Transformers) which is revolutionizing pre-training so that data scientists can build state-of-the-art NLP systems, piggybacking on BERT. But, what makes BERT revolutionary?

BERT is the first unsupervised, deeply bi-directional contextual representation. That means, without explicit labeling, the model can learn based on the full context of a word derived from the words on the left and the right of it. This will significantly improve context-based NLP tasks such as language inference, translation, question answering, understanding search queries, etc. BERT will enable pre-trained models for a variety of languages, bringing the AI benefits across a multitude of languages.

Innovative techniques like BERT learn deeply from the context. If bias is imbibed in the context, pre-trained models will learn that too. Fine-grained performance monitoring and improvement (slice-based learning), fairness indicators and benchmarks, testing for bias, and diverse training datasets are some of the ways to fight bias.

Weak supervision

One of the big challenges in machine learning is the lack of large sets of labeled training data required for training models. Weak supervision is a framework to generate datasets by using various labeling functions such as heuristic rules, regular expressions, classifiers, crowdsourcing, labeling by SMEs, turks, etc., amplifying weak signals to create a large-labeled training dataset. Another approach is using a specific type of transfer learning where pre-trained models built using a large dataset are fine-tuned for the task (Refer pre-training in the above section). Here, transfer learning becomes a type of weak supervision. Snorkel is a system created by Stanford researchers which uses heuristic-labeling functions created by domain experts to create and manage datasets without manually labeling datasets. Snorkel can also be used for slicing data for targeted improvement of critical subsets of data which can be leveraged for fighting bias.

The training data needs to represent the real world production data without compromising privacy and security. At Druva we generate synthetic data at scale which is highly influenced by production data and has “genes” of production data but not a single byte of data is taken as-is from production. This is done by identifying data distribution patterns including the edge cases and modeling the production data by test data generation. You can learn more about Druva’s methods to synthesise data in this talk at the O’Reilly Strata Data & AI Conference.

GANs can be used for data augmentation to generate artificial training data, thus increasing the scale of training samples to build better performing ML models. Snuba, a nascent research from Stanford, is similar to Snorkel but it generates the heuristics automatically. The quest to gather, generate, synthesize, augment, transform, and re-use data to feed data-hungry ML models will continue to rise.

Deep reinforcement learning (DRL)

Reinforcement learning is a machine learning paradigm by which the model learns by rewards and punishment with a goal to maximize cumulative rewards. Reinforcement learning mimics human learning which is a balance between exploratory learning (experimenting with the unknown) and exploitative learning (leveraging current knowledge). Though the concept of reinforcement learning has been existing for decades now, applications of reinforcement learning are dramatically growing in NLP and computer vision.

Deep learning is a representation of how humans learn by consuming raw sensory information such as vision, audio, etc. Deep reinforcement learning (DRL) uses both deep learning and reinforcement learning concepts that can provide human-level performance with cognition and motor skills. DRL will continue to have enormous impact by mimicking human ability and beyond in robotics, fully autonomous vehicles, etc.

AI/ML systems such as facial recognition software, autonomous vehicles, etc., are based on technologies such as deep learning, reinforcement learning, neural networks, etc., that can be described as a ‘black box.’ It is hard to explain how these systems arrive at a specific decision, but these decisions can be life or death and it is important to have transparency and explainability for these AI/ML systems to be reliable.

Explainable AI (XAI) is an upcoming effort to build interpretable and explainable machine learning models. For instance, Google’s What-if tool is a visual interface designed to probe models, understand what features dominate the model, dive deep into data points, and then analyze those data points. Explainable AI can soon become a governance requirement.

Conclusion

Generative adversarial networks can unleash creativity. BERT-based NLP systems can deeply understand the linguistic meaning. Data-hungry AI/ML systems can be fed using weak supervision. Human-level performance in tasks can be achieved by robots and autonomous vehicles using deep reinforcement learning. These technical advancements in AI will help organizations build creative and cognitive machine learning systems that can deeply understand context. The need for governance and regulation will continue to increase with these advancements. As a broad range of organizations innovate using AI/ML, the responsibility lies with each organization to build secure, responsible, explainable, ethical, and fair AI/ML systems.

If you missed Part 1 of this blog series, learn more about why building responsible AI/ML is a critical challenge that needs to be prioritized when developing AI/ML systems.

Emerging trends in artificial intelligence and machine learning – Part 2

Generative adversarial networks (GANs)

Bidirectional Encoder Representations from Transformers (BERT)

Weak supervision

Deep reinforcement learning (DRL)

Conclusion

Blog

Druva Data Resiliency Cloud

Cloud Backup & Recovery

Data Protection

Governance & Compliance

Cyber Resilience

Business drivers

Workloads

Partners

Customers

Resources

Company