In Part 1 of this blog series, I walked you through the focus areas that are emerging in AI that can be meaningfully leveraged across a broad range of industries. In part 2, let’s dive deeper to understand the technological advancements in AI that are emerging as applications in the real world. We’ll also cover some of the risks that arise with these advancements and the efforts needed to address them.
Generative adversarial networks (GANs)
Generative adversarial networks (GANs) are generative models which can create plausible art, text, images, audio, gifs, music, videos, etc. GANs achieve plausibility by simultaneously training two models — a generator and a discriminator. The generator tries to challenge the discriminator by generating plausible output, and the discriminator tries to learn the difference between the generated data and the true data. You can learn more from Google’s self-study guide for machine learning practitioners on GANs. GANs are used as a technique to implement generative AI.
Amazon Web Services (AWS) identifies generative AI as one of the biggest, recent advancements in artificial intelligence technology because of its ability to create something new. Generative AI opens the door to an entire world of possibilities for human and computer creativity, with practical applications emerging across industries, from turning sketches into images for accelerated product development, to improving computer-aided design of complex objects. Doubling down on AI democratization, AWS also introduced DeepComposer to help developers with almost no ML background to get started on GANs.
Without responsible AI, technology as creative as GANs can be used destructively — e.g., deepfakes, which create disinformation. Even worse, as this MIT Technology Review article points out, the biggest threat of deepfakes is not the deepfake itself, but when people stop believing real things are real. Companies like Google and Facebook are releasing datasets to accelerate the development of deepfakes detection tools. You will see more improvements in responsible AI that help detect and fight deepfakes.
Bidirectional Encoder Representations from Transformers (BERT)
Creating a general language representation model using millions or even billions of training data samples is called Natural Language Processing (NLP) pre-training. These generic models can be fine tuned to build further specific NLP capabilities such as language inference, translation, natural language question and answering, etc. Google open sourced a technique called BERT (Bidirectional Encoder Representations from Transformers) which is revolutionizing pre-training so that data scientists can build state-of-the-art NLP systems, piggybacking on BERT. But, what makes BERT revolutionary?
BERT is the first unsupervised, deeply bi-directional contextual representation. That means, without explicit labeling, the model can learn based on the full context of a word derived from the words on the left and the right of it. This will significantly improve context-based NLP tasks such as language inference, translation, question answering, understanding search queries, etc. BERT will enable pre-trained models for a variety of languages, bringing the AI benefits across a multitude of languages.
Innovative techniques like BERT learn deeply from the context. If bias is imbibed in the context, pre-trained models will learn that too. Fine-grained performance monitoring and improvement (slice-based learning), fairness indicators and benchmarks, testing for bias, and diverse training datasets are some of the ways to fight bias.
One of the big challenges in machine learning is the lack of large sets of labeled training data required for training models. Weak supervision is a framework to generate datasets by using various labeling functions such as heuristic rules, regular expressions, classifiers, crowdsourcing, labeling by SMEs, turks, etc., amplifying weak signals to create a large-labeled training dataset. Another approach is using a specific type of transfer learning where pre-trained models built using a large dataset are fine-tuned for the task (Refer pre-training in the above section). Here, transfer learning becomes a type of weak supervision. Snorkel is a system created by Stanford researchers which uses heuristic-labeling functions created by domain experts to create and manage datasets without manually labeling datasets. Snorkel can also be used for slicing data for targeted improvement of critical subsets of data which can be leveraged for fighting bias.
The training data needs to represent the real world production data without compromising privacy and security. At Druva we generate synthetic data at scale which is highly influenced by production data and has “genes” of production data but not a single byte of data is taken as-is from production. This is done by identifying data distribution patterns including the edge cases and modeling the production data by test data generation. You can learn more about Druva’s methods to synthesise data in this talk at the O’Reilly Strata Data & AI Conference.
GANs can be used for data augmentation to generate artificial training data, thus increasing the scale of training samples to build better performing ML models. Snuba, a nascent research from Stanford, is similar to Snorkel but it generates the heuristics automatically. The quest to gather, generate, synthesize, augment, transform, and re-use data to feed data-hungry ML models will continue to rise.
Deep reinforcement learning (DRL)
Reinforcement learning is a machine learning paradigm by which the model learns by rewards and punishment with a goal to maximize cumulative rewards. Reinforcement learning mimics human learning which is a balance between exploratory learning (experimenting with the unknown) and exploitative learning (leveraging current knowledge). Though the concept of reinforcement learning has been existing for decades now, applications of reinforcement learning are dramatically growing in NLP and computer vision.
Deep learning is a representation of how humans learn by consuming raw sensory information such as vision, audio, etc. Deep reinforcement learning (DRL) uses both deep learning and reinforcement learning concepts that can provide human-level performance with cognition and motor skills. DRL will continue to have enormous impact by mimicking human ability and beyond in robotics, fully autonomous vehicles, etc.
AI/ML systems such as facial recognition software, autonomous vehicles, etc., are based on technologies such as deep learning, reinforcement learning, neural networks, etc., that can be described as a ‘black box.’ It is hard to explain how these systems arrive at a specific decision, but these decisions can be life or death and it is important to have transparency and explainability for these AI/ML systems to be reliable.
Explainable AI (XAI) is an upcoming effort to build interpretable and explainable machine learning models. For instance, Google’s What-if tool is a visual interface designed to probe models, understand what features dominate the model, dive deep into data points, and then analyze those data points. Explainable AI can soon become a governance requirement.
Generative adversarial networks can unleash creativity. BERT-based NLP systems can deeply understand the linguistic meaning. Data-hungry AI/ML systems can be fed using weak supervision. Human-level performance in tasks can be achieved by robots and autonomous vehicles using deep reinforcement learning. These technical advancements in AI will help organizations build creative and cognitive machine learning systems that can deeply understand context. The need for governance and regulation will continue to increase with these advancements. As a broad range of organizations innovate using AI/ML, the responsibility lies with each organization to build secure, responsible, explainable, ethical, and fair AI/ML systems.
If you missed Part 1 of this blog series, learn more about why building responsible AI/ML is a critical challenge that needs to be prioritized when developing AI/ML systems.