Analysis

Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning

Tim G. J. Rudner

and Helen Toner

June 2024

This paper is the fifth installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. This paper explores the opportunities and challenges of building AI systems that “know what they don’t know.”

Download Full Report

Other briefs in this series:

Introduction

The last decade of progress in machine learning research has given rise to systems that are surprisingly capable but also notoriously unreliable. The chatbot ChatGPT, developed by OpenAI, provides a good illustration of this tension. Users interacting with the system after its release in November 2022 quickly found that while it could adeptly find bugs in programming code and author Seinfeld scenes, it could also be confounded by simple tasks. For example, one dialogue showed the bot claiming that the fastest marine mammal was the peregrine falcon, then changing its mind to the sailfish, then back to the falcon—despite the obvious fact that neither of these choices is a mammal. This kind of uneven performance is characteristic of deep learning systems—the type of AI systems that have seen most progress in recent years—and presents a significant challenge to their deployment in real-world contexts.

An intuitive way to handle this problem is to build machine learning systems that “know what they don’t know”—that is, systems that can recognize and account for situations where they are more likely to make mistakes. For instance, a chatbot could display a confidence score next to its answers, or an autonomous vehicle could sound an alarm when it finds itself in a scenario it cannot handle. That way, the system could be useful in situations where it performs well, and harmless in situations where it does not. This could be especially useful for AI systems that are used in a wide range of settings, such as large language models (the technology that powers chatbots like ChatGPT), since these systems are very likely to encounter scenarios that diverge from what they were trained and tested for.

Unfortunately, designing machine learning systems that can recognize their limits is more challenging than it may appear at first glance. In fact, enabling machine learning systems to “know what they don’t know”—known in technical circles as “uncertainty quantification”—is an open and widely studied research problem within machine learning. This paper gives an introduction to how uncertainty quantification works, why it is difficult, and what the prospects are for the future.

Download Full Report

Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning

Authors

Tim G. J. Rudner Helen Toner

Originally Published

June 2024

Topics

Assessment

Citation

Tim G. J. Rudner and Helen Toner, "Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning" (Center for Security and Emerging Technology, June 2024). https://doi.org/10.51593/20220013

Analysis

Key Concepts in AI Safety: An Overview

March 2021

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Analysis

Key Concepts in AI Safety: Robustness and Adversarial Examples

March 2021

This paper is the second installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Analysis

Key Concepts in AI Safety: Interpretability in Machine Learning

March 2021

This paper is the third installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Analysis

Key Concepts in AI Safety: Specification in Machine Learning

November 2021

This paper is the fourth installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

How the U.S. Wins the Global Tech Competition

Center for Security and Emerging Technology

How the U.S. Wins the Global Tech Competition

Analysis

Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning

Introduction

Download Full Report

Related Content

Key Concepts in AI Safety: An Overview

Key Concepts in AI Safety: Robustness and Adversarial Examples

Key Concepts in AI Safety: Interpretability in Machine Learning

Key Concepts in AI Safety: Specification in Machine Learning

How the U.S. Wins the Global Tech Competition

Analysis

Key Concepts in AI Safety: Reliable Uncertainty Quantification in Machine Learning

Introduction

Download Full Report

Related Content

Key Concepts in AI Safety: An Overview

Key Concepts in AI Safety: Robustness and Adversarial Examples

Key Concepts in AI Safety: Interpretability in Machine Learning

Key Concepts in AI Safety: Specification in Machine Learning

This website uses cookies.