CSET

Why Improving AI Reliability Metrics May Not Lead to Reliability

Romeo Valentin

Helen Toner

August 8, 2023

How can we measure the reliability of machine learning systems? And do these measures really help us predict real world performance? A recent study by the Stanford Intelligent Systems Laboratory, supported by CSET funding, provides new evidence that models may perform well on certain reliability metrics while still being unreliable in other ways. This blog post summarizes the study’s results, which suggest that policymakers and regulators should not think of “reliability” or “robustness” as a single, easy-to-measure property of an AI system. Instead, AI reliability requirements will need to consider which facets of reliability matter most for any given use case, and how those facets can be evaluated.

Read arXiv Preprint

Related Content

This paper is the second installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. The first paper in the series, “Key Concepts in AI Safety: An Overview,” described three categories of AI safety issues: problems of robustness, assurance, and specification. This paper introduces adversarial examples, a major challenge to robustness in modern machine learning systems.

When the technology and policy communities use terms associated with trustworthy AI, could they be talking past one another? This paper examines the use of trustworthy AI keywords and the potential for an “Inigo Montoya problem” in trustworthy AI, inspired by "The Princess Bride" movie quote: “You keep using that word. I do not think it means what you think it means.”

Analysis

Hacking AI

December 2020

Machine learning systems’ vulnerabilities are pervasive. Hackers and adversaries can easily exploit them. As such, managing the risks is too large a task for the technology community to handle alone. In this primer, Andrew Lohn writes that policymakers must understand the threats well enough to assess the dangers that the United States, its military and intelligence services, and its civilians face when they use machine learning.

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.