Key Concepts in AI Safety: An Overview

Tim G. J. Rudner

Helen Toner

March 2021

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.

Download Full Report