CSET’s Must Read Research: A Primer

Tessa Baker
| December 18, 2023

This guide provides a run-down of CSET’s research since 2019 for first-time visitors and long-term fans alike. Quickly get up to speed on our “must-read” research and learn about how we organize our work.

This blog post by CSET’s Executive Director Dewey Murdick explores two different metaphorical lenses for governing the frontier of AI. The "Space Exploration Approach" likens AI models to spacecrafts venturing into unexplored territories, requiring detailed planning and regular updates. The "Snake-Filled Garden Approach" views AI as a garden with both harmless and dangerous 'snakes,' necessitating rigorous testing and risk assessment. In the post, Dewey examines these metaphors and the different ways they can inform approaches to AI governance strategy that balances innovation with safety, all while emphasizing the importance of ongoing learning and adaptability.

Skating to Where the Puck Is Going

Helen Toner Jessica Ji John Bansemer Lucy Lim
| October 2023

AI capabilities are evolving quickly and pose novel—and likely significant—risks. In these rapidly changing conditions, how can policymakers effectively anticipate and manage risks from the most advanced and capable AI systems at the frontier of the field? This Roundtable Report summarizes some of the key themes and conclusions of a July 2023 workshop on this topic jointly hosted by CSET and Google DeepMind.

Why Improving AI Reliability Metrics May Not Lead to Reliability

Romeo Valentin Helen Toner
| August 8, 2023

How can we measure the reliability of machine learning systems? And do these measures really help us predict real world performance? A recent study by the Stanford Intelligent Systems Laboratory, supported by CSET funding, provides new evidence that models may perform well on certain reliability metrics while still being unreliable in other ways. This blog post summarizes the study’s results, which suggest that policymakers and regulators should not think of “reliability” or “robustness” as a single, easy-to-measure property of an AI system. Instead, AI reliability requirements will need to consider which facets of reliability matter most for any given use case, and how those facets can be evaluated.

CSET's Catherine Aiken testified before the National Artificial Intelligence Advisory Committee on measuring progress in U.S. AI research and development.