Reports

Skating to Where the Puck Is Going

Anticipating and Managing Risks from Frontier AI Systems

Helen Toner,

Jessica Ji,

John Bansemer,

and Lucy Lim

October 2023

AI capabilities are evolving quickly and pose novel—and likely significant—risks. In these rapidly changing conditions, how can policymakers effectively anticipate and manage risks from the most advanced and capable AI systems at the frontier of the field? This Roundtable Report summarizes some of the key themes and conclusions of a July 2023 workshop on this topic jointly hosted by CSET and Google DeepMind.

Download Roundtable Report

Executive Summary

The advent of more powerful AI systems such as large language models (LLMs) with more general-purpose capabilities has raised expectations that they will have significant societal impacts and create new governance challenges for policymakers. The rapid pace of development adds to the difficulty of managing these challenges. Policymakers will have to grapple with a new generation of AI-related risks, including the potential for AI to be used for malicious purposes, to disrupt or disable critical infrastructure, and to create new and unforeseen threats associated with the emergent capabilities of advanced AI.

In July 2023, the Center for Security and Emerging Technology (CSET) at Georgetown University and Google DeepMind hosted a virtual roundtable, as part of a series of roundtables organized by DeepMind to gather different views and perspectives on AI developments. This roundtable sought to assess the current trajectory of AI development and discuss measures that industry and governments should consider to guide these technologies in a positive and beneficial direction. This Roundtable Report summarizes some of the key themes and conclusions of the roundtable discussion and aims to help policymakers “skate to where the puck is going to be,” in the words of ice hockey great Wayne Gretzky. These themes do not necessarily reflect the organizational views of either Google DeepMind or CSET.

The rise of LLMs has demonstrated that AI is becoming more general-purpose. Current systems are already capable of performing a wide range of distinct tasks, including translating text, writing and editing prose, solving math problems, writing software, and much more. However, there was broad consensus across roundtable participants that these systems are only one iteration of what are likely to be even more capable systems within the next few years. AI developers are actively working to make these systems more powerful, which in turn increases safety and security concerns. Five ways in which existing AI systems are currently being augmented are multimodality, tool use, deeper reasoning and planning, larger and more capable memory, and increased interaction between AI systems.

In anticipation of new types of risks that current and upcoming models may pose, AI companies have begun undertaking “model evaluations” of their most advanced general-purpose AI models. Such evaluations attempt to identify dangerous capabilities such as autonomous replication (a model’s ability to acquire resources, create copies of itself, and adapt to novel challenges); dangerous knowledge about sensitive subjects such as chemical, biological, radiological, or nuclear weapon production; capacity to carry out offensive cyber operations; the ability to manipulate, persuade, or deceive human observers; advanced cognitive capabilities such as long-term planning and error correction; and understanding of their own development, testing, and deployment (sometimes called situational awareness).These capability evaluations are a productive first step, but should not be seen as a comprehensive approach to managing risks. At best, they can provide an early indication of some potential risks, which could trigger mandatory reporting requirements and additional safety measures.

The risks of these new systems are evident, but responsibility for managing that risk is less so. Several participants noted that the current status quo may place too much of the responsibility for managing risk on industry. Instead, policymakers should work towards achieving a balance between government and industry. Under this approach, governments must have sufficient technical expertise to support independent auditing functions, including the ability to describe and execute evaluations. At the same time, governments should encourage and incentivize industry to advance the science of evaluations and report their results. Third-party testing and auditing organizations can provide important additional capacity, but do not negate the need for governments to increase their own oversight capabilities.

To manage these risks, potential policy levers can be grouped into three categories:

creating visibility and understanding,
defining best practices, and;
incentivizing and enforcing certain behaviors.

Governments should encourage—or perhaps require—private-sector actors to test and evaluate their systems and to report dangerous capabilities and real-world threat intelligence to oversight bodies in order to increase regulators’ visibility into the state of play. Policy interventions could also help standardize the testing and evaluation process within the AI pipeline. Options to incentivize or enforce these behaviors could include leveraging government procurement requirements, establishing industry certifications for frontier AI systems, and incentivizing increased transparency of discovered vulnerabilities and reporting of evaluation results by reducing liability for companies that disclose responsibly.

Download Roundtable Report

Skating to Where the Puck is Going

Authors

Helen Toner Jessica Ji John Bansemer Lucy Lim

Chris Painter

Courtney Corley

Jess Whittlestone

Matt Botvinick

Mikel Rodriguez

Ram Shankar Siva Kumar

Topics

CyberAI

Cybersecurity of AI Systems

Foundations

Citation

Helen Toner, Jessica Ji, John Bansemer, and Lucy Lim, et al., "Skating to Where the Puck is Going" (Center for Security and Emerging Technology, October 2023). https://doi.org/10.51593/2023CA004

Blog

Regulating the AI Frontier: Design Choices and Constraints

October 2023

Recent advances in general-purpose artificial intelligence systems have sparked interest in where the frontier of the field might move next—and what policymakers can do to manage emerging risks. This blog post summarizes key takeaways from… Read More

Blog

Securing AI Makes for Safer AI

July 2023

Recent discussions of AI have focused on safety, reliability, and other risks. Lost in this debate is the real need to secure AI against malicious actors. This blog post applies lessons from traditional cybersecurity to… Read More

Blog

What Does AI Red-Teaming Actually Mean?

October 2023

“AI red-teaming” is currently a hot topic, but what does it actually mean? This blog post explains the term’s cybersecurity origins, why AI red-teaming should incorporate cybersecurity practices, and how its evolving definition and sometimes… Read More

Reports

Securing AI

March 2022

Like traditional software, vulnerabilities in machine learning software can lead to sabotage or information leakages. Also like traditional software, sharing information about vulnerabilities helps defenders protect their systems and helps attackers exploit them. This brief… Read More

Reports

Key Concepts in AI Safety: An Overview

March 2021

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Reports

Key Concepts in AI Safety: Interpretability in Machine Learning

March 2021

This paper is the third installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Reports

Key Concepts in AI Safety: Robustness and Adversarial Examples

March 2021

This paper is the second installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Reports

Key Concepts in AI Safety: Specification in Machine Learning

November 2021

This paper is the fourth installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More

Center for Security and Emerging Technology

When cheap AI becomes a secret weapon

Reports

Skating to Where the Puck Is Going

Anticipating and Managing Risks from Frontier AI Systems

Executive Summary

Download Roundtable Report

Related Content

Regulating the AI Frontier: Design Choices and Constraints

Securing AI Makes for Safer AI

What Does AI Red-Teaming Actually Mean?

Securing AI

Key Concepts in AI Safety: An Overview

Key Concepts in AI Safety: Interpretability in Machine Learning

Key Concepts in AI Safety: Robustness and Adversarial Examples

Key Concepts in AI Safety: Specification in Machine Learning

When cheap AI becomes a secret weapon

Reports

Skating to Where the Puck Is Going

Anticipating and Managing Risks from Frontier AI Systems

Executive Summary

Download Roundtable Report

Related Content

Regulating the AI Frontier: Design Choices and Constraints

Securing AI Makes for Safer AI

What Does AI Red-Teaming Actually Mean?

Securing AI

Key Concepts in AI Safety: An Overview

Key Concepts in AI Safety: Interpretability in Machine Learning

Key Concepts in AI Safety: Robustness and Adversarial Examples

Key Concepts in AI Safety: Specification in Machine Learning

This website uses cookies.