The advent of more powerful AI systems such as large language models (LLMs) with more general-purpose capabilities has raised expectations that they will have significant societal impacts and create new governance challenges for policymakers. The rapid pace of development adds to the difficulty of managing these challenges. Policymakers will have to grapple with a new generation of AI-related risks, including the potential for AI to be used for malicious purposes, to disrupt or disable critical infrastructure, and to create new and unforeseen threats associated with the emergent capabilities of advanced AI.
In July 2023, the Center for Security and Emerging Technology (CSET) at Georgetown University and Google DeepMind hosted a virtual roundtable, as part of a series of roundtables organized by DeepMind to gather different views and perspectives on AI developments. This roundtable sought to assess the current trajectory of AI development and discuss measures that industry and governments should consider to guide these technologies in a positive and beneficial direction. This Roundtable Report summarizes some of the key themes and conclusions of the roundtable discussion and aims to help policymakers “skate to where the puck is going to be,” in the words of ice hockey great Wayne Gretzky. These themes do not necessarily reflect the organizational views of either Google DeepMind or CSET.
The rise of LLMs has demonstrated that AI is becoming more general-purpose. Current systems are already capable of performing a wide range of distinct tasks, including translating text, writing and editing prose, solving math problems, writing software, and much more. However, there was broad consensus across roundtable participants that these systems are only one iteration of what are likely to be even more capable systems within the next few years. AI developers are actively working to make these systems more powerful, which in turn increases safety and security concerns. Five ways in which existing AI systems are currently being augmented are multimodality, tool use, deeper reasoning and planning, larger and more capable memory, and increased interaction between AI systems.
In anticipation of new types of risks that current and upcoming models may pose, AI companies have begun undertaking “model evaluations” of their most advanced general-purpose AI models. Such evaluations attempt to identify dangerous capabilities such as autonomous replication (a model’s ability to acquire resources, create copies of itself, and adapt to novel challenges); dangerous knowledge about sensitive subjects such as chemical, biological, radiological, or nuclear weapon production; capacity to carry out offensive cyber operations; the ability to manipulate, persuade, or deceive human observers; advanced cognitive capabilities such as long-term planning and error correction; and understanding of their own development, testing, and deployment (sometimes called situational awareness).These capability evaluations are a productive first step, but should not be seen as a comprehensive approach to managing risks. At best, they can provide an early indication of some potential risks, which could trigger mandatory reporting requirements and additional safety measures.
The risks of these new systems are evident, but responsibility for managing that risk is less so. Several participants noted that the current status quo may place too much of the responsibility for managing risk on industry. Instead, policymakers should work towards achieving a balance between government and industry. Under this approach, governments must have sufficient technical expertise to support independent auditing functions, including the ability to describe and execute evaluations. At the same time, governments should encourage and incentivize industry to advance the science of evaluations and report their results. Third-party testing and auditing organizations can provide important additional capacity, but do not negate the need for governments to increase their own oversight capabilities.
To manage these risks, potential policy levers can be grouped into three categories:
- creating visibility and understanding,
- defining best practices, and;
- incentivizing and enforcing certain behaviors.
Governments should encourage—or perhaps require—private-sector actors to test and evaluate their systems and to report dangerous capabilities and real-world threat intelligence to oversight bodies in order to increase regulators’ visibility into the state of play. Policy interventions could also help standardize the testing and evaluation process within the AI pipeline. Options to incentivize or enforce these behaviors could include leveraging government procurement requirements, establishing industry certifications for frontier AI systems, and incentivizing increased transparency of discovered vulnerabilities and reporting of evaluation results by reducing liability for companies that disclose responsibly.