CSET

Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Margarita Konaev

Andrew Lohn

August 1, 2023

Two CSET researchers are coauthors for a new multi-organization report about the safety of AI systems led by OpenAI and the Berkeley Risk and Security Lab. The report, published on arXiv, identified six confidence-building measures (CBMs) that could be applied by AI labs to reduce hostility, prevent conflict escalation, and improve trust between parties as it relates to foundation AI models.

Read Full Proceedings on arXiv

Related Content

Among great powers, AI has become a new focus of competition due to its potential to transform the character of conflict and disrupt the military balance. This policy brief considers alternative paths toward AI safety and security.

As modern machine learning systems become more widely used, the potential costs of malfunctions grow. This policy brief describes how trends we already see today—both in newly deployed artificial intelligence systems and in older technologies—show how damaging the AI accidents of the future could be. It describes a wide range of hypothetical but realistic scenarios to illustrate the risks of AI accidents and offers concrete policy suggestions to reduce these risks.

Analysis

AI Verification

February 2021

The rapid integration of artificial intelligence into military systems raises critical questions of ethics, design and safety. While many states and organizations have called for some form of “AI arms control,” few have discussed the technical details of verifying countries’ compliance with these regulations. This brief offers a starting point, defining the goals of “AI verification” and proposing several mechanisms to support arms inspections and continuous verification.

Analysis

Agile Alliances

February 2020

The United States must collaborate with its allies and partners to shape the trajectory of artificial intelligence, promoting liberal democratic values and protecting against efforts to wield AI for authoritarian ends.