CSET

Making AI (more) Safe, Secure, and Transparent: Context and Research from CSET

Tessa Baker

July 21, 2023

On July 21, the White House announced voluntary commitments from seven AI firms to ensure safe, secure, and transparent AI. CSET’s research provides important context to this discussion.

Related Content

This paper is the first installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure these systems work safely and reliably. In it, the authors introduce three categories of AI safety issues: problems of robustness, assurance, and specification. Other papers in this series elaborate on these and further key concepts.

Data Brief

Who Cares About Trust?

July 2023

Artificial intelligence-enabled systems are transforming society and driving an intense focus on what policy and technical communities can do to ensure that those systems are trustworthy and used responsibly. This analysis draws on prior work about the use of trustworthy AI terms to identify 18 clusters of research papers that contribute to the development of trustworthy AI. In identifying these clusters, the analysis also reveals that some concepts, like "explainability," are forming distinct research areas, whereas other concepts, like "reliability," appear to be accepted as metrics and broadly applied.

When the technology and policy communities use terms associated with trustworthy AI, could they be talking past one another? This paper examines the use of trustworthy AI keywords and the potential for an “Inigo Montoya problem” in trustworthy AI, inspired by "The Princess Bride" movie quote: “You keep using that word. I do not think it means what you think it means.”

As modern machine learning systems become more widely used, the potential costs of malfunctions grow. This policy brief describes how trends we already see today—both in newly deployed artificial intelligence systems and in older technologies—show how damaging the AI accidents of the future could be. It describes a wide range of hypothetical but realistic scenarios to illustrate the risks of AI accidents and offers concrete policy suggestions to reduce these risks.

Recent discussions of AI have focused on safety, reliability, and other risks. Lost in this debate is the real need to secure AI against malicious actors. This blog post applies lessons from traditional cybersecurity to emerging AI-model risks.

Analysis

One Size Does Not Fit All

February 2023

Artificial intelligence is so diverse in its range that no simple one-size-fits-all assessment approach can be adequately applied to it. AI systems have a wide variety of functionality, capabilities, and outputs. They are also created using different tools, data modalities, and resources, which adds to the diversity of their assessment. Thus, a collection of approaches and processes is needed to cover a wide range of AI products, tools, services, and resources.

Analysis

Trusted Partners

February 2021

As the U.S. military integrates artificial intelligence into its systems and missions, there are outstanding questions about the role of trust in human-machine teams. This report examines the drivers and effects of such trust, assesses the risks from too much or too little trust in intelligent technologies, reviews efforts to build trustworthy AI systems, and offers future directions for research on trust relevant to the U.S. military.

With the rapid integration of AI into our daily lives, we must all learn when and whether to trust the technology, understand its capabilities and limitations, and adapt as these systems — and our functional relationships with them — evolve.