Cybersecurity of AI Systems

Red-teaming is a popular evaluation methodology for AI systems, but it is still severely lacking in theoretical grounding and technical best practices. This blog introduces the concept of threat modeling for AI red-teaming and explores the ways that software tools can support or hinder red teams. To do effective evaluations, red-team designers should ensure their tools fit with their threat model and their testers.

AI Control: How to Make Use of Misbehaving AI Agents

Kendrea Beers and Cody Rushing
| October 1, 2025

As AI agents become more autonomous and capable, organizations need new approaches to deploy them safely at scale. This explainer introduces the rapidly growing field of AI control, which offers practical techniques for organizations to get useful outputs from AI agents even when the AI agents attempt to misbehave.

Harmonizing AI Guidance: Distilling Voluntary Standards and Best Practices into a Unified Framework

Kyle Crichton, Abhiram Reddy, Jessica Ji, Ali Crawford, Mia Hoffmann, Colin Shea-Blymyer, and John Bansemer
| September 2025

Organizations looking to adopt artificial intelligence (AI) systems face the challenge of deciphering a myriad of voluntary standards and best practices—requiring time, resources, and expertise that many cannot afford. To address this problem, this report distills over 7,000 recommended practices from 52 reports into a single harmonized framework. Integrating new AI guidance with existing safety and security practices, this work provides a road map for organizations navigating the complex landscape of AI guidance.

CSET’s Jessica Ji shared her expert analysis in an interview published by Science News. The interview discusses the U.S. government’s new action plan to integrate artificial intelligence into federal operations and highlights the significant privacy, cybersecurity, and civil liberties risks of using AI tools on consolidated sensitive data, such as health, financial, and personal records.

Frontier AI capabilities show no sign of slowing down so that governance can catch up, yet national security challenges need addressing in the near term. This blog post outlines a governance approach that complements existing commitments by AI companies. This post argues the government should take targeted actions toward AI preparedness: sharing national security expertise, promoting transparency into frontier AI development, and facilitating the development of best practices.

Despite recent upheaval in the AI policy landscape, AI evaluations—including AI red-teaming—will remain fundamental to understanding and governing the usage of AI systems and their impact on society. This blog post draws from a December 2024 CSET workshop on AI testing to outline challenges associated with improving red-teaming and suggest recommendations on how to address those challenges.

How to Assess the Likelihood of Malicious Use of Advanced AI Systems

Josh A. Goldstein and Girish Sastry
| March 2025

As new advanced AI systems roll out, there is widespread disagreement about malicious use risks. Are bad actors likely to misuse these tools for harm? This report presents a simple framework to guide the questions researchers ask—and the tools they use—to evaluate the likelihood of malicious use.

Cybersecurity Risks of AI-Generated Code

Jessica Ji, Jenny Jun, Maggie Wu, and Rebecca Gelles
| November 2024

Artificial intelligence models have become increasingly adept at generating computer code. They are powerful and promising tools for software development across many industries, but they can also pose direct and indirect cybersecurity risks. This report identifies three broad categories of risk associated with AI code generation models and discusses their policy and cybersecurity implications.

Securing Critical Infrastructure in the Age of AI

Kyle Crichton, Jessica Ji, Kyle Miller, John Bansemer, Zachary Arnold, David Batz, Minwoo Choi, Marisa Decillis, Patricia Eke, Daniel M. Gerstein, Alex Leblang, Monty McGee, Greg Rattray, Luke Richards, and Alana Scott
| October 2024

As critical infrastructure operators and providers seek to harness the benefits of new artificial intelligence capabilities, they must also manage associated risks from both AI-enabled cyber threats and potential vulnerabilities in deployed AI systems. In June 2024, CSET led a workshop to assess these issues. This report synthesizes our findings, drawing on lessons from cybersecurity and insights from critical infrastructure sectors to identify challenges and potential risk mitigations associated with AI adoption.

Revisiting AI Red-Teaming

Jessica Ji and Colin Shea-Blymyer
| September 26, 2024

This year, CSET researchers returned to the DEF CON cybersecurity conference to explore how understandings of AI red-teaming practices have evolved among cybersecurity practitioners and AI experts. This blog post, a companion to "How I Won DEF CON’s Generative AI Red-Teaming Challenge", summarizes our takeaways and concludes with a list of outstanding research questions regarding AI red-teaming, some of which CSET hopes to address in future work.