Analysis

Through the Chat Window and Into the Real World: Preparing for AI Agents

Helen Toner,

John Bansemer,

Kyle Crichton,

Matthew Burtell,

Thomas Woodside,

Anat Lior,

Andrew Lohn,

Ashwin Acharya,

Beba Cibralic,

Chris Painter,

Cullen O’Keefe,

Iason Gabriel,

Kathleen Fisher,

Ketan Ramakrishnan,

Krystal Jackson,

Noam Kolt,

Rebecca Crootof,

and Samrat Chatterjee

October 2024

Computer scientists have long sought to build systems that can actively and autonomously carry out complicated goals in the real world—commonly referred to as artificial intelligence "agents." Recently, significant progress in large language models has fueled new optimism about the prospect of building sophisticated AI agents. This CSET-led workshop report synthesizes findings from a May 2024 workshop on this topic, including what constitutes an AI agent, how the technology is improving, what risks agents exacerbate, and intervention points that could help.

Download Full Report

Related Content

Researchers, companies, and policymakers have dedicated increasing attention to evaluating large language models (LLMs). This explainer covers why researchers are interested in evaluations, as well as some common evaluations and associated challenges. While evaluations can… Read More

Large language models (LLMs), the technology that powers generative artificial intelligence (AI) products like ChatGPT or Google Gemini, are often thought of as chatbots that predict the next word. But that isn't the full story… Read More

Analysis

Skating to Where the Puck Is Going

October 2023

AI capabilities are evolving quickly and pose novel—and likely significant—risks. In these rapidly changing conditions, how can policymakers effectively anticipate and manage risks from the most advanced and capable AI systems at the frontier of… Read More

Analysis

Autonomous Cyber Defense

June 2023

The current AI-for-cybersecurity paradigm focuses on detection using automated tools, but it has largely neglected holistic autonomous cyber defense systems — ones that can act without human tasking. That is poised to change as tools… Read More