In The News

AI Models Will Sabotage And Blackmail Humans To Survive In New Tests. Should We Be Worried?

HuffPost

June 5, 2025

CSET’s Helen Toner shared her expert insights in an article published by HuffPost. The article discusses concerning findings from recent tests showing that advanced AI models, including OpenAI’s o3 and Anthropic’s Claude Opus 4, can exhibit deceptive, self-preserving behaviors when faced with shutdown or replacement.

Read Article

CSET’s Helen Toner shared her expert insights in an article published by HuffPost. The article discusses concerning findings from recent tests showing that advanced AI models, including OpenAI’s o3 and Anthropic’s Claude Opus 4, can exhibit deceptive, self-preserving behaviors when faced with shutdown or replacement.

What we’re starting to see is that things like self preservation and deception are useful enough to the models that they’re going to learn them, even if we didn’t mean to teach them.CSET Director of Strategy and Foundational Research Grants, Helen Toner

Toner highlighted the growing risks associated with these behaviors, stating, “What we’re starting to see is that things like self preservation and deception are useful enough to the models that they’re going to learn them, even if we didn’t mean to teach them.”

To read the article, visit HuffPost.

Is It Too Late to Slow China’s AI Development?

May 2025

CSET’s Helen Toner shared her expert insights in an article published by Foreign Policy. The article explores the impact of renewed U.S. export restrictions on Nvidia and the broader implications for U.S.-China competition in artificial intelligence… Read More

Trump’s Crackdown on Foreign Student Visas Could Derail Critical AI Research

May 2025

CSET’s Helen Toner shared her expert insights in an article published by WIRED. The article discusses the U.S. government’s plans to aggressively revoke visas for Chinese students, particularly those in sensitive research fields or with… Read More

These Startups Are Building Advanced AI Models Without Data Centers

April 2025

CSET’s Helen Toner shared her expert insights in an article published by WIRED. The article explores the development of a new large language model, Collective-1, built using a distributed training approach that leverages globally dispersed GPUs… Read More

Reports

AI for Military Decision-Making

March 2025

Artificial intelligence is reshaping military decision-making. This concise overview explores how AI-enabled systems can enhance situational awareness and accelerate critical operational decisions—even in high-pressure, dynamic environments. Yet, it also highlights the essential need for clear… Read More

Center for Security and Emerging Technology

Pentagon Standoff Is a Decisive Moment for How A.I. Will Be Used in War

In The News

AI Models Will Sabotage And Blackmail Humans To Survive In New Tests. Should We Be Worried?

Related Content

Is It Too Late to Slow China’s AI Development?

Trump’s Crackdown on Foreign Student Visas Could Derail Critical AI Research

These Startups Are Building Advanced AI Models Without Data Centers

AI for Military Decision-Making

Pentagon Standoff Is a Decisive Moment for How A.I. Will Be Used in War

In The News

AI Models Will Sabotage And Blackmail Humans To Survive In New Tests. Should We Be Worried?

Related Content

Is It Too Late to Slow China’s AI Development?

Trump’s Crackdown on Foreign Student Visas Could Derail Critical AI Research

These Startups Are Building Advanced AI Models Without Data Centers

AI for Military Decision-Making

This website uses cookies.