Analysis

Truth, Lies, and Automation

How Language Models Could Change Disinformation

Ben Buchanan

Andrew Lohn

Micah Musser

Katerina Sedova

May 2021

Growing popular and industry interest in high-performing natural language generation models has led to concerns that such models could be used to generate automated disinformation at scale. This report examines the capabilities of GPT-3--a cutting-edge AI system that writes text--to analyze its potential misuse for disinformation. A model like GPT-3 may be able to help disinformation actors substantially reduce the work necessary to write disinformation while expanding its reach and potentially also its effectiveness.

Download Full Report

For millennia, disinformation campaigns have been fundamentally human endeavors. Their perpetrators mix truth and lies in potent combinations that aim to sow discord, create doubt, and provoke destructive action. The most famous disinformation campaign of the twenty-first century—the Russian effort to interfere in the U.S. presidential election—relied on hundreds of people working together to widen preexisting fissures in American society.

Since its inception, writing has also been a fundamentally human endeavor. No more. In 2020, the company OpenAI unveiled GPT-3, a powerful artificial intelligence system that generates text based on a prompt from human operators. The system, which uses a vast neural network, a powerful machine learning algorithm, and upwards of a trillion words of human writing for guidance, is remarkable. Among other achievements, it has drafted an op-ed that was commissioned by The Guardian, written news stories that a majority of readers thought were written by humans, and devised new internet memes.

In light of this breakthrough, we consider a simple but important question: can automation generate content for disinformation campaigns? If GPT-3 can write seemingly credible news stories, perhaps it can write compelling fake news stories; if it can draft op-eds, perhaps it can draft misleading tweets.

To address this question, we first introduce the notion of a human-machine team, showing how GPT-3’s power derives in part from the human-crafted prompt to which it responds. We were granted free access to GPT-3—a system that is not publicly available for use—to study GPT-3’s capacity produce disinformation as part of a human-machine team. We show that, while GPT-3 is often quite capable on its own, it reaches new heights of capability when paired with an adept operator and editor. As a result, we conclude that although GPT-3 will not replace all humans in disinformation operations, it is a tool that can help them to create moderate- to high-quality messages at a scale much greater than what has come before.

In reaching this conclusion, we evaluated GPT-3’s performance on six tasks that are common in many modern disinformation campaigns. Table 1 describes those tasks and GPT-3’s performance on each.

Table 1. Summary evaluations of GPT-3 performance on six disinformation-related tasks.

Task	Description	Performance
Narrative Reiteration	Generating varied short messages that advance a particular theme, such as climate change denial.	GPT-3 excels with little human involvement.
Narrative Elaboration	Developing a medium-length story that fits within a desired worldview when given only a short prompt, such as a headline.	GPT-3 performs well, and technical fine-tuning leads to consistent performance.
Narrative Manipulation	Rewriting news articles from a new perspective, shifting the tone, worldview, and conclusion to match an intended theme.	GPT-3 performs reasonably well with little human intervention or oversight, though our study was small.
Narrative Seeding	Devising new narratives that could form the basis of conspiracy theories, such as QAnon.	GPT-3 easily mimics the writing style of QAnon and could likely do the same for other conspiracy theories; it is unclear how potential followers would respond.
Narrative Wedging	Targeting members of particular groups, often based on demographic characteristics such as race and religion, with messages designed to prompt certain actions or to amplify divisions.	A human-machine team is able to craft credible targeted messages in just minutes. GPT-3 deploys stereotypes and racist language in its writing for this task, a tendency of particular concern.
Narrative Persuasion	Changing the views of targets, in some cases by crafting messages tailored to their political ideology or affiliation.	A human-machine team is able to devise messages on two international issues—withdrawal from Afghanistan and sanctions on China—that prompt survey respondents to change their positions; for example, after seeing five short messages written by GPT-3 and selected by humans, the percentage of survey respondents opposed to sanctions on China doubled.

Across these and other assessments, GPT-3 proved itself to be both powerful and limited. When properly prompted, the machine is a versatile and effective writer that nonetheless is constrained by the data on which it was trained. Its writing is imperfect, but its drawbacks—such as a lack of focus in narrative and a tendency to adopt extreme views—are less significant when creating content for disinformation campaigns.

Should adversaries choose to pursue automation in their disinformation campaigns, we believe that deploying an algorithm like the one in GPT-3 is well within the capacity of foreign governments, especially tech-savvy ones such as China and Russia. It will be harder, but almost certainly possible, for these governments to harness the required computational power to train and run such a system, should they desire to do so.

Mitigating the dangers of automation in disinformation is challenging. Since GPT-3’s writing blends in so well with human writing, the best way to thwart adversary use of systems like GPT-3 in disinformation campaigns is to focus on the infrastructure used to propagate the campaign’s messages, such as fake accounts on social media, rather than on determining the authorship of the text itself.

Such mitigations are worth considering because our study shows there is a real prospect of automated tools generating content for disinformation campaigns. In particular, our results are best viewed as a low-end estimate of what systems like GPT-3 can offer. Adversaries who are unconstrained by ethical concerns and buoyed with greater resources and technical capabilities will likely be able to use systems like GPT-3 more fully than we have, though it is hard to know whether they will choose to do so. In particular, with the right infrastructure, they will likely be able to harness the scalability that such automated systems offer, generating many messages and flooding the information landscape with the machine’s most dangerous creations.

Our study shows the plausibility—but not inevitability—of such a future, in which automated messages of division and deception cascade across the internet. While more developments are yet to come, one fact is already apparent: humans now have able help in mixing truth and lies in the service of disinformation.

Download Full Report

Truth, Lies, and Automation: How Language Models Could Change Disinformation

Authors

Ben Buchanan Andrew Lohn Micah Musser Katerina Sedova

Originally Published

May 2021

Topics

Citation

Ben Buchanan, Andrew Lohn, Micah Musser, and Katerina Sedova, "Truth, Lies, and Automation: How Language Models Could Change Disinformation" (Center for Security and Emerging Technology, May 2021). https://doi.org/10.51593/2021CA003

Twitter Facebook LinkedIn

Analysis

AI and the Future of Disinformation Campaigns

December 2021

Artificial intelligence offers enormous promise to advance progress and powerful capabilities to disrupt it. This policy brief is the second installment of a series that examines how advances in AI could be exploited to enhance… Read More

Analysis

AI and the Future of Disinformation Campaigns

December 2021

Artificial intelligence offers enormous promise to advance progress, and powerful capabilities to disrupt it. This policy brief is the first installment of a series that examines how advances in AI could be exploited to enhance… Read More

Testimony

Andrew Lohn’s Testimony Before the House Homeland Security Subcommittee on Cybersecurity, Infrastructure Protection, and Innovation

June 2022

CSET Senior Fellow Andrew Lohn testified before the House of Representatives Homeland Security Subcommittee on Cybersecurity, Infrastructure Protection, and Innovation at a hearing on "Securing the Future: Harnessing the Potential of Emerging Technologies While Mitigating… Read More

Analysis

Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk

January 2023

Machine learning advances have powered the development of new and more powerful generative language models. These systems are increasingly able to write text at near human levels. In a new report, authors at CSET, OpenAI,… Read More

Analysis

A National Security Research Agenda for Cybersecurity and Artificial Intelligence

May 2020

Machine learning advances are transforming cyber strategy and operations. This necessitates studying national security issues at the intersection of AI and cybersecurity, including offensive and defensive cyber operations, the cybersecurity of AI systems, and the… Read More

Analysis

Truth, Lies, and Automation

How Language Models Could Change Disinformation

Table 1. Summary evaluations of GPT-3 performance on six disinformation-related tasks.

Download Full Report

Related Content

AI and the Future of Disinformation Campaigns

AI and the Future of Disinformation Campaigns

Andrew Lohn’s Testimony Before the House Homeland Security Subcommittee on Cybersecurity, Infrastructure Protection, and Innovation

Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and How to Reduce Risk

A National Security Research Agenda for Cybersecurity and Artificial Intelligence

This website uses cookies.