Data Icon - illustrated bar and line graph

Data

CSET’s unique data-driven approach is enabled by our data team. The team includes data scientists, data research analysts, software engineers, survey and translation specialists, and more. We maintain CSET’s vast data holdings, which include nearly 60 analysis-ready datasets, offering unprecedented coverage of the emerging technology ecosystem. The team develops and deploys the latest methods in data science and machine learning to clean, link, classify, and otherwise enhance data for analytic use, as well as support the curation and annotation of original datasets - from surveys to scraped online information. Resulting research and tools are presented in CSET Data Briefs and Data Snapshots, public repositories, as well as academic conferences and publications.



Emerging Technology Observatory Logo

You can also check out our work on CSET’s Emerging Technology Observatory. ETO provides free, high-quality data resources leveraging CSET’s data and analytic capabilities to transform data into actionable insights. Currently, ETO hosts 10 public tools and 8 open datasets maintained by CSET’s data team. Subscribe to receive the ETO analysis and updates.

Recent Publications

In the second installation of our blog series analyzing 147 AI-related laws enacted by Congress between January 2020 and March 2025 from AGORA, we explore the governance strategies, risk-related concepts, and harms addressed in the legislation. In the first blog, we showed that the majority of these AI-related legislative documents were drawn...

Read More

Data Snapshot

The NIH’s Impact on Research and Innovation

Katherine Quinn, Steph Batalis, and Rebecca Gelles
| August 7, 2025

Data Snapshots are informative descriptions and quick analyses that dig into CSET’s unique data resources. This three-part series introduces CSET’s patent clusters, which connect related patents through citations and text similarity.

Read More

Data Visualization

Exploring AI legislation in Congress with AGORA: Origin and Application Domains

Mina Narayanan and Sonali Subbu Rathinam
| July 23, 2025

In this two-part analysis, we use data from the Emerging Technology Observatory's AGORA to explore AI-related legislation that was enacted by Congress between January 2020 and March 2025. This first blog explores the origin and application domains of the AI-related legislation we reviewed. The second blog examines the governance strategies, risk-related concepts,...

Read More

Recent Blog Articles

On July 31, 2025, the Trump administration released “Winning the Race: America’s AI Action Plan.” CSET has broken down the Action Plan, focusing on specific government deliverables. Our Provision and Timeline tracker breaks down which agencies are responsible for implementing recommendations and the types of actions...

Read More

CSET’s Must Read Research: A Primer

Tessa Baker
| December 18, 2023

This guide provides a run-down of CSET’s research since 2019 for first-time visitors and long-term fans alike. Quickly get up to speed on our “must-read” research and learn about how we organize our work.

Read More

Our People

Catherine Aiken

Director of Data Science and Research

Adrian Thinnyun

Data Research Analyst

Ben Murphy

Translation Manager

Brian Love

Senior Software Engineer

Daniel Chou

Data Scientist

Jacob Feldgoise

Senior Data Research Analyst

Katherine Quinn

Data Scientist

Rebecca Gelles

ML Engineer

Ronnie Kinoshita

Deputy Director of Data Science & Research

Shruti Agarwal

Software Engineer

Sonali Subbu Rathinam

Data Research Analyst

Related News

On July 31, 2025, the Trump administration released “Winning the Race: America’s AI Action Plan.” CSET has broken down the Action Plan, focusing on specific government deliverables. Our Provision and Timeline tracker breaks down which agencies are responsible for implementing recommendations and the types of actions they should take.
In The News

Mapping the AI Governance Landscape

October 15, 2025
🔔 The number of AI-related governance documents is rapidly proliferating, but what risks, mitigations, and other concepts do these documents actually cover? MIT AI Risk Initiative researchers Simon Mylius, Peter Slattery, Yan Zhu, Alexander Saeri, Jess Graham, Michael Noetel, and Neil Thompson teamed up with CSET’s Mina Narayanan and Adrian Thinnyun to pilot an approach to map over 950 AI governance documents to several extensible taxonomies. These taxonomies cover AI risks and actors, industry sectors targeted, and other AI-related concepts, complementing AGORA’s thematic taxonomy of risk factors, harms, governance strategies, incentives for compliance, and application areas.
On October 30, 2023, the Biden administration released its long-awaited Executive Order on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. CSET has broken down the EO, focusing on specific government deliverables. Our EO Provision and Timeline tracker lists which agencies are responsible for actioning EO provisions and their deadlines.
As technology competition intensifies between the United States and China, governments and policy researchers are looking for metrics to assess each country’s relative strengths and weaknesses. One measure of technology innovation increasingly used by the policy community is research output. Drawing on CSET’s experiences over the last four years, this post shares our best practices for using research output to study national technological competition and inform public policy.
CSET has received a lot of questions about LLMs and their implications. But questions and discussions tend to miss some basics about LLMs and how they work. In this blog post, we ask CSET’s NLP Engineer, James Dunham, to help us explain LLMs in plain English.
Making sense of the often overwhelming world of emerging tech with data-driven tools and resources.