Executive Summary
While top-level principles regarding trustworthy, ethical, and responsible artificial intelligence and machine learning (ML) are critical to the formation of international norms, so too is the detailed work of the academic and research communities in establishing precise framings, techniques, and tools that will help create or assess trustworthy AI. An obvious interest among policymakers, therefore, is to understand and assess where the technical community may be making progress that can be harnessed, and where policymakers would do well to support or otherwise incentivize more activity.
Understanding where progress is being made in developing trustworthy AI is complicated. First, the field of AI/ML is rapidly advancing, with new tools and techniques emerging in rapid succession. Second, trustworthy AI is a nascent, multifaceted concept that is hard to bound. And third, there is the possibility that policymakers and technical researchers may be talking past each other, at least in the published literature, by using the same key terms to describe trustworthy AI—but with different meanings ascribed to these terms.
This paper aims to assist technology policymakers interested in trustworthy AI by examining the use of trustworthy AI keywords in AI research publications and whether or not that use overlaps with how the research and development community uses the same terms. Drawing on the National Institute of Standards and Technologies’ AI Risk Management Framework (NIST AI RMF), a set of terms related to trustworthy AI is defined, and 2.3 million AI-related research publications between 2010 and 2021 are analyzed, with the following findings:
- Roughly 14 percent of AI papers between 2010 and 2021 include at least one of 13 trustworthy AI keywords (322,209 keyword papers). The growth in the number of publications using these terms exceeds the growth of research on AI generally in the past five years.
- A review of the titles and abstracts of the most cited papers with a trustworthy AI keyword in 2021, reveals that researchers are using most of the keywords in ways that align with the intent of the NIST AI RMF. However, tracking trends in trustworthy AI research through keywords can be misleading, because not all papers that use a keyword for trustworthy AI actually discuss that subject, and some keywords are used in different contexts more often than others. For example:
- The keywords reliability and robustness are the most frequently mentioned trustworthy AI terms in publications, and most of the titles and abstracts reviewed for this study indicate that the terms are used in ways that align with NIST’s AI RMF. These terms may appear frequently in part because they are generally expected evaluation metrics used widely in AI research.
- This trend was noted in a review of titles and abstracts. However, in the case of reliability, a significant minority of papers using the term do so in the context of research on how AI could improve the reliability of a non-AI system.
- Like reliability, safety, security, and resilience are also terms frequently used with varying meanings. While most of the titles and abstracts reviewed for this study use these terms in ways that align with NIST’s AI RMF definitions, a significant minority use them in research on how AI could improve the reliability, safety, security, and/or resilience of a non-AI system.
- While the keyword bias is frequently used in policy conversations in the context of mitigating or avoiding the harmful effects of discrimination, in AI publications it has two main uses, one technical to describe meaningful components of an algorithm, and the other to describe unfair discrimination. NIST’s definition accounts for both, though it focuses on harmful bias mitigation in the sense of unfair discrimination. Researchers are evenly split between these two options in how they use the word bias.
- Many publications that use the terms explainability, interpretability, transparency, and accountability are referencing how to develop AI models and systems that an end-user can trust, specifically in the context of the Explainable AI (XAI) research area. This is interesting, because, while trustworthy AI is not currently considered a research area, XAI has developed into one. Although the terms explainability and interpretability can be confusing to non-experts, they appear to be distinct and core to the XAI area of research.