Analysis

Putting Explainable AI to the Test: A Critical Look at AI Evaluation Approaches

Mina Narayanan,

Christian Schoeberl,

and Tim G. J. Rudner

February 2025

Explainability and interpretability are often cited as key characteristics of trustworthy AI systems, but it is unclear how they are evaluated in practice. This report examines how researchers evaluate their explainability and interpretability claims in the context of AI-enabled recommendation systems and offers considerations for policymakers seeking to support AI evaluations.

Download Full Report

Related Content

Australia, Canada, Japan, the United Kingdom, and the United States emphasize principles of accountability, explainability, fairness, privacy, security, and transparency in their high-level AI policy documents. But while the words are the same, these countries… Read More

When the technology and policy communities use terms associated with trustworthy AI, could they be talking past one another? This paper examines the use of trustworthy AI keywords and the potential for an “Inigo Montoya… Read More

This paper is the third installment in a series on “AI safety,” an area of machine learning research that aims to identify causes of unintended behavior in machine learning systems and develop tools to ensure… Read More