Executive Summary
Policymakers frequently invoke explainability and interpretability as key principles that responsible and safe AI systems should uphold. However, it is unclear how evaluations of explainability and interpretability methods are conducted in practice. To examine evaluations of these methods, we conducted a literature review of studies that focus on the explainability and interpretability of recommendation systems—a type of AI system that often uses explanations. Specifically, we analyzed how researchers (1) describe explainability and interpretability and (2) evaluate their explainability and interpretability claims in the context of AI-enabled recommendation systems. We focused on evaluation approaches in the research literature because data on AI developers’ evaluation approaches is not always publicly available, and researchers’ approaches can guide the types of evaluations that AI developers adopt.
We find that researchers describe explainability and interpretability in variable ways across papers and do not clearly differentiate explainability from interpretability. We also identify five evaluation approaches that researchers adopt—case studies, comparative evaluations, parameter tuning, surveys, and operational evaluations—and observe that research papers strongly favor evaluations of system correctness over evaluations of system effectiveness. These evaluations serve important but distinct purposes. Evaluations of system correctness test whether explainable systems are built according to researcher specifications, and evaluations of system effectiveness test whether explainable systems operate as intended in the real world. If researchers understand and measure explainability or other facets of AI safety differently, policies for implementing or evaluating safe AI systems may not be effective. Although further inquiry is needed to determine whether these results translate to other research areas and the extent to which research practices influence developers, these trends suggest that policymakers would do well to invest in standards for AI safety evaluations and enable a workforce that can assess the efficacy of these evaluations in different contexts.