Introduction
The choice of available methods for surfacing AI/ML-relevant publications matters for bibliometric analysis. Our evaluations recommend the arXiv classifier for identifying AI/ML-relevant publications in English, due to its performance and support for updates from new expert labels over time. By comparison, other methods exhibit lower recall on analytically important AI/ML conference publications, while making far more errors in the STEM preprints available on arXiv. While these other methods have analytic utility, the arXiv classifier performs best for generating AI research at the publication level.
This analysis has not directly evaluated the performance of Chinese-language keyword search on Chinese-language publications, but our English-language keyword search results suggest careful manual review of results is required. In cross-language analysis, we recommend applying the arXiv classifier to English-language text and keyword search to Chinese-language text, favoring performance over methodological consistency.