A Free, Open Resource for the Global AI Community
The White House is tapping the expertise of researchers from Georgetown’s Center for Security and Emerging Technology to determine how data and open research can be used to address the COVID-19 pandemic. CSET has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a resource of more than 57,000 articles in JSON format about COVID-19 and the coronavirus family of viruses for use by the global machine learning community. The dataset represents the most extensive machine-readable coronavirus literature collection available for data and text mining to date.
With this step, we’ve made available full-text, machine-readable resources to help speed response to this global crisis. The worldwide machine learning community now has the opportunity to apply recent advances in natural language processing to find answers to important questions about this infectious disease.Dewey Murdick, CSET Director of Data Science
CORD-19 contains 45,000 full-text articles with a wealth of information about the novel coronavirus (SARS-CoV-2), the associated illness COVID-19, and related viruses. The collection will be updated as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others.
At the request of the White House Office of Science and Technology Policy, CSET leads this effort in partnership with the Allen Institute for AI, Chan Zuckerberg Initiative, Microsoft Research and the National Library of Medicine of the National Institutes of Health. Read the press release here.
Now, preliminary answers to questions about COVID-19 — including estimations of reproduction rate, incubation period, and key risk factors — are emerging. Kaggle has summarized early findings extracted from the CORD-19 papers by machine learning algorithms. Learn more here.
Media Coverage
- White House, Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset
- Wired, Researchers Will Deploy AI to Better Understand Coronavirus
- Wired, AI Can Help Scientists Find a Covid-19 Vaccine
- Defense One, How to Counter China’s Coronavirus Disinformation Campaign
- Forbes, A Call To Action To AI Experts: Join The Fight Against The Coronavirus
- Forbes, Our Smartphone Data Can Predict How Coronavirus Will Spread
- Federal News Network, Using data in the fight against coronavirus
- Fedscoop, White House aims to answer WHO’s coronavirus questions using natural language processing
- Geekwire, AI2 and Microsoft join the White House’s push to enlist AI for the war on coronavirus
- Geekwire, Software tools for mining COVID-19 research studies go viral among scientists
- Geekwire, How AI is helping scientists in the fight against COVID-19, from robots to predicting the future
- Nextgov, Government Partnership Offers Cash Prizes for AI Tools That Support Coronavirus Research
- The Next Web, How AI helps scientists find reliable coronavirus research
- TechCrunch, With launch of COVID-19 data hub, the White House issues a ‘call to action’ for AI researchers
- Governing, AI to Interrogate Deep Archive to Find Insights on COVID-19
- Analytics India, Top Hackathons Dedicated To Fight COVID-19
- Psychology Today, White House Calls for AI Experts to Help COVID-19 Research
- Tech Republic, Verizon Media builds search engine to help researchers find COVID-19 documents
- The South African, CORD-19: Database of scientific articles launched to help AI fight COVID-19
- Import AI, AnimeGAN; why Bengali is hard for OCR systems; help with COVID by mining the CORD-19 dataset