COVID-19 Data Sets to Keep You Informed on the Crisis

5 COVID-19 data sets for analysts, journalists and researchers to use during the Coronavirus pandemic.

The COVID-19 pandemic has been the talking point of almost every newsroom recently, and it has had a huge impact on communities around the world. But as much as the coronavirus has disrupted our daily lives, we also live in a world where information is readily available and it’s just a matter of using the information to support our efforts in handling this pandemic and making informed decisions.

Important COVID-19 Data Sets

The research community has responded to the COVID-19 pandemic by researching and generating important data sets that are not only beneficial to the medical community but also to data scientists. It can help to accelerate treatment research and inform policymakers on how to predict future events.

Here are five of the best COVID-19 data sets to draw information from.

1. Global Coronavirus (COVID-19) Data (Johns Hopkins)

Produced by Johns Hopkins University in Baltimore, USA, The COVID-19 Dashboard is one of the most accessed Coronavirus data sets in the world.

This data set is provided by the Johns Hopkins University Center for Systems Science and Engineering, and is drawn from various sources, including the World Health Organization as well as the National Health Commission of the People’s Republic of China. This data set has been in existence since January 2020 and the JHU CCSE also maintains a data repository on GitHub, allowing developers, researchers and journalists to access and utilise raw data.

Access this data set by clicking this link: Johns Hopkins

2. CORD-19

You’ll need experience in accessing open data sets using a tool like R.

Several research groups have partnered with the Allen Institute for AI to deliver a data set called the COVID-19 Open Research Dataset (CORD-19). This dataset contains information from more than 40,000 scholarly articles about the pandemic and the family of coronaviruses to be used by global researchers. This data set also supports the Kaggle challenge, that calls for the development of data mining tools from data scientists to help the medical community. An explorer tool has been launched to help navigate through the CORD-19 data set.

Access this data set by clicking this link: CORD-19 Data Set.

3. nCoV-2019

Again, one for the nerds amongst us. To access and make use of the nCov-2019 data set, you’ll need a degree of statistical computing experience.

The nCoV-2019 data set collects information from various national health reports, along with selected online reports, and provides real-time case information. The data is geo-coded and also provides additional information such as travel records and symptoms, if available. There is a GitHub repository available for this data set here. Data sets like these help with public health decision making.

Access this data set by clicking this link: nCoV-2019 Data Set.

4. WHO COVID-2019

Accessible by a search engine or export, literature and data files relating to the COVID-19 pandemic by the WHO.

The World Health Organization (WHO) has a data set that is updated on a daily basis by obtaining information from various journals as well as scientific articles to support this data. Users can download the data set from the WHO and also do a data base search by journal or keyword. They also list their resources with a direct link to each.

Access this data set by clicking this link: WHO COVID-19.

5. COVID-19 Tweet IDs

The COVID-19 Tweet IDs data set offers a collection of millions of tweets across the world, that are associated with the coronavirus pandemic. This data set was established in January 2020 and follows various twitter accounts with specific keywords in different languages. It also collects tweets to help provide a breakdown of information shared on Twitter, including tips on how to handle the pandemic.

These data sets can be a significant benefit not only to the medical research community but also to policymakers worldwide. How data scientists utilize these data sets and what tools will be provided as a result, remains to be seen, but this gives us one extra tool in the fight against the coronavirus.  

Access this data set by clicking this link: COVID-19 Tweet ID Data.

Molzana continue to trade and operate as normally as possible during the COVID-19 pandemic, we are truly grateful to be able to continue to help our clients during this time. We hope that you and your business are fortunate enough to be in a similar position. Please stay safe.

3 Reasons to Automate Your Reports

Everyone hates reporting, even analysts. It is, however, a necessity of every business. Saving time, effort and money are the obvious benefits of reporting but increase data reliability and accuracy are also the some more benefit or automating your reports.

Read More

Sign Up to Our Newsletter

Stay informed with our collection of helpful tips and industry trends.

  • This field is for validation purposes and should be left unchanged.

Contact Molzana


Simple, no-nonsense analytics, data & tech
from the straight talking analytics experts.

London & Manchester.

Get in Touch

Telephone: +44 (0) 8712 458 672

London: Bond Works, 77 Farringdon Road, EC1M 3JU Manchester: 83 Ducie Street, M1 2JQ

Molzana Logo Blue
  • This field is for validation purposes and should be left unchanged.