The COVID-19 pandemic has been the talking point of almost every newsroom recently, and it has had a huge impact on communities around the world. But as much as the coronavirus has disrupted our daily lives, we also live in a world where information is readily available and it’s just a matter of using the information to support our efforts in handling this pandemic and making informed decisions.
Important COVID-19 Data Sets
The research community has responded to the COVID-19 pandemic by researching and generating important data sets that are not only beneficial to the medical community but also to data scientists. It can help to accelerate treatment research and inform policymakers on how to predict future events.
Here are five of the best COVID-19 data sets to draw information from.
This data set is provided by the Johns Hopkins University Center for Systems Science and Engineering, and is drawn from various sources, including the World Health Organization as well as the National Health Commission of the People’s Republic of China. This data set has been in existence since January 2020 and the JHU CCSE also maintains a data repository on GitHub, allowing developers, researchers and journalists to access and utilise raw data.
Access this data set by clicking this link: Johns Hopkins
Several research groups have partnered with the Allen Institute for AI to deliver a data set called the COVID-19 Open Research Dataset (CORD-19). This dataset contains information from more than 40,000 scholarly articles about the pandemic and the family of coronaviruses to be used by global researchers. This data set also supports the Kaggle challenge, that calls for the development of data mining tools from data scientists to help the medical community. An explorer tool has been launched to help navigate through the CORD-19 data set.
Access this data set by clicking this link: CORD-19 Data Set.
The nCoV-2019 data set collects information from various national health reports, along with selected online reports, and provides real-time case information. The data is geo-coded and also provides additional information such as travel records and symptoms, if available. There is a GitHub repository available for this data set here. Data sets like these help with public health decision making.
Access this data set by clicking this link: nCoV-2019 Data Set.
The World Health Organization (WHO) has a data set that is updated on a daily basis by obtaining information from various journals as well as scientific articles to support this data. Users can download the data set from the WHO and also do a data base search by journal or keyword. They also list their resources with a direct link to each.
Access this data set by clicking this link: WHO COVID-19.
The COVID-19 Tweet IDs data set offers a collection of millions of tweets across the world, that are associated with the coronavirus pandemic. This data set was established in January 2020 and follows various twitter accounts with specific keywords in different languages. It also collects tweets to help provide a breakdown of information shared on Twitter, including tips on how to handle the pandemic.
These data sets can be a significant benefit not only to the medical research community but also to policymakers worldwide. How data scientists utilize these data sets and what tools will be provided as a result, remains to be seen, but this gives us one extra tool in the fight against the coronavirus.
Access this data set by clicking this link: COVID-19 Tweet ID Data.
Molzana continue to trade and operate as normally as possible during the COVID-19 pandemic, we are truly grateful to be able to continue to help our clients during this time. We hope that you and your business are fortunate enough to be in a similar position. Please stay safe.