Global COVID-19 Research and Modeling: A Historical Record
![]() |
|
To answer the big questions like ‘how have global scientists responded to tackling COVID-19?’ and ‘how has COVID-19 been quantified?’, our team explored 809k valid publications in English, covering the entire World Health Organisation-recognised active period of the COVID-19 pandemic from January 2020 to April 2023, affiliated with 194 countries and 2.3M authors across 27 subject areas, and conducted series of research on COVID-19 modeling in 3.5 years from 2021 to 2024.
This book provides answers to fundamental and challenging questions regarding the global response to COVID-19. It creates a historical record of COVID-19 research conducted over the four years of the pandemic, with a focus on how researchers have responded, quantified, and modeled COVID-19 problems. Since mid-2021, we have diligently monitored and analyzed global scientific efforts in tackling COVID-19. Our comprehensive global endeavor involves collecting, processing, analyzing, and discovering COVID-19 related scientific literature in English since January 2020. This provides insights into how scientists across disciplines and almost every country and regions have fought against COVID-19. Additionally, we explore the quantification of COVID-19 problems and impacts through mathematics, AI, machine learning, data science, epidemiology, and domain knowledge. The book reports findings on publication quantities, impacts, collaborations, and correlations with the economy and infections globally, regionally, and country-wide. These results represent the first and only holistic and systematic studies aimed at scientifically understanding, quantifying, and containing the pandemic. We hope this comprehensive analysis will contribute to better preparedness, response, and management of future emergencies and inspire further research in infectious diseases. The book serves as a historical archive for the whole scientific community, providing unprecedented resources for research policy, funding management authorities, researchers, policy makers, and funding bodies involved in infectious disease management, public health, and emergency resilience.
- The US led the way in overall contributions, accounting for 14.81 per cent of total publications. The European Union was close behind at 14.66 per cent, while China contributed 6.02 percent. G20 countries and regions contributed 47.54%. OECD countries and regions contributed 42.38%.
- In quantifying and modeling COVID-19, regression, machine learning and deep learning, simulation, multivariate statistics, artificial intelligence, and statistical model are the mostly applied techniques. Medical scientists most frequently applied techniques including regression, multivariate statistics, simulation, statistical modeling, and machine learning. Computer scientists preferred machine learning, simulation, deep learning, artificial intelligence, and regression. Social scientists favored regression, structural equation models, machine learning, simulation, and artificial intelligence.
- Mental health — including anything related to sentiment, anxiety or depression — was the most frequently expressed concern in research, rather than virology or medical treatments.
- The top five research keywords explored were mental health, pandemic, vaccination, second waves and lockdowns.
- The US, China, the United Kingdom, Italy and India ranked top five globally in publication quantity and cumulative impact, while the top five in research productivity were the Netherlands, Switzerland, the UK, France and the US.
- Among G20 countries, Argentina, South Korea and Australia were less productive in COVID-19 research than expected.
- China, the US, the UK, Canada and Italy are the top-5 most collaborative countries. Despite well-publicised political tensions between the US and China during this period, scientists maintained the strongest collaborative relationships than any other countries.
- Very classic, conventional methodologies, in particular, basic regression, machine learning models and multivariate statistics, were overwhelmingly applied in medical science, social science and computer science.
- Many researchers rushed to publish their results, but used basic analytical techniques to process newly-available data, while some of the analysis was really very naive.
- Of the G20 and OECD countries, the US, China, the UK, Italy and India are the top 5 in terms of cumulative Composite Impact integrating H5-index, impact factor, CiteScore, SNIP and SJR.
- Regarding the paper-averaged mean Composite Impact of all publications, Iceland, France, the Netherlands, Denmark, and Switzerland are the top-5 countries. The US and China share a similar paper-averaged mean CI. While India ranks No. 5 in terms of publication number, India ranks 18th with respect to its median paper-averaged CI.
- The countries and regions with the highest coefficient between the total publication number and GDP per capita and between the collective CI and GDP per capita are India, China, Pakistan, the US and Brazil.
- Of the G20 countries and regions, India, China, the US, Brazil and Turkey are the top-5 in terms of their correlation between publication number and GDP per capita.
Some of highlights of the findings:
- A summary of important findings and implications of this global scientific response to studying COVID-19: Chapter 3: Highlights of the Findings
- MQ Lighthouse: Big data confirms mental health was studied more than the virus during the pandemic.
More comprehensive findings can be found in
Global Scientific Responses
- Longbing Cao and Wenfeng Hou. How Have Global Scientists Responded to Tackling COVID-19? Full technical report, pp. 1-125, University of Technology Sydney, 2022. Access the report at medRxiv and its associated COVID-19 global scientist response dataset and results at Kaggle. The report and data provide comprehensive literature analyses and results about the global scientific response to the COVID-19 pandemic.
- Longbing Cao and Qing Liu. COVID-19 Modeling: A Review, 1-103, 2021. Accessible at medRxiv or SSRN. This review provides a comprehensive review of COVID-19 modeling including epidemiological modeling, AI, data science, machine learning and deep learning, statistical and mathematical modeling, and simulation methods, etc.
The following two comprehensive and systematic literature review reports summarize how the global scientists have responded to the COVID-19 pandemic:
Also refer to for more information about the relevant research on COVID-19.
Introduction
Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the new severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It has quickly emerged into a pandemic within a short period.[1][2]
During the pandemic, multiple variants of the virus that causes COVID-19 have emerged. In the United Kingdom (UK), a new variant with an unusually large number of mutations is affecting ordinary lives. This variant spreads more easily and quickly than other variants. This variant was first detected in September 2020 and is now highly prevalent in London and southeast England. It has been detected in numerous countries around the world, including the United States and Canada.
In South Africa, another variant has emerged independently of the variant detected in the UK. This variant, originally detected in early October 2020, shares some mutations with the variant detected in the UK. This variant seems to spread more easily and quickly than other variants. Now, cases caused by this variant have been detected outside of South Africa. Currently, there is no evidence that it causes more severe illness or increases the risk of death.
Another variant recently emerged in Nigeria. There is no evidence that this variant causes more severe illness or increases the spread of COVID-19 in Nigeria.[3]
Symptoms
- Mild symptoms or none at all
Some people become infected but only have very mild symptoms or none at all.
- Common symptoms
The most common symptoms of COVID-19 are fever, dry cough, fatigue, persistent pain or pressure in the chest, shortness of breath.
Other symptoms that are less common and may affect some patients include: runny nose, loss of taste or smell, nasal congestion, conjunctivitis (also known as red eyes), sore throat, headache, muscle or joint pain, different types of skin rash, nausea or vomiting, diarrhea, chills or dizziness, irritability, confusion, reduced consciousness (sometimes associated with seizures), anxiety, depression, or sleep disorders.
Children tend to have abdominal symptoms and skin changes or rashes, and sometimes vomiting and diarrhea.
- Severe symptoms
More severe and rare neurological complications include strokes, brain inflammation, delirium and nerve damage.
- A high temperature
- A new, continuous cough
- A loss or change to the patients’ sense of smell or taste
- More children, and pregnant and post-partum women
- More frequently presented renal and gastrointestinal symptoms
- More often treated with non-invasive mechanical ventilation and corticoids, and less often with invasive mechanical ventilation, conventional oxygen therapy and anticoagulants
- School closing
- Workplace closing
- Cancel public events
- Restrictions on gathering size
- Close public transport
- Stay at home requirements
- Restrictions on internal movement
- Restrictions on international movement
- Income support
- Debt/contract relief for households
- Fiscal measures
- Giving international support
- Public information campaign
- Testing policy
- Contact tracing
- Emergency investment in healthcare
- Investment in Covid-19 vaccines
- Facial coverings
- Other responses
Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. 80% of patients have mild to moderate symptoms. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness.[4] COVID-19 can affect different people in different ways:
1. Symptoms of the first waves:[10][11][12][13]
2. Symptoms of the second waves:
Comparing with the first wave, the key symptoms of the second wave include:[7][8]
3. Difference between COVID-19 and flu:
COVID-19 | Flu | ||
---|---|---|---|
Differences | Virus family | SARS-CoV-2 | Any of several different types and strains of influenza viruses |
Epidemiology | More easily, more serious, longer time (from two days to two weeks[9][11]) and higher contagiousness | Less time(one to four days[9]) | |
Treatment | Currently only available in intravenous form | Oral antiviral medications | |
Vaccine | Partly available, in development | Available | |
Similarities | Symptoms | Fever, cough, body aches, and sometimes vomiting and diarrhea | |
Spread | Transmit the virus to other people nearby | ||
Treatment | Both are treated by addressing symptoms | ||
Prevention | Mask-wearing; hand washing; staying home |
Policies
-
To help slow down the spread of COVID-19, many countries or regions propose advice to the public. OxCGRT organizes the current official suggestions into four categories[5]:
2. Economic response
3. Health systems
4. Miscellaneous
Datasets
With the spread of COVID-19 on a global scale, some datasets related to COVID-19 cases, policies implemented against COVID-19, economic measures taken on the impact of COVID-19, and so on have been made public worldwide to tackle COVID-19. The following four types of datasets may be helpful for the COVID-19 modeling. Note: Some datasets overlap with each other.
- Cases dataset
- COVID Intel Database:
Consisting of the number of confirmed cases and deaths of covid-19 worldwide. The data is updated daily.
The data includes the name of a country, the WHO region, cases (cumulative total), cases (cumulative total per 1 million population), cases (newly reported in last 7 days), cases (newly reported in last 24 hours), deaths (cumulative total), deaths (cumulative total per 1 million population), deaths (newly reported in last 7 days), deaths (newly reported in last 24 hours), and transmission classification. Data types are divided into numeric and textual data.
Additionally, other eleven aspects are involved in this dataset, such as hospitalizations, vaccinations, mortality risks, and so on.
- Our World in Data:
Focusing on the tests of COVID-19 a country has done. Data types are divided into numeric and textual data.
- Polices dataset
- The dataset is of interest to epidemiologists who wish to link government measures worldwide to the developments of the number of cases.
- The dataset is also of interest to social scientists interested in the impact of other factors, e.g., democracy or institutions, on the rigidity and the timing of the measures taken.
- The coding of the economic measures is also useful to relate economic interventions with economic outcomes such as the gross domestic product or national financial market indices.
- Mobility dataset
- Research dataset
- Biomedical data
- Case statistics data
- Competition data
- Research article data
- Other data
This dataset consists of two parts, daily cases and deaths (COVID Intel Database), testing data (Our World in Data).
The dataset for the 2019 Novel Coronavirus Visual Dashboard is operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE), also supported by the ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL). This dataset mainly consists of the confirmed cases or deaths of COVID-19.
The data sources are composed of aggregated data sources, US data sources at the state (Admin1) or county/city (Admin2) level, and Non-US data sources at the country/region (Admin0) or state/province (Admin1) level.
The Oxford COVID-19 Government Response Tracker (OxCGRT) is a program established by the Blavatnik School of Government at the University of Oxford, UK. Data on seventeen different indicators of government response are recorded. These include containment, economic, and health system policies. Non-quantitative indicators are converted to ordinal scales based on a series of criteria. These are combined into a number of response indices on a quantitative scale.
The Oxford team collects information on common policy responses, scores the stringency of such measures, and aggregates these into a Stringency Index.
The data is collected from publicly available information by a cross-disciplinary Oxford University team of academics and students from every part of the world, led by the Blavatnik School of Government.
The team collects publicly available information on a number of indicators of government response. The first seven indicators (S1-S7), taking policies such as school closures, travel bans, etc., are recorded on an ordinal scale; the remainder are financial indicators such as fiscal or monetary measures.
This dataset contains key characteristics about the data described in the Data Descriptor Response2covid19, a dataset of government responses to COVID-19 all around the world.
The Apple Mobility Data is anonymised and aggregated, which was harvested from users of the mobile phone application Apple Maps and made publicly available by Apple. All data are presented relative to a baseline established on Jan 13, 2020, with days defined as midnight to midnight PST. There are generally marked differences by day of week (e.g. weekend effects), and likely be affected by seasonal mobility changes. These data are generally made available with a one-to-two-day delay and are used in accordance with the Apple’s Terms and Conditions.
Google has provided anonymously aggregated mobility data from users of the mobile phone application Google Maps. Data are informed by visits and length of stay at locations and are available from users that share their location history with Google. All data are presented relative to a baseline, which is the median value, for the day of the week, for the five week period between Jan 3 and Feb 6 2020. Due to this, the data are less influenced by day of week biases but are influenced by normal differences in seasonal usage. This data are generally made available with approximately a one week delay, and are used in accordance with Google’s Terms and Conditions.