About this dashboard

Johns Hopkins University maintains a database of world-wide confirmed COVID-19 cases (GitHub, Dashboard). This dashboard lets you explore the data further by visualising data regarding the total confirmed cases, confirmed deaths and confirmed recoveries. These numbers can be seen nominally as well as relative to the country's population. You can compare these data across countries and you can fit exponential growth models to individual countries to see how the disease has spread so far.

Authors: This dashboard was created by Nikolaj Theodor Thams, Martin Emil Jakobsen, Phillip Bredahl Mogensen, all PhD-students in Statistics at the University of Copenhagen. We would be happy to receive your feedback.

COVID19 data: Johns Hopkins University (GitHub, Dashboard).

SSI data: Danish State Serum Institute (reports).

Update frequency: The Johns Hopkins github database is currently updated once daily, shortly after which we feed this dashboards with the updated data. The SSI reports are published daily between 12.00 and 13.00 Central European Time.

Validity of data: The situation on COVID-19 is developing rapidly, and we're vastly grateful for the work done by Johns Hopkins. While there may be misreportings in the data, they are often quickly solved.

Population data: https://www.worldometers.info/world-population/population-by-country/

Terms of use: Please be aware of the terms of use for the Johns Hopkins data. If you wish to use our dashboard, naturally you must obey to these. Further, we take no responsibilities for the correctness of displayed data, processing nor modelling.

Download Selected Data



Johns Hopkins University maintains a database of COVID-19 cases. This dashboard explores the data further by visualising data for individual countries.

COVID19 data: Johns Hopkins University (GitHub, Dashboard).

Authors: This dashboard was created by Nikolaj Theodor Thams, Martin Emil Jakobsen, Phillip Bredahl Mogensen, all PhD-students in Statistics at the University of Copenhagen.




Model description:

We are here attempting to fit an exponential growth model, where we regress the cumulative confirmed cases of COVID19 onto the time passed since the first confirmed case. In many cases, this seems to be a reasonable model, although this model is far from rich enough to describe all cases. In China, for example, the disease spread has largely plateaued, and can therefore not be described properly by an exponential growth model. However, in order to show an output for every single country, we simply present here the best fit in the model class – no matter how good it is.

Details:

We're fitting models of the kind: \[ \text{Total cases}_i = (1+r)^{t_i-l} + \text{noise}_i. \] The parameter \(r\) is the infection rate (i.e. multiplicative increase in infected people) and \(l\) is the lag-phase duration (i.e. time until the curve 'explodes'). The covariate \(t\) is number of days from first confirmed case.

How to interpret the models:

There are two parameters of interest in the models that we fit here; the rate of infection (\(r\)) and the lag-phase duration (\(l\)). The way that you should interpret can be illustrated with a small example:

Suppose that Sweden has a rate of infection \(r = 0.33\) and a lag-phase duration \(l = 19.6\) – these are the estimated values for the first \(45\) days after the first confirmed case in Sweden. We – loosely – interpret the lag-phase as the time it takes until the disease spread really kicks in. That is, up until the \(20^{\text{th}}\) day, the outbreak was relatively contained. After this point, we start seeing an exponential growth in the number of confirmed cases. In particular, after this point we estimate that the number of people that are infected by the virus increases by \(33\%\) per day – this is exactly the infection rate. That is, if there are \(100\) people infected today, we expect that there will be \(100\cdot 1.33 = 133\) patients infected tomorrow.

Limitations:

While these models appear to accurately describe the disease spread in many countries at the time of writing (March 16, 2020), there are obvious limitations to this model class. The most obvious is, that we in no way account for the fact that people will start to become healthy again at some point, which reduces the number of infections. That is, we do not model the current number of infections. Instead, we are modelling the cumulative number of infections, i.e. how many infections have there been in total throughout the lifetime of the outbreak. The second limitation is that, after a while, we will start seeing a plateau in the number of infections. The worst-case-scenario is that everyone becomes infected, which would then cause the curve to flatten – this is not captured by our models. A more likely scenario is that the disease spread becomes contained through both active and passive interventions, e.g. government interventions and awareness. This would, again, cause the curve to flatten. These models, then, should only be though of as a describtion of how the initial outbreak (up to any interventions) looked for a specific country.


              
Default case fatality rate: South Korea
Solid curves indicate confirmed numbers. Shaded regions are estimated number of infected, measured from illness onset.



About this analysis: In this analysis, we try to estimate the cumulative number of actual infected individuals, as opposed to reported number of infected, which may be orders of magnitude lower.

While there are substantial shadow figures for infected, the death figures are more reliable. The number of deaths today gives an indication of the number of infected individuals 2-3 weeks in the past. Fundamentally, we model the number of infected individuals by using data about the South Korean death rate, which is believed to be more accurate than many other, due to the large amounts of tests performed in South Korea.

Further our approach incorporate varying demographics across countries and varying mortality across age groups. Technical details of our approach can be found in this white paper (which is still work in progress).

Hackathon This methodology was developed as part of the #WirVsVirus hackathon (March 20th-22nd, 2020). See also the devpost page related to our solution and this 2 minute pitch describing our solution (both in German).

Data source: Johns Hopkins University (infection data) and UN data (demographic data).

Contributors: Rune Christiansen, Phillip Mogensen, Jonas Peters, Niklas Pfister, Nikolaj Thams.