Created on 12 Mar 2020 ;    Modified on 12 Mar 2020 ;    Translationitalian

Coronavirus Covid-19 outbreak, some official data sources


This article does not claim to be exhaustive and is published without any guarantee of correctness over time. Use it carefully, checking the links before their use. The linked data are responsibility of the agencies that publish it.


Staying on Coronavirus, it is important to stick to reliable data to reason about what is happening.

For this reason, here I report some of the addresses that can be consulted on the WEB, which I consider primary sources, to read firsthand the data of the spread of the epidemic.

First of all here is the list of addresses in question. Then I conclude with some personal consideration.

Some WEB addresses available

ONU World Health Organization: Situation Reports.

This is a list of pdf files in English. They are good for personal consultation, to read them. Less good for automatically extracting data. There is the history since Jan 21 2020.

China China's National Health Commission: Daily Briefing

It is a list of html pages in English. These too are ok for reading. And they are not so good to automatically extract the data to be analyzed (even if it is more simple mining an html than a pdf). It is present the story from 25 Jan 2020

EU European Centre for Disease Prevention and Control: Download today’s data on the geographic distribution of COVID-19 cases worldwide

HTML page with the link to an Excel file containing the daily data of infections and deaths for various countries in the world. Good the extraction of the data for analysis, with the foresight to make the sums to have the cumulative data. Data starts on 31 Dec 2019.

UK Government Services: Number of coronavirus (COVID-19) cases and risk in the UK

It is an html page, in English. Structured in several sections, it presents global infection data, and various risk warnings, symptoms, and so on. Data extraction is relatively simple since there is one dedicated section. There is no history of data.

USA Centers for Disease Control and Prevention (CDC): Cases in US

This is an articulated html page, in English. It presents maps and data of various kinds and with different types of aggregations. It reports only the overall data. It's good to read. Data extraction is relatively simple because it is grouped in a single html table. This table has the temporal axis curiously extended horizontally, making manual consultation complex. One would think it was done on purpose ... It presents the story from 11 Jan 2020

USA John Hopkins, Center for Systems Science and Engineering: Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE

Where the US government does not go, American universities put a nice patch. John Hopkins publish a map visualizing the spread of the virus. This map has become the reference for all the main news channels. In addition, the data that animate the aforementioned map is published via Github at the address linked at the top of this section.

At the same Github address, John Hopkins publishes the links from which it extracts the data subsequently used to populate the database. I recommend it for whom is interested in extracting data to perform analyzes.

Italy Here the situation is more complex and there are at least two official sources:


I usually criticize the ruling political class, regardless of its color.

But in this case I must say that I am appreciating the transparency policy adopted by the ItalianCouncil of Ministers. In particular, I approve the idea implemented by Civil Protection that:

  • communicates the infection data daily, indicating in detail even the numbers of hospitalized patients, which, for example, is not known in the case of the highly advanced USA;
  • and information is presented in two different ways:
    • graphics for ease of reference on the fly (although I criticize the graph of the national trend over time: I preferred to use a different scale);
    • in tabular form a CSV files, easily downloadable and to analyze with satistical tools;
    • and the latter is aggregated with three different granularities: national, regional and provincial.

This is true transparent information: open and complete.

My compliments also to China. Although in an optimized form for human consultation, however it exposes a lot of data. I do not comment on any controversy I have read about the method of data collection, which initially could be interpreted as aimed at containing the numbers. It is possible that these voices have a foundation of truth. Even in light of the purges that have been among the managers of the province of Hubei, in relation to the lack of initial management of the infection.

The European Union (EU) is also doing very well. Only the essential data are present: infected and deceased. But the whole story is present, from the official beginning of the epidemic in China. And, more important, data for all countries are reported.

I am disappointed by USA. The CDC page is captivating and articulated, but low in exposed data. In my opinion, Trump's policy, not only towards coronavirus, is ruining this great nation. And it is not only in my opinion. For example it is interesting to read this article: The Trump Administration’s Misinformation Machine, published by Scientific American.

Finally, I must mention the American university John Hopkins with particular emphasis. They did a remarkable job, they did first data collection and display on the map. The Italian Civil Protection website is derived from them.

Enjoy. ldfa