Media reports, and government data, repeatedly presents ‘numbers of Covid-19 cases‘. However, the data presented is not the number of all cases numbers, but some fraction of the number of cases. It is typically not even governments best guess of case numbers – it is the smaller number: ‘confirmed cases‘.
‘Confirmed cases‘ is presented as effectively being total ‘cases‘ of Covid-19. It is clear that there are far more actual cases than there are confirmed cases since only a small percentage of people are tested, which means the statistics presented are misleading.
- government experts who do calculate actual case estimates, typically state actual cases to be in the range of 8 to 16 times higher than confirmed cases
- ‘Confirmed cases’ can rise at time the number of people infected is falling, and can fall when the number of new infections is rising.
- Use of ‘Confirmed cases’ can most certainly mask the real severity of a Covid-19 outbreak.
This post explores the problems further, looks at why actual cases is so different from ‘confirmed cases’, and looks are what challenges and solutions to having more accurate and useful data.
(still in early version- updates and more links to sources to follow)
- Problems Include:
- Why ‘cases’ is so different from ‘confirmed cases’
- How big is the difference?
- The wild variation of differences
- Alternatives – fixing the data
- deaths are bad, but ‘confirmed cases’ are not!
- sample data
Not country tests everybody. Ok, there are countries or areas where there is no reason to even suspect people are infected.
However, at this time, no country even tests everybody they think may be infected as there is a limit to test kits.
However, some countries come far closer to testing everyone with a reason to be tested, and may catch perhaps even 50% or all cases, yet others test as few as less than 10% of those with a reason to be tested. When people who should be tested are not, the numbers get distorted.
Consider the UK, where Prince Charles recently became a confirmed case, yet despite being confirmed, questions are raised whether he was eligible to be tested. Clearly, this demonstrates that infected people may not even be eligible to be tested, and as a result, many people are missing from the stats.
In New York, the governor is clearly aware of how a surge in case numbers can more reflect more testing, rather than more cases:
Mr. Cuomo has said the virus has spread so widely that the increase in the number of confirmed cases reflected New York’s added testing capacity more than anything else.NY TImes
Increase testing, and the headline will be ‘a surge in cases’, even if the more testing is part of better controls that are seeing cases fall.
There are more reasons ‘confirmed cases’ differ from ‘people infected’, with the result that in some cases (like South Korea) there may be a few as 3 infected people per confirmed case, and in other countries, there could easily be 1,000 infected people per confirmed case. The variation is huge.
Consider this question, how serious is the problem in each of Iran, Germany, Spain and the UK?.
Checking the situation report 65 from the WHO for March 25th, Iran has recorded 24,811 cases Germany 31,554 cases, and Spain has 34,811 cases and the UK 8086 cases.
Do we believe for a moment this represents the situation in each of these countries?
‘Cases’ would suggest( using Germany as a reference):
- Iran had around 2/3 of the number of infected people as in Germany
- The UK had just over 1/4 the infected people as Germany
- Spain had around 10% more infected people than Germany
But when we compare deaths, clearly we see a different picture. Germany has the least number of deaths, with around 1/3 or the number recorded in the UK where less testing means less cases have been confirmed, despite clearly the more outbreak being more severe in the UK than in Germany. Spain with 5x the cases of the UK, clearly have more then 5x the deaths, suggesting even less testing in Spain that the UK. In Iran, with 60% of the number of cases of Spain, has 70% of the cases of Spain, suggesting even less testing.
Evaluating severity ‘by’ cases (but using ‘confirmed cases’), makes the country doing the most testing appear to have the most severe outbreak, when generally more testing means a better management of the outbreak and, all else being equal, less actual cases.
Exaggerated Mortality Rates
Simply put, mortality is calculated by number-of-deaths divided by number-infected. Given the number of deaths is from actual people who die, yet the number of cases clearly excludes many people who have were cases by did not die, mortality rates could be low as 10x those calculated by deaths/cases. Because the
Given that ‘more tests will yield more cases’, and some governments may feel ‘government approval ratings will be impacted by high case numbers’, there is a disincentive to test for many countries. The WHO advises: “Test, Test Test”.
Every additional confirmed case, becomes a managed and controlled case, which leads to less actual cases. But governments fearing the polls may be reluctant to test.
Driving Misguided Policy
Using ‘confirmed cases’ as ‘cases’ leads to seriously misleading information, can drive bad government policy, and is seriously damaging to efforts to stop people dying and reduce economic disaster.
Severity of an Outbreak Can be Masked
Consider Indonesia. There has been very little testing, so reported confirmed cases are very low. But all logic suggest that the only reason reported cases are low is this lack of testing.
Consider Japan. After some initial apparent success controlling the outbreak, testing rules became tight with government authorisation required for testing. Is it a coincidence that case numbers remained low until the decision on the Olympics?
Encouraging Distorted Testing
Every country has rules that limit testing. With official case figures determined by tests, this can result in countries testing where they want to report cases to enable promotion of policies. Typically, closure of borders acceptance of asylum seekers can be supported by heavily testing to gather supporting data.
No Actual Case Numbers are Calculated or Released.
It is possible that governments do have a handle on the real numbers but are choosing not to release that data. Basically, governments either do know the data and are keeping us in the dark, or they do not know either because they do not wish to know. I am not sure which option is the most frightening.
Either way, with ‘confirmed cases’ being accepted as a substitute, governments have no pressure to release actual case estimates.
Why ‘cases’ is so different from ‘confirmed cases’
One reason is that when locations overwhelmed, they instruct everyone to quarantine and tests are reserved to determining how to treat critically ill patients in hospital.
The highest number of cases occur in locations where health systems are overwhelmed. Countries like South Korea made testing free and provided drive in test locations, without even a requirement of symptoms.
At the other end of the spectrum, countries that develop a situation such as in Italy are so overrun they can only test patients who arrive in hospital. In Italy, if you do not need to be in hospital at this time (mid March 2020) then the action to take is not changed by any test: be in quarantine regardless. This means a very high ratio of confirmed cases in Italy have a very severe case of Covid-19, as these cases are the only ones tests. Actual cases become a far bigger multiple of ‘confirmed cases’, than data from areas where medical systems are not overrun.
How big is the difference?
From 10x to 1000x. This data to be added.
- UC Berkley states US cases 9x underreported.
Deaths are bad, but ‘confirmed cases’ need not be!
Consider the early stages of the outbreak in Iran. Insufficient testing meant relatively small number of confirmed cases, A multiple of deaths may be more accurate way to predict cases
The best solution is random tests and statistical approaches. I believe this was done in Iceland, with the results indicating 20x the level of quite intensive targeted testing. I plan to add more on this.
Limits of testing kits and/or resources to administer the tests means countries can rarely test every suspect case. By why not record the number of suspect cases and test a small sample of them? Then use statistics!
Goal: Best Possible Case Numbers.
The UK, Norway, Iceland, Sweden all have had estimates in one form or another calculated by government medical teams. Why stop at one estimate? Why not keep collecting data and continue to improve estimates as more data continues to be available.
For worldwide estimates, a body such as the WHO could look at global data and how testing is conducted in various countries to all projections for countries without their own figures. In fact, various national governments should have motivation to do this anyway, in order to have the best data possible.
Every day we get reports like: “The number of corona virus cases worldwide has now risen to XXX”, or the number of new cases in our country today is YYY. But these reports represent gross underreporting and distortion of what is really happening.
I have seen reports qualifying data as ‘known cases’, which is a step forward, but we need to continue.
A defence presented is “since we do not have actual cases, ‘confirmed cases’ is the best he have”. However, given the problems, no excuse justifies not having clarity on the difference between “confirmed cases” and “actual case numbers”. Either the difference needs to be made clear, or preferably, we need to move to at least statistically calculated ‘best possible actual case numbers”, as suggested above in ‘alternatives’.