America was unprepared for the scale of the pandemic, which overwhelmed many counties and filled some hospitals to capacity. A new paper in PNAS suggests that there may have been a mathematical method, of sorts, to the craziness of those early days of COVID.
The study tests a model that closely matches patterns of reported case counts and deaths, county by county, across the United States between April 2020 and June 2021. The model suggests that unprecedented COVID spikes could, even now overwhelm local jurisdictions.
Our best guess, based on the data, is that the number of cases and deaths per county has infinite variance, meaning a county could be hit with a huge number of cases or deaths. We cannot reasonably predict that any county will have the resources to deal with extremely large and rare events, so it is crucial that counties, as well as states and even countries, make plans, in advance. , to share resources.”
Joel Cohen, Rockefeller University
Predict 99% of a pandemic
Ecologists might have guessed that the spread of COVID cases and deaths would at least roughly conform to Taylor’s law, a formula that relates a population’s mean to its variance (a measure of the dispersion around the mean) . From fluctuating crop yields to the frequency of tornado outbreaks to the proliferation of cancer cells, Taylor’s Law forms the backbone of many statistical models that experts use to describe thousands of species, including including humans.
But when Cohen started researching whether Taylor’s Law could also describe the grim COVID statistics provided by The New York Times, he was surprised.
Ninety-nine percent of county case and death counts between April 2020 and June 2021 conformed to a “lognormal” distribution of Taylor’s law, which predicts that the variance of cases or deaths in each location will be proportional to the squared mean of cases or deaths. For example, if the average number of cases per county is 50 in Arizona and 100 in California, this version of Taylor’s law would predict that the dispersion of the number of cases in California would be four times greater than the dispersion of the number of case in Arizona. Similarly, if the number of cases per county in those two states were 50 and 150, respectively, the spread would be nine times greater in California.
However, the first percent of the number of cases and deaths did not fit the log-normal distribution. Instead, the high numbers corresponded to the Pareto distribution; a pattern more often observed in economics than in biology, in which extremely high values are rarely but consistently observed (think: distribution of income or wealth). What made this particular Pareto distribution unique was that it also had infinite variance, implying that the dispersion would increase beyond any finite limit the greater the number of cases or deaths observed. The challenge was to understand why even the top 1% of the counts still complied with Taylor’s law with the same exponent as the bottom 99%.
“It was a headache,” recalls Cohen. “And I sat on this puzzle, taking it out every once in a while, torturing it a bit and putting it away. Until one day I called in the heavy artillery.”
The remaining 1%
Cohen sent his computer simulations and unproven conjectures to Richard A. Davis of Columbia University and Gennady Samorodnitsky of Cornell University, asking for their input. A few months later, the two sent him some theorems: the missing proof that Taylor’s law would hold even for the 1% most Pareto-distributed counties, with the same exponent as the 99% log-normally distributed counties . “These theorems helped prove that Taylor’s law accurately describes all data,” Cohen said. “The pandemic produced an ordered pattern of number of cases by county and deaths by county. The unexpected part of this ordering was that, in the most extreme cases, there was no limit to how bad things got.”
Infinite Variance, Almost Infinite Problems
Why the pandemic follows this hybrid (lognormal-Pareto) version of Taylor’s law so closely is unclear. One possibility is that Taylor’s law – which describes the variance of many ecological systems, including infectious diseases like measles and Chagas disease – simply captures the nature of infection. If a patient infects two people (with some probability) and each of those two patients infects two other people (with some probability), we would expect cases to increase exponentially (with some probability), and occasional random events could result in infinite variance.
Cohen hopes the study will sound alarm bells for policy makers. Infinite variance in cases and deaths per county means there is a highly unlikely but possible scenario in which a spike in COVID makes every individual in that county sick, or worse. Although the advent of vaccines makes such a scenario increasingly unlikely, regions in the United States and abroad with low vaccination rates still face the possibility of spikes they cannot handle. .
The calculations, Cohen says, suggest that COVID cases and deaths could far exceed the capacity of local jurisdictions to cope. “Governments had better be ready to call on their friends,” he says.
Cohen, I, et al. (2022) US COVID-19 cases and deaths follow Taylor’s law for heavy-tailed distributions with infinite variance. PNAS. doi.org/10.1073/pnas.2209234119.