Anti-clustering in the national SARS-CoV-2 daily infection counts
The noise in daily infection counts of an epidemic should be super-Poissonian due to intrinsic epidemiological and administrative clustering. Here, we use this clustering to classify the official national SARS-CoV-2 daily infection counts and check for infection counts that are unusual by being anti-clustered. We adopt a one-parameter model of ϕ_i' infections per cluster, dividing any daily count n_i into n_i/ϕ_i' 'clusters', for 'country' i. We assume that n_i/ϕ_i' on a given day j is drawn from a Poisson distribution whose mean is robustly estimated from the four neighbouring days, and calculate the inferred Poisson probability P_ij' of the observation. The P_ij' values should be uniformly distributed. We find the value ϕ_i that minimises the Kolmogorov–Smirnov distance from a uniform distribution. We investigate the (ϕ_i, N_i) distribution, for total infection count N_i. We consider consecutive count sequences above a threshold of 50 daily infections. We find that most of the daily infection count sequences are inconsistent with a Poissonian model. All are consistent with the ϕ_i model. Clustering increases with total infection count for the full sequences: ϕ_i ∼√(N_i). The 28-, 14- and 7-day least noisy sequences for several countries are best modelled as sub-Poissonian, suggesting a distinct epidemiological family. The 28-day sequences of DZ, BY, TR, AE have strongly sub-Poissonian preferred models, with ϕ_i^28 <0.5; and FI, SA, RU, AL, IR have ϕ_i^28 <3.0. Independent verification may be warranted for those countries with unusually low clustering.
READ FULL TEXT