COVID-19: Strategies for Allocation of Test Kits

04/03/2020 ∙ by Arpita Biswas, et al. ∙ Microsoft indian institute of science 0

With the increasing spread of COVID-19, it is important to systematically test more and more people. The current strategy for test-kit allocation is mostly rule-based, focusing on individuals having (a) symptoms for COVID-19, (b) travel history or (c) contact history with confirmed COVID-19 patients. Such testing strategy may miss out on detecting asymptomatic individuals who got infected via community spread. Thus, it is important to allocate a separate budget of test-kits per day targeted towards preventing community spread and detecting new cases early on. In this report, we consider the problem of allocating test-kits and discuss some solution approaches. We believe that these approaches will be useful to contain community spread and detect new cases early on. Additionally, these approaches would help in collecting unbiased data which can then be used to improve the accuracy of machine learning models trained to predict COVID-19 infections.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Background

South Korea, a country of 50 million people, has set an example of successfully flattening the curve of new COVID-19 infections by conducting over 400,000 tests [13] (Figure 2). This was achieved by setting up drive-through testing, allowing at least 10,000 people to be tested per day. South Korea’s foreign minister Kang Kyung-wha, in an interview with BBC News [2], said that “Testing is central because that leads to early detection, minimizes further spread, and quickly treats those found with the virus”. Several countries are suffering from severe community spread because of their delays in testing [12], two of the prime examples being the United States and Italy. In the United States, among a population of 330 million, the number of confirmed cases is more than 230,000 with over 10,000 deaths and these numbers are growing exponentially (Figure 3), whereas in South Korea there are around 9976 confirmed cases and 169 deaths (as of April 2, 2020). Thus, early testing and repeated testing at regular intervals are two of the key strategies to ensure a low fatality rate. However, for countries with a large population (more than 100 million), it is difficult to adopt exhaustive testing schemes because of the limited number of available testing-kits and facilities. Testing a lot of people with mild or no symptoms would occupy the limited testing resources, which could otherwise be used for high-risk patients. However, it is also important to test individuals with mild or no symptoms to detect asymptomatic cases [10] and implement a method that systematically tests individuals for COVID-19. Such systematic testing methods can be designed with various intents, namely, (a) early detecting among health workers and other essential-service workers, (b) containment as well as prevention of community spread, (c) timely care for the ones who are at potential risk, and (d) collecting data for improving the risk of infection assessment models and learning counterfactual scenarios, which is essential to determine medical demand, economic impact and policy making.

2 Current Testing Policy and Gaps

As per the new guidelines by the Indian Council of Medical Research (ICMR) [8], the current testing strategy mandates testing of all symptomatic contacts of confirmed cases, all symptomatic health workers, and all patients with fever, cough, and shortness of breath. For asymptomatic individuals with recent travel history, ICMR provides strict guidelines to be home-quarantined and get tested when symptomatic. All the asymptomatic direct contacts of confirmed cases are asked to get tested once between 5 and 14 days. Although the current testing method is targeted towards containing the COVID-19 infection, there are some gaps:

  1. The known cases are mostly self-reported. Thus, the testing happens only among the symptomatic cases and misses asymptomatic cases that do not report their contact with confirmed cases.

  2. Asymptomatic patients who get infected via community spread are missed.

  3. The data collected using the current testing strategy is heavily biased towards patients with symptoms of infections. This makes the data inappropriate to be used directly for training an infection-prediction model.

In particular, the asymptomatic cases are the most important factors for community spread. Since these cases are hard to detect, the current testing policies may not be adequate to contain the spread of the virus [14]. Thus, it is important to allocate a separate budget, test-kits per day, targeted towards (a) detecting new cases early on and also (b) collecting unbiased data. We formally define the test allocation problem in Section 3. Next, in Section 4, we propose some strategies for allocating a fixed number of test-kits across the population each day.

3 Test Allocation Problem

We assume that, on each day , a budget of is declared for containing community spread, where represents the number of test-kits available on the next day, i.e., day. We consider the problem of selecting a set of individuals on that day, and recommend them to take a test (for detecting whether he/she is infected).

3.1 Problem Statement

We assume that there is a symptom-checker app, where individuals can fill in their personal information (such as age, gender), along with medical history (diabetes, cardiovascular diseases), and travel information (country traveled within the last 14 days). The individuals are assumed to use the symptom-checker app at any time of the day, and the symptom checker is required to decide whether to recommend testing to an individual. Note that this data could be filled by the individuals on their own or by field-level health workers (ASHAs [1]) or even obtained from legacy health survey datasets. The symptom-checker coordinates111The coordination between symptom checker app and local testing centers would facilitate the symptom checker to allocate a dedicated time-slot to each individual for taking the test. This would prevent large gatherings at the test centers. In this paper, we do not tackle the problem of allocating fixed time-slots to individuals. However, this remains an interesting direction for future research.

with the local testing facilities to estimate the number of test-kits available (for additional recommendations) on the next day. There can be two modes of recommendation:

  1. Offline mode: Recommendation can be delayed till the end of the day, after all the users have registered the symptom checker on a particular day .

  2. Online mode: Recommendation is made as soon as the individual enters his/her information in the app.

The selected individuals are then supposed to undergo a test222Ensuring the individuals take the test with the right incentives, mandates, and ethical safeguards is yet another direction for further study.. It would typically take one more day to observe the test-result (infected or not).

Goal: Given a budget of test-kits, design a strategy to select at most

individuals each day such that the observations (infected or not) for those individuals help in improving the predictive power of an infection-prediction model. This goal is aligned with the need to observe enough number of individuals having distinct feature vectors. Thus, such a strategy is implicitly targeted towards collecting unbiased data as well as detecting new cases early on (even among asymptomatic cases) which in turn would help in containing the community spread.

3.2 Notation

Let represent the entire population of the country (irrespective of whether they have filled-in their details in the app). Let denote the set of all individuals who fill in their details in the symptom-checker app, possibly on different days. Clearly, is a superset of . Let each individual be denoted by a vector of features . These features capture demographic and health-related information for each individual. If an individual take the test, then the label denotes the test-result. denotes that is infected and denotes that is not infected. Thus, can be represented as a matrix with rows representing individuals and columns representing features as well as the label .

Let be the set of individuals who fill in their details in the symptom checker app on the day . Let be the set of individuals who are selected. We assume that (for most practical scenarios). Note that the labels for all will eventually be available. Let be the set of all the individuals who took the test. Note that for all the corresponding labels333In certain cases, an individual might be asked to take a test on multiple days for accuracy. That is still counted as a single test and the associated label is assumed to be positive if any of the tests are positive. are observed. On the other hand, the labels will be unavailable for all , which is the case for most individuals in and .

In this work, we focus on the offline mode and propose some approaches in Section 4. We briefly discuss the online mode and other important considerations in Section 5.

4 Solution Approaches: Offline Mode

In the offline mode, the symptom checker is allowed to delay the recommendation of testing until the end of the day. This mode of operation would require the users to register before a specific time of the day (say, ) to be eligible for testing on the next day. Let the set of all eligible users on the day be denoted as . Within minutes of this deadline (i.e., by PM), the selected individuals (denoted as ) would then be notified about getting the test done on the next day, i.e., day.

In this section, we propose four approaches, namely (a) randomization across four disjoint buckets, (b) stratification based on features, (c) budget delayed contextual bandits, and (d) utility-based active learning solution. These solutions are aimed at selecting

individuals among the population who fills in their details in the symptom checker on a day . The first two approaches ( and ) are easy to implement and can be quickly adopted for allocating test-kits, while the next two approaches ( and ) are more sophisticated and yield better results in long term.

4.1 Randomization across four disjoint buckets

A naive approach is to assign each individual in population to one of the following four mutually disjoint groups.

  1. [label=:]

  2. symptomatic cases with risky contact or travel history (i.e., contact with confirmed cases or international travel),

  3. asymptomatic cases with risky contact or travel history,

  4. symptomatic cases without any risky travel or contact history, and

  5. asymptomatic cases without any risky travel or contact history.

Split the test-kits among four buckets: allocate a budget of to each group such that . The value depends on how much importance should be given to the group while allocating testing-kits. Then, randomly sample from each group based on the budget allocation for that group.

Note that testing should remain mandatory for all those in the critical group (symptomatic cases with either recent travel history or contact history). Since there is an overlap with the mandatory cases, the budget can be adjusted accordingly.

4.2 Stratification based on features

In this approach, we wish to select both the under-represented and over-represented groups in the population to the same extent as that in the larger population , especially when the features are not distributed evenly across the set (as observed by symptom-checker).

Let us assume that each person in the population has a feature vector which includes age as one of the features. A naive random test allocation strategy is the one in which we sample uniformly the population who have filled out the symptom checker (i.e., ). In this allocation strategy, the distribution across the sampled features would mimic the distribution across . For example, if of the rows of contain individuals less than 60 years old, then the random allocation strategy is likely to select individuals who are less than years old and might miss out on the individuals above years. We wish to avoid this scenario and ensure that there is adequate representation for the population in the chosen sample.

To sample across all features according to the distribution of features in , we adopt the stratification approach. In this approach, we divide the population space (both and ) into groups (i.e., disjoint subsets) based on the values of some selected features (represented as

). If we know the joint distribution of the population over the set of features in

and , we can use them to adjust the weight of each individual according to the group he/she falls in. This means that, if the overall population

is uniformly distributed across all groups and an individual

falls in a group with fewer samples in

, the weight, or probability of selecting that individual, will be higher than for group with many samples of


Formally, we define the overall population as , where each individual is represented by a vector . Let be a selected set of features based on which the population is grouped. The joint distribution of the features in over the population be denoted as where is the value that a feature can take. Similarly, let be the joint distribution for the population (set of individuals who have reached out to the symptom check on day ). We can use these values to assign a weight [15] on an individual which is proportional to the value .

Moreover, we also increase the weights of individuals who are at a higher risk. To find the utility values, let us assume that there is an external function that outputs the utility associated with testing an individual ; see Section 5.3 for more discussions on .

The weight on each individual can be computed as

This weight would then be used to perform a weighted random sampling of individuals from the pool of individuals.

Example: let us consider two features, namely, age and gender, for each individual in the population444We can do a similar stratification across geographical regions, e.g., states/districts, past medical conditions, and occupations. We can denote the joint distribution of age and gender over the entire population by , which can be obtained using the information provided in the National Health Profile (see Table 1 for state-wise distribution of age and gender or Table 2 for the overall distribution of age and gender) and the Census. We wish to select samples from such that the selected set of samples is representative of the population (either for a given state or for the overall country).

Now, we divide the population into groups, for example , , , , etc. These values represent the density of the population that falls within that bucket. When selecting among the population, we will weight the samples with the inverse of these values. Similarly, we would use

Putting all these together, we provide a simple pseudo-code below.

{}] # Compute the fraction of X_i who belong to a particular (Gender, Age) category pair > for each gender in [Male, Female]: for each age in [<20, 20-40, 40-60, 60-80, >80] joint_users[gender, age] = count_of_samples_in(X_i, gender, age)/number_of_rows(X_i)

# Let the joint probabilities be estimated for Gender and Age categories in X^all # using information provided in National Health Profile (using Table 1 or 2): > for each gender in [Male, Female]: for each age in [<20, 20-40, 40-60, 60-80, >80]: joint_overall[gender, age]=proportion(X^all, gender, age)

# Create a 1-D list of unique IDs > ids = extract_unique_ids(X_i)

# Assign weights to each individual depending on their Gender and Age categories > weighted_probabilities = list(size=length(ids)) # Initialize a list of weights > for j in 1 to length(ids): gender = X_i[j, "Gender"] age = X_i[j, "Age"] risk = U(X[j,]) # Assume function U(x) outputs risk score for each individual x weighted_probabilities[j] = risk * joint_overall[gender, age]/joint_users[gender, age]

# Choose K_i items from the list of ids using weighted random sampling > select = choice(list_of_candidates = ids, number_of_items_to_pick = K_i, adjusted_probability = weighted_probabilities, sample_with_replacement = False) Notes for implementation:

  • the joint probabilities, joint_users[gender, age] may have zero value for certain (gender, age) pairs and computation of corresponding users (weighted_probabilities[j]) would throw an error. This be adjusted by assigning to weighted_probabilities[j] whenever joint_users[gender, age] equals , or use back-off methods for smoothing [3].

  • the Python package numpy.random.choice generates a weighted random sample of a fixed size from a given array.

4.3 Budgeted delayed contextual bandits:

In this approach, we combine the classifier learning with “informative” data point selection. The algorithm’s goal is to iteratively refine the classifier so that the cost of missed detections or false positives is similar to the best you could have done in hindsight.

In a contextual bandits setting, the decision-maker observes a context, makes a decision by choosing one action from a number of alternative actions and later observes a reward associated with that decision. The goal is to maximize the average reward. In our case, the context is information about the user: their age, geolocation, prior health conditions, symptoms, travel history, contact history, etc. The actions are: to test a person or not. The reward or cost is: a) cost of missing detection (i.e. not testing a person when she/he is actually infected) and b) cost of false alarm (i.e. testing a person when she/he is not infected).

In general, this algorithm constructs a confidence ball around every point and its prediction (i.e. where the person should be tested or not). The final decision per point is sampled from a probability distribution over both the outcomes: positive, negative. Here is a tutorial and an implementation of this algorithm:

Above mentioned, Vowpal Wabbit [17]

is an open-source library that implements online and offline training algorithms for contextual bandits. Offline training and evaluation algorithms are described in the paper titled “Doubly Robust Policy Evaluation and Learning” 

[4] by Dudik et al.

For implementation, we can bootstrap the algorithm using the stratification approach on the dataset obtained until day . After this, we model the problem as a contextual bandits problem day onward.

Figure 1: Illustrating budgeted delayed contextual bandits approach

The main components of the problem are:

  • Context: information about an individual, such as, age, gender, geolocation, prior health conditions, present symptoms, travel history, and contact history, represented as a vector with dimensions.

  • Actions: to test a person (arm ) or not (arm ). This is a 2-arms bandit setting ().

  • Feedback: Whether a person is positive ().

  • Reward/Cost: reward/cost of testing a person when he/she is positive and when he/she is not positive . We can fix cost/reward of not testing to be (i.e. for both .

  • Policy: takes a context and returns an action with probability . We denote by , the policy used on day .

  • Budget: the number of available test-kits, say .

  • Decision to optimize: learn a policy that selects individuals to maximize the expected reward

Now, let represent the set of new data points collected on the day for which we need to decide the subset that should go for testing. Recall that size of should be bounded by the budget . That is, .

Also, let be the set of points for which we had asked for testing and some section of them are tested and their results are available (we assume that the feedback , whether or not the selected individuals are positive, is delayed by a day). That is, where is the test result of individual with feature vector (context) . The algorithm works as follows:

  1. Update: We use and the cost function to train (or re-train) a classifier and update scoring function using standard contextual bandit update methods. gives a score for testing individual ; higher score implies one of the two scenarios (a) the individual is more likely to be positive or (b) the algorithm is confused and is trying to build confidence around .

  2. Prediction: We apply updated function on each individual to get their scores . These risk probabilities and the associated costs act as input to a contextual bandit algorithm [17] and it outputs the probability of recommending testing, .

  3. Selection: The next task is to select a total of individuals on day . We select using a weighted random sampling with recommendation probabilities (as described in the pseudocode above).

4.4 Utility-based active learning

In this approach, the goal is to find a set of most “informative” data points (people) so that getting them tested (i.e. labeling them) would lead to a significant increase in risk assessment solution performance, i.e. the classifier that predicts if a person is likely to be positive and hence should be tested.

There are two types of solutions to this problem:

  • Uncertainty based exploration: Determine which user’s infection status is most uncertain according to the current classifier’s scoring function and ask for their labels (see, for example, the paper by Tong and Koller, 2001 [16]), but this method will suffer in our case with high bias in the initial classifier.

  • Disagreement based exploration: In this approach, if any two classifiers in the hypothesis class disagree on whether a person is likely to be infected or not, then that person should be tested. However, this algorithm is somewhat challenging to implement. One version of the algorithm is implemented here [9].

5 Discussions

5.1 Solution Approach: Online Mode

The challenge in the online mode is that individuals arrive online one-by-one, and one needs to take an immediate decision on whether to recommend them for testing. To do that, we need to design a rate-limiting mechanism [7]. At a high level, a rate-limiter limits the number of actions that can be performed within a particular time window. To apply a rate-limiter mechanism, let us divide a day into six time-slots of hours each and limit recommending only fraction of at a particular time-slot . The value of can be estimated by computing the proportion of individuals who registered at a particular slot on the previous day, i.e.,

For an individual who arrives during the time-slot on the day , we compute the score (probability or utility of recommendation) using one of the four approaches defined in the offline mode. If there were less than individuals on the previous day, then recommend testing to . If not, then compare ’s score with that of the highest scorer of the previous day among those who arrived in slot . If scores higher than the scorer on the previous day, then recommend testing.

Apart from rate-limiting mechanisms, there exist other sophisticated techniques that can be applied to facilitate the online mode of operation.

5.2 Coordination Between Symptom Checker and Testing Facilities

The test-allocation problem that we consider assumes that the symptom checker app coordinates with local testing centers to estimate the number of available test-kits on the next day. Such coordination is also important for allowing the symptom checker to allocate a dedicated time-slot to each individual for taking the test. This would prevent large gatherings at the test centers. In this paper, we do not tackle the problem of allocating fixed time-slots to individuals. However, this remains an interesting direction for future research.

5.3 Finding the Associated Utility

In Section 4.2, we assume an external function that outputs the utility associated with an individual . This utility function has to be chosen based on the health policy directives [8]. Note that this utility value can be computed using all the available information (e.g., location, symptoms, occupation) associated with an individual, including features that might not have been used for stratification. When the budget itself forms a big fraction of the overall test kits, it is important to address the care and containment aspects along with data collection requirements. Let denote the current risk assessment model. Then set to the probability of infection or a monotonic function thereof is a good choice since it ensures that the high risk cases are assigned more test-kits while allowing a non-zero allocation for the low risk ones. When the high risk patients are already guaranteed tests and the goal for the budget is to improve the infection assessment model for a binary decision, then it would be appropriate to choose to be the entropy of the conditional distribution or a similar uncertainty measure. If the intent is to estimate a well-calibrated infection distribution, then assuming a uniform for all would be suitable.

5.4 Better Utilization of Test-Kits

In a country with a very large population (such as India having 1,376 million people), it might be extremely difficult to manufacture extra test-kits in bulk. Thus, it is important to study how to optimize the use of these few extra test-kits. A recent study [5] shows that it is possible to use lesser test-kits for testing more people in a situation where most of the test results turn out to be negative (number of positive cases in South Korea is only around of the total tests performed). The article [5] mentions that researchers at the German Red Cross Blood Donor Service developed a procedure called mini-pool where they mix trial samples in a pool and test it. If the test for the pool is negative, then all the individuals are declared negative. If the pool is tested positive, then each sample is tested individually. More sophisticated techniques can be used based on this binary search method.

Though this utilization problem is orthogonal to the problem we consider in this work, there are solutions that may help to choose the budget , where can be much more than the number of available test-kits, and thus would help to test a larger set of individuals.


The authors thank Mohit Kumar and Arun Iyer for extremely useful discussions.


6 Appendix

Figure 2: The cumulative number of confirmed cases and deaths in South Korea. We clearly see the flattening of confirmed cases in South Korea. Data source [6]
Figure 3: The cumulative number of confirmed cases and deaths in United States (US). We observe an exponential increase of confirmed cases in US. Data source [6]
State Sex Ratio Population above 60
Total Male Female
India 943 8.2 7.9 8.4
Andhra Pradesh - 9.5 9.8 9.3
Arunachal Pradesh 938 - - -
Assam 958 6.6 6.9 6.2
Bihar 918 6.4 6.4 6.3
Chhattisgarh 991 7.1 6.6 7.5
Goa 973
Gujarat 919 8.4 7.6 9.3
Haryana 879 7.4 6.9 8
Himachal Pradesh 972 11.1 10.8 11.4
Jammu & Kashmir 889 9.3 9.2 9.3
Jharkhand 949 6.6 6.3 6.9
Karnataka 973 8.2 7.7 8.7
Kerala 1,084 13 12.3 13.7
Madhya Pradesh 931 7.2 6.9 7.5
Maharashtra 929 9.1 8.7 9.5
Manipur 985 - - -
Meghalaya 989 - - -
Mizoram 976 - - -
Nagaland 931 - - -
Odisha 979 9.9 10.1 9.8
Punjab 895 10.2 9.7 10.7
Rajasthan 928 7.3 6.5 8.2
Sikkim 890 - - -
Tamil Nadu 996 10.7 10.6 10.9
Telangana - 8.2 8.3 8.2
Tripura 960 - - -
Uttar Pradesh 912 6.9 6.6 7.2
Uttarakhand 963 8.6 8 9.3
West Bengal 950 8.7 8.7 8.8
Andaman & Nicobar Islands 876 - - -
Chandigarh 818 - - -
Dadra & Nagar Haveli 774 - - -
Daman & Diu 618 - - -
NCT of Delhi 868 6.6 6.4 6.9
Lakshadweep 947 - - -
Puducherry 1,037 - - -
Table 1: Sex ratios and persons above the age of 60, Statewise [11]
Age Group Total Male Female
0-4 8.3 8.5 8.1
5-9 8.8 8.9 8.6
10-14 9.4 9.6 9.2
15-19 10.2 10.5 10
20-24 10.6 10.4 10.8
25-29 9.8 9.7 10
30-34 8.3 8.3 8.2
35-39 7.2 7.1 7.2
40-44 6.2 6.2 6.2
45-49 5.3 5.3 5.3
50-54 4.3 4.3 4.3
55-59 3.5 3.4 3.6
60-64 3 3 3.1
65-69 2.1 2.1 2.2
70-74 1.4 1.3 1.5
75-79 0.9 0.8 0.9
80-84 0.5 0.4 0.5
85+ 0.3 0.2 0.3
Table 2: National Age Distribution [11]