Adaptive Group Testing with Mismatched Models

by   Mingzhou Fan, et al.

Accurate detection of infected individuals is one of the critical steps in stopping any pandemic. When the underlying infection rate of the disease is low, testing people in groups, instead of testing each individual in the population, can be more efficient. In this work, we consider noisy adaptive group testing design with specific test sensitivity and specificity that select the optimal group given previous test results based on pre-selected utility function. As in prior studies on group testing, we model this problem as a sequential Bayesian Optimal Experimental Design (BOED) to adaptively design the groups for each test. We analyze the required number of group tests when using the updated posterior on the infection status and the corresponding Mutual Information (MI) as our utility function for selecting new groups. More importantly, we study how the potential bias on the ground-truth noise of group tests may affect the group testing sample complexity.


Noisy Adaptive Group Testing using Bayesian Sequential Experimental Design

When test resources are scarce, a viable alternative to test for the pre...

Group testing via residuation and partial geometries

The motivation for this paper comes from the ongoing SARS-CoV-2 Pandemic...

Agreement and Statistical Efficiency in Bayesian Perception Models

Bayesian models of group learning are studied in Economics since the 197...

Is Group Testing Ready for Prime-time in Disease Identification?

Large scale disease screening is a complicated process in which high cos...

Neural Group Testing to Accelerate Deep Learning

Recent advances in deep learning have made the use of large, deep neural...

Welfare-Maximizing Pooled Testing

In an epidemic, how should an organization with limited testing resource...

Efficient candidate screening under multiple tests and implications for fairness

When recruiting job candidates, employers rarely observe their underlyin...

1 Introduction

Originally proposed for blood testing in the World War II [Doffman1943GT], group testing has been a powerful tool for detecting infected individuals in a large population, for example by polymerase chain reaction (PCR) tests for COVID-19 [Gollier2020COVID]. By mixing up the test samples (e.g. saliva or blood) of individuals in a group, the tester can determine whether there exists any infected individual in the group if no mistakes are made.

Current studies on group testing can be roughly divided into two categories – non-adaptive [Han2017GT, Liu2011GT, Du1993GT, Aita2021GT, Sobel1975GT] and adaptive methods [Baldassini2013GT, Han2017GT, Google2020GT, Sakata2021GT, Bai2019GT, McMahan2017GT, HughesOliver1994GT, Scarlett2019GT] – based on whether the group to test is decided before the tests or adaptively given test results during the whole sequential procedure.

Non-adaptive group testing is solved in a two-stage fashion: designing testing groups and recovering the infection status based on the testing results. The very first paper about group testing [Doffman1943GT] is in a non-adaptive way and derived the optimal group selection in a noiseless setup. Recovering for large population is considered in [Sobel1975GT]. A noisy setup was considered and the optimality of group testing was proved in [Liu2011GT]. A review of non-adaptive group testing and its applications can be found in [Du1993GT].

Adaptive group testing updates the Bayesian models of infection status based on the previous testing results and designs a new group to test at each iteration. Works on adaptive group testing have examined its empirical performance in different set-ups. Bayesian regression was used in [McMahan2017GT] to model the group testing data with adaptive group tests based on the updated posterior. Model inference by lattice-based classification models [Tatsuoka2021GT] or sum-observation [Han2017GT] has been explored. In [Sakata2021GT], Loopy Belief Propagation (LBP) [Murphy1999LBP] and other approximation strategies [Bai2019GT]

were adopted for scalable inference. All the aforementioned works design group tests based on the entropy-based utility function. Recently, other utility functions, including mutual information (MI) and expected area under the receiver operating characteristic curve (AUC), have been explored with a sequential Monte-Carlo (SMC) method in 

[Google2020GT]. Capacity for noiseless group testing was given in [Baldassini2013GT]. To the best of our knowledge, theoretical analysis for noisy adaptive group testing is limited. In [Bai2019GT]

, the sample complexity when using the entropy utility function was derived. However, computational of the sample complexity requires the ground-truth probability, which is typically unknown in practice.

In this paper, we consider adaptive group testing in the presence of uncertainty. We follow the model formulation in [Google2020GT] to set a Bernoulli prior for the infection status. Group testing design is based on a mutual information utility function of the current posterior, iteratively updated given previous results. Based on a stopping criterion of conditional entropy, we derive a lower bound of the required number of group tests. More importantly, we further analyze the sample complexity when we have possibly mismatched testing models. We prove that when the model parameters are mismatched with the ground truth, the lower bound increases as expected, with a constant , due to the optimal group selection based on the biased utility function. Such a theoretical analysis has not been discussed in the existing literature. We further confirm our analyses by simulation results.

2 Problem Setup

Given a population of

individuals whose infection statuses are unknown and modeled by Bernoulli random variables

’s, if the -th individual is infected, and otherwise

. Denote the random vector representing our understanding on the population infection state by

where with the infection probability for the th individual. We are interested in designing group tests adaptively to discover the unknown infection state: . We assume that the Bernoulli random variables ’s are independent, i.e. for any , we can denote . Here, can be identical or vary across individuals.

Throughout the paper, we will use capital letters to denote random variables and vectors are with the bold font.

2.1 Group Testing

For time and cost efficiency, instead of testing each individual, we test gathered samples mixed from the selected individuals (saliva samples, for example) to discover the infection status. For the -th group test result , if the -th sample is tested positive, indicating that the combined sample contains the sample(s) from infected individual(s); and otherwise all the individuals drawn in the th group test are not infected. Group testing design is to choose a subset of individuals from the population as a group, denoted as a vector , and test the mixed samples from the individuals with the corresponding .

2.2 Group Testing Parameters

Existing testing assays have limitations and it is possible to have testing errors. As in [Google2020GT], we assume that the group testing has the following sensitivity () and specificity ():


where . Here, and are referred as model parameters for adaptive group testing design in the following part of the paper.

2.3 Inference

Assume that we have designed a batch of testing groups at stage , where and is the batch size. Given their corresponding test results , we can compute:


based on (1) and (2), where

and (5)

Here, .

We can further infer the posterior of the population infection status by Bayes’s rule:


For simplicity, from now on, we write the posterior of an event given the previous test results by:


3 Adaptive Group Testing

For adaptive group testing, we actively select a batch of groups to update the posterior , and design a utility function to guide the group selection in the next iteration. More specifically, the task at each stage is to find a batch of groups such that


3.1 Mutual Information Utility

One of natural choices of the utility function is the Mutual Information:


Denote as the binary entropy and denote as . Write as . According to (6) the Mutual Information can be written as


where , , and


represents the probability of having infected patient(s) in the chosen group.

Here, we consider the simplified setups with all the sensitivity and specificity being constant with respect to group selection, i.e. , and . The Mutual Information (10) can be written as:


Further assume that we test one group at each stage. We have:


where . Note that we have replaced by and by . Write , (13) can be written as:


where is a concave function of so it would be maximized if its derivative is zero, leading to the closed-form optimal point of for (13):


Note that is fixed when the group testing sensitivity and specificity parameters are given. As it is concave, it is easy to find that optimizes .

The problem defined in (9) becomes


and can be informally viewed as finding to make as close as possible to .

3.2 Stopping Criteria

In previous works, either the budget [Google2020GT] or the maximum probability of infection status, i.e. [Bai2019GT, Sakata2021GT]

, have been used as stopping criteria. The former cannot help us analyze the asymptotic performance, while the latter can be tricky to analyze if using estimation methods like LBP. So here we use Conditional Entropy (CE) as the stopping criterion:


where and is the entropy of the prior and is fixed once the prior is given.

With the definition of mutual information and (6), we have


So we have


If we only search for one group to query each stage, by substituting (13) and (14) into (19),


The stopping criterion becomes


By doing this, we are able to give the stopping criterion informational meaning and it is easy to compute once each is given by group selection.

3.3 Sample Complexity

It worth noting that by the nature of deriving in (11), we are not guaranteed to reach every time during adaptive group testing. In other words, there are always gaps between the achieved by group test design and the optimal . Therefore we cannot treat the information gain at each iteration as a constant. Besides, it can be difficult to analyze how close can be approach as adapts over the iterations.

Here, we approximate

as normal distributions

. An example histogram of the selected in the experiments is illustrated in Figure 1.

Figure 1: The histogram of selected in the 5-th to 15-th iteration in the experiment. The red line is the PDF of . We can see that the distribution of

can be reasonably fitted by a Gaussian distribution. The group selection of the first few iterations would be highly influenced by the prior so we do not include them here.

Thus, we can transform the stopping criterion (21) into


Note that




where .

Theorem 1.

If , we have


where , , , and .


When ,


where and .





Considering as independent sampled, we also have are independent. So that we have




By Chebyshev’s inequality, if , or ,


We now give the condition for with the following lemma.

Proposition 1.

If , we have


From Equation 24 and Theorem 1, we have this lemma. ∎

Based on this theorem, we have shown that the probability of meeting the stopping criterion is in the rate of when . takes the form , which is related to the prior distribution. When setting i.i.d. Bernoulli prior, becomes

, which is proportional to the number of patients. In general the variance

is small, which leads to so that can be close to very quickly as soon as .

4 Mismatched Model

Now consider we have biased assumptions on the group test model parameters. Specifically, the true sensitivity and specificity is and , but we do not have them in practice and set them as and (mismatched) for adaptive group testing.

In each iteration, we would optimize the ‘mismatched’ utility function:


and select the group such that , where is the mismatched posterior updated with the mismatched parameters. With the same setup in Section 3, the selection at each iteration is .

The mismatched selection target of would be

Figure 2: MI utility with the change of and when we set ground truth , . The z axis is . The orange plane represents the optimal Information Gain.

The actual information gain, however, should be calculated with the true parameters,


Notice that here the true posterior needs to be updated with true parameters and have the ‘true’ understanding on the infection status.

Histogram of
Histogram of
Figure 3: The histogram of selected in the 3-th to 7-th iteration of experiment over 1000 runs.

Similar as what we derive in the previous section, we consider . A histogram in a example is illustrated in Figures 3 and 3. When the model is mismatched, the variance would be much larger.

Similar to Lemma 1, we have:

Proposition 2.

If , we have


for mismatched models, where


, .


The proof can be easily adopted from the proof of Lemma 1, simply by replacing with . ∎

Here represents the change of sample complexity because of bias on to . We want to point out that holds, so the probability can still be close to 1 once . The main influence of biased model is the difference of and . Also if , so that we can observe that the performance is similar when in the experiments.

5 Experimental Results

We perform simulations to confirm the derived bounds of the required number of group testing iterations in this section.

5.1 Experimental Settings

We investigate how the group testing performance changes with different model parameter settings. We have simulated results for 24 combinations of mismatched parameters (biased) together with the ground-truth group testing parameters (unbiased), and . To allow the exhaustive search to achieve the best achieved group design, we have simulated 1000 runs with a population of ten individuals with one infected individual. The prior is set as the independent Bernoulli for each individual and the probability of each individual being infected is , . We perform adaptive group test as described in previous sections for each simulation run and take the average conditional entropy and Area Under ROC Curve (AUC) [Bradley1997AUC] over 1000 runs for each iteration for performance evaluation.

5.2 Conditional Entropy

In this set of experiments, the ground-truth entropy for performed simulations is:

We have plotted the average Conditional Entropy defined in (20) over iterations in Figure 3(a). The dashed curve is the performance based on the ground-truth model parameters, which outperforms the ones based on the utility function with mismatched models as we expected.

(a) Average CE
Figure 4: Simulation Results

Then we show the average of the required group test number in Figures 3(b), 3(c), and 3(d) when , respectively. The three horizontal lines in Figure 3(a) shows the value of when . We can see that when , we have similar required test numbers.

5.3 Auc

Here, we compute the area-under-curve (AUC) based on the marginal likelihood as the criterion to evaluate the performance of our updated posterior given corresponding group test results in each iteration.

We compute the marginal likelihood for each of the individuals, indexed by , as

where is the infection status of the -th individual, is a group that only contains -th individual, i.e. one-hot coding of the -th individual.

(a) AUC after 4 tests
(b) AUC after 8 tests
Figure 5: Average AUC after different numbers of tests in different setups

The AUC metric of a classifier

is defined as


The AUC of the infection marginal likelihood can be written as


where is the indicator function: if is true, and otherwise.

Although not directly optimized over AUC, the AUC values for each setup in given iterations illustrated in Figures 4(a) and 4(b) show that conditional entropy is relevant, though not strictly monotonically, to the accuracy of infection detection.

6 Conclusions

In this paper, we proved that the probability of meeting the stopping criterion based on conditional entropy is in the rate of . More importantly, we have shown that a mismatch in the group testing model would lead to a multiplicative constant (), determined by the difference between and , to the required number of group tests. Our simulation study shows that the adaptive group testing can be efficient in infection detection based on the mutual information utility. Adaptive design with the correct group testing model outperforms the ones with mismatched models. The performance evaluation by AUC has shown to be related to the conditional entropy though not strictly monotonic.