Sequential Community Mode Estimation

11/16/2021
by   Shubham Anand Jain, et al.
0

We consider a population, partitioned into a set of communities, and study the problem of identifying the largest community within the population via sequential, random sampling of individuals. There are multiple sampling domains, referred to as boxes, which also partition the population. Each box may consist of individuals of different communities, and each community may in turn be spread across multiple boxes. The learning agent can, at any time, sample (with replacement) a random individual from any chosen box; when this is done, the agent learns the community the sampled individual belongs to, and also whether or not this individual has been sampled before. The goal of the agent is to minimize the probability of mis-identifying the largest community in a fixed budget setting, by optimizing both the sampling strategy as well as the decision rule. We propose and analyse novel algorithms for this problem, and also establish information theoretic lower bounds on the probability of error under any algorithm. In several cases of interest, the exponential decay rates of the probability of error under our algorithms are shown to be optimal up to constant factors. The proposed algorithms are further validated via simulations on real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2020

Group testing for overlapping communities

In this paper, we propose algorithms that leverage a known community str...
research
07/02/2020

Improved bounds for noisy group testing with constant tests per item

The group testing problem is concerned with identifying a small set of i...
research
06/29/2020

Non-Convex Exact Community Recovery in Stochastic Block Model

Learning community structures in graphs that are randomly generated by s...
research
06/19/2020

Partitioned Sampling of Public Opinions Based on Their Social Dynamics

Public opinion polling is usually done by random sampling from the entir...
research
11/07/2018

How Many Subpopulations is Too Many? Exponential Lower Bounds for Inferring Population Histories

Reconstruction of population histories is a central problem in populatio...
research
07/09/2021

Preserving Diversity when Partitioning: A Geometric Approach

Diversity plays a crucial role in multiple contexts such as team formati...
research
01/09/2022

Selecting the Best Optimizing System

We formulate selecting the best optimizing system (SBOS) problems and pr...

Please sign up or login with your details

Forgot password? Click here to reset