Estimating linear functionals of a sparse family of Poisson means
Assume that we observe a sample of size n composed of p-dimensional signals, each signal having independent entries drawn from a scaled Poisson distribution with an unknown intensity. We are interested in estimating the sum of the n unknown intensity vectors, under the assumption that most of them coincide with a given 'background' signal. The number s of p-dimensional signals different from the background signal plays the role of sparsity and the goal is to leverage this sparsity assumption in order to improve the quality of estimation as compared to the naive estimator that computes the sum of the observed signals. We first introduce the group hard thresholding estimator and analyze its mean squared error measured by the squared Euclidean norm. We establish a nonasymptotic upper bound showing that the risk is at most of the order of σ^2(sp + s^2sqrt(p)) log^3/2(np). We then establish lower bounds on the minimax risk over a properly defined class of collections of s-sparse signals. These lower bounds match with the upper bound, up to logarithmic terms, when the dimension p is fixed or of larger order than s^2. In the case where the dimension p increases but remains of smaller order than s^2, our results show a gap between the lower and the upper bounds, which can be up to order sqrt(p).
READ FULL TEXT