Optimal subsampling designs

04/06/2023
by   Henrik Imberg, et al.
0

Subsampling is commonly used to overcome computational and economical bottlenecks in the analysis of finite populations and massive datasets. Existing methods are often limited in scope and use optimality criteria (e.g., A-optimality) with well-known deficiencies, such as lack of invariance to the measurement-scale of the data and parameterisation of the model. A unified theory of optimal subsampling design is still lacking. We present a theory of optimal design for general data subsampling problems, including finite population inference, parametric density estimation, and regression modelling. Our theory encompasses and generalises most existing methods in the field of optimal subdata selection based on unequal probability sampling and inverse probability weighting. We derive optimality conditions for a general class of optimality criteria, and present corresponding algorithms for finding optimal sampling schemes under Poisson and multinomial sampling designs. We present a novel class of transformation- and parameterisation-invariant linear optimality criteria which enjoy the best of two worlds: the computational tractability of A-optimality and invariance properties similar to D-optimality. The methodology is illustrated on an application in the traffic safety domain. In our experiments, the proposed invariant linear optimality criteria achieve 92-99 D-efficiency with 90-95 A-optimality criterion has only 46 examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2018

Optimal Designs for Poisson Count Data with Gamma Block Effects

The Poisson-Gamma model is a generalization of the Poisson model, which ...
research
12/20/2022

Active sampling: A machine-learning-assisted framework for finite population inference with optimal subsamples

Data subsampling has become widely recognized as a tool to overcome comp...
research
08/02/2018

Removal of the points that do not support an E-optimal experimental design

We propose a method of removal of design points that cannot support any ...
research
12/21/2020

Optimality of multi-way designs

In this paper we study optimality aspects of a certain type of designs i...
research
06/10/2019

Bayesian experimental design using regularized determinantal point processes

In experimental design, we are given n vectors in d dimensions, and our ...
research
03/15/2012

Adaptive experimental design for one-qubit state estimation with finite data based on a statistical update criterion

We consider 1-qubit mixed quantum state estimation by adaptively updatin...
research
08/10/2022

Optimal response surface designs in the presence of model contamination

Complete reliance on the fitted model in response surface experiments is...

Please sign up or login with your details

Forgot password? Click here to reset