
Reverse iterative volume sampling for linear regression
We study the following basic machine learning task: Given a fixed set of...
read it

Improved WorstCase Regret Bounds for Randomized LeastSquares Value Iteration
This paper studies regret minimization with randomized value functions i...
read it

Approximate Positively Correlated Distributions and Approximation Algorithms for Doptimal Design
Experimental design is a classical problem in statistics and has also fo...
read it

LowCon: A designbased subsampling approach in a misspecified linear modeL
We consider a measurement constrained supervised learning problem, that ...
read it

Tail bounds for volume sampled linear regression
The n × d design matrix in a linear regression problem is given, but the...
read it

Learning Linear Dynamical Systems with SemiParametric Least Squares
We analyze a simple prefiltered variation of the least squares estimator...
read it

A modelfree approach to linear least squares regression with exact probabilities
In a regression setting with observation vector y ∈ R^n and given finite...
read it
Minimax experimental design: Bridging the gap between statistical and worstcase approaches to least squares regression
In experimental design, we are given a large collection of vectors, each with a hidden response value that we assume derives from an underlying linear model, and we wish to pick a small subset of the vectors such that querying the corresponding responses will lead to a good estimator of the model. A classical approach in statistics is to assume the responses are linear, plus zeromean i.i.d. Gaussian noise, in which case the goal is to provide an unbiased estimator with smallest mean squared error (Aoptimal design). A related approach, more common in computer science, is to assume the responses are arbitrary but fixed, in which case the goal is to estimate the least squares solution using few responses, as quickly as possible, for worstcase inputs. Despite many attempts, characterizing the relationship between these two approaches has proven elusive. We address this by proposing a framework for experimental design where the responses are produced by an arbitrary unknown distribution. We show that there is an efficient randomized experimental design procedure that achieves strong variance bounds for an unbiased estimator using few responses in this general model. Nearly tight bounds for the classical Aoptimality criterion, as well as improved bounds for worstcase responses, emerge as special cases of this result. In the process, we develop a new algorithm for a joint sampling distribution called volume sampling, and we propose a new i.i.d. importance sampling method: inverse score sampling. A key novelty of our analysis is in developing new expected error bounds for worstcase regression by controlling the tail behavior of i.i.d. sampling via the jointness of volume sampling. Our result motivates a new minimaxoptimality criterion for experimental design which can be viewed as an extension of both Aoptimal design and sampling for worstcase regression.
READ FULL TEXT
Comments
There are no comments yet.