
Active Covering
We analyze the problem of active covering, where the learner is given an...
read it

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel kmeans Clustering
We present tight lower bounds on the number of kernel evaluations requir...
read it

Online Forgetting Process for Linear Regression Models
Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a ...
read it

Active Regression with Adaptive Huber Loss
This paper addresses the scalar regression problem through a novel solut...
read it

Convergence Rates of Active Learning for Maximum Likelihood Estimation
An active learner is given a class of models, a large set of unlabeled e...
read it

Attribute Efficient Linear Regression with DataDependent Sampling
In this paper we analyze a budgeted learning setting, in which the learn...
read it

Adaptive Semisupervised Inference
Semisupervised methods inevitably invoke some assumption that links the ...
read it
Semisupervised Active Regression
Labelled data often comes at a high cost as it may require recruiting human labelers or running costly experiments. At the same time, in many practical scenarios, one already has access to a partially labelled, potentially biased dataset that can help with the learning task at hand. Motivated by such settings, we formally initiate a study of semisupervised active learning through the frame of linear regression. In this setting, the learner has access to a dataset X ∈ℝ^(n_1+n_2) × d which is composed of n_1 unlabelled examples that an algorithm can actively query, and n_2 examples labelled apriori. Concretely, denoting the true labels by Y ∈ℝ^n_1 + n_2, the learner's objective is to find β∈ℝ^d such that, X β  Y _2^2 ≤ (1 + ϵ) min_β∈ℝ^d X β  Y _2^2 while making as few additional label queries as possible. In order to bound the label queries, we introduce an instance dependent parameter called the reduced rank, denoted by R_X, and propose an efficient algorithm with query complexity O(R_X/ϵ). This result directly implies improved upper bounds for two important special cases: (i) active ridge regression, and (ii) active kernel ridge regression, where the reducedrank equates to the statistical dimension, sd_λ and effective dimension, d_λ of the problem respectively, where λ≥ 0 denotes the regularization parameter. For active ridge regression we also prove a matching lower bound of O(sd_λ / ϵ) on the query complexity of any algorithm. This subsumes prior work that only considered the unregularized case, i.e., λ = 0.
READ FULL TEXT
Comments
There are no comments yet.