
-
Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
We study episodic reinforcement learning under unknown adversarial corru...
read it
-
Task-Optimal Exploration in Linear Dynamical Systems
Exploration in unknown environments is a fundamental problem in reinforc...
read it
-
Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding
Autonomous robot-assisted feeding requires the ability to acquire a wide...
read it
-
Experimental Design for Regret Minimization in Linear Bandits
In this paper we propose a novel experimental design-based algorithm to ...
read it
-
Learning to Actively Learn: A Robust Approach
This work proposes a procedure for designing algorithms for specific ada...
read it
-
A New Perspective on Pool-Based Active Classification and False-Discovery Control
In many scientific settings there is a need for adaptive experimental de...
read it
-
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits
This paper proposes near-optimal algorithms for the pure-exploration lin...
read it
-
Estimating the number and effect sizes of non-null hypotheses
We study the problem of estimating the distribution of effect sizes (the...
read it
-
Active Learning for Identification of Linear Dynamical Systems
We propose an algorithm to actively estimate the parameters of a linear ...
read it
-
Mosaic: A Sample-Based Database System for Open World Query Processing
Data scientists have relied on samples to analyze populations of interes...
read it
-
Sequential Experimental Design for Transductive Linear Bandits
In this paper we introduce the transductive linear bandit problem: given...
read it
-
The True Sample Complexity of Identifying Good Arms
We consider two multi-armed bandit problems with n arms: (i) given an ϵ ...
read it
-
Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs
This paper establishes that optimistic algorithms attain gap-dependent a...
read it
-
SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
read it
-
Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning
Hyperparameter tuning of multi-stage pipelines introduces a significant ...
read it
-
Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs
This paper considers a multi-armed bandit game where the number of arms ...
read it
-
Massively Parallel Hyperparameter Tuning
Modern learning models are characterized by large hyperparameter spaces....
read it
-
A Bandit Approach to Multiple Testing with False Discovery Control
We propose an adaptive sampling approach for multiple testing which aims...
read it
-
Adaptive Sampling for Convex Regression
In this paper, we introduce the first principled adaptive-sampling proce...
read it
-
A framework for Multi-A(rmed)/B(andit) testing with online FDR control
We propose an alternative framework to existing setups for controlling f...
read it
-
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime
We propose a novel technique for analyzing adaptive sampling called the ...
read it
-
Finite Sample Prediction and Recovery Bounds for Ordinal Embedding
The goal of ordinal embedding is to represent items as points in a low-d...
read it
-
Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
Performance of machine learning algorithms depends critically on identif...
read it
-
Best-of-K Bandits
This paper studies the Best-of-K Bandit game: At each time the player ch...
read it
-
Non-stochastic Best Arm Identification and Hyperparameter Optimization
Motivated by the task of hyperparameter optimization, we introduce the n...
read it
-
Sparse Dueling Bandits
The dueling bandit problem is a variation of the classical multi-armed b...
read it
-
lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits
The paper proposes a novel upper confidence bound (UCB) procedure for id...
read it
-
On Finding the Largest Mean Among Many
Sampling from distributions to find the one with the largest mean arises...
read it