
Learning the Linear Quadratic Regulator from Nonlinear Observations
We introduce a new problem setting for continuous control called the LQR...
Private Reinforcement Learning with PAC and Regret Guarantees
Motivated by highstakes decisionmaking domains like personalized medic...
Contrastive learning, multiview redundancy, and linear models
Selfsupervised learning is an empirically successful approach to unsupe...
SampleEfficient Reinforcement Learning of Undercomplete POMDPs
Partial observability is a common challenge in many reinforcement learni...
Information Theoretic Regret Bounds for Online Nonlinear Control
This work studies the problem of sequential control in an unknown, nonli...
Open Problem: Model Selection for Contextual Bandits
In statistical learning, algorithms for model selection allow the learne...
Provably adaptive reinforcement learning in metric spaces
We study reinforcement learning in continuous state and action spaces en...
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
In order to deal with the curse of dimensionality in reinforcement learn...
Efficient Contextual Bandits with Continuous Actions
We create a computationally tractable algorithm for contextual bandits w...
Contrastive estimation reveals topic posterior information to linear models
Contrastive learning is an approach to representation learning that util...
Corrupted Multidimensional Binary Search: Learning in the Presence of Irrational Agents
Standard gametheoretic formulations for settings like contextual pricin...
Adaptive Estimator Selection for OffPolicy Evaluation
We develop a generic datadriven method for estimator selection in offp...
RewardFree Exploration for Reinforcement Learning
Exploration is widely regarded as one of the most challenging aspects of...
Algebraic and Analytic Approaches for Parameter Learning in Mixture Models
We present two different approaches for parameter learning in several mi...
Scalable Hierarchical Clustering with Tree Grafting
We introduce Grinch, a new algorithm for largescale, nongreedy hierarc...
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
We design a new provably efficient algorithm for episodic reinforcement ...
Kinematic State Abstraction and Provably Efficient RichObservation Reinforcement Learning
We present an algorithm, HOMER, for exploration and reinforcement learni...
Sample Complexity of Learning Mixtures of Sparse Linear Regressions
In the problem of learning mixtures of linear regressions, the goal is t...
Robust Dynamic Assortment Optimization in the Presence of Outlier Customers
We consider the dynamic assortment optimization problem under the multin...
Doubly robust offpolicy evaluation with shrinkage
We design a new family of estimators for offpolicy evaluation in contex...
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
We design a new algorithm for batch active learning with deep neural net...
Model selection for contextual bandits
We introduce the problem of model selection for contextual bandits, wher...
Trace Reconstruction: Generalized and Parameterized
In the beautifully simpletostate problem of trace reconstruction, the ...
Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting
We study contextual bandit learning with an abstract policy class and co...
Provably efficient RL with Rich Observations via Latent State Decoding
We study the exploration problem in episodic MDPs with rich observations...
ModelBased Reinforcement Learning in Contextual Decision Processes
We study the sample complexity of modelbased reinforcement learning in ...
Contextual bandits with surrogate losses: Margin bounds and efficient algorithms
We introduce a new family of marginbased regret guarantees for adversar...
Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming
We design a new myopic strategy for a wide class of sequential design of...
Semiparametric Contextual Bandits
This paper studies semiparametric contextual bandits, a generalization o...
On Polynomial Time PAC Reinforcement Learning with Rich Observations
We study the computational tractability of provably sampleefficient (PA...
Disagreementbased combinatorial pure exploration: Efficient algorithms and an analysis with localization
We design new algorithms for the combinatorial pure exploration problem ...
Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning
Knowledge bases (KB), both automatically and manually constructed, are o...
Asynchronous Parallel Bayesian Optimisation via Thompson Sampling
We design and analyse variations of the classical Thompson sampling (TS)...
An Online Hierarchical Algorithm for Extreme Clustering
Many modern clustering methods scale well to a large number of data item...
Active Learning for CostSensitive Classification
We design an active learning algorithm for costsensitive multiclass cla...
Contextual Decision Processes with Low Bellman Rank are PACLearnable
This paper studies systematic exploration for reinforcement learning wit...
Offpolicy evaluation for slate recommendation
This paper studies the evaluation of policies that recommend an ordered ...
Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains
Highdimensional observations and complex realworld dynamics present ma...
PAC Reinforcement Learning with Rich Observations
We propose and study a new model for reinforcement learning with rich ob...
Minimax Structured Normal Means Inference
We provide a unified treatment of a broad class of noisy structure recov...
Extreme Compressive Sampling for Covariance Estimation
This paper studies the problem of estimating the covariance of a collect...
Contextual Semibandits via Supervised Learning Oracles
We study an online decision making problem where on each round a learner...
Learning to Search Better Than Your Teacher
Methods for learning to search for structured prediction typically imita...
Influence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations
We propose and analyze estimators for statistical functionals of one or ...
On Estimating L_2^2 Divergence
We give a comprehensive theoretical characterization of a nonparametric ...
On the Power of Adaptivity in Matrix Completion and Approximation
We consider the related tasks of matrix completion and matrix approximat...
Subspace Learning from Extremely Compressed Measurements
We consider learning the principal subspace of a large set of vectors fr...
Nonparametric Estimation of Renyi Divergence and Friends
We consider nonparametric estimation of L_2, Renyiα and Tsallisα diver...
Nearoptimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic
The detection of anomalous activity in graphs is a statistical problem t...
Recovering GraphStructured Activations using Adaptive Compressive Measurements
We study the localization of a cluster of activated vertices in a graph,...
Akshay Krishnamurthy
Assistant Professor in the College of Information and Computer Sciences at the University of Massachusetts, Amherst