
Accelerated Message Passing for EntropyRegularized MAP Inference
Maximum a posteriori (MAP) inference in discretevalued Markov random fi...
On Optimism in ModelBased Reinforcement Learning
The principle of optimism in the face of uncertainty is prevalent throug...
Stochastic Bandits with Linear Constraints
We study a constrained contextual linear bandit setting, where the goal ...
Regret Balancing for Bandit and RL Model Selection
We consider model selection in stochastic bandit and reinforcement learn...
Learning the Truth From Only One Side of the Story
Learning under onesided feedback (i.e., where examples arrive in an onl...
Stochastic Flows and Geometric Optimization on the Orthogonal Group
We present a new class of stochastic, geometricallydriven optimization ...
Robustness Guarantees for Mode Estimation with an Application to Bandits
Mode estimation is a classical problem in statistics with a wide range o...
Model Selection in Contextual Stochastic Bandit Problems
We study model selection in stochastic bandit problems. Our approach rel...
On Thompson Sampling with Langevin Algorithms
Thompson sampling is a methodology for multiarmed bandit problems that ...
Ready Policy One: World Building Through Active Learning
ModelBased Reinforcement Learning (MBRL) offers a promising direction f...
Effective Diversity in PopulationBased Reinforcement Learning
Maintaining a population of solutions has been shown to increase explora...
ESMAML: Simple HessianFree Meta Learning
We introduce ESMAML, a new framework for solving the model agnostic met...
Wasserstein Fair Classification
We propose an approach to fair classification that enforces independence...
Reinforcement Learning with Chromatic Networks
We present a new algorithm for finding compact neural networks encoding ...
Approximate SheraliAdams Relaxations for MAP Inference via Entropy Regularization
Maximum a posteriori (MAP) inference is a fundamental computational para...
Wasserstein Reinforcement Learning
We propose behaviordriven optimization via Wasserstein distances (WDs) ...
Structured Monte Carlo Sampling for Nonisotropic Distributions via Determinantal Point Processes
We propose a new class of structured methods for Monte Carlo (MC) sampli...
Adaptive SampleEfficient Blackbox Optimization via ESactive Subspaces
We present a new algorithm ASEBO for conducting optimization of highdim...
When random search is not enough: SampleEfficient and NoiseRobust Blackbox Optimization of RL Policies
Interest in derivativefree optimization (DFO) and "evolutionary strateg...
GenOja: A Simple and Efficient Algorithm for Streaming Generalized Eigenvector Computation
In this paper, we study the problems of principal Generalized Eigenvecto...
Online learning with kernel losses
We present a generalization of the adversarial linear bandits framework,...
A note on reinforcement learning with Wasserstein distance regularisation, with applications to multipolicy learning
In this note we describe an application of Wasserstein distance to Reinf...
