
Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Reinforcement learning (RL) is empirically successful in complex nonline...
read it

On the Theory of Reinforcement Learning with OnceperEpisode Feedback
We study a theory of reinforcement learning (RL) in which the learner re...
read it

Parallelizing Contextual Linear Bandits
Standard approaches to decisionmaking under uncertainty focus on sequen...
read it

Near Optimal Policy Optimization via REPS
Since its introduction a decade ago, relative entropy policy search (REP...
read it

Unlocking Pixels for Reinforcement Learning via Implicit Attention
There has recently been significant interest in training reinforcement l...
read it

Deep Reinforcement Learning with Dynamic Optimism
In recent years, deep offpolicy actorcritic algorithms have become a d...
read it

ESENAS: Combining Evolution Strategies with Neural Architecture Search at No Extra Cost for Reinforcement Learning
We introduce ESENAS, a simple neural architecture search (NAS) algorith...
read it

Fairness with Continuous Optimal Transport
Whilst optimal transport (OT) is increasingly being recognized as a powe...
read it

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL
We propose a simple model selection approach for algorithms in stochasti...
read it

Online Model Selection for Reinforcement Learning with Function Approximation
Deep reinforcement learning has achieved impressive successes yet often ...
read it

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Over the last decade, a single algorithm has changed many facets of our ...
read it

Accelerated Message Passing for EntropyRegularized MAP Inference
Maximum a posteriori (MAP) inference in discretevalued Markov random fi...
read it

On Optimism in ModelBased Reinforcement Learning
The principle of optimism in the face of uncertainty is prevalent throug...
read it

Stochastic Bandits with Linear Constraints
We study a constrained contextual linear bandit setting, where the goal ...
read it

Regret Balancing for Bandit and RL Model Selection
We consider model selection in stochastic bandit and reinforcement learn...
read it

Learning the Truth From Only One Side of the Story
Learning under onesided feedback (i.e., where examples arrive in an onl...
read it

Stochastic Flows and Geometric Optimization on the Orthogonal Group
We present a new class of stochastic, geometricallydriven optimization ...
read it

Robustness Guarantees for Mode Estimation with an Application to Bandits
Mode estimation is a classical problem in statistics with a wide range o...
read it

Model Selection in Contextual Stochastic Bandit Problems
We study model selection in stochastic bandit problems. Our approach rel...
read it

On Thompson Sampling with Langevin Algorithms
Thompson sampling is a methodology for multiarmed bandit problems that ...
read it

Ready Policy One: World Building Through Active Learning
ModelBased Reinforcement Learning (MBRL) offers a promising direction f...
read it

Effective Diversity in PopulationBased Reinforcement Learning
Maintaining a population of solutions has been shown to increase explora...
read it

ESMAML: Simple HessianFree Meta Learning
We introduce ESMAML, a new framework for solving the model agnostic met...
read it

Wasserstein Fair Classification
We propose an approach to fair classification that enforces independence...
read it

Reinforcement Learning with Chromatic Networks
We present a new algorithm for finding compact neural networks encoding ...
read it

Approximate SheraliAdams Relaxations for MAP Inference via Entropy Regularization
Maximum a posteriori (MAP) inference is a fundamental computational para...
read it

Wasserstein Reinforcement Learning
We propose behaviordriven optimization via Wasserstein distances (WDs) ...
read it

Structured Monte Carlo Sampling for Nonisotropic Distributions via Determinantal Point Processes
We propose a new class of structured methods for Monte Carlo (MC) sampli...
read it

Adaptive SampleEfficient Blackbox Optimization via ESactive Subspaces
We present a new algorithm ASEBO for conducting optimization of highdim...
read it

When random search is not enough: SampleEfficient and NoiseRobust Blackbox Optimization of RL Policies
Interest in derivativefree optimization (DFO) and "evolutionary strateg...
read it

GenOja: A Simple and Efficient Algorithm for Streaming Generalized Eigenvector Computation
In this paper, we study the problems of principal Generalized Eigenvecto...
read it

Online learning with kernel losses
We present a generalization of the adversarial linear bandits framework,...
read it

A note on reinforcement learning with Wasserstein distance regularisation, with applications to multipolicy learning
In this note we describe an application of Wasserstein distance to Reinf...
read it
Aldo Pacchiano
is this you? claim profile