
-
Unlocking Pixels for Reinforcement Learning via Implicit Attention
There has recently been significant interest in training reinforcement l...
read it
-
ES-ENAS: Combining Evolution Strategies with Neural Architecture Search at No Extra Cost for Reinforcement Learning
We introduce ES-ENAS, a simple neural architecture search (NAS) algorith...
read it
-
Monte-Carlo Tree Search as Regularized Policy Optimization
The combination of Monte-Carlo tree search (MCTS) with deep reinforcemen...
read it
-
Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies
Off-policy learning algorithms have been known to be sensitive to the ch...
read it
-
Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning
We propose a graphical model framework for goal-conditioned RL, with an ...
read it
-
Self-Imitation Learning via Generalized Lower Bound Q-learning
Self-imitation learning motivated by lower-bound Q-learning is a novel a...
read it
-
Taylor Expansion Policy Optimization
In this work, we investigate the application of Taylor expansions in rei...
read it
-
Discrete Action On-Policy Learning with Action-Value Critic
Reinforcement learning (RL) in discrete action space is ubiquitous in re...
read it
-
ES-MAML: Simple Hessian-Free Meta Learning
We introduce ES-MAML, a new framework for solving the model agnostic met...
read it
-
Reinforcement Learning with Chromatic Networks
We present a new algorithm for finding compact neural networks encoding ...
read it
-
Reinforcement Learning for Integer Programming: Learning to Cut
Integer programming (IP) is a general optimization framework widely appl...
read it
-
Wasserstein Reinforcement Learning
We propose behavior-driven optimization via Wasserstein distances (WDs) ...
read it
-
Variance Reduction for Evolution Strategies via Structured Control Variates
Evolution Strategies (ES) are a powerful class of blackbox optimization ...
read it
-
Structured Monte Carlo Sampling for Nonisotropic Distributions via Determinantal Point Processes
We propose a new class of structured methods for Monte Carlo (MC) sampli...
read it
-
Augment-Reinforce-Merge Policy Gradient for Binary Stochastic Policy
Due to the high variance of policy gradients, on-policy optimization alg...
read it
-
Orthogonal Estimation of Wasserstein Distances
Wasserstein distances are increasingly used in a wide variety of applica...
read it
-
Adaptive Sample-Efficient Blackbox Optimization via ES-active Subspaces
We present a new algorithm ASEBO for conducting optimization of high-dim...
read it
-
Discretizing Continuous Action Space for On-Policy Optimization
In this work, we show that discretizing action space for continuous cont...
read it
-
Boosting Trust Region Policy Optimization by Normalizing Flows Policy
We propose to improve trust region policy search with normalizing flows ...
read it
-
Implicit Policy for Reinforcement Learning
We introduce Implicit Policy, a general class of expressive policies tha...
read it
-
Exploration by Distributional Reinforcement Learning
We propose a framework based on distributional reinforcement learning an...
read it
-
Variational Deep Q Network
We propose a framework that directly tackles the probability distributio...
read it