
DeploymentEfficient Reinforcement Learning via ModelBased Offline Optimization
Most reinforcement learning (RL) algorithms assume online access to the ...
read it

Emergent RealWorld Robotic Skills via Unsupervised OffPolicy Reinforcement Learning
Reinforcement learning provides a general framework for learning robotic...
read it

A Divergence Minimization Perspective on Imitation Learning Methods
In many settings, it is desirable to learn decisionmaking and control p...
read it

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
Hierarchical reinforcement learning has demonstrated significant success...
read it

MultiAgent Manipulation via Locomotion using Hierarchical Sim2Real
Manipulation and locomotion are closely related problems that are often ...
read it

DynamicsAware Unsupervised Discovery of Skills
Conventionally, modelbased reinforcement learning (MBRL) aims to learn ...
read it

Way OffPolicy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Most deep reinforcement learning (RL) systems are not able to learn effe...
read it

Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Solving complex, temporallyextended tasks is a longstanding problem in...
read it

Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Deep latent variable models have become a popular model choice due to th...
read it

NearOptimal Representation Learning for Hierarchical Reinforcement Learning
We study the problem of representation learning in goalconditioned hier...
read it

The Mirage of ActionDependent Baselines in Reinforcement Learning
Policy gradient methods are a widely used class of modelfree reinforcem...
read it

Temporal Difference Models: ModelFree Deep RL for ModelBased Control
Modelfree reinforcement learning (RL) is a powerful, general tool for l...
read it

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Deep reinforcement learning algorithms can learn complex behavioral skil...
read it

Interpolated Policy Gradient: Merging OnPolicy and OffPolicy Gradient Estimation for Deep Reinforcement Learning
Offpolicy modelfree deep reinforcement learning methods using previous...
read it

Sequence Tutor: Conservative FineTuning of Sequence Generation Models with KLcontrol
This paper proposes a general method for improving the structure and qua...
read it

Categorical Reparameterization with GumbelSoftmax
Categorical variables are a natural choice for representing discrete str...
read it

Deep Reinforcement Learning for Robotic Manipulation with Asynchronous OffPolicy Updates
Reinforcement learning holds the promise of enabling autonomous robots t...
read it

Continuous Deep QLearning with Modelbased Acceleration
Modelfree reinforcement learning has been successfully applied to a ran...
read it

MuProp: Unbiased Backpropagation for Stochastic Neural Networks
Deep neural networks are powerful parametric models that can be trained ...
read it

Neural Adaptive Sequential Monte Carlo
Sequential Monte Carlo (SMC), or particle filtering, is a popular class ...
read it

Towards Deep Neural Network Architectures Robust to Adversarial Examples
Recent work has shown deep neural networks (DNNs) to be highly susceptib...
read it
Shixiang Gu
is this you? claim profile
Research Intern at Google, Ph.D. candidate and Research Assistant at University of Cambridge