
DeploymentEfficient Reinforcement Learning via ModelBased Offline Optimization
Most reinforcement learning (RL) algorithms assume online access to the ...
Emergent RealWorld Robotic Skills via Unsupervised OffPolicy Reinforcement Learning
Reinforcement learning provides a general framework for learning robotic...
A Divergence Minimization Perspective on Imitation Learning Methods
In many settings, it is desirable to learn decisionmaking and control p...
Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
Hierarchical reinforcement learning has demonstrated significant success...
MultiAgent Manipulation via Locomotion using Hierarchical Sim2Real
Manipulation and locomotion are closely related problems that are often ...
DynamicsAware Unsupervised Discovery of Skills
Conventionally, modelbased reinforcement learning (MBRL) aims to learn ...
Way OffPolicy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Most deep reinforcement learning (RL) systems are not able to learn effe...
Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Solving complex, temporallyextended tasks is a longstanding problem in...
Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
Deep latent variable models have become a popular model choice due to th...
NearOptimal Representation Learning for Hierarchical Reinforcement Learning
We study the problem of representation learning in goalconditioned hier...
The Mirage of ActionDependent Baselines in Reinforcement Learning
Policy gradient methods are a widely used class of modelfree reinforcem...
Temporal Difference Models: ModelFree Deep RL for ModelBased Control
Modelfree reinforcement learning (RL) is a powerful, general tool for l...
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Deep reinforcement learning algorithms can learn complex behavioral skil...
Interpolated Policy Gradient: Merging OnPolicy and OffPolicy Gradient Estimation for Deep Reinforcement Learning
Offpolicy modelfree deep reinforcement learning methods using previous...
Sequence Tutor: Conservative FineTuning of Sequence Generation Models with KLcontrol
This paper proposes a general method for improving the structure and qua...
Categorical Reparameterization with GumbelSoftmax
Categorical variables are a natural choice for representing discrete str...
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous OffPolicy Updates
Reinforcement learning holds the promise of enabling autonomous robots t...
Continuous Deep QLearning with Modelbased Acceleration
Modelfree reinforcement learning has been successfully applied to a ran...
MuProp: Unbiased Backpropagation for Stochastic Neural Networks
Deep neural networks are powerful parametric models that can be trained ...
Neural Adaptive Sequential Monte Carlo
Sequential Monte Carlo (SMC), or particle filtering, is a popular class ...
Towards Deep Neural Network Architectures Robust to Adversarial Examples
Recent work has shown deep neural networks (DNNs) to be highly susceptib...
Shixiang Gu
Research Intern at Google, Ph.D. candidate and Research Assistant at University of Cambridge