George Tucker

research

∙ 12/21/2022

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

Imitation learning (IL) is a simple and powerful way to use high-quality...

0 Yiren Lu, et al. ∙

research

∙ 11/28/2022

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

The potential of offline reinforcement learning (RL) is that high-capaci...

0 Aviral Kumar, et al. ∙

research

∙ 11/03/2022

Oracle Inequalities for Model Selection in Offline Reinforcement Learning

In offline reinforcement learning (RL), a learner leverages prior logged...

3 Jonathan N. Lee, et al. ∙

research

∙ 12/23/2021

Model Selection in Batch Policy Optimization

We study the problem of model selection in batch policy optimization: gi...

2 Jonathan N. Lee, et al. ∙

research

∙ 12/09/2021

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

Despite overparameterization, deep networks trained via supervised learn...

0 Aviral Kumar, et al. ∙

research

∙ 06/15/2021

Coupled Gradient Estimators for Discrete Latent Variables

Training models with discrete latent variables is challenging due to the...

0 Zhe Dong, et al. ∙

research

∙ 04/28/2021

Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

Standard dynamics models for continuous control make use of feedforward ...

5 Michael R. Zhang, et al. ∙

research

∙ 03/30/2021

Benchmarks for Deep Off-Policy Evaluation

Off-policy evaluation (OPE) holds the promise of being able to leverage ...

13 Justin Fu, et al. ∙

research

∙ 12/12/2020

Offline Policy Selection under Uncertainty

The presence of uncertainty in policy evaluation significantly complicat...

0 Mengjiao Yang, et al. ∙

research

∙ 06/24/2020

RL Unplugged: Benchmarks for Offline Reinforcement Learning

Offline methods for reinforcement learning have the potential to help br...

10 Caglar Gulcehre, et al. ∙

research

∙ 06/18/2020

DisARM: An Antithetic Gradient Estimator for Binary Latent Variables

Training models with discrete latent variables is challenging due to the...

0 Zhe Dong, et al. ∙

research

∙ 06/08/2020

Conservative Q-Learning for Offline Reinforcement Learning

Effectively leveraging large, previously collected datasets in reinforce...

0 Aviral Kumar, et al. ∙

research

∙ 05/04/2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

In this tutorial article, we aim to provide the reader with the conceptu...

29 Sergey Levine, et al. ∙

research

∙ 04/15/2020

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...

41 Justin Fu, et al. ∙

research

∙ 04/15/2020

Datasets for Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...

6 Justin Fu, et al. ∙

research

∙ 12/09/2019

Meta-Learning without Memorization

The ability to learn new concepts with small amounts of data is a critic...

19 Mingzhang Yin, et al. ∙

research

∙ 11/26/2019

Behavior Regularized Offline Reinforcement Learning

In reinforcement learning (RL) research, it is common to assume access t...

28 Yifan Wu, et al. ∙

research

∙ 11/06/2019

Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse

Posterior collapse in Variational Autoencoders (VAEs) arises when the va...

33 James Lucas, et al. ∙

research

∙ 10/31/2019

Energy-Inspired Models: Learning with Sampler-Induced Distributions

Energy-based models (EBMs) are powerful probabilistic models, but suffer...

18 Dieterich Lawson, et al. ∙

research

∙ 06/16/2019

Reinforcement Learning Driven Heuristic Optimization

Heuristic algorithms such as simulated annealing, Concorde, and METIS ar...

8 Qingpeng Cai, et al. ∙

research

∙ 06/03/2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

Off-policy reinforcement learning aims to leverage experience collected ...

7 Aviral Kumar, et al. ∙

research

∙ 05/16/2019

On Variational Bounds of Mutual Information

Estimating and optimizing Mutual Information (MI) is core to many proble...

8 Ben Poole, et al. ∙

research

∙ 03/01/2019

Model-Based Reinforcement Learning for Atari

Model-free reinforcement learning (RL) can be used to learn effective po...

20 Łukasz Kaiser, et al. ∙

research

∙ 12/26/2018

Learning to Walk via Deep Reinforcement Learning

Deep reinforcement learning suggests the promise of fully automated lear...

30 Tuomas Haarnoja, et al. ∙

research

∙ 12/13/2018

Soft Actor-Critic Algorithms and Applications

Model-free deep reinforcement learning (RL) algorithms have been success...

12 Tuomas Haarnoja, et al. ∙

research

∙ 10/10/2018

The Laplacian in RL: Learning Representations with Efficient Approximations

The smallest eigenvectors of the graph Laplacian are well-known to provi...

18 Yifan Wu, et al. ∙

research

∙ 10/09/2018

Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives

Deep latent variable models have become a popular model choice due to th...

18 George Tucker, et al. ∙

research

∙ 07/04/2018

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Integrating model-free and model-based approaches in reinforcement learn...

6 Jacob Buckman, et al. ∙

research

∙ 06/26/2018

Guided evolutionary strategies: escaping the curse of dimensionality in random search

Many applications in machine learning require optimizing a function whos...

2 Niru Maheswaranathan, et al. ∙

research

∙ 03/06/2018

Smoothed Action Value Functions for Learning Gaussian Policies

State-action value functions (i.e., Q-values) are ubiquitous in reinforc...

0 Ofir Nachum, et al. ∙

research

∙ 02/27/2018

The Mirage of Action-Dependent Baselines in Reinforcement Learning

Policy gradient methods are a widely used class of model-free reinforcem...

0 George Tucker, et al. ∙

research

∙ 02/26/2018

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Recent advances in deep reinforcement learning have made significant str...

0 Carlos Riquelme, et al. ∙

research

∙ 06/16/2017

An online sequence-to-sequence model for noisy speech recognition

Generative models have long been the dominant approach for speech recogn...

0 Chung-Cheng Chiu, et al. ∙

research

∙ 05/25/2017

Filtering Variational Objectives

When used as a surrogate objective for maximum likelihood estimation in ...

0 Chris J. Maddison, et al. ∙

research

∙ 05/16/2017

Learning Hard Alignments with Variational Inference

There has recently been significant interest in hard attention models fo...

0 Dieterich Lawson, et al. ∙

research

∙ 05/05/2017

Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting

We propose a max-pooling based loss function for training Long Short-Ter...

0 Ming Sun, et al. ∙

research

∙ 03/21/2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Learning in models with discrete latent variables is challenging due to ...

0 George Tucker, et al. ∙

research

∙ 01/23/2017

Regularizing Neural Networks by Penalizing Confident Output Distributions

We systematically explore regularizing neural networks by penalizing low...

0 Gabriel Pereyra, et al. ∙

research

∙ 11/18/2016

Compacting Neural Network Classifiers via Dropout Training

We introduce dropout compaction, a novel method for training feed-forwar...

0 Yotaro Kubo, et al. ∙

George Tucker

Featured Co-authors

Sign in with Google

Consider DeepAI Pro