Ofir Nachum

research

∙ 09/18/2023

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

In this work, we present a scalable reinforcement learning method for tr...

0 Yevgen Chebotar, et al. ∙

research

∙ 06/26/2023

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Large transformer models trained on diverse datasets have shown a remark...

0 Jonathan N. Lee, et al. ∙

research

∙ 05/26/2023

Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation

In recent years, domains such as natural language processing and image r...

0 David Brandfonbrener, et al. ∙

research

∙ 05/24/2023

Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Animals have evolved various agile locomotion strategies, such as sprint...

2 Ken Caluwaerts, et al. ∙

research

∙ 05/19/2023

Multimodal Web Navigation with Instruction-Finetuned Foundation Models

The progress of autonomous web navigation has been hindered by the depen...

0 Hiroki Furuta, et al. ∙

research

∙ 03/07/2023

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Foundation models pretrained on diverse data at scale have demonstrated ...

0 Sherry Yang, et al. ∙

research

∙ 01/31/2023

Learning Universal Policies via Text-Guided Video Generation

A goal of artificial intelligence is to construct an agent that can solv...

7 Yilun Du, et al. ∙

research

∙ 12/13/2022

RT-1: Robotics Transformer for Real-World Control at Scale

By transferring knowledge from large, diverse, task-agnostic datasets, m...

0 Anthony Brohan, et al. ∙

research

∙ 11/23/2022

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

Using massive datasets to train large-scale models has emerged as a domi...

0 David Venuto, et al. ∙

research

∙ 11/03/2022

Contrastive Value Learning: Implicit Models for Simple Offline RL

Model-based reinforcement learning (RL) methods are appealing in the off...

0 Bogdan Mazoure, et al. ∙

research

∙ 11/03/2022

Oracle Inequalities for Model Selection in Offline Reinforcement Learning

In offline reinforcement learning (RL), a learner leverages prior logged...

3 Jonathan N. Lee, et al. ∙

research

∙ 10/24/2022

Dichotomy of Control: Separating What You Can Control from What You Cannot

Future- or return-conditioned supervised learning is an emerging paradig...

0 Mengjiao Yang, et al. ∙

research

∙ 10/08/2022

Understanding HTML with Large Language Models

Large language models (LLMs) have shown exceptional performance on a var...

0 Izzeddin Gür, et al. ∙

research

∙ 07/27/2022

PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive Information Representations

Evolution Strategy (ES) algorithms have shown promising results in train...

0 Kuang-Huei Lee, et al. ∙

research

∙ 06/24/2022

Joint Representation Training in Sequential Tasks with Shared Structure

Classical theory in reinforcement learning (RL) predominantly focuses on...

0 Aldo Pacchiano, et al. ∙

research

∙ 05/31/2022

A Mixture-of-Expert Approach to RL-based Dialogue Management

Despite recent advancements in language models (LMs), their application ...

0 Yinlam Chow, et al. ∙

research

∙ 05/30/2022

Multi-Game Decision Transformers

A longstanding goal of the field of AI is a strategy for compiling diver...

4 Kuang-Huei Lee, et al. ∙

research

∙ 05/27/2022

Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters

Motivated by the success of ensembles for uncertainty estimation in supe...

0 Seyed Kamyar Seyed Ghasemipour, et al. ∙

research

∙ 05/22/2022

Chain of Thought Imitation with Procedure Cloning

Imitation learning aims to extract high-performance policies from logged...

0 Mengjiao Yang, et al. ∙

research

∙ 01/28/2022

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error

In this work, we study the use of the Bellman equation as a surrogate ob...

0 Scott Fujimoto, et al. ∙

research

∙ 12/23/2021

Model Selection in Batch Policy Optimization

We study the problem of model selection in batch policy optimization: gi...

2 Jonathan N. Lee, et al. ∙

research

∙ 11/29/2021

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

Reinforcement learning (RL) agents are widely used for solving complex s...

0 Bogdan Mazoure, et al. ∙

research

∙ 10/27/2021

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

The aim in imitation learning is to learn effective policies by utilizin...

0 Mengjiao Yang, et al. ∙

research

∙ 08/04/2021

Policy Gradients Incorporating the Future

Reasoning about the future – understanding how decisions in the present ...

0 David Venuto, et al. ∙

research

∙ 05/26/2021

Provable Representation Learning for Imitation with Contrastive Fourier Features

In imitation learning, it is common to learn a behavior policy to match ...

0 Ofir Nachum, et al. ∙

research

∙ 04/28/2021

Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

Standard dynamics models for continuous control make use of feedforward ...

5 Michael R. Zhang, et al. ∙

research

∙ 03/30/2021

Benchmarks for Deep Off-Policy Evaluation

Off-policy evaluation (OPE) holds the promise of being able to leverage ...

13 Justin Fu, et al. ∙

research

∙ 03/23/2021

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

Progress in deep reinforcement learning (RL) research is largely enabled...

13 Hiroki Furuta, et al. ∙

research

∙ 03/17/2021

Near Optimal Policy Optimization via REPS

Since its introduction a decade ago, relative entropy policy search (REP...

0 Aldo Pacchiano, et al. ∙

research

∙ 03/14/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Many modern approaches to offline Reinforcement Learning (RL) utilize be...

0 Ilya Kostrikov, et al. ∙

research

∙ 02/11/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

The recent success of supervised learning methods on ever larger offline...

10 Mengjiao Yang, et al. ∙

research

∙ 12/12/2020

Offline Policy Selection under Uncertainty

The presence of uncertainty in policy evaluation significantly complicat...

0 Mengjiao Yang, et al. ∙

research

∙ 10/26/2020

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

Reinforcement learning (RL) has achieved impressive performance in a var...

0 Anurag Ajay, et al. ∙

research

∙ 10/22/2020

CoinDICE: Off-Policy Confidence Interval Estimation

We study high-confidence behavior-agnostic off-policy evaluation in rein...

0 Bo Dai, et al. ∙

research

∙ 07/27/2020

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

In reinforcement learning, it is typical to use the empirically observed...

0 Ilya Kostrikov, et al. ∙

research

∙ 07/07/2020

Off-Policy Evaluation via the Regularized Lagrangian

The recently proposed distribution correction estimation (DICE) family o...

4 Mengjiao Yang, et al. ∙

research

∙ 06/24/2020

RL Unplugged: Benchmarks for Offline Reinforcement Learning

Offline methods for reinforcement learning have the potential to help br...

10 Caglar Gulcehre, et al. ∙

research

∙ 06/05/2020

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization

Most reinforcement learning (RL) algorithms assume online access to the ...

10 Tatsuya Matsushima, et al. ∙

research

∙ 04/15/2020

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...

41 Justin Fu, et al. ∙

research

∙ 04/15/2020

Datasets for Data-Driven Reinforcement Learning

The offline reinforcement learning (RL) problem, also referred to as bat...

6 Justin Fu, et al. ∙

research

∙ 02/08/2020

BRPO: Batch Residual Policy Optimization

In batch reinforcement learning (RL), one often constrains a learned pol...

11 Sungryull Sohn, et al. ∙

research

∙ 01/07/2020

Reinforcement Learning via Fenchel-Rockafellar Duality

We review basic concepts of convex duality, focusing on the very general...

0 Ofir Nachum, et al. ∙

research

∙ 12/10/2019

Imitation Learning via Off-Policy Distribution Matching

When performing imitation learning from expert demonstrations, distribut...

0 Ilya Kostrikov, et al. ∙

research

∙ 12/04/2019

AlgaeDICE: Policy Gradient from Arbitrary Experience

In many real-world applications of reinforcement learning (RL), interact...

0 Ofir Nachum, et al. ∙

research

∙ 11/26/2019

Behavior Regularized Offline Reinforcement Learning

In reinforcement learning (RL) research, it is common to assume access t...

28 Yifan Wu, et al. ∙

research

∙ 10/04/2019

Group-based Fair Learning Leads to Counter-intuitive Predictions

A number of machine learning (ML) methods have been proposed recently to...

0 Ofir Nachum, et al. ∙

research

∙ 09/23/2019

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Hierarchical reinforcement learning has demonstrated significant success...

14 Ofir Nachum, et al. ∙

research

∙ 08/13/2019

Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real

Manipulation and locomotion are closely related problems that are often ...

7 Ofir Nachum, et al. ∙

research

∙ 06/10/2019

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

In many real-world reinforcement learning applications, access to the en...

0 Ofir Nachum, et al. ∙

research

∙ 06/06/2019

DeepMDP: Learning Continuous Latent Space Models for Representation Learning

Many reinforcement learning (RL) tasks provide the agent with high-dimen...

4 Carles Gelada, et al. ∙

Ofir Nachum

Featured Co-authors

Sign in with Google

Consider DeepAI Pro