Our goal is for robots to follow natural language instructions like "put...
We present a novel observation about the behavior of offline reinforceme...
We propose Heuristic Blending (HUBL), a simple performance-improving
tec...
We study a new paradigm for sequential decision making, called offline P...
A rich representation is key to general robotic manipulation, but existi...
We propose a novel model-based offline Reinforcement Learning (RL) frame...
Real-world reinforcement learning (RL) is often severely limited since
t...
We propose a new model-based offline RL framework, called Adversarial Mo...
Simulated humanoids are an appealing research domain due to their physic...
We develop a reinforcement learning (RL) framework for applications that...
We study lifelong reinforcement learning (RL) in a regret minimization
s...
We propose Adversarially Trained Actor Critic (ATAC), a new model-free
a...
Many sequential decision problems involve finding a policy that maximize...
The use of pessimism, when reasoning about datasets lacking exhaustive
e...
We provide a framework for accelerating reinforcement learning (RL)
algo...
Policy optimization methods are popular reinforcement learning algorithm...
We consider the problem of learning motion policies for acceleration-bas...
Generating robot motion for multiple tasks in dynamic environments is
ch...
Online policy optimization (OPO) views policy optimization for sequentia...
Despite its promise, reinforcement learning's real-world adoption has be...
Predicting calibrated confidence scores for multi-class deep networks is...
Online learning is a powerful tool for analyzing iterative algorithms.
H...
We present a reduction from reinforcement learning (RL) to no-regret onl...
RMPflow is a recently proposed policy-fusion framework based on differen...
Policy gradient methods have demonstrated success in reinforcement learn...
Robotic systems often need to consider multiple tasks concurrently. This...
Model predictive control (MPC) is a powerful technique for solving dynam...
We study the dynamic regret of a new class of online learning problems, ...
We develop a novel policy synthesis algorithm, RMPflow, based on
geometr...
Bilevel optimization has been recently revisited for designing and analy...
We present a predictor-corrector framework, called PicCoLO, that can
tra...
Gaussian processes (GPs) provide a powerful non-parametric framework for...
Sample efficiency is critical in solving real-world reinforcement learni...
Imitation learning (IL) consists of a set of tools that leverage expert
...
Value aggregation is a general framework for solving imitation learning
...
Large-scale Gaussian process inference has long faced practical challeng...