Reinforcement learning from human feedback (RLHF) can improve the qualit...
StarCraft II is one of the most challenging simulated reinforcement lear...
This paper describes π2vec, a method for representing behaviors of
black...
We show that the popular reinforcement learning (RL) strategy of estimat...
Standard dynamics models for continuous control make use of feedforward
...
Off-policy evaluation (OPE) holds the promise of being able to leverage
...
Offline reinforcement learning (RL purely from logged data) is an import...
Offline methods for reinforcement learning have the potential to help br...
Deep reinforcement learning has led to many recent-and
groundbreaking-ad...
Gating mechanisms are widely used in neural network models, where they a...
This paper introduces R2D3, an agent that makes efficient use of
demonst...
Humans are experts at high-fidelity imitation -- closely mimicking a
dem...
Deep reinforcement learning methods traditionally struggle with tasks wh...
Convolutional autoregressive models have recently demonstrated
state-of-...
Video object detection is challenging because objects that are easily
de...
We consider the task of dimensional emotion recognition on video data us...
Despite being the appearance-based classifier of choice in recent years,...
Convolutional neural networks perform well on object recognition because...