Nando de Freitas

research

∙ 08/17/2023

Reinforced Self-Training (ReST) for Language Modeling

Reinforcement learning from human feedback (RLHF) can improve the qualit...

0 Caglar Gulcehre, et al. ∙

research

∙ 08/07/2023

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

StarCraft II is one of the most challenging simulated reinforcement lear...

0 Michael Mathieu, et al. ∙

research

∙ 05/05/2023

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Standard approaches to sequential decision-making exploit an agent's abi...

6 Patrick Emedom-Nnamdi, et al. ∙

research

∙ 03/13/2023

Vision-Language Models as Success Detectors

Detecting successful behaviour is crucial for training intelligent agent...

7 Yuqing Du, et al. ∙

research

∙ 10/10/2022

Multi-step Planning for Automated Hyperparameter Optimization with OptFormer

As machine learning permeates more industries and models become more exp...

10 Lucio M. Dery, et al. ∙

research

∙ 05/26/2022

Towards Learning Universal Hyperparameter Optimizers with Transformers

Meta-learning hyperparameter optimization (HPO) algorithms from prior ex...

9 Yutian Chen, et al. ∙

research

∙ 05/12/2022

A Generalist Agent

Inspired by progress in large-scale language modeling, we apply a simila...

12 Scott Reed, et al. ∙

research

∙ 02/08/2022

Competition-Level Code Generation with AlphaCode

Programming is a powerful and ubiquitous problem-solving tool. Developin...

0 Yujia Li, et al. ∙

research

∙ 10/20/2021

Shaking the foundations: delusions in sequence models for interaction and control

The recent phenomenal success of language models has reinvigorated machi...

68 Pedro A. Ortega, et al. ∙

research

∙ 06/18/2021

Active Offline Policy Selection

This paper addresses the problem of policy selection in domains with abu...

12 Ksenia Konyushkova, et al. ∙

research

∙ 05/21/2021

On Instrumental Variable Regression for Deep Offline Policy Evaluation

We show that the popular reinforcement learning (RL) strategy of estimat...

26 Yutian Chen, et al. ∙

research

∙ 03/17/2021

Regularized Behavior Value Estimation

Offline reinforcement learning restricts the learning process to rely on...

8 Caglar Gulcehre, et al. ∙

research

∙ 12/12/2020

Semi-supervised reward learning for offline reinforcement learning

In offline reinforcement learning (RL) agents are trained using a logged...

12 Ksenia Konyushkova, et al. ∙

research

∙ 11/27/2020

Offline Learning from Demonstrations and Unlabeled Experience

Behavior cloning (BC) is often practical for robot learning because it a...

6 Konrad Zolna, et al. ∙

research

∙ 11/06/2020

Large-scale multilingual audio visual dubbing

We describe a system for large-scale audiovisual translation and dubbing...

3 Yi Yang, et al. ∙

research

∙ 10/14/2020

Learning Deep Features in Instrumental Variable Regression

Instrumental variable (IV) regression is a standard strategy for learnin...

21 Liyuan Xu, et al. ∙

research

∙ 07/27/2020

Learning Compositional Neural Programs for Continuous Control

We propose a novel solution to challenging sparse-reward, continuous con...

38 Thomas Pierrot, et al. ∙

research

∙ 07/17/2020

Hyperparameter Selection for Offline Reinforcement Learning

Offline reinforcement learning (RL purely from logged data) is an import...

30 Tom Le Paine, et al. ∙

research

∙ 06/26/2020

Critic Regularized Regression

Offline reinforcement learning (RL), also known as batch RL, offers the ...

32 Ziyu Wang, et al. ∙

research

∙ 06/24/2020

RL Unplugged: Benchmarks for Offline Reinforcement Learning

Offline methods for reinforcement learning have the potential to help br...

10 Caglar Gulcehre, et al. ∙

research

∙ 06/01/2020

Acme: A Research Framework for Distributed Reinforcement Learning

Deep reinforcement learning has led to many recent-and groundbreaking-ad...

22 Matt Hoffman, et al. ∙

research

∙ 10/02/2019

Task-Relevant Adversarial Imitation Learning

We show that a critical problem in adversarial imitation from high-dimen...

35 Konrad Zolna, et al. ∙

research

∙ 09/26/2019

A Framework for Data-Driven Robotics

We present a framework for data-driven robotics that makes use of a larg...

0 Serkan Cabi, et al. ∙

research

∙ 09/12/2019

Modular Meta-Learning with Shrinkage

Most gradient-based approaches to meta-learning do not explicitly accoun...

6 Yutian Chen, et al. ∙

research

∙ 09/03/2019

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

This paper introduces R2D3, an agent that makes efficient use of demonst...

10 Tom Le Paine, et al. ∙

research

∙ 05/30/2019

Learning Compositional Neural Programs with Recursive Tree Search and Planning

We propose a novel reinforcement learning algorithm, AlphaNPI, that inco...

7 Thomas Pierrot, et al. ∙

research

∙ 05/08/2019

Meta-learning of Sequential Strategies

In this report we review memory-based meta-learning as a tool for buildi...

16 Pedro A. Ortega, et al. ∙

research

∙ 12/17/2018

Bayesian Optimization in AlphaGo

During the development of AlphaGo, its many hyper-parameters were tuned ...

129 Yutian Chen, et al. ∙

research

∙ 10/19/2018

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

We propose a unified mechanism for achieving coordination and communicat...

0 Natasha Jaques, et al. ∙

research

∙ 10/19/2018

Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

We derive a new intrinsic social motivation for multi-agent reinforcemen...

0 Natasha Jaques, et al. ∙

research

∙ 10/11/2018

One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL

Humans are experts at high-fidelity imitation -- closely mimicking a dem...

4 Tom Le Paine, et al. ∙

research

∙ 09/27/2018

Sample Efficient Adaptive Text-to-Speech

We present a meta-learning approach for adaptive text-to-speech (TTS) wi...

2 Yutian Chen, et al. ∙

research

∙ 07/13/2018

Large-Scale Visual Speech Recognition

This work presents a scalable solution to open-vocabulary visual speech ...

68 Brendan Shillingford, et al. ∙

research

∙ 05/29/2018

Playing hard exploration games by watching YouTube

Deep reinforcement learning methods traditionally struggle with tasks wh...

2 Yusuf Aytar, et al. ∙

research

∙ 05/24/2018

Hyperbolic Attention Networks

We introduce hyperbolic attention networks to endow neural networks with...

0 Caglar Gulcehre, et al. ∙

research

∙ 04/17/2018

Learning Awareness Models

We consider the setting of an agent with a fixed body interacting with a...

0 Brandon Amos, et al. ∙

research

∙ 04/06/2018

Compositional Obverter Communication Learning From Raw Visual Input

One of the distinguishing aspects of human language is its compositional...

0 Edward Choi, et al. ∙

research

∙ 02/26/2018

Reinforcement and Imitation Learning for Diverse Visuomotor Skills

We propose a model-free deep reinforcement learning method that leverage...

0 Yuke Zhu, et al. ∙

research

∙ 11/07/2017

Cortical microcircuits as gated-recurrent neural networks

Cortical circuits exhibit intricate recurrent architectures that are rem...

0 Rui Ponte Costa, et al. ∙

research

∙ 10/27/2017

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions

Deep autoregressive models have shown state-of-the-art performance in de...

0 Scott Reed, et al. ∙

research

∙ 07/11/2017

The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously

This paper introduces the Intentional Unintentional (IU) agent. This age...

0 Serkan Cabi, et al. ∙

research

∙ 06/20/2017

Programmable Agents

We build deep RL agents that execute declarative programs expressed in f...

0 Misha Denil, et al. ∙

research

∙ 03/14/2017

Learned Optimizers that Scale and Generalize

Learning to learn has emerged as an important direction for achieving ar...

0 Olga Wichrowska, et al. ∙

research

∙ 03/10/2017

Parallel Multiscale Autoregressive Density Estimation

PixelCNN achieves state-of-the-art results in density estimation for nat...

0 Scott Reed, et al. ∙

research

∙ 11/11/2016

Learning to Learn without Gradient Descent by Gradient Descent

We learn recurrent neural network optimizers trained on simple synthetic...

0 Yutian Chen, et al. ∙

research

∙ 11/06/2016

Learning to Perform Physics Experiments via Deep Reinforcement Learning

When encountering novel objects, humans are able to infer a wide range o...

0 Misha Denil, et al. ∙

research

∙ 11/05/2016

LipNet: End-to-End Sentence-level Lipreading

Lipreading is the task of decoding text from the movement of a speaker's...

0 Yannis Assael, et al. ∙

research

∙ 06/14/2016

Learning to learn by gradient descent by gradient descent

The move from hand-designed features to learned features in machine lear...

0 Marcin Andrychowicz, et al. ∙

research

∙ 02/08/2016

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks

We propose deep distributed recurrent Q-networks (DDRQN), which enable t...

0 Jakob N. Foerster, et al. ∙

research

∙ 11/19/2015

Neural Programmer-Interpreters

We propose the neural programmer-interpreter (NPI): a recurrent and comp...

0 Scott Reed, et al. ∙

Nando de Freitas

Featured Co-authors

Sign in with Google

Consider DeepAI Pro