Satinder Singh

research

∙ 08/17/2023

Diversifying AI: Towards Creative Chess with AlphaZero

In recent years, Artificial Intelligence (AI) systems have surpassed hum...

0 Tom Zahavy, et al. ∙

research

∙ 03/07/2023

Structured State Space Models for In-Context Reinforcement Learning

Structured state space sequence (S4) models have recently achieved state...

0 Chris Lu, et al. ∙

research

∙ 02/28/2023

Hierarchical Reinforcement Learning in Complex 3D Environments

Hierarchical Reinforcement Learning (HRL) agents have the potential to d...

0 Bernardo Ávila Pires, et al. ∙

research

∙ 01/28/2023

Composing Task Knowledge with Modular Successor Feature Approximators

Recently, the Successor Features and Generalized Policy Improvement (SF ...

0 Wilka Carvalho, et al. ∙

research

∙ 12/30/2022

POMRL: No-Regret Learning-to-Plan with Increasing Horizons

We study the problem of planning under model uncertainty in an online me...

0 Khimya Khetarpal, et al. ∙

research

∙ 10/30/2022

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the...

0 Dilip Arumugam, et al. ∙

research

∙ 10/25/2022

In-context Reinforcement Learning with Algorithm Distillation

We propose Algorithm Distillation (AD), a method for distilling reinforc...

1 Michael (Misha) Laskin, et al. ∙

research

∙ 10/19/2022

Palm up: Playing in the Latent Manifold for Unsupervised Pretraining

Large and diverse datasets have been the cornerstones of many impressive...

0 Hao Liu, et al. ∙

research

∙ 06/30/2022

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

We introduce DeepNash, an autonomous agent capable of learning to play t...

6 Julien Perolat, et al. ∙

research

∙ 02/08/2022

GrASP: Gradient-Based Affordance Selection for Planning

Planning with a learned model is arguably a key component of intelligenc...

0 Vivek Veeriah, et al. ∙

research

∙ 11/01/2021

On the Expressivity of Markov Reward

Reward is the driving force for reinforcement-learning agents. This pape...

15 David Abel, et al. ∙

research

∙ 09/09/2021

Bootstrapped Meta-Learning

Meta-learning empowers artificial intelligence to increase its efficienc...

23 Sebastian Flennerhag, et al. ∙

research

∙ 06/01/2021

Reward is enough for convex MDPs

Maximising a cumulative reward function that is Markov and stationary, i...

0 Tom Zahavy, et al. ∙

research

∙ 02/25/2021

Reinforcement Learning of Implicit and Explicit Control Flow in Instructions

Learning to flexibly follow task instructions in dynamic environments po...

21 Ethan A. Brooks, et al. ∙

research

∙ 02/12/2021

Discovery of Options via Meta-Learned Subgoals

Temporal abstractions in the form of options have been shown to help rei...

5 Vivek Veeriah, et al. ∙

research

∙ 02/09/2021

Pairwise Weights for Temporal Credit Assignment

How much credit (or blame) should an action taken in a state get for a f...

0 Zeyu Zheng, et al. ∙

research

∙ 02/09/2021

Learning State Representations from Random Deep Action-conditional Predictions

In this work, we study auxiliary prediction tasks defined by temporal-di...

0 Zeyu Zheng, et al. ∙

research

∙ 12/14/2020

Efficient Querying for Cooperative Probabilistic Commitments

Multiagent systems can use commitments as the core of a general coordina...

0 Qi Zhang, et al. ∙

research

∙ 10/28/2020

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

First-person object-interaction tasks in high-fidelity, 3D, simulated en...

10 Wilka Carvalho, et al. ∙

research

∙ 07/17/2020

Discovering Reinforcement Learning Algorithms

Reinforcement learning (RL) algorithms update an agent's parameters acco...

72 Junhyuk Oh, et al. ∙

research

∙ 07/16/2020

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Deep reinforcement learning includes a broad family of algorithms that p...

9 Zhongwen Xu, et al. ∙

research

∙ 06/08/2020

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Recent advances in deep reinforcement learning (RL) have led to consider...

0 Thomas Anthony, et al. ∙

research

∙ 02/28/2020

Self-Tuning Deep Reinforcement Learning

Reinforcement learning (RL) algorithms often require expensive manual or...

20 Tom Zahavy, et al. ∙

research

∙ 12/15/2019

How Should an Agent Practice?

We present a method for learning intrinsic reward functions to drive the...

17 Janarthanan Rajendran, et al. ∙

research

∙ 12/11/2019

What Can Learned Intrinsic Rewards Capture?

Reinforcement learning agents can include different components, such as ...

25 Zeyu Zheng, et al. ∙

research

∙ 12/05/2019

Hindsight Credit Assignment

We consider the problem of efficient credit assignment in reinforcement ...

0 Anna Harutyunyan, et al. ∙

research

∙ 11/25/2019

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem

Order dispatching and driver repositioning (also known as fleet manageme...

0 John Holler, et al. ∙

research

∙ 10/31/2019

Object-oriented state editing for HRL

We introduce agents that use object-oriented reasoning to consider alter...

0 Victor Bapst, et al. ∙

research

∙ 10/23/2019

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

Reinforcement learning (RL) methods have been shown to be capable of lea...

20 Aditya Modi, et al. ∙

research

∙ 09/10/2019

Discovery of Useful Questions as Auxiliary Tasks

Arguably, intelligent agents ought to be able to discover their own ques...

7 Vivek Veeriah, et al. ∙

research

∙ 09/04/2019

No Press Diplomacy: Modeling Multi-Agent Gameplay

Diplomacy is a seven-player non-stochastic, non-cooperative game, where ...

4 Philip Paquette, et al. ∙

research

∙ 08/09/2019

Behaviour Suite for Reinforcement Learning

This paper introduces the Behaviour Suite for Reinforcement Learning, or...

2 Ian Osband, et al. ∙

research

∙ 01/24/2019

Learning Independently-Obtainable Reward Functions

We present a novel method for learning a set of disentangled reward func...

0 Christopher Grimm, et al. ∙

research

∙ 12/03/2018

Generative Adversarial Self-Imitation Learning

This paper explores a simple regularizer for reinforcement learning by p...

0 Yijie Guo, et al. ∙

research

∙ 08/24/2018

Learning End-to-End Goal-Oriented Dialog with Multiple Answers

In a dialog, there can be multiple valid next utterances at any point. T...

0 Janarthanan Rajendran, et al. ∙

research

∙ 06/22/2018

Many-Goals Reinforcement Learning

All-goals updating exploits the off-policy nature of Q-learning to updat...

0 Vivek Veeriah, et al. ∙

research

∙ 06/14/2018

Self-Imitation Learning

This paper proposes Self-Imitation Learning (SIL), a simple off-policy a...

0 Junhyuk Oh, et al. ∙

research

∙ 04/22/2018

Named Entities troubling your Neural Methods? Build NE-Table: A neural approach for handling Named Entities

Many natural language processing tasks require dealing with Named Entiti...

0 Janarthanan Rajendran, et al. ∙

research

∙ 04/17/2018

On Learning Intrinsic Rewards for Policy Gradient Methods

In many sequential decision making tasks, it is challenging to design re...

0 Zeyu Zheng, et al. ∙

research

∙ 03/08/2018

The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBA

During the 2017 NBA playoffs, Celtics coach Brad Stevens was faced with ...

0 Jiaxuan Wang, et al. ∙

research

∙ 11/15/2017

Markov Decision Processes with Continuous Side Information

We consider a reinforcement learning (RL) setting in which the agent int...

0 Aditya Modi, et al. ∙

research

∙ 07/11/2017

Value Prediction Network

This paper proposes a novel deep reinforcement learning (RL) architectur...

0 Junhyuk Oh, et al. ∙

research

∙ 06/15/2017

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

As a step towards developing zero-shot task generalization capabilities ...

0 Junhyuk Oh, et al. ∙

research

∙ 05/15/2017

Repeated Inverse Reinforcement Learning

We introduce a novel repeated Inverse Reinforcement Learning problem: th...

0 Kareem Amin, et al. ∙

research

∙ 05/30/2016

Control of Memory, Active Perception, and Action in Minecraft

In this paper, we introduce a new set of reinforcement learning (RL) tas...

0 Junhyuk Oh, et al. ∙

research

∙ 01/25/2016

Towards Resolving Unidentifiability in Inverse Reinforcement Learning

We consider a setting for Inverse Reinforcement Learning (IRL) where the...

0 Kareem Amin, et al. ∙

research

∙ 07/31/2015

Action-Conditional Video Prediction using Deep Networks in Atari Games

Motivated by vision-based reinforcement learning (RL) problems, in parti...

0 Junhyuk Oh, et al. ∙

research

∙ 01/16/2014

Learning to Make Predictions In Partially Observable Environments Without a Generative Model

When faced with the problem of learning a model of a high-dimensional en...

0 Erik Talvitie, et al. ∙

research

∙ 01/23/2013

Approximate Planning for Factored POMDPs using Belief State Simplification

We are interested in the problem of planning for factored POMDPs. Buildi...

0 David A. McAllester, et al. ∙

research

∙ 01/23/2013

On the Complexity of Policy Iteration

Decision-making problems in uncertain or stochastic domains are often fo...

0 Yishay Mansour, et al. ∙

Satinder Singh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro