Laurent Orseau

research

∙ 09/19/2023

Language Modeling Is Compression

It has long been established that predictive models can be transformed i...

0 Grégoire Delétang, et al. ∙

research

∙ 07/31/2023

Line Search for Convex Minimization

Golden-section search and bisection search are the two main principled a...

0 Laurent Orseau, et al. ∙

research

∙ 05/26/2023

Levin Tree Search with Context Models

Levin Tree Search (LTS) is a search algorithm that makes use of a policy...

0 Laurent Orseau, et al. ∙

research

∙ 02/06/2023

Memory-Based Meta-Learning on Non-Stationary Distributions

Memory-based meta-learning is a technique for approximating Bayes-optima...

0 Tim Genewein, et al. ∙

research

∙ 12/29/2021

Isotuning With Applications To Scale-Free Online Learning

We extend and combine several tools of the literature to design fast, ad...

5 Laurent Orseau, et al. ∙

research

∙ 12/20/2021

Proving Theorems using Incremental Learning and Hindsight Experience Replay

Traditional automated theorem provers for first-order logic depend on sp...

0 Eser Aygün, et al. ∙

research

∙ 03/21/2021

Policy-Guided Heuristic Search with Guarantees

The use of a policy and a heuristic function for guiding search can be q...

0 Laurent Orseau, et al. ∙

research

∙ 03/05/2021

Training a First-Order Theorem Prover from Synthetic Data

A major challenge in applying machine learning to automated theorem prov...

0 Vlad Firoiu, et al. ∙

research

∙ 10/15/2020

Avoiding Side Effects By Considering Future Tasks

Designing reward functions is difficult: the designer has to specify wha...

22 Victoria Krakovna, et al. ∙

research

∙ 06/22/2020

Logarithmic Pruning is All You Need

The Lottery Ticket Hypothesis is a conjecture that every large neural ne...

11 Laurent Orseau, et al. ∙

research

∙ 06/19/2020

Learning to Prove from Synthetic Theorems

A major challenge in applying machine learning to automated theorem prov...

0 Eser Aygün, et al. ∙

research

∙ 04/28/2020

Pitfalls of learning a reward function online

In some agent designs like inverse reinforcement learning an agent needs...

6 Stuart Armstrong, et al. ∙

research

∙ 07/30/2019

Iterative Budgeted Exponential Search

We tackle two long-standing problems related to re-expansions in heurist...

0 Malte Helmert, et al. ∙

research

∙ 06/07/2019

Zooming Cautiously: Linear-Memory Heuristic Search With Node Expansion Guarantees

We introduce and analyze two parameter-free linear-memory tree search al...

0 Laurent Orseau, et al. ∙

research

∙ 01/11/2019

An investigation of model-free planning

The field of reinforcement learning (RL) is facing increasingly challeng...

10 Arthur Guez, et al. ∙

research

∙ 01/08/2019

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss

We consider prediction with expert advice under the log-loss with the go...

12 Laurent Orseau, et al. ∙

research

∙ 11/27/2018

Single-Agent Policy Tree Search With Guarantees

We introduce two novel tree search algorithms that use a policy to guide...

0 Laurent Orseau, et al. ∙

research

∙ 06/04/2018

Measuring and avoiding side effects using relative reachability

How can we design reinforcement learning agents that avoid causing unnec...

2 Victoria Krakovna, et al. ∙

research

∙ 05/31/2018

Agents and Devices: A Relative Definition of Agency

According to Dennett, the same system may be described using a `physical...

0 Laurent Orseau, et al. ∙

research

∙ 11/27/2017

AI Safety Gridworlds

We present a suite of reinforcement learning environments illustrating v...

0 Jan Leike, et al. ∙

research

∙ 05/23/2017

Reinforcement Learning with a Corrupted Reward Channel

No real-world reward function is perfect. Sensory errors and software bu...

0 Tom Everitt, et al. ∙

research

∙ 02/25/2016

Thompson Sampling is Asymptotically Optimal in General Environments

We discuss a variant of Thompson sampling for nonparametric reinforcemen...

0 Jan Leike, et al. ∙

Laurent Orseau

Featured Co-authors

Sign in with Google

Consider DeepAI Pro