
SoftBayes: Prod for Mixtures of Experts with LogLoss
We consider prediction with expert advice under the logloss with the go...
read it

Logarithmic Pruning is All You Need
The Lottery Ticket Hypothesis is a conjecture that every large neural ne...
read it

An investigation of modelfree planning
The field of reinforcement learning (RL) is facing increasingly challeng...
read it

Pitfalls of learning a reward function online
In some agent designs like inverse reinforcement learning an agent needs...
read it

Measuring and avoiding side effects using relative reachability
How can we design reinforcement learning agents that avoid causing unnec...
read it

Reinforcement Learning with a Corrupted Reward Channel
No realworld reward function is perfect. Sensory errors and software bu...
read it

Thompson Sampling is Asymptotically Optimal in General Environments
We discuss a variant of Thompson sampling for nonparametric reinforcemen...
read it

AI Safety Gridworlds
We present a suite of reinforcement learning environments illustrating v...
read it

Agents and Devices: A Relative Definition of Agency
According to Dennett, the same system may be described using a `physical...
read it

SingleAgent Policy Tree Search With Guarantees
We introduce two novel tree search algorithms that use a policy to guide...
read it

Zooming Cautiously: LinearMemory Heuristic Search With Node Expansion Guarantees
We introduce and analyze two parameterfree linearmemory tree search al...
read it

Iterative Budgeted Exponential Search
We tackle two longstanding problems related to reexpansions in heurist...
read it

Learning to Prove from Synthetic Theorems
A major challenge in applying machine learning to automated theorem prov...
read it
Laurent Orseau
is this you? claim profile