Marcus Hutter

research

∙ 09/19/2023

Language Modeling Is Compression

It has long been established that predictive models can be transformed i...

0 Grégoire Delétang, et al. ∙

research

∙ 07/31/2023

Line Search for Convex Minimization

Golden-section search and bisection search are the two main principled a...

0 Laurent Orseau, et al. ∙

research

∙ 06/09/2023

Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments

The design of autonomous agents that can interact effectively with other...

0 Jonathon Schwartz, et al. ∙

research

∙ 05/26/2023

Levin Tree Search with Context Models

Levin Tree Search (LTS) is a search algorithm that makes use of a policy...

0 Laurent Orseau, et al. ∙

research

∙ 02/19/2023

Evaluating Representations with Readout Model Switching

Although much of the success of Deep Learning builds on learning good re...

0 Yazhe Li, et al. ∙

research

∙ 02/13/2023

Universal Agent Mixtures and the Geometry of Intelligence

Inspired by recent progress in multi-agent Reinforcement Learning (RL), ...

0 Samuel Allen Alexander, et al. ∙

research

∙ 02/06/2023

Memory-Based Meta-Learning on Non-Stationary Distributions

Memory-based meta-learning is a technique for approximating Bayes-optima...

0 Tim Genewein, et al. ∙

research

∙ 02/06/2023

U-Clip: On-Average Unbiased Stochastic Gradient Clipping

U-Clip is a simple amendment to gradient clipping that can be applied to...

0 Bryn Elesedy, et al. ∙

research

∙ 12/23/2022

Generalization Bounds for Transfer Learning with Pretrained Classifiers

We study the ability of foundation models to learn representations for c...

4 Tomer Galanti, et al. ∙

research

∙ 10/22/2022

Testing Independence of Exchangeable Random Variables

Given well-shuffled data, can we determine whether the data items are st...

0 Marcus Hutter, et al. ∙

research

∙ 10/14/2022

Sequential Learning Of Neural Networks for Prequential MDL

Minimum Description Length (MDL) provides a framework and an objective f...

0 Jörg Bornschein, et al. ∙

research

∙ 10/05/2022

Atari-5: Distilling the Arcade Learning Environment down to Five Games

The Arcade Learning Environment (ALE) has become an essential benchmark ...

0 Matthew Aitchison, et al. ∙

research

∙ 09/30/2022

Beyond Bayes-optimality: meta-learning what you know you don't know

Meta-training agents with memory has been shown to culminate in Bayes-op...

8 Jordi Grau-Moya, et al. ∙

research

∙ 07/19/2022

Formal Algorithms for Transformers

This document aims to be a self-contained, mathematically precise overvi...

18 Mary Phuong, et al. ∙

research

∙ 07/05/2022

Neural Networks and the Chomsky Hierarchy

Reliable generalization lies at the heart of safe ML and AI. However, un...

3 Grégoire Delétang, et al. ∙

research

∙ 06/02/2022

Uniqueness and Complexity of Inverse MDP Models

What is the action sequence aa'a" that was likely responsible for reachi...

1 Marcus Hutter, et al. ∙

research

∙ 12/30/2021

On the Role of Neural Collapse in Transfer Learning

We study the ability of foundation models to learn representations for c...

9 Tomer Galanti, et al. ∙

research

∙ 12/29/2021

Isotuning With Applications To Scale-Free Online Learning

We extend and combine several tools of the literature to design fast, ad...

5 Laurent Orseau, et al. ∙

research

∙ 12/26/2021

Reducing Planning Complexity of General Reinforcement Learning with Non-Markovian Abstractions

The field of General Reinforcement Learning (GRL) formulates the problem...

11 Sultan J. Majeed, et al. ∙

research

∙ 10/20/2021

Shaking the foundations: delusions in sequence models for interaction and control

The recent phenomenal success of language models has reinvigorated machi...

68 Pedro A. Ortega, et al. ∙

research

∙ 10/06/2021

Reward-Punishment Symmetric Universal Intelligence

Can an agent's intelligence level be negative? We extend the Legg-Hutter...

4 Samuel Allen Alexander, et al. ∙

research

∙ 09/30/2021

Reinforcement Learning with Information-Theoretic Actuation

Reinforcement Learning formalises an embodied agent's interaction with t...

16 Elliot Catt, et al. ∙

research

∙ 05/13/2021

Intelligence and Unambitiousness Using Algorithmic Information Theory

Algorithmic Information Theory has inspired intractable constructions of...

9 Michael K. Cohen, et al. ∙

research

∙ 02/17/2021

Fully General Online Imitation Learning

In imitation learning, imitators and demonstrators are policies for pick...

15 Michael K. Cohen, et al. ∙

research

∙ 02/08/2021

Learning Curve Theory

Recently a number of empirical "universal" scaling law papers have been ...

7 Marcus Hutter, et al. ∙

research

∙ 12/18/2020

Exact Reduction of Huge Action Spaces in General Reinforcement Learning

The reinforcement learning (RL) framework formalizes the notion of learn...

12 Sultan Javed Majeed, et al. ∙

research

∙ 11/18/2020

Counterfactual Credit Assignment in Model-Free Reinforcement Learning

Credit assignment in reinforcement learning is the problem of measuring ...

8 Thomas Mesnard, et al. ∙

research

∙ 10/23/2020

A Combinatorial Perspective on Transfer Learning

Human intelligence is characterized not only by the capacity to learn co...

98 Jianan Wang, et al. ∙

research

∙ 07/30/2020

On Representing (Anti)Symmetric Functions

Permutation-invariant, -equivariant, and -covariant functions and anti-s...

13 Marcus Hutter, et al. ∙

research

∙ 06/22/2020

Logarithmic Pruning is All You Need

The Lottery Ticket Hypothesis is a conjecture that every large neural ne...

11 Laurent Orseau, et al. ∙

research

∙ 06/15/2020

Pessimism About Unknown Unknowns Inspires Conservatism

If we could define the set of all bad outcomes, we could hard-code an ag...

8 Michael K. Cohen, et al. ∙

research

∙ 06/05/2020

Curiosity Killed the Cat and the Asymptotically Optimal Agent

Reinforcement learners are agents that learn to pick actions that lead t...

6 Michael K. Cohen, et al. ∙

research

∙ 02/21/2020

Online Learning in Contextual Bandits using Gated Linear Networks

We introduce a new and completely online contextual bandit algorithm cal...

1 Eren Sezener, et al. ∙

research

∙ 09/30/2019

Gated Linear Networks

This paper presents a family of backpropagation-free neural architecture...

38 Joel Veness, et al. ∙

research

∙ 08/13/2019

Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective

Can an arbitrarily intelligent reinforcement learning agent be kept unde...

3 Tom Everitt, et al. ∙

research

∙ 07/11/2019

Fairness without Regret

A popular approach of achieving fairness in optimization problems is by ...

7 Marcus Hutter, et al. ∙

research

∙ 05/29/2019

Asymptotically Unambitious Artificial General Intelligence

General intelligence, the ability to solve arbitrary solvable problems, ...

6 Michael K. Cohen, et al. ∙

research

∙ 05/28/2019

Conditions on Features for Temporal Difference-Like Methods to Converge

The convergence of many reinforcement learning (RL) algorithms with line...

4 Marcus Hutter, et al. ∙

research

∙ 03/04/2019

Strong Asymptotic Optimality in General Environments

Reinforcement Learning agents are expected to eventually perform well. T...

8 Michael K. Cohen, et al. ∙

research

∙ 11/09/2018

Performance Guarantees for Homomorphisms Beyond Markov Decision Processes

Most real-world problems have huge state and/or action spaces. Therefore...

2 Sultan Javed Majeed, et al. ∙

research

∙ 05/03/2018

AGI Safety Literature Review

The development of Artificial General Intelligence (AGI) promises to be ...

0 Tom Everitt, et al. ∙

research

∙ 08/13/2017

A Game-Theoretic Analysis of the Off-Switch Game

The off-switch game is a game theoretic model of a highly intelligent ro...

0 Tobias Wängberg, et al. ∙

research

∙ 06/25/2017

Count-Based Exploration in Feature Space for Reinforcement Learning

We introduce a new count-based optimistic exploration algorithm for Rein...

0 Jarryd Martin, et al. ∙

research

∙ 05/30/2017

Universal Reinforcement Learning Algorithms: Survey and Experiments

Many state-of-the-art reinforcement learning (RL) algorithms typically a...

0 John Aslanides, et al. ∙

research

∙ 05/23/2017

Reinforcement Learning with a Corrupted Reward Channel

No real-world reward function is perfect. Sensory errors and software bu...

0 Tom Everitt, et al. ∙

research

∙ 04/12/2016

Loss Bounds and Time Complexity for Speed Priors

This paper establishes for the first time the predictive performance of ...

0 Daniel Filan, et al. ∙

research

∙ 02/25/2016

Thompson Sampling is Asymptotically Optimal in General Environments

We discuss a variant of Thompson sampling for nonparametric reinforcemen...

0 Jan Leike, et al. ∙

research

∙ 10/19/2015

On the Computability of AIXI

How could we solve the machine learning and the artificial intelligence ...

0 Jan Leike, et al. ∙

research

∙ 10/16/2015

Bad Universal Priors and Notions of Optimality

A big open question of algorithmic information theory is the choice of t...

0 Jan Leike, et al. ∙

research

∙ 09/09/2015

A Topological Approach to Meta-heuristics: Analytical Results on the BFS vs. DFS Algorithm Selection Problem

Search is a central problem in artificial intelligence, and BFS and DFS ...

0 Tom Everitt, et al. ∙

Marcus Hutter

Featured Co-authors

Sign in with Google

Consider DeepAI Pro