Intelligence and Unambitiousness Using Algorithmic Information Theory

05/13/2021
by   Michael K. Cohen, et al.
9

Algorithmic Information Theory has inspired intractable constructions of general intelligence (AGI), and undiscovered tractable approximations are likely feasible. Reinforcement Learning (RL), the dominant paradigm by which an agent might learn to solve arbitrary solvable problems, gives an agent a dangerous incentive: to gain arbitrary "power" in order to intervene in the provision of their own reward. We review the arguments that generally intelligent algorithmic-information-theoretic reinforcement learners such as Hutter's (2005) AIXI would seek arbitrary power, including over us. Then, using an information-theoretic exploration schedule, and a setup inspired by causal influence theory, we present a variant of AIXI which learns to not seek arbitrary power; we call it "unambitious". We show that our agent learns to accrue reward at least as well as a human mentor, while relying on that mentor with diminishing probability. And given a formal assumption that we probe empirically, we show that eventually, the agent's world-model incorporates the following true fact: intervening in the "outside world" will have no effect on reward acquisition; hence, it has no incentive to shape the outside world.

READ FULL TEXT

page 7

page 17

research
08/13/2019

Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective

Can an arbitrarily intelligent reinforcement learning agent be kept unde...
research
11/21/2019

Information-Theoretic Confidence Bounds for Reinforcement Learning

We integrate information-theoretic concepts into the design and analysis...
research
12/03/2019

Optimal Farsighted Agents Tend to Seek Power

Some researchers have speculated that capable reinforcement learning (RL...
research
11/02/2020

Useful Policy Invariant Shaping from Arbitrary Advice

Reinforcement learning is a powerful learning paradigm in which agents c...
research
02/03/2022

Reward is not enough: can we liberate AI from the reinforcement learning paradigm?

I present arguments against the hypothesis put forward by Silver, Singh,...
research
07/18/2020

Modulation of viability signals for self-regulatory control

We revisit the role of instrumental value as a driver of adaptive behavi...
research
05/29/2019

Asymptotically Unambitious Artificial General Intelligence

General intelligence, the ability to solve arbitrary solvable problems, ...

Please sign up or login with your details

Forgot password? Click here to reset