Is Power-Seeking AI an Existential Risk?

06/16/2022
by   Joseph Carlsmith, et al.
0

This report examines what I see as the core argument for concern about existential risk from misaligned artificial intelligence. I proceed in two stages. First, I lay out a backdrop picture that informs such concern. On this picture, intelligent agency is an extremely powerful force, and creating agents much more intelligent than us is playing with fire – especially given that if their objectives are problematic, such agents would plausibly have instrumental incentives to seek power over humans. Second, I formulate and evaluate a more specific six-premise argument that creating agents of this kind will lead to existential catastrophe by 2070. On this argument, by 2070: (1) it will become possible and financially feasible to build relevantly powerful and agentic AI systems; (2) there will be strong incentives to do so; (3) it will be much harder to build aligned (and relevantly powerful/agentic) AI systems than to build misaligned (and relevantly powerful/agentic) AI systems that are still superficially attractive to deploy; (4) some such misaligned systems will seek power over humans in high-impact ways; (5) this problem will scale to the full disempowerment of humanity; and (6) such disempowerment will constitute an existential catastrophe. I assign rough subjective credences to the premises in this argument, and I end up with an overall estimate of  5 catastrophe of this kind will occur by 2070. (May 2022 update: since making this report public in April 2021, my estimate here has gone up, and is now at >10

READ FULL TEXT
research
06/23/2022

On Avoiding Power-Seeking by Artificial Intelligence

We do not know how to align a very intelligent AI agent's behavior with ...
research
11/06/2022

Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control

Artificial Intelligence (AI) is one of the most transformative technolog...
research
03/26/2021

Alignment of Language Agents

For artificial intelligence to be beneficial to humans the behaviour of ...
research
09/23/2011

Analysis of first prototype universal intelligence tests: evaluating and comparing AI algorithms and humans

Today, available methods that assess AI systems are focused on using emp...
research
05/12/2020

Argument Schemes for Explainable Planning

Artificial Intelligence (AI) is being increasingly used to develop syste...
research
06/27/2022

Parametrically Retargetable Decision-Makers Tend To Seek Power

If capable AI agents are generally incentivized to seek power in service...
research
05/08/2017

An Anthropic Argument against the Future Existence of Superintelligent Artificial Intelligence

This paper uses anthropic reasoning to argue for a reduced likelihood th...

Please sign up or login with your details

Forgot password? Click here to reset