The AGI Containment Problem

04/02/2016
by   James Babcock, et al.
0

There is considerable uncertainty about what properties, capabilities and motivations future AGIs will have. In some plausible scenarios, AGIs may pose security risks arising from accidents and defects. In order to mitigate these risks, prudent early AGI research teams will perform significant testing on their creations before use. Unfortunately, if an AGI has human-level or greater intelligence, testing itself may not be safe; some natural AGI goal systems create emergent incentives for AGIs to tamper with their test environments, make copies of themselves on the internet, or convince developers and operators to do dangerous things. In this paper, we survey the AGI containment problem - the question of how to build a container in which tests can be conducted safely and reliably, even on AGIs with unknown motivations and capabilities that could be dangerous. We identify requirements for AGI containers, available mechanisms, and weaknesses that need to be addressed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2018

Challenges and Characteristics of Intelligent Autonomy for Internet of Battle Things in Highly Adversarial Environments

Numerous, artificially intelligent, networked things will populate the b...
research
05/24/2023

Model evaluation for extreme risks

Current approaches to building general-purpose AI systems tend to produc...
research
02/26/2019

Intelligent Autonomous Things on the Battlefield

Numerous, artificially intelligent, networked things will populate the b...
research
11/08/2018

Security Risk Assessment in Internet of Things Systems

Information security risk assessment methods have served us well over th...
research
06/21/2023

An Overview of Catastrophic AI Risks

Rapid advancements in artificial intelligence (AI) have sparked growing ...
research
09/03/2023

A Survey on What Developers Think About Testing

Software is infamous for its poor quality and frequent occurrence of bug...
research
07/08/2023

Typology of Risks of Generative Text-to-Image Models

This paper investigates the direct risks and harms associated with moder...

Please sign up or login with your details

Forgot password? Click here to reset