Risks from Learned Optimization in Advanced Machine Learning Systems

06/05/2019
by   Evan Hubinger, et al.
0

We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer - a situation we refer to as mesa-optimization, a neologism we introduce in this paper. We believe that the possibility of mesa-optimization raises two important questions for the safety and transparency of advanced machine learning systems. First, under what circumstances will learned models be optimizers, including when they should not be? Second, when a learned model is an optimizer, what will its objective be - how will it differ from the loss function it was trained under - and how can it be aligned? In this paper, we provide an in-depth analysis of these two primary questions and provide an overview of topics for future research.

READ FULL TEXT
research
03/22/2022

Practical tradeoffs between memory, compute, and performance in learned optimizers

Optimization plays a costly and crucial role in developing machine learn...
research
11/28/2019

Computer Systems Have 99 Problems, Let's Not Make Machine Learning Another One

Machine learning techniques are finding many applications in computer sy...
research
01/14/2021

Training Learned Optimizers with Randomly Initialized Learned Optimizers

Learned optimizers are increasingly effective, with performance exceedin...
research
09/22/2022

A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases

Learned optimizers – neural networks that are trained to act as optimize...
research
07/17/2020

Technologies for Trustworthy Machine Learning: A Survey in a Socio-Technical Context

Concerns about the societal impact of AI-based services and systems has ...
research
12/02/2022

Transformer-Based Learned Optimization

In this paper, we propose a new approach to learned optimization. As com...
research
06/19/2022

Modeling Transformative AI Risks (MTAIR) Project – Summary Report

This report outlines work by the Modeling Transformative AI Risk (MTAIR)...

Please sign up or login with your details

Forgot password? Click here to reset