Reward Reports for Reinforcement Learning

04/22/2022
by   Thomas Krendl Gilbert, et al.
0

The desire to build good systems in the face of complex societal effects requires a dynamic approach towards equity and access. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and post-deployment performance unexamined. Meanwhile, recent work in reinforcement learning design has shown that the effects of optimization objectives on the resultant system behavior can be wide-ranging and unpredictable. In this paper we sketch a framework for documenting deployed learning systems, which we call Reward Reports. Taking inspiration from various contributions to the technical literature on reinforcement learning, we outline Reward Reports as living documents that track updates to design choices and assumptions behind what a particular automated system is optimizing for. They are intended to track dynamic phenomena arising from system deployment, rather than merely static properties of models or data. After presenting the elements of a Reward Report, we provide three examples: DeepMind's MuZero, MovieLens, and a hypothetical deployment of a Project Flow traffic control policy.

READ FULL TEXT

page 6

page 20

page 36

research
02/11/2022

Choices, Risks, and Reward Reports: Charting Public Policy for Reinforcement Learning Systems

In the long term, reinforcement learning (RL) is considered by many AI t...
research
07/13/2022

From Design to Deployment of Zero-touch Deep Reinforcement Learning WLANs

Machine learning (ML) is increasingly used to automate networking tasks,...
research
06/03/2021

Hyperbolically-Discounted Reinforcement Learning on Reward-Punishment Framework

This paper proposes a new reinforcement learning with hyperbolic discoun...
research
09/10/2020

Importance Weighted Policy Learning and Adaption

The ability to exploit prior experience to solve novel problems rapidly ...
research
10/08/2018

Toward Understanding the Impact of Staleness in Distributed Machine Learning

Many distributed machine learning (ML) systems adopt the non-synchronous...
research
10/18/2022

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity

Reinforcement learning provides an automated framework for learning beha...

Please sign up or login with your details

Forgot password? Click here to reset