DeepAI AI Chat
Log In Sign Up

ConQUR: Mitigating Delusional Bias in Deep Q-learning

02/27/2020
by   Andy Su, et al.
20

Delusional bias is a fundamental source of error in approximate Q-learning. To date, the only techniques that explicitly address delusion require comprehensive search using tabular value estimates. In this paper, we develop efficient methods to mitigate delusional bias by training Q-approximators with labels that are "consistent" with the underlying greedy policy class. We introduce a simple penalization scheme that encourages Q-labels used across training batches to remain (jointly) consistent with the expressible policy class. We also propose a search framework that allows multiple Q-approximators to be generated and tracked, thus mitigating the effect of premature (implicit) policy commitments. Experimental results demonstrate that these methods can improve the performance of Q-learning in a variety of Atari games, sometimes dramatically.

READ FULL TEXT

page 14

page 17

page 20

01/25/2021

Diverse Adversaries for Mitigating Bias in Training

Adversarial learning can learn fairer and less biased models of language...
05/10/2021

Improving Fairness of AI Systems with Lossless De-biasing

In today's society, AI systems are increasingly used to make critical de...
06/19/2019

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

We consider the core reinforcement-learning problem of on-policy value f...
07/20/2022

Discover and Mitigate Unknown Biases with Debiasing Alternate Networks

Deep image classifiers have been found to learn biases from datasets. To...
11/19/2020

Latent Adversarial Debiasing: Mitigating Collider Bias in Deep Neural Networks

Collider bias is a harmful form of sample selection bias that neural net...
12/16/2021

Mitigating the Bias of Centered Objects in Common Datasets

Convolutional networks are considered shift invariant, but it was demons...
04/06/2019

Mitigating Gyral Bias in Cortical Tractography via Asymmetric Fiber Orientation Distributions

Diffusion tractography in brain connectomics often involves tracing axon...