Accounting for the Sequential Nature of States to Learn Features for Reinforcement Learning

05/12/2022
by   Nathan Michlo, et al.
0

In this work, we investigate the properties of data that cause popular representation learning approaches to fail. In particular, we find that in environments where states do not significantly overlap, variational autoencoders (VAEs) fail to learn useful features. We demonstrate this failure in a simple gridworld domain, and then provide a solution in the form of metric learning. However, metric learning requires supervision in the form of a distance function, which is absent in reinforcement learning. To overcome this, we leverage the sequential nature of states in a replay buffer to approximate a distance metric and provide a weak supervision signal, under the assumption that temporally close states are also semantically similar. We modify a VAE with triplet loss and demonstrate that this approach is able to learn useful features for downstream tasks, without additional supervision, in environments where standard VAEs fail.

READ FULL TEXT

page 3

page 4

research
02/13/2018

TVAE: Triplet-Based Variational Autoencoder using Metric Learning

Deep metric learning has been demonstrated to be highly effective in lea...
research
06/14/2021

Self-Supervised Metric Learning in Multi-View Data: A Downstream Task Perspective

Self-supervised metric learning has been a successful approach for learn...
research
01/13/2021

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

Reinforcement learning methods trained on few environments rarely learn ...
research
04/28/2022

Mixup-based Deep Metric Learning Approaches for Incomplete Supervision

Deep learning architectures have achieved promising results in different...
research
10/22/2019

An Empirical Study on Learning Fairness Metrics for COMPAS Data with Human Supervision

The notion of individual fairness requires that similar people receive s...
research
07/11/2020

ECML: An Ensemble Cascade Metric Learning Mechanism towards Face Verification

Face verification can be regarded as a 2-class fine-grained visual recog...

Please sign up or login with your details

Forgot password? Click here to reset