Identification and Off-Policy Learning of Multiple Objectives Using Adaptive Clustering

In this work, we present a methodology that enables an agent to make efficient use of its exploratory actions by autonomously identifying possible objectives in its environment and learning them in parallel. The identification of objectives is achieved using an online and unsupervised adaptive clustering algorithm. The identified objectives are learned (at least partially) in parallel using Q-learning. Using a simulated agent and environment, it is shown that the converged or partially converged value function weights resulting from off-policy learning can be used to accumulate knowledge about multiple objectives without any additional exploration. We claim that the proposed approach could be useful in scenarios where the objectives are initially unknown or in real world scenarios where exploration is typically a time and energy intensive process. The implications and possible extensions of this work are also briefly discussed.

READ FULL TEXT
research
09/10/2019

Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning

Prior access to domain knowledge could significantly improve the perform...
research
04/21/2017

Multi-Objective Deep Q-Learning with Subsumption Architecture

In this work we present a method for using Deep Q-Networks (DQNs) in mul...
research
11/18/2018

Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning

The idea of reusing or transferring information from previously learned ...
research
02/05/2019

Learning to Learn in Simulation

Deep learning often requires the manual collection and annotation of a t...
research
10/03/2019

Using Logical Specifications of Objectives in Multi-Objective Reinforcement Learning

In the multi-objective reinforcement learning (MORL) paradigm, the relat...
research
09/06/2022

Cross apprenticeship learning framework: Properties and solution approaches

Apprenticeship learning is a framework in which an agent learns a policy...
research
06/25/2023

A Framework for dynamically meeting performance objectives on a service mesh

We present a framework for achieving end-to-end management objectives fo...

Please sign up or login with your details

Forgot password? Click here to reset