Towards neoRL networks; the emergence of purposive graphs

02/25/2022
by   Per R. Leikanger, et al.
0

The neoRL framework for purposive AI implements latent learning by emulated cognitive maps, with general value functions (GVF) expressing operant desires toward separate states. The agent's expectancy of reward, expressed as learned projections in the considered space, allows the neoRL agent to extract purposive behavior from the learned map according to the reward hypothesis. We explore this allegory further, considering neoRL modules as nodes in a network with desire as input and state-action Q-value as output; we see that action sets with Euclidean significance imply an interpretation of state-action vectors as Euclidean projections of desire. Autonomous desire from neoRL nodes within the agent allows for deeper neoRL behavioral graphs. Experiments confirm the effect of neoRL networks governed by autonomous desire, verifying the four principles for purposive networks. A neoRL agent governed by purposive networks can navigate Euclidean spaces in real-time while learning, exemplifying how modern AI still can profit from inspiration from early psychology.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2018

Incorporating Behavioral Constraints in Online AI Systems

AI systems that learn through reward feedback about the actions they tak...
research
02/19/2022

Navigating Conceptual Space; A new take on Artificial General Intelligence

Edward C. Tolman found reinforcement learning unsatisfactory for explain...
research
06/28/2019

Growing Action Spaces

In complex tasks, such as those with large combinatorial action spaces, ...
research
06/11/2018

An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning

Our goal is for AI systems to correctly identify and act according to th...
research
03/02/2023

Deconstructing deep active inference

Active inference is a theory of perception, learning and decision making...
research
10/25/2020

Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control

In many reinforcement learning (RL) problems, it takes some time until a...

Please sign up or login with your details

Forgot password? Click here to reset