A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations

07/06/2023
by   Sergio F. Chevtchenko, et al.
0

Reinforcement Learning (RL) provides a powerful framework for decision-making in complex environments. However, implementing RL in hardware-efficient and bio-inspired ways remains a challenge. This paper presents a novel Spiking Neural Network (SNN) architecture for solving RL problems with real-valued observations. The proposed model incorporates multi-layered event-based clustering, with the addition of Temporal Difference (TD)-error modulation and eligibility traces, building upon prior work. An ablation study confirms the significant impact of these components on the proposed model's performance. A tabular actor-critic algorithm with eligibility traces and a state-of-the-art Proximal Policy Optimization (PPO) algorithm are used as benchmarks. Our network consistently outperforms the tabular approach and successfully discovers stable control policies on classic RL environments: mountain car, cart-pole, and acrobot. The proposed model offers an appealing trade-off in terms of computational and hardware implementation requirements. The model does not require an external memory buffer nor a global error gradient computation, and synaptic updates occur online, driven by local learning rules and a broadcasted TD-error signal. Thus, this work contributes to the development of more hardware-efficient RL solutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2020

Memristor Hardware-Friendly Reinforcement Learning

Recently, significant progress has been made in solving sophisticated pr...
research
09/09/2023

Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective

Reinforcement learning (RL) is a powerful tool for solving complex decis...
research
03/05/2021

A Dual-Memory Architecture for Reinforcement Learning on Neuromorphic Platforms

Reinforcement learning (RL) is a foundation of learning in biological sy...
research
10/10/2020

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

It is a popular belief that model-based Reinforcement Learning (RL) is m...
research
10/01/2022

Integrating Conventional Headway Control with Reinforcement Learning to Avoid Bus Bunching

Bus bunching is a natural-occurring phenomenon that undermines the effic...
research
11/12/2022

CACTO: Continuous Actor-Critic with Trajectory Optimization – Towards global optimality

This paper presents a novel algorithm for the continuous control of dyna...

Please sign up or login with your details

Forgot password? Click here to reset