Associative Memory Based Experience Replay for Deep Reinforcement Learning

07/16/2022
by   Mengyuan Li, et al.
0

Experience replay is an essential component in deep reinforcement learning (DRL), which stores the experiences and generates experiences for the agent to learn in real time. Recently, prioritized experience replay (PER) has been proven to be powerful and widely deployed in DRL agents. However, implementing PER on traditional CPU or GPU architectures incurs significant latency overhead due to its frequent and irregular memory accesses. This paper proposes a hardware-software co-design approach to design an associative memory (AM) based PER, AMPER, with an AM-friendly priority sampling operation. AMPER replaces the widely-used time-costly tree-traversal-based priority sampling in PER while preserving the learning performance. Further, we design an in-memory computing hardware architecture based on AM to support AMPER by leveraging parallel in-memory search operations. AMPER shows comparable learning performance while achieving 55x to 270x latency improvement when running on the proposed hardware compared to the state-of-the-art PER running on GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/06/2021

Deep Reinforcement Learning with Quantum-inspired Experience Replay

In this paper, a novel training paradigm inspired by quantum computation...
research
01/09/2018

Deep In-GPU Experience Replay

Experience replay allows a reinforcement learning agent to train on samp...
research
02/04/2020

Bootstrapping a DQN Replay Memory with Synthetic Experiences

An important component of many Deep Reinforcement Learning algorithms is...
research
12/07/2021

PTR-PPO: Proximal Policy Optimization with Prioritized Trajectory Replay

On-policy deep reinforcement learning algorithms have low data utilizati...
research
08/29/2023

R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics

Autonomous robotic systems, like autonomous vehicles and robotic search ...
research
10/26/2021

A DPDK-Based Acceleration Method for Experience Sampling of Distributed Reinforcement Learning

A computing cluster that interconnects multiple compute nodes is used to...

Please sign up or login with your details

Forgot password? Click here to reset