AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization

by   Kailash Gogineni, et al.

Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning (RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralized execution (CTDE) paradigm. Current multi-agent reinforcement learning (MARL) algorithms consider experience replay with uniform sampling or based on priority weights to improve transition data sample efficiency in the sampling phase. However, moving transition data histories for each agent through the processor memory hierarchy is a performance limiter. Also, as the agents' transitions continuously renew every iteration, the finite cache capacity results in increased cache misses. To this end, we propose , that repeatedly reuses the transitions (experiences) for a window of n steps in order to improve the cache locality and minimize the transition data movement, instead of sampling new transitions at each step. Specifically, our optimization uses priority weights to select the transitions so that only high-priority transitions will be reused frequently, thereby improving the cache performance. Our experimental results on the Predator-Prey environment demonstrate the effectiveness of reusing the essential transitions based on the priority weights, where we observe an end-to-end training time reduction of 25.4% (for 32 agents) compared to existing prioritized MER algorithms without notable degradation in the mean reward.


MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization

Experience replay is crucial for off-policy reinforcement learning (RL) ...

Discriminative Experience Replay for Efficient Multi-agent Reinforcement Learning

In cooperative multi-agent tasks, parameter sharing among agents is a co...

Towards Efficient Multi-Agent Learning Systems

Multi-Agent Reinforcement Learning (MARL) is an increasingly important r...

Virtual Replay Cache

Return caching is a recent strategy that enables efficient minibatch tra...

Offline Prioritized Experience Replay

Offline reinforcement learning (RL) is challenged by the distributional ...

Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning

Prioritized experience replay (PER) samples important transitions, rathe...

Revisiting Prioritized Experience Replay: A Value Perspective

Experience replay enables off-policy reinforcement learning (RL) agents ...

Please sign up or login with your details

Forgot password? Click here to reset