Parallel bandit architecture based on laser chaos for reinforcement learning

05/19/2022
by   Takashi Urushibara, et al.
0

Accelerating artificial intelligence by photonics is an active field of study aiming to exploit the unique properties of photons. Reinforcement learning is an important branch of machine learning, and photonic decision-making principles have been demonstrated with respect to the multi-armed bandit problems. However, reinforcement learning could involve a massive number of states, unlike previously demonstrated bandit problems where the number of states is only one. Q-learning is a well-known approach in reinforcement learning that can deal with many states. The architecture of Q-learning, however, does not fit well photonic implementations due to its separation of update rule and the action selection. In this study, we organize a new architecture for multi-state reinforcement learning as a parallel array of bandit problems in order to benefit from photonic decision-makers, which we call parallel bandit architecture for reinforcement learning or PBRL in short. Taking a cart-pole balancing problem as an instance, we demonstrate that PBRL adapts to the environment in fewer time steps than Q-learning. Furthermore, PBRL yields faster adaptation when operated with a chaotic laser time series than the case with uniformly distributed pseudorandom numbers where the autocorrelation inherent in the laser chaos provides a positive effect. We also find that the variety of states that the system undergoes during the learning phase exhibits completely different properties between PBRL and Q-learning. The insights obtained through the present study are also beneficial for existing computing platforms, not just photonic realizations, in accelerating performances by the PBRL algorithms and correlated random sequences.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2018

Scalable photonic reinforcement learning by time-division multiplexing of laser chaos

Reinforcement learning involves decision making in dynamic and uncertain...
research
05/12/2022

Controlling chaotic itinerancy in laser dynamics for reinforcement learning

Photonic artificial intelligence has attracted considerable interest in ...
research
03/30/2022

Theory of Acceleration of Decision Making by Correlated Times Sequences

Photonic accelerators have been intensively studied to provide enhanced ...
research
04/14/2017

Ultrafast photonic reinforcement learning based on laser chaos

Reinforcement learning involves decision making in dynamic and uncertain...
research
12/20/2022

Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation

Recently, extensive studies on photonic reinforcement learning to accele...
research
09/15/2021

Estimation of Warfarin Dosage with Reinforcement Learning

In this paper, it has attempted to use Reinforcement learning to model t...
research
09/16/2023

gym-saturation: Gymnasium environments for saturation provers (System description)

This work describes a new version of a previously published Python packa...

Please sign up or login with your details

Forgot password? Click here to reset