Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control

10/25/2020
by   Hamid Radmard Rahmani, et al.
0

In many reinforcement learning (RL) problems, it takes some time until a taken action by the agent reaches its maximum effect on the environment and consequently the agent receives the reward corresponding to that action by a delay called action-effect delay. Such delays reduce the performance of the learning algorithm and increase the computational costs, as the reinforcement learning agent values the immediate rewards more than the future reward that is more related to the taken action. This paper addresses this issue by introducing an applicable enhanced Q-learning method in which at the beginning of the learning phase, the agent takes a single action and builds a function that reflects the environments response to that action, called the reflexive γ - function. During the training phase, the agent utilizes the created reflexive γ- function to update the Q-values. We have applied the developed method to a structural control problem in which the goal of the agent is to reduce the vibrations of a building subjected to earthquake excitations with a specified delay. Seismic control problems are considered as a complex task in structural engineering because of the stochastic and unpredictable nature of earthquakes and the complex behavior of the structure. Three scenarios are presented to study the effects of zero, medium, and long action-effect delays and the performance of the Enhanced method is compared to the standard Q-learning method. Both RL methods use neural network to learn to estimate the state-action value function that is used to control the structure. The results show that the enhanced method significantly outperforms the performance of the original method in all cases, and also improves the stability of the algorithm in dealing with action-effect delays.

READ FULL TEXT
research
12/02/2022

Multi-Agent Reinforcement Learning with Reward Delays

This paper considers multi-agent reinforcement learning (MARL) where the...
research
08/17/2021

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Several real-world scenarios, such as remote control and sensing, are co...
research
09/20/2023

Delays in Reinforcement Learning

Delays are inherent to most dynamical systems. Besides shifting the proc...
research
03/09/2020

Human AI interaction loop training: New approach for interactive reinforcement learning

Reinforcement Learning (RL) in various decision-making tasks of machine ...
research
06/24/2019

In Hindsight: A Smooth Reward for Steady Exploration

In classical Q-learning, the objective is to maximize the sum of discoun...
research
05/17/2021

Generic Itemset Mining Based on Reinforcement Learning

One of the biggest problems in itemset mining is the requirement of deve...
research
02/25/2022

Towards neoRL networks; the emergence of purposive graphs

The neoRL framework for purposive AI implements latent learning by emula...

Please sign up or login with your details

Forgot password? Click here to reset