Delayed Q-update: A novel credit assignment technique for deriving an optimal operation policy for the Grid-Connected Microgrid

06/30/2020
by   Hyungjun Park, et al.
0

A microgrid is an innovative system that integrates distributed energy resources to supply electricity demand within electrical boundaries. This study proposes an approach for deriving a desirable microgrid operation policy that enables sophisticated controls in the microgrid system using the proposed novel credit assignment technique, delayed-Q update. The technique employs novel features such as the ability to tackle and resolve the delayed effective property of the microgrid, which prevents learning agents from deriving a well-fitted policy under sophisticated controls. The proposed technique tracks the history of the charging period and retroactively assigns an adjusted value to the ESS charging control. The operation policy derived using the proposed approach is well-fitted for the real effects of ESS operation because of the process of the technique. Therefore, it supports the search for a near-optimal operation policy under a sophisticatedly controlled microgrid environment. To validate our technique, we simulate the operation policy under a real-world grid-connected microgrid system and demonstrate the convergence to a near-optimal policy by comparing performance measures of our policy with benchmark policy and optimal policy.

READ FULL TEXT

page 1

page 7

page 8

research
04/17/2017

O^2TD: (Near)-Optimal Off-Policy TD Learning

Temporal difference learning and Residual Gradient methods are the most ...
research
12/12/2019

Sublinear Optimal Policy Value Estimation in Contextual Bandits

We study the problem of estimating the expected reward of the optimal po...
research
03/17/2021

Near Optimal Policy Optimization via REPS

Since its introduction a decade ago, relative entropy policy search (REP...
research
07/09/2022

Optimal policies for Bayesian olfactory search in turbulent flows

In many practical scenarios, a flying insect must search for the source ...
research
12/04/2019

A Maximin Optimal Online Power Control Policy for Energy Harvesting Communications

A general theory of online power control for discrete-time battery limit...
research
04/14/2022

Reinforcement Learning Policy Recommendation for Interbank Network Stability

In this paper we analyze the effect of a policy recommendation on the pe...

Please sign up or login with your details

Forgot password? Click here to reset