Safe Reinforcement Learning for Strategic Bidding of Virtual Power Plants in Day-Ahead Markets

07/11/2023
by   Ognjen Stanojev, et al.
0

This paper presents a novel safe reinforcement learning algorithm for strategic bidding of Virtual Power Plants (VPPs) in day-ahead electricity markets. The proposed algorithm utilizes the Deep Deterministic Policy Gradient (DDPG) method to learn competitive bidding policies without requiring an accurate market model. Furthermore, to account for the complex internal physical constraints of VPPs we introduce two enhancements to the DDPG method. Firstly, a projection-based safety shield that restricts the agent's actions to the feasible space defined by the non-linear power flow equations and operating constraints of distributed energy resources is derived. Secondly, a penalty for the shield activation in the reward function that incentivizes the agent to learn a safer policy is introduced. A case study based on the IEEE 13-bus network demonstrates the effectiveness of the proposed approach in enabling the agent to learn a highly competitive, safe strategic policy.

READ FULL TEXT

page 1

page 5

research
12/20/2021

Safe multi-agent deep reinforcement learning for joint bidding and maintenance scheduling of generation units

This paper proposes a safe reinforcement learning algorithm for generati...
research
02/18/2021

Strategic bidding in freight transport using deep reinforcement learning

This paper presents a multi-agent reinforcement learning algorithm to re...
research
06/13/2023

Multi-market Energy Optimization with Renewables via Reinforcement Learning

This paper introduces a deep reinforcement learning (RL) framework for o...
research
01/19/2023

Deep Reinforcement Learning for Power Trading

The Dutch power market includes a day-ahead market and an auction-like i...
research
07/19/2021

Constrained Policy Gradient Method for Safe and Fast Reinforcement Learning: a Neural Tangent Kernel Based Approach

This paper presents a constrained policy gradient algorithm. We introduc...
research
10/08/2017

Recurrent Network-based Deterministic Policy Gradient for Solving Bipedal Walking Challenge on Rugged Terrains

This paper presents the learning algorithm based on the Recurrent Networ...
research
10/12/2018

Detecting Strategic Manipulation in Distributed Optimisation of Electric Vehicle Aggregators

Given the rapid rise of electric vehicles (EVs) worldwide, and the ambit...

Please sign up or login with your details

Forgot password? Click here to reset