Online Residential Demand Response via Contextual Multi-Armed Bandits
Residential load demands have huge potential to be exploited to enhance the efficiency and reliability of power system operation through demand response (DR) programs. This paper studies the strategies to select the right customers for residential DR from the perspective of load service entities (LSEs). One of the main challenges to implement residential DR is that customer responses to the incentives are uncertain and unknown, which are influenced by various personal and environmental factors. To address this challenge, this paper employs the contextual multi-armed bandit (CMAB) method to model the optimal customer selection problem with uncertainty. Based on Thompson sampling framework, an online learning and decision-making algorithm is proposed to learn customer behaviors and select appropriate customers for load reduction. This algorithm takes the contextual information into consideration and is applicable to complicated DR settings. Numerical simulations are performed to demonstrate the efficiency and learning effectiveness of the proposed algorithm.
READ FULL TEXT