Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem

09/16/2019
by   Dattaraj Rao, et al.
0

Traditional Reinforcement Learning (RL) problems depend on an exhaustive simulation environment that models real-world physics of the problem and trains the RL agent by observing this environment. In this paper, we present a novel approach to creating an environment by modeling the reward function based on empirical rules extracted from human domain knowledge of the system under study. Using this empirical rewards function, we will build an environment and train the agent. We will first create an environment that emulates the effect of setting cabin temperature through thermostat. This is typically done in RL problems by creating an exhaustive model of the system with detailed thermodynamic study. Instead, we propose an empirical approach to model the reward function based on human domain knowledge. We will document some rules of thumb that we usually exercise as humans while setting thermostat temperature and try and model these into our reward function. This modeling of empirical human domain rules into a reward function for RL is the unique aspect of this paper. This is a continuous action space problem and using deep deterministic policy gradient (DDPG) method, we will solve for maximizing the reward function. We will create a policy network that predicts optimal temperature setpoint given external temperature and humidity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2023

Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning

An appropriate reward function is of paramount importance in specifying ...
research
06/23/2020

Environment Shaping in Reinforcement Learning using State Abstraction

One of the central challenges faced by a reinforcement learning (RL) age...
research
01/06/2022

Admissible Policy Teaching through Reward Design

We study reward design strategies for incentivizing a reinforcement lear...
research
03/16/2021

Learning to Shape Rewards using a Game of Switching Controls

Reward shaping (RS) is a powerful method in reinforcement learning (RL) ...
research
02/05/2019

Learning to Learn in Simulation

Deep learning often requires the manual collection and annotation of a t...
research
04/10/2018

Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning

We investigate a novel approach for image restoration by reinforcement l...
research
05/09/2021

Improving Cost Learning for JPEG Steganography by Exploiting JPEG Domain Knowledge

Although significant progress in automatic learning of steganographic co...

Please sign up or login with your details

Forgot password? Click here to reset