A Quadratic Actor Network for Model-Free Reinforcement Learning

03/11/2021 ∙ by Matthias Weissenbacher, et al. ∙ 28

In this work we discuss the incorporation of quadratic neurons into policy networks in the context of model-free actor-critic reinforcement learning. Quadratic neurons admit an explicit quadratic function approximation in contrast to conventional approaches where the the non-linearity is induced by the activation functions. We perform empiric experiments on several MuJoCo continuous control tasks and find that when quadratic neurons are added to MLP policy networks those outperform the baseline MLP whilst admitting a smaller number of parameters. The top returned reward is in average increased by 5.8% while being about 21% more sample efficient. Moreover, it can maintain its advantage against added action and observation noise.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 4

page 6

page 7

Code Repositories

Quadratic_MLPs_in_RL

TD3 and SAC algorithms with Quadratic - MLP (Q-MLP) as actor policy network


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.