Multi-fidelity reinforcement learning framework for shape optimization

by   Sahil Bhola, et al.
University of Michigan

Deep reinforcement learning (DRL) is a promising outer-loop intelligence paradigm which can deploy problem solving strategies for complex tasks. Consequently, DRL has been utilized for several scientific applications, specifically in cases where classical optimization or control methods are limited. One key limitation of conventional DRL methods is their episode-hungry nature which proves to be a bottleneck for tasks which involve costly evaluations of a numerical forward model. In this article, we address this limitation of DRL by introducing a controlled transfer learning framework that leverages a multi-fidelity simulation setting. Our strategy is deployed for an airfoil shape optimization problem at high Reynolds numbers, where our framework can learn an optimal policy for generating efficient airfoil shapes by gathering knowledge from multi-fidelity environments and reduces computational costs by over 30%. Furthermore, our formulation promotes policy exploration and generalization to new environments, thereby preventing over-fitting to data from solely one fidelity. Our results demonstrate this framework's applicability to other scientific DRL scenarios where multi-fidelity environments can be used for policy learning.


Pessimistic Model Selection for Offline Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) has demonstrated great potentials in s...

Efficient Learning of Voltage Control Strategies via Model-based Deep Reinforcement Learning

This article proposes a model-based deep reinforcement learning (DRL) me...

Fidelity-Induced Interpretable Policy Extraction for Reinforcement Learning

Deep Reinforcement Learning (DRL) has achieved remarkable success in seq...

Combing Policy Evaluation and Policy Improvement in a Unified f-Divergence Framework

The framework of deep reinforcement learning (DRL) provides a powerful a...

Control of a simulated MRI scanner with deep reinforcement learning

Magnetic resonance imaging (MRI) is a highly versatile and widely used c...

Sample-Efficient Co-Design of Robotic Agents Using Multi-fidelity Training on Universal Policy Network

Co-design involves simultaneously optimizing the controller and agents p...

Please sign up or login with your details

Forgot password? Click here to reset