Reinforcement Learning for Load-balanced Parallel Particle Tracing

09/13/2021
by   Jiayi Xu, et al.
0

We explore an online learning reinforcement learning (RL) paradigm for optimizing parallel particle tracing performance in distributed-memory systems. Our method combines three novel components: (1) a workload donation model, (2) a high-order workload estimation model, and (3) a communication cost model, to optimize the performance of data-parallel particle tracing dynamically. First, we design an RL-based workload donation model. Our workload donation model monitors the workload of processes and creates RL agents to donate particles and data blocks from high-workload processes to low-workload processes to minimize the execution time. The agents learn the donation strategy on-the-fly based on reward and cost functions. The reward and cost functions are designed to consider the processes' workload change and the data transfer cost for every donation action. Second, we propose an online workload estimation model, in order to help our RL model estimate the workload distribution of processes in future computations. Third, we design the communication cost model that considers both block and particle data exchange costs, helping the agents make effective decisions with minimized communication cost. We demonstrate that our algorithm adapts to different flow behaviors in large-scale fluid dynamics, ocean, and weather simulation data. Our algorithm improves parallel particle tracing performance in terms of parallel efficiency, load balance, and costs of I/O and communication for evaluations up to 16,384 processors.

READ FULL TEXT

page 10

page 12

research
07/02/2021

An Efficient Particle Tracking Algorithm for Large-Scale Parallel Pseudo-Spectral Simulations of Turbulence

Particle tracking in large-scale numerical simulations of turbulent flow...
research
08/10/2023

A Comparison of Classical and Deep Reinforcement Learning Methods for HVAC Control

Reinforcement learning (RL) is a promising approach for optimizing HVAC ...
research
06/14/2023

A reinforcement learning strategy for p-adaptation in high order solvers

Reinforcement learning (RL) has emerged as a promising approach to autom...
research
01/06/2023

A Framework for Large Scale Particle Filters Validated with Data Assimilation for Weather Simulation

Particle filters are a group of algorithms to solve inverse problems thr...
research
08/16/2022

Performance Assessment of Diffusive Load Balancing for Distributed Particle Advection

Particle advection is the approach for extraction of integral curves fro...
research
09/14/2022

Analysis of Reinforcement Learning for determining task replication in workflows

Executing workflows on volunteer computing resources where individual ta...
research
05/18/2023

Maximal workload, minimal workload, maximal workload difference: optimizing all criteria at once

In a simple model of assigning workers to tasks, every solution that min...

Please sign up or login with your details

Forgot password? Click here to reset