Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources

03/15/2023
by   Logan Ward, et al.
0

Applications that fuse machine learning and simulation can benefit from the use of multiple computing resources, with, for example, simulation codes running on highly parallel supercomputers and AI training and inference tasks on specialized accelerators. Here, we present our experiences deploying two AI-guided simulation workflows across such heterogeneous systems. A unique aspect of our approach is our use of cloud-hosted management services to manage challenging aspects of cross-resource authentication and authorization, function-as-a-service (FaaS) function invocation, and data transfer. We show that these methods can achieve performance parity with systems that rely on direct connection between resources. We achieve parity by integrating the FaaS system and data transfer capabilities with a system that passes data by reference among managers and workers, and a user-configurable steering algorithm to hide data transfer latencies. We anticipate that this ease of use can enable routine use of heterogeneous resources in computational science.

READ FULL TEXT

page 3

page 8

page 9

research
05/14/2021

Slicing-Based AI Service Provisioning on Network Edge

Edge intelligence leverages computing resources on network edge to provi...
research
06/09/2020

Artificial Intelligence (AI)-Centric Management of Resources in Modern Distributed Computing Systems

Contemporary Distributed Computing Systems (DCS) such as Cloud Data Cent...
research
03/15/2023

HeRAFC: Heuristic Resource Allocation and Optimization in MultiFog-Cloud Environment

By bringing computing capacity from a remote cloud environment closer to...
research
01/03/2023

AI-Driven Confidential Computing across Edge-to-Cloud Continuum

With the meteoric growth of technology, individuals and organizations ar...
research
09/20/2023

A Cost-Aware Mechanism for Optimized Resource Provisioning in Cloud Computing

Due to the recent wide use of computational resources in cloud computing...
research
04/17/2023

Development of Authenticated Clients and Applications for ICICLE CI Services – Final Report for the REHS Program, June-August, 2022

The Artificial Intelligence (AI) institute for Intelligent Cyberinfrastr...
research
05/07/2020

funcX: A Federated Function Serving Fabric for Science

Exploding data volumes and velocities, new computational methods and pla...

Please sign up or login with your details

Forgot password? Click here to reset