Data Transfer and Network Services management for Domain Science Workflows

03/15/2022
by   Tom Lehman, et al.
0

This paper describes a vision and work in progress to elevate network resources and data transfer management to the same level as compute and storage in the context of services access, scheduling, life cycle management, and orchestration. While domain science workflows often include active compute resource allocation and management, the data transfers and associated network resource coordination is not handled in a similar manner. As a result data transfers can introduce a degree of uncertainty in workflow operations, and the associated lack of network information does not allow for either the workflow operations or the network use to be optimized. The net result is that domain science workflow processes are forced to view the network as an opaque infrastructure into which they inject data and hope that it emerges at the destination with an acceptable Quality of Experience. There is little ability for applications to interact with the network to exchange information, negotiate performance parameters, discover expected performance metrics, or receive status/troubleshooting information in real time. Developing mechanisms to allow an application workflow to obtain information regarding the network services, capabilities, and options, to a degree similar to what is possible for compute resources is the primary motivation for this work. The initial focus is on the Open Science Grid (OSG)/Compact Muon Solenoid (CMS) Large Hadron Collider (LHC) workflows with Rucio/FTS/XRootD based data transfers and the interoperation with the ESnet SENSE (Software-Defined Network for End-to-end Networked Science at the Exascale) system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2020

Software-Defined Network for End-to-end Networked Science at the Exascale

Domain science applications and workflow processes are currently forced ...
research
03/06/2022

A Realtime Monitoring Platform for Workflow Subroutines

With the advancement in distributed computing, workflow management syste...
research
09/27/2022

Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations

Unique scientific instruments designed and operated by large global coll...
research
03/31/2017

The Eclipse Integrated Computational Environment

Problems in modeling and simulation require significantly different work...
research
07/08/2021

HTCondor data movement at 100 Gbps

HTCondor is a major workload management system used in distributed high ...
research
07/26/2018

Jupyter as Common Technology Platform for Interactive HPC Services

The Minnesota Supercomputing Institute has implemented Jupyterhub and th...

Please sign up or login with your details

Forgot password? Click here to reset