Design and Evaluation of a Simple Data Interface for Efficient Data Transfer Across Diverse Storage

09/07/2020 ∙ by Zhengchun Liu, et al. ∙ 0

Modern science and engineering computing environments often feature storage systems of different types, from parallel file systems in high-performance computing centers to object stores operated by cloud providers. To enable easy, reliable, secure, and performant data exchange among these different systems, we propose Connector, a pluggable data access architecture for diverse, distributed storage. By abstracting low-level storage system details, this abstraction permits a managed data transfer service (Globus in our case) to interact with a large and easily extended set of storage systems. Equally important, it supports third-party transfers: that is, direct data transfers from source to destination that are initiated by a third-party client but do not engage that third party in the data path. The abstraction also enables management of transfers for performance optimization, error handling, and end-to-end integrity. We present the Connector design, describe implementations for different storage services, evaluate tradeoffs inherent in managed transfers, motivate recommended deployment options, and propose a performance model-based method that allows for easy characterization of performance in different contexts without exhaustive benchmarking.



There are no comments yet.


page 4

page 15

page 16

page 17

page 18

page 19

page 20

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.