Analyzing scientific data sharing patterns for in-network data caching

05/03/2021
by   Elizabeth Copps, et al.
0

The volume of data moving through a network increases with new scientific experiments and simulations. Network bandwidth requirements also increase proportionally to deliver data within a certain time frame. We observe that a significant portion of the popular dataset is transferred multiple times to different users as well as to the same user for various reasons. In-network data caching for the shared data has shown to reduce the redundant data transfers and consequently save network traffic volume. In addition, overall application performance is expected to improve with in-network caching because access to the locally cached data results in lower latency. This paper shows how much data was shared over the study period, how much network traffic volume was consequently saved, and how much the temporary in-network caching increased the scientific application performance. It also analyzes data access patterns in applications and the impacts of caching nodes on the regional data repository. From the results, we observed that the network bandwidth demand was reduced by nearly a factor of 3 over the study period.

READ FULL TEXT

page 4

page 6

research
05/01/2023

Analyzing Transatlantic Network Traffic over Scientific Data Caches

Large scientific collaborations often share huge volumes of data around ...
research
05/11/2022

Access Trends of In-network Cache for Scientific Data

Scientific collaborations are increasingly relying on large volumes of d...
research
12/30/2020

Leveraging User Access Patterns and Advanced Cyberinfrastructure to Accelerate Data Delivery from Shared-use Scientific Observatories

With the growing number and increasing availability of shared-use instru...
research
03/14/2022

Deploying in-network caches in support of distributed scientific data sharing

The importance of intelligent data placement, management, and analysis h...
research
07/20/2023

Effectiveness and predictability of in-network storage cache for scientific workflows

Large scientific collaborations often have multiple scientists accessing...
research
05/29/2021

SMURF: Efficient and Scalable Metadata Access for Distributed Applications

In parallel with big data processing and analysis dominating the usage o...
research
07/18/2019

A View on Edge caching Applications

Devices with the ability to connect to the internet are growing in numbe...

Please sign up or login with your details

Forgot password? Click here to reset