DeepAI AI Chat
Log In Sign Up

Large Scale Parallelization Using File-Based Communications

by   Chansup Byun, et al.

In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and receiving processes are not on the same node. However, even with this additional overhead cost, its benefits are far greater for the overall cluster operation in addition to the performance enhancement in message communications for large scale parallel jobs. For example, when running a 2048-process parallel job, it achieved about 34 times better performance with MPI_Bcast() when using the local filesystem. Furthermore, since the security for transferring message files is handled entirely by using the secure copy protocol (scp) and the file system permissions, no additional security measures or ports are required other than those that are typically required on an HPC system.


page 1

page 2

page 3

page 4


Node-Based Job Scheduling for Large Scale Simulations of Short Running Jobs

Diverse workloads such as interactive supercomputing, big data analysis,...

A Stochastic Model for File Lifetime and Security in Data Center Networks

Data center networks are an important infrastructure in various applicat...

DisTRaC: Accelerating High Performance Compute Processing for Temporary Data Storage

High Performance Compute (HPC) clusters often produce intermediate files...

Best of Both Worlds: High Performance Interactive and Batch Launching

Rapid launch of thousands of jobs is essential for effective interactive...

A local parallel communication algorithm for polydisperse rigid body dynamics

The simulation of large ensembles of particles is usually parallelized b...

PaPaS: A Portable, Lightweight, and Generic Framework for Parallel Parameter Studies

The current landscape of scientific research is widely based on modeling...

Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching

Gzip is a file compression format, which is ubiquitously used. Although ...