Checkpointing with cp: the POSIX Shared Memory System

02/25/2021
by   Lehman H. Garrison, et al.
0

We present the checkpointing scheme of Abacus, an N-body simulation code that allocates all persistent state in POSIX shared memory, or ramdisk. Checkpointing becomes as simple as copying files from ramdisk to external storage. The main simulation executable is invoked once per time step, memory mapping the input state, computing the output state directly into ramdisk, and unmapping the input state. The main executable remains unaware of the concept of checkpointing, with the top-level driver code launching a file-system copy between executable invocations when a checkpoint is needed. Since the only information flow is through files on ramdisk, the checkpoint must be correct so long as the simulation is correct. However, we find that with multi-GB of state, there is a significant overhead to unmapping the shared memory. This can be partially mitigated with multithreading, but ultimately, we do not recommend shared memory for use with a large state.

READ FULL TEXT

page 1

page 2

page 3

research
11/02/2018

DurableFS: A File System for Persistent Memory

With the availability of hybrid DRAM-NVRAM memory on the memory bus of C...
research
01/04/2019

File System in Data-Centric Computing

The moving computation on the edge or near to data is the new trend that...
research
04/24/2019

Reconstruct the Directories for In-Memory File Systems

Existing path lookup routines in file systems need to construct an auxil...
research
06/29/2020

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

Modeling a structured, dynamic environment like a video game requires ke...
research
04/18/2021

FOX: Hardware-Assisted File Auditing for Direct Access NVM-Hosted Filesystems

With emerging non-volatile memories entering the mainstream market, seve...
research
10/14/2021

ALFRED: Virtual Memory for Intermittent Computing

We present ALFRED: a virtual memory abstraction that resolves the dichot...
research
04/05/2022

Persistent Kernels for Iterative Memory-bound GPU Applications

Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU ...

Please sign up or login with your details

Forgot password? Click here to reset