CXLMemSim: A pure software simulated CXL.mem for performance characterization

03/10/2023
by   Yiwei Yang, et al.
0

The emerging CXL.mem standard provides a new type of byte-addressable remote memory with a variety of memory types and hierarchies. With CXL.mem, multiple layers of memory – e.g., local DRAM and CXL-attached remote memory at different locations – are exposed to operating systems and user applications, bringing new challenges and research opportunities. Unfortunately, since CXL.mem devices are not commercially available, it is difficult for researchers to conduct systems research that uses CXL.mem. In this paper, we present our ongoing work, CXLMemSim, a fast and lightweight CXL.mem simulator for performance characterization. CXLMemSim uses a performance model driven using performance monitoring events, which are supported by most commodity processors. Specifically, CXLMemSim attaches to an existing, unmodified program, and divides the execution of the program into multiple epochs; once an epoch finishes, CXLMemSim collects performance monitoring events and calculates the simulated execution time of the epoch based on these events. Through this method, CXLMemSim avoids the performance overhead of a full-system simulator (e.g., Gem5) and allows the memory hierarchy and latency to be easily adjusted, enabling research such as memory scheduling for complex applications. Our preliminary evaluation shows that CXLMemSim slows down the execution of the attached program by 4.41x on average for real-world applications.

READ FULL TEXT

page 1

page 2

page 3

research
07/20/2018

SCARR: A Novel Scalable Runtime Remote Attestation

Runtime remote attestation is a technique that allows to validate the co...
research
06/08/2021

Dynamic Software Updates for Unmodified Browsers through Multi-Version Execution

In this paper, we present the design, implementation, and evaluation of ...
research
11/09/2022

Performance Characterization of AutoNUMA Memory Tiering on Graph Analytics

Non-Volatile Memory (NVM) can deliver higher density and lower cost per ...
research
03/19/2022

No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing

Serverless platforms essentially face a tradeoff between container start...
research
09/23/2018

OS Scheduling Algorithms for Memory Intensive Workloads in Multi-socket Multi-core servers

Major chip manufacturers have all introduced multicore microprocessors. ...
research
08/06/2020

Near Linear OS Scheduling Optimization for Memory Intensive Workloads on Multi-socket Multi-core servers

Multi-socket multi-core servers are used for solving some of the importa...
research
05/20/2015

A Survey Report on Operating Systems for Tiny Networked Sensors

Wireless sensor network (WSN) has attracted researchers worldwide to exp...

Please sign up or login with your details

Forgot password? Click here to reset