3PO: Programmed Far-Memory Prefetching for Oblivious Applications

Using memory located on remote machines, or far memory, as a swap space is a promising approach to meet the increasing memory demands of modern datacenter applications. Operating systems have long relied on prefetchers to mask the increased latency of fetching pages from swap space to main memory. Unfortunately, with traditional prefetching heuristics, performance still degrades when applications use far memory. In this paper we propose a new prefetching technique for far-memory applications. We focus our efforts on memory-intensive, oblivious applications whose memory access patterns are independent of their inputs, such as matrix multiplication. For this class of applications we observe that we can perfectly prefetch pages without relying on heuristics. However, prefetching perfectly without requiring significant application modifications is challenging. In this paper we describe the design and implementation of 3PO, a system that provides pre-planned prefetching for general oblivious applications. We demonstrate that 3PO can accelerate applications, e.g., running them 30-150 faster than with Linux's prefetcher with 20 understand the fundamental software overheads of prefetching in a paging-based system, and the minimum performance penalty that they impose when we run applications under constrained local memory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/2020

Leveraging Architectural Support of Three Page Sizes with Trident

Large pages are commonly deployed to reduce address translation overhead...
research
11/22/2019

Effectively Prefetching Remote Memory with Leap

Memory disaggregation over RDMA can improve the performance of memory-co...
research
03/09/2021

MapVisual: A Visualization Tool for Memory Access Patterns

Memory bandwidth is strongly correlated to the complexity of the memory ...
research
04/29/2020

Towards Faster Reasoners By Using Transparent Huge Pages

Various state-of-the-art automated reasoning (AR) tools are widely used ...
research
12/26/2021

Asynchronous Memory Access Unit for General Purpose Processors

In future data centers, applications will make heavy use of far memory (...
research
10/22/2019

Mitigating the Performance-Efficiency Tradeoff in Resilient Memory Disaggregation

Memory disaggregation has received attention in recent years as a promis...
research
03/17/2022

Canvas: Isolated and Adaptive Swapping for Multi-Applications on Remote Memory

Remote memory techniques for datacenter applications have recently gaine...

Please sign up or login with your details

Forgot password? Click here to reset