GCS: Generalized Cache Coherence For Efficient Synchronization

01/06/2023
by   Yanpeng Yu, et al.
0

We explore the design of scalable synchronization primitives for disaggregated shared memory. Porting existing synchronization primitives to disaggregated shared memory results in poor scalability with the number of application threads because they layer synchronization primitives atop cache-coherence substrates, which engenders redundant inter-core communications. Substantially higher cache-coherence latency (μs) with substantially lower bandwidths in state-of-the-art disaggregated shared memory designs amplifies the impact of such redundant communications and precludes scalability. In this work, we argue for a co-design for the cache-coherence and synchronization layers for better performance scaling of multi-threaded applications on disaggregated memory. This is driven by our observation that synchronization primitives are essentially a generalization of cache-coherence protocols in time and space. We present GCS as an implementation of this co-design. GCS employs wait queues and arbitrarily-sized cache lines directly at the cache-coherence protocol layer for temporal and spatial generalization. We evaluate GCS against the layered approach for synchronization primitives: the pthread implementation of reader-writer lock, and show that GCS improves in-memory key-value store performance at scale by 1 - 2 orders of magnitude.

READ FULL TEXT
research
01/19/2021

SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures

Near-Data-Processing (NDP) architectures present a promising way to alle...
research
10/16/2022

RevaMp3D: Architecting the Processor Core and Cache Hierarchy for Systems with Monolithically-Integrated Logic and Memory

Recent nano-technological advances enable the Monolithic 3D (M3D) integr...
research
06/23/2017

Predictable Cache Coherence for Multi-Core Real-Time Systems

This work addresses the challenge of allowing simultaneous and predictab...
research
07/22/2020

Analytical Modeling the Multi-Core Shared Cache Behavior with Considerations of Data-Sharing and Coherence

To mitigate the ever worsening "Power wall" and "Memory wall" problems, ...
research
09/30/2022

Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems

As industry moves toward chiplet-based designs, the insertion of hardwar...
research
10/12/2018

Compact NUMA-Aware Locks

Modern multi-socket architectures exhibit non-uniform memory access (NUM...
research
10/03/2018

BRAVO – Biased Locking for Reader-Writer Locks

Designers of modern reader-writer locks confront a difficult trade-off r...

Please sign up or login with your details

Forgot password? Click here to reset