Gradient TRIX

01/12/2023
by   Christoph Lenzen, et al.
0

Gradient clock synchronization (GCS) algorithms minimize the worst-case clock offset between the nodes in a distributed network of diameter D and size n. They achieve optimal offsets of Θ(log D) locally, i.e. between adjacent nodes as shown by Lenzen et al., and Θ(D) globally as shown by Biaz and Welch. As demonstrated in the work of Bund et al., this is a highly promising approach for improved clocking schemes for large-scale synchronous Systems-on-Chip (SoC). Unfortunately, in large systems, faults hinder their practical use. State of the art fault-tolerant, as presented by Bund et al., has a drawback that is fatal in this setting: It relies on node and edge replication. For f=1, this translates to at least 16-fold edge replication and high degree nodes, far from the optimum of 2f+1=3 for tolerating up to f faulty neighbors. In this work, we present a self-stabilizing GCS algorithm for a grid-like directed graph with optimal node in- and out-degrees of 3 that tolerates 1 faulty in-neighbor. If nodes fail with independent probability p∈ o(n^-1/2), it achieves asymptotically optimal local skew of Θ(log D) with probability 1-o(1); this holds under general worst-case assumptions on link delay and clock speed variations, provided they change slowly relative to the speed of the system. The failure probability is the largest possible ensuring that with probabity 1-o(1) for each node at most one in-neighbor fails. As modern hardware is clocked at gigahertz speeds and the algorithm can simultaneously sustain a constant number of arbitrary changes due to faults in each clock cycle, this results in sufficient robustness to dramatically increase the size of reliable synchronously clocked SoCs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2019

Fault Tolerant Gradient Clock Synchronization

Synchronizing clocks in distributed systems is well-understood, both in ...
research
10/03/2020

TRIX: Low-Skew Pulse Propagation for Fault-Tolerant Hardware

The vast majority of hardware architectures use a carefully timed refere...
research
08/29/2023

PALS: Distributed Gradient Clocking on Chip

Consider an arbitrary network of communicating modules on a chip, each r...
research
03/11/2020

PALS: Plesiochronous and Locally Synchronous Systems

Consider an arbitrary network of communicating modules on a chip, each r...
research
09/10/2018

Resilience Bounds of Sensing-Based Network Clock Synchronization

Recent studies exploited external periodic synchronous signals to synchr...
research
03/29/2020

Optimal Good-case Latency for Byzantine Broadcast and State Machine Replication

This paper investigates Byzantine broadcast (BB) protocols with optimal ...
research
03/24/2023

On the Susceptibility of QDI Circuits to Transient Faults

By design, quasi delay-insensitive (QDI) circuits exhibit higher resilie...

Please sign up or login with your details

Forgot password? Click here to reset