TRIX: Low-Skew Pulse Propagation for Fault-Tolerant Hardware

10/03/2020
by   Christoph Lenzen, et al.
0

The vast majority of hardware architectures use a carefully timed reference signal to clock their computational logic. However, standard distribution solutions are not fault-tolerant. In this work, we present a simple grid structure as a more reliable clock propagation method and study it by means of simulation experiments. Fault-tolerance is achieved by forwarding clock pulses on arrival of the second of three incoming signals from the previous layer. A key question is how well neighboring grid nodes are synchronized, even without faults. Analyzing the clock skew under typical-case conditions is highly challenging. Because the forwarding mechanism involves taking the median, standard probabilistic tools fail, even when modeling link delays just by unbiased coin flips. Our statistical approach provides substantial evidence that this system performs surprisingly well. Specifically, in an "infinitely wide" grid of height H, the delay at a pre-selected node exhibits a standard deviation of O(H^1/4) (≈ 2.7 link delay uncertainties for H=2000) and skew between adjacent nodes of o(loglog H) (≈ 0.77 link delay uncertainties for H=2000). We conclude that the proposed system is a very promising clock distribution method. This leads to the open problem of a stochastic explanation of the tight concentration of delays and skews. More generally, we believe that understanding our very simple abstraction of the system is of mathematical interest in its own right.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/12/2023

Gradient TRIX

Gradient clock synchronization (GCS) algorithms minimize the worst-case ...
research
02/21/2019

Fault Tolerant Gradient Clock Synchronization

Synchronizing clocks in distributed systems is well-understood, both in ...
research
05/12/2016

A Fault Tolerance Improved Majority Voter for TMR System Architectures

For digital system designs, triple modular redundancy (TMR), which is a ...
research
02/13/2020

Functional Failure Rate Due to Single-Event Transients in Clock Distribution Networks

With technology scaling, lower supply voltages, and higher operating fre...
research
03/04/2022

Optimal Clock Synchronization with Signatures

Cryptographic signatures can be used to increase the resilience of distr...
research
08/29/2023

PALS: Distributed Gradient Clocking on Chip

Consider an arbitrary network of communicating modules on a chip, each r...

Please sign up or login with your details

Forgot password? Click here to reset