Direct Telemetry Access

02/04/2022
by   Jonatan Langlet, et al.
0

The emergence of programmable switches allows operators to collect a vast amount of fine-grained telemetry data in real time. However, consolidating the telemetry reports at centralized collectors to gain a network-wide view poses an immense challenge. The received data has to be transported from the switches, parsed, manipulated, and inserted in queryable data structures. As the network scales, this requires excessive CPU processing. RDMA is a transport protocol that bypasses the CPU and allows extremely high data transfer rates. Yet, RDMA is not designed for telemetry collection: it requires a stateful connection, supports only a small number of concurrent writers, and has limited writing primitives, which restricts its data aggregation applicability. We introduce Direct Telemetry Access (DTA), a solution that allows fast and efficient telemetry collection, aggregation, and indexing. Our system establishes RDMA connections only from collectors' ToR switches, called translators, that process DTA reports from all other switches. DTA features novel and expressive reporting primitives such as Key-Write, Append, Sketch-Merge, and Key-Increment that allow integration of telemetry systems such as INT and others. The translators then aggregate, batch, and write the reports to collectors' memory in queryable form.

READ FULL TEXT

page 8

page 9

page 10

research
10/11/2021

Zero-CPU Collection with Direct Telemetry Access

Programmable switches are driving a massive increase in fine-grained mea...
research
09/12/2011

Light-weight Locks

In this paper, we propose a new approach to building synchronization pri...
research
02/06/2019

Storm: a fast transactional dataplane for remote data structures

RDMA is an exciting technology that enables a host to access the memory ...
research
02/25/2023

Efficient Hardware Primitives for Immediate Memory Reclamation in Optimistic Data Structures

Safe memory reclamation (SMR) algorithms are crucial for preventing use-...
research
10/21/2021

FlexTOE: Flexible TCP Offload with Fine-Grained Parallelism

FlexTOE is a flexible, yet high-performance TCP offload engine (TOE) to ...
research
02/09/2022

Constructing and Analyzing the LSM Compaction Design Space (Updated Version)

Log-structured merge (LSM) trees offer efficient ingestion by appending ...

Please sign up or login with your details

Forgot password? Click here to reset