Zero-CPU Collection with Direct Telemetry Access

10/11/2021
by   Jonatan Langlet, et al.
0

Programmable switches are driving a massive increase in fine-grained measurements. This puts significant pressure on telemetry collectors that have to process reports from many switches. Past research acknowledged this problem by either improving collectors' stack performance or by limiting the amount of data sent from switches. In this paper, we take a different and radical approach: switches are responsible for directly inserting queryable telemetry data into the collectors' memory, bypassing their CPU, and thereby improving their collection scalability. We propose to use a method we call direct telemetry access, where switches jointly write telemetry reports directly into the same collector's memory region, without coordination. Our solution, DART, is probabilistic, trading memory redundancy and query success probability for CPU resources at collectors. We prototype DART using commodity hardware such as P4 switches and RDMA NICs and show that we get high query success rates with a reasonable memory overhead. For example, we can collect INT path tracing information on a fat tree topology without a collector's CPU involvement while achieving 99.9% query success probability and using just 300 bytes per flow.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2022

Direct Telemetry Access

The emergence of programmable switches allows operators to collect a vas...
research
10/28/2021

NetDAM: Network Direct Attached Memory with Programmable In-Memory Computing ISA

Data-intensive applications like distributed AI-training may require mul...
research
04/11/2023

High-performance and Scalable Software-based NVMe Virtualization Mechanism with I/O Queues Passthrough

NVMe(Non-Volatile Memory Express) is an industry standard for solid-stat...
research
05/13/2022

Virtual Disk Snapshot Management at Scale

Contrary to the other resources such as CPU, memory, and network, for wh...
research
06/18/2019

Write-Optimized and Consistent RDMA-based NVM Systems

In order to deliver high performance in cloud computing, we generally ex...
research
04/07/2020

SoftWear: Software-Only In-Memory Wear-Leveling for Non-Volatile Main Memory

Several emerging technologies for byte-addressable non-volatile memory (...
research
03/24/2021

RDMA is Turing complete, we just did not know it yet!

It is becoming increasingly popular for distributed systems to exploit n...

Please sign up or login with your details

Forgot password? Click here to reset