Reliable and Distributed Network Monitoring via In-band Network Telemetry

12/30/2022
by   Goksel Simsek, et al.
0

Traditional network monitoring solutions usually lack of scalability due to their centralized nature collecting heartbeats from all network components via a single controller. As a solution, In-Band Network Telemetry (INT) framework has been recently proposed to collect network telemetry information more autonomously and distributedly by employing programmable switches. However, it imposes further challenges to (i) find suitable INT paths to optimize the control overhead and information freshness and (ii) ensure reliable delivery of control information over multi-hop INT paths. In this work, we propose a monitoring scheme, reliable Graph Partitioned INT (GPINT), by extending our previous work and integrating shared queue ring (SQR) as a reliability feature against potential failures in network telemetry collection due to network congestion and link degradation that may cause loss of the visibility of the network. We implement our proposal in a recent data plane programming language P4, and compare it with traditional Simple Network Management Protocol (SNMP) and also another state-of-the-art study employing Euler's method for INT path generation. Our analysis first shows the importance of having a data recovery mechanism against packet losses under different network conditions. Then, our emulation results indicate that GPINT with reliability extension performs much better than its opponent in terms of telemetry collection latency and overhead monitoring scheme even under a high amount of packet losses.

READ FULL TEXT
research
12/17/2020

Reliability Aware Multiple Path Installation in Software Defined Networking

Being a state-of-the-art network, Software Defined Networking (SDN) deco...
research
08/21/2018

FastReact: In-Network Control and Caching for Industrial Control Networks using Programmable Data Planes

Providing network reliability as well as low and predictable latency is ...
research
09/26/2019

Programmable Event Detection for In-Band Network Telemetry

In-Band Network Telemetry (INT) is a novel framework for collecting tele...
research
09/21/2020

NetReduce: RDMA-Compatible In-Network Reduction for Distributed DNN Training Acceleration

We present NetReduce, a novel RDMA-compatible in-network reduction archi...
research
10/09/2020

P4-CoDel: Experiences on Programmable Data Plane Hardware

Fixed buffer sizing in computer networks, especially the Internet, is a ...
research
07/19/2022

P4TE: PISA Switch Based Traffic Engineering in Fat-Tree Data Center Networks

This work presents P4TE, an in-band traffic monitoring, load-aware packe...
research
08/16/2022

FlEC: Enhancing QUIC with application-tailored reliability mechanisms

Packet losses are common events in today's networks. They usually result...

Please sign up or login with your details

Forgot password? Click here to reset