Enabling the Reflex Plane with the nanoPU

12/13/2022
by   Stephen Ibanez, et al.
0

Many recent papers have demonstrated fast in-network computation using programmable switches, running many orders of magnitude faster than CPUs. The main limitation of writing software for switches is the constrained programming model and limited state. In this paper we explore whether a new type of CPU, called the nanoPU, offers a useful middle ground, with a familiar C/C++ programming model, and potentially many terabits/second of packet processing on a single chip, with an RPC response time less than 1 μs. To evaluate the nanoPU, we prototype and benchmark three common network services: packet classification, network telemetry report processing, and consensus protocols on the nanoPU. Each service is evaluated using cycle-accurate simulations on FPGAs in AWS. We found that packets are classified 2× faster and INT reports are processed more than an order of magnitude quickly than state-of-the-art approaches. Our production quality Raft consensus protocol, running on the nanoPU, writes to a 3-way replicated key-value store (MICA) in 3 μs, twice as fast as the state-of-the-art, with 99% tail latency of only 3.26 μs. To understand how these services can be combined, we study the design and performance of a network reflex plane, designed to process telemetry data, make fast control decisions, and update consistent, replicated state within a few microseconds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2022

Scale-friendly In-network Coordination

The programmability of modern network devices has led to innovative rese...
research
03/24/2021

Metronome: adaptive and precise intermittent packet retrieval in DPDK

DPDK (Data Plane Development Kit) is arguably today's most employed fram...
research
06/16/2021

Dynamic Recompilation of Software Network Services with Morpheus

State-of-the-art approaches to design, develop and optimize software pac...
research
10/07/2020

PsPIN: A high-performance low-power architecture for flexible in-network compute

The capacity of offloading data and control tasks to the network is beco...
research
03/05/2018

Programmable Switch as a Parallel Computing Device

Modern switches have packet processing capacity of up to multi-tera bits...
research
01/25/2019

Partitioned Paxos via the Network Data Plane

Consensus protocols are the foundation for building fault-tolerant, dist...
research
04/05/2020

Kollaps: Decentralized and Dynamic Topology Emulation

The performance and behavior of large-scale distributed applications is ...

Please sign up or login with your details

Forgot password? Click here to reset