hXDP: Efficient Software Packet Processing on FPGA NICs

by   Marco Spaziani Brunella, et al.

FPGA accelerators on the NIC enable the offloading of expensive packet processing tasks from the CPU. However, FPGAs have limited resources that may need to be shared among diverse applications, and programming them is difficult. We present a solution to run Linux's eXpress Data Path programs written in eBPF on FPGAs, using only a fraction of the available hardware resources while matching the performance of high-end CPUs. The iterative execution model of eBPF is not a good fit for FPGA accelerators. Nonetheless, we show that many of the instructions of an eBPF program can be compressed, parallelized or completely removed, when targeting a purpose-built FPGA executor, thereby significantly improving performance. We leverage that to design hXDP, which includes (i) an optimizing-compiler that parallelizes and translates eBPF bytecode to an extended eBPF Instruction-set Architecture defined by us; a (ii) soft-CPU to execute such instructions on FPGA; and (iii) an FPGA-based infrastructure to provide XDP's maps and helper functions as defined within the Linux kernel. We implement hXDP on an FPGA NIC and evaluate it running real-world unmodified eBPF programs. Our implementation is clocked at 156.25MHz, uses about 15 Despite these modest requirements, it achieves the packet processing throughput of a high-end CPU core and provides a 10x lower packet forwarding latency.


page 5

page 7

page 8

page 9

page 14

page 15

page 16

page 17


Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems

In this paper, we introduce a software-defined framework that enables th...

Shire: Making FPGA-accelerated Middlebox Development More Pleasant

We introduce an approach to designing FPGA-accelerated middleboxes that ...

Theoretical Model of Computation and Algorithms for FPGA-based Hardware Accelerators

While FPGAs have been used extensively as hardware accelerators in indus...

Optimized implementation of the conjugate gradient algorithm for FPGA-based platforms using the Dirac-Wilson operator as an example

It is now a noticeable trend in High Performance Computing that the syst...

Running Neural Networks on the NIC

In this paper we show that the data plane of commodity programmable (Net...

Pythia: Scheduling of Concurrent Network packet Processing Applications on Heterogeneous Devices [EXTENDED VERSION]

Modern commodity computing systems are composed by a number of different...

Fakernet – small and fast FPGA-based TCP and UDP communication

A common theme of data acquisition systems is the transport of data from...

Please sign up or login with your details

Forgot password? Click here to reset