DeepAI AI Chat
Log In Sign Up

Domain-Specific Computational Storage for Serverless Computing

by   Rohan Mahapatra, et al.
University of California, San Diego
The University of Kansas

While (1) serverless computing is emerging as a popular form of cloud execution, datacenters are going through major changes: (2) storage dissaggregation in the system infrastructure level and (3) integration of domain-specific accelerators in the hardware level. Each of these three trends individually provide significant benefits; however, when combined the benefits diminish. Specifically, the paper makes the key observation that for serverless functions, the overhead of accessing dissaggregated persistent storage overshadows the gains from accelerators. Therefore, to benefit from all these trends in conjunction, we propose Domain-Specific Computational Storage for Serverless (DSCS-Serverless). This idea contributes a serverless model that leverages a programmable accelerator within computational storage to conjugate the benefits of acceleration and storage disaggregation simultaneously. Our results with eight applications shows that integrating a comparatively small accelerator within the storage (DSCS-Serverless) that fits within its power constrains (15 Watts), significantly outperforms a traditional disaggregated system that utilizes the NVIDIA RTX 2080 Ti GPU (250 Watts). Further, the work highlights that disaggregation, serverless model, and the limited power budget for computation in storage require a different design than the conventional practices of integrating microprocessors and FPGAs. This insight is in contrast with current practices of designing computational storage that are yet to address the challenges associated with the shifts in datacenters. In comparison with two such conventional designs that either use quad-core ARM A57 or a Xilinx FPGA, DSCS-Serverless provides 3.7x and 1.7x end-to-end application speedup, 4.3x and 1.9x energy reduction, and 3.2x and 2.3x higher cost efficiency, respectively.


page 1

page 3

page 5

page 10

page 11


Compiling Halide Programs to Push-Memory Accelerators

Image processing and machine learning applications benefit tremendously ...

From Domain-Specific Languages to Memory-Optimized Accelerators for Fluid Dynamics

Many applications are increasingly requiring numerical simulations for s...

Trireme: Exploring Hierarchical Multi-Level Parallelism for Domain Specific Hardware Acceleration

The design of heterogeneous systems that include domain specific acceler...

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration

The research interest in specialized hardware accelerators for deep neur...

STANNIS: Low-Power Acceleration of Deep NeuralNetwork Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of ...

In-RDBMS Hardware Acceleration of Advanced Analytics

The data revolution is fueled by advances in several areas, including da...

A Survey on Impact of Transient Faults on BNN Inference Accelerators

Over past years, the philosophy for designing the artificial intelligenc...