Accelerating Deep Learning Classification with Error-controlled Approximate-key Caching

12/13/2021
by   Alessandro Finamore, et al.
0

While Deep Learning (DL) technologies are a promising tool to solve networking problems that map to classification tasks, their computational complexity is still too high with respect to real-time traffic measurements requirements. To reduce the DL inference cost, we propose a novel caching paradigm, that we named approximate-key caching, which returns approximate results for lookups of selected input based on cached DL inference results. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. As such, we couple approximate-key caching with an error-correction principled algorithm, that we named auto-refresh. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching – testifying the practical interest of our proposal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/13/2018

Distributed Deep Learning at the Edge: A Novel Proactive and Cooperative Caching Framework for Mobile Edge Networks

This letter proposes two novel proactive cooperative caching approaches ...
research
09/18/2022

Improving the Performance of DNN-based Software Services using Automated Layer Caching

Deep Neural Networks (DNNs) have become an essential component in many a...
research
02/07/2020

Accelerating Deep Learning Inference via Freezing

Over the last few years, Deep Neural Networks (DNNs) have become ubiquit...
research
09/21/2023

Performance Model for Similarity Caching

Similarity caching allows requests for an item to be served by a similar...
research
03/09/2020

Lightweight Inter-transaction Caching with Precise Clocks and Dynamic Self-invalidation

Distributed, transactional storage systems scale by sharding data across...
research
10/24/2020

Satisfying Increasing Performance Requirements with Caching at the Application Level

Application-level caching is a form of caching that has been increasingl...
research
06/11/2020

Is deep learning necessary for simple classification tasks?

Automated machine learning (AutoML) and deep learning (DL) are two cutti...

Please sign up or login with your details

Forgot password? Click here to reset