Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

07/14/2017
by   Tian Guo, et al.
0

Modern mobile applications are benefiting significantly from the advancement in deep learning, e.g., implementing real-time image recognition and conversational system. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of computational complexity and size constraints, these trained models are often hosted in the cloud. To utilize these cloud-based models, mobile apps will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it restricts the use case scenarios, e.g. mobile apps need to have network access. With mobile specific deep learning optimizations, it is now possible to employ on-device inference. However, because mobile hardware, such as GPU and memory size, can be very limited when compared to its desktop counterpart, it is important to understand the feasibility of this new on-device deep learning inference architecture. In this paper, we empirically evaluate the inference performance of three Convolutional Neural Networks (CNNs) using a benchmark Android application we developed. Our measurement and analysis suggest that on-device inference can cost up to two orders of magnitude greater response time and energy when compared to cloud-based inference, and that loading model and computing probability are two performance bottlenecks for on-device deep inferences.

READ FULL TEXT
research
07/14/2017

Towards Efficient Deep Inference for Mobile Applications

Mobile applications are benefiting significantly from the advancement in...
research
09/10/2019

Characterizing the Deep Neural Networks Inference Performance of Mobile Applications

Today's mobile applications are increasingly leveraging deep neural netw...
research
03/11/2020

In Situ Network and Application Performance Measurement on Android Devices and the Imperfections

Understanding network and application performance are essential for debu...
research
05/15/2016

DeepLearningKit - an GPU Optimized Deep Learning Framework for Apple's iOS, OS X and tvOS developed in Metal and Swift

In this paper we present DeepLearningKit - an open source framework that...
research
08/14/2020

SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud

Despite the soaring use of convolutional neural networks (CNNs) in mobil...
research
03/16/2023

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Runtime-tunable context-dependent network compression would make mobile ...
research
02/01/2023

Towards Implementing Energy-aware Data-driven Intelligence for Smart Health Applications on Mobile Platforms

Recent breakthrough technological progressions of powerful mobile comput...

Please sign up or login with your details

Forgot password? Click here to reset