Towards Efficient Deep Inference for Mobile Applications

07/14/2017
by   Tian Guo, et al.
0

Mobile applications are benefiting significantly from the advancement in deep learning, e.g. providing new features. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of model computation complexity and increased model sizes, those trained models are usually hosted in the cloud. When mobile apps need to utilize those models, they will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it also restricts the use case scenarios, e.g. mobile apps need to have access to network. With mobile specific deep learning optimizations, it is now possible to employ device-based inference. However, because mobile hardware, e.g. GPU and memory size, can be very different and limited when compared to desktop counterpart, it is important to understand the feasibility of this new device-based deep learning inference architecture. In this paper, we empirically evaluate the inference efficiency of three Convolutional Neural Networks using a benchmark Android application we developed. Based on our application-driven analysis, we have identified several performance bottlenecks for mobile applications powered by on-device deep learning inference.

READ FULL TEXT
research
07/14/2017

Cloud-based or On-device: An Empirical Study of Mobile Deep Inference

Modern mobile applications are benefiting significantly from the advance...
research
07/05/2023

UX Heuristics and Checklist for Deep Learning powered Mobile Applications with Image Classification

Advances in mobile applications providing image classification enabled b...
research
05/15/2016

DeepLearningKit - an GPU Optimized Deep Learning Framework for Apple's iOS, OS X and tvOS developed in Metal and Swift

In this paper we present DeepLearningKit - an open source framework that...
research
03/16/2023

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Runtime-tunable context-dependent network compression would make mobile ...
research
04/21/2020

A Data and Compute Efficient Design for Limited-Resources Deep Learning

Thanks to their improved data efficiency, equivariant neural networks ha...
research
05/17/2021

A Cloud-based Deep Learning Framework for Remote Detection of Diabetic Foot Ulcers

This research proposes a mobile and cloud-based framework for the automa...
research
07/11/2018

Knowledge Extracted from Recurrent Deep Belief Network for Real Time Deterministic Control

Recently, the market on deep learning including not only software but al...

Please sign up or login with your details

Forgot password? Click here to reset