Mobile-Cloud Inference for Collaborative Intelligence

06/24/2023
by   Mateen Ulhaq, et al.
0

As AI applications for mobile devices become more prevalent, there is an increasing need for faster execution and lower energy consumption for deep learning model inference. Historically, the models run on mobile devices have been smaller and simpler in comparison to large state-of-the-art research models, which can only run on the cloud. However, cloud-only inference has drawbacks such as increased network bandwidth consumption and higher latency. In addition, cloud-only inference requires the input data (images, audio) to be fully transferred to the cloud, creating concerns about potential privacy breaches. There is an alternative approach: shared mobile-cloud inference. Partial inference is performed on the mobile in order to reduce the dimensionality of the input data and arrive at a compact feature tensor, which is a latent space representation of the input signal. The feature tensor is then transmitted to the server for further inference. This strategy can reduce inference latency, energy consumption, and network bandwidth usage, as well as provide privacy protection, because the original signal never leaves the mobile. Further performance gain can be achieved by compressing the feature tensor before its transmission.

READ FULL TEXT

page 15

page 19

page 21

page 23

page 30

page 32

page 36

page 37

research
02/01/2020

Shared Mobile-Cloud Inference for Collaborative Intelligence

As AI applications for mobile devices become more prevalent, there is an...
research
02/01/2019

Towards Collaborative Intelligence Friendly Architectures for Deep Learning

Modern mobile devices are equipped with high-performance hardware resour...
research
12/10/2018

Efficient Training Management for Mobile Crowd-Machine Learning: A Deep Reinforcement Learning Approach

In this letter, we consider the concept of Mobile Crowd-Machine Learning...
research
02/04/2019

BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services

Recent studies have shown the latency and energy consumption of deep neu...
research
04/26/2018

Near-Lossless Deep Feature Compression for Collaborative Intelligence

Collaborative intelligence is a new paradigm for efficient deployment of...
research
11/12/2022

PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference

Mobile cloud offloading is indispensable for inference tasks based on la...
research
02/14/2020

Bit Allocation for Multi-Task Collaborative Intelligence

Recent studies have shown that collaborative intelligence (CI) is a prom...

Please sign up or login with your details

Forgot password? Click here to reset