ShadowTutor: Distributed Partial Distillation for Mobile Video DNN Inference

by   Jae-Won Chung, et al.

Following the recent success of deep neural networks (DNN) on video computer vision tasks, performing DNN inferences on videos that originate from mobile devices has gained practical significance. As such, previous approaches developed methods to offload DNN inference computations for images to cloud servers to manage the resource constraints of mobile devices. However, when it comes to video data, communicating information of every frame consumes excessive network bandwidth and renders the entire system susceptible to adverse network conditions such as congestion. Thus, in this work, we seek to exploit the temporal coherence between nearby frames of a video stream to mitigate network pressure. That is, we propose ShadowTutor, a distributed video DNN inference framework that reduces the number of network transmissions through intermittent knowledge distillation to a student model. Moreover, we update only a subset of the student's parameters, which we call partial distillation, to reduce the data size of each network transmission. Specifically, the server runs a large and general teacher model, and the mobile device only runs an extremely small but specialized student model. On sparsely selected key frames, the server partially trains the student model by targeting the teacher's response and sends the updated part to the mobile device. We investigate the effectiveness of ShadowTutor with HD video semantic segmentation. Evaluations show that network data transfer is reduced by 95 average. Moreover, the throughput of the system is improved by over three times and shows robustness to changes in network bandwidth.


Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer

Video style transfer techniques inspire many exciting applications on mo...

Delta Distillation for Efficient Video Processing

This paper aims to accelerate video stream processing, such as object de...

I Have Seen Enough: A Teacher Student Network for Video Classification Using Fewer Frames

Over the past few years, various tasks involving videos such as classifi...

Online Model Distillation for Efficient Video Inference

High-quality computer vision models typically address the problem of und...

Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer

The number of traffic accidents has been continuously increasing in rece...

AccMPEG: Optimizing Video Encoding for Video Analytics

With more videos being recorded by edge sensors (cameras) and analyzed b...

DNN-Driven Compressive Offloading for Edge-Assisted Semantic Video Segmentation

Deep learning has shown impressive performance in semantic segmentation,...

Please sign up or login with your details

Forgot password? Click here to reset