Sequential Dynamic Decision Making with Deep Neural Nets on a Test-Time Budget

by   Henghui Zhu, et al.

Deep neural network (DNN) based approaches hold significant potential for reinforcement learning (RL) and have already shown remarkable gains over state-of-art methods in a number of applications. The effectiveness of DNN methods can be attributed to leveraging the abundance of supervised data to learn value functions, Q-functions, and policy function approximations without the need for feature engineering. Nevertheless, the deployment of DNN-based predictors with very deep architectures can pose an issue due to computational and other resource constraints at test-time in a number of applications. We propose a novel approach for reducing the average latency by learning a computationally efficient gating function that is capable of recognizing states in a sequential decision process for which policy prescriptions of a shallow network suffices and deeper layers of the DNN have little marginal utility. The overall system is adaptive in that it dynamically switches control actions based on state-estimates in order to reduce average latency without sacrificing terminal performance. We experiment with a number of alternative loss-functions to train gating functions and shallow policies and show that in a number of applications a speed-up of up to almost 5X can be obtained with little loss in performance.


EENet: Learning to Early Exit for Adaptive Inference

Budgeted adaptive inference with early exits is an emerging technique to...

Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing

Transforming off-the-shelf deep neural network (DNN) models into dynamic...

When Not to Classify: Detection of Reverse Engineering Attacks on DNN Image Classifiers

This paper addresses detection of a reverse engineering (RE) attack targ...

How to Train your DNN: The Network Operator Edition

Deep Neural Nets have hit quite a crest, But physical networks are where...

A Theoretical Connection Between Statistical Physics and Reinforcement Learning

Sequential decision making in the presence of uncertainty and stochastic...

Adaptive Gradient Prediction for DNN Training

Neural network training is inherently sequential where the layers finish...

MM-KTD: Multiple Model Kalman Temporal Differences for Reinforcement Learning

There has been an increasing surge of interest on development of advance...

Please sign up or login with your details

Forgot password? Click here to reset