Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence

We study the tradeoff between computational effort and accuracy in a cascade of deep neural networks. During inference, early termination in the cascade is controlled by confidence levels derived directly from the softmax outputs of intermediate classifiers. The advantage of early termination is that classification is performed using less computation, thus adjusting the computational effort to the complexity of the input. Moreover, dynamic modification of confidence thresholds allow one to trade accuracy for computational effort without requiring retraining. Basing of early termination on softmax classifier outputs is justified by experimentation that demonstrates an almost linear relation between confidence levels in intermediate classifiers and accuracy. Our experimentation with architectures based on ResNet obtained the following results. (i) A speedup of 1.5 that sacrifices 1.4 respect to the CIFAR-10 test set. (ii) A speedup of 1.19 that sacrifices 0.7 accuracy with respect to the CIFAR-100 test set. (iii) A speedup of 2.16 that sacrifices 1.4

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2023

When Does Confidence-Based Cascade Deferral Suffice?

Cascades are a classical strategy to enable inference cost to vary adapt...
research
05/07/2021

BasisNet: Two-stage Model Synthesis for Efficient Inference

In this work, we present BasisNet which combines recent advancements in ...
research
05/25/2017

Real-Time Background Subtraction Using Adaptive Sampling and Cascade of Gaussians

Background-Foreground classification is a fundamental well-studied probl...
research
06/15/2022

Resource-Constrained Edge AI with Early Exit Prediction

By leveraging the data sample diversity, the early-exit network recently...
research
03/19/2020

Overinterpretation reveals image classification model pathologies

Image classifiers are typically scored on their test set accuracy, but h...
research
04/02/2023

SEENN: Towards Temporal Spiking Early-Exit Neural Networks

Spiking Neural Networks (SNNs) have recently become more popular as a bi...
research
06/05/2023

Towards Anytime Classification in Early-Exit Architectures by Enforcing Conditional Monotonicity

Modern predictive models are often deployed to environments in which com...

Please sign up or login with your details

Forgot password? Click here to reset