DeepAI AI Chat
Log In Sign Up

Does Interference Exist When Training a Once-For-All Network?

by   Jordan Shipard, et al.

The Once-For-All (OFA) method offers an excellent pathway to deploy a trained neural network model into multiple target platforms by utilising the supernet-subnet architecture. Once trained, a subnet can be derived from the supernet (both architecture and trained weights) and deployed directly to the target platform with little to no retraining or fine-tuning. To train the subnet population, OFA uses a novel training method called Progressive Shrinking (PS) which is designed to limit the negative impact of interference during training. It is believed that higher interference during training results in lower subnet population accuracies. In this work we take a second look at this interference effect. Surprisingly, we find that interference mitigation strategies do not have a large impact on the overall subnet population performance. Instead, we find the subnet architecture selection bias during training to be a more important aspect. To show this, we propose a simple-yet-effective method called Random Subnet Sampling (RSS), which does not have mitigation on the interference effect. Despite no mitigation, RSS is able to produce a better performing subnet population than PS in four small-to-medium-sized datasets; suggesting that the interference effect does not play a pivotal role in these datasets. Due to its simplicity, RSS provides a 1.9× reduction in training times compared to PS. A 6.1× reduction can also be achieved with a reasonable drop in performance when the number of RSS training epochs are reduced. Code available at


meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

We propose a simple yet effective technique for neural network learning....

Complex Signal Denoising and Interference Mitigation for Automotive Radar Using Convolutional Neural Networks

Driver assistance systems as well as autonomous cars have to rely on sen...

Should We Worry About Interference in Emerging Dense NGSO Satellite Constellations?

Many satellite operators are planning to deploy NGSO systems for broadba...

SimpleTran: Transferring Pre-Trained Sentence Embeddings for Low Resource Text Classification

Fine-tuning pre-trained sentence embedding models like BERT has become t...

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Conventional NAS-based pruning algorithms aim to find the sub-network wi...

MEDFAIR: Benchmarking Fairness for Medical Imaging

A multitude of work has shown that machine learning-based medical diagno...

A modern look at the relationship between sharpness and generalization

Sharpness of minima is a promising quantity that can correlate with gene...