Soft Conditional Computation

04/10/2019
by   Brandon Yang, et al.
0

Conditional computation aims to increase the size and accuracy of a network, at a small increase in inference cost. Previous hard-routing models explicitly route the input to a subset of experts. We propose soft conditional computation, which, in contrast, utilizes all experts while still permitting efficient inference through parameter routing. Concretely, for a given convolutional layer, we wish to compute a linear combination of n experts α_1 · (W_1 * x) + ... + α_n · (W_n * x), where α_1, ..., α_n are functions of the input learned through gradient descent. A straightforward evaluation requires n convolutions. We propose an equivalent form of the above computation, (α_1 W_1 + ... + α_n W_n) * x, which requires only a single convolution. We demonstrate the efficacy of our method, named CondConv, by scaling up the MobileNetV1, MobileNetV2, and ResNet-50 model architectures to achieve higher accuracy while retaining efficient inference. On the ImageNet classification dataset, CondConv improves the top-1 validation accuracy of the MobileNetV1(0.5x) model from 63.8 detection, CondConv improves the minival mAP of a MobileNetV1(1.0x) SSD model from 20.3 to 22.4 with just a 4

READ FULL TEXT
research
06/06/2023

Soft Merging of Experts with Adaptive Routing

Sparsely activated neural networks with conditional computation learn to...
research
06/05/2018

Deep Mixture of Experts via Shallow Embedding

Larger networks generally have greater representational power at the cos...
research
10/05/2021

MoEfication: Conditional Computation of Transformer Models for Efficient Inference

Transformer-based pre-trained language models can achieve superior perfo...
research
10/14/2021

Towards More Effective and Economic Sparsely-Activated Model

The sparsely-activated models have achieved great success in natural lan...
research
08/02/2023

From Sparse to Soft Mixtures of Experts

Sparse mixture of expert architectures (MoEs) scale model capacity witho...
research
11/30/2021

PokeBNN: A Binary Pursuit of Lightweight Accuracy

Top-1 ImageNet optimization promotes enormous networks that may be impra...
research
11/26/2017

SkipNet: Learning Dynamic Routing in Convolutional Networks

Increasing depth and complexity in convolutional neural networks has ena...

Please sign up or login with your details

Forgot password? Click here to reset