Efficient Incorporation of Multiple Latency Targets in the Once-For-All Network

12/12/2020
by   Vidhur Kumar, et al.
0

Neural Architecture Search has proven an effective method of automating architecture engineering. Recent work in the field has been to look for architectures subject to multiple objectives such as accuracy and latency to efficiently deploy them on different target hardware. Once-for-All (OFA) is one such method that decouples training and search and is able to find high-performance networks for different latency constraints. However, the search phase is inefficient at incorporating multiple latency targets. In this paper, we introduce two strategies (Top-down and Bottom-up) that use warm starting and randomized network pruning for the efficient incorporation of multiple latency targets in the OFA network. We evaluate these strategies against the current OFA implementation and demonstrate that our strategies offer significant running time performance gains while not sacrificing the accuracy of the subnetworks that were found for each latency target. We further demonstrate that these performance gains are generalized to every design space used by the OFA network.

READ FULL TEXT
research
03/11/2021

HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search

In this paper, we present a novel multi-objective hardware-aware neural ...
research
08/25/2022

SONAR: Joint Architecture and System Optimization Search

There is a growing need to deploy machine learning for different tasks o...
research
06/06/2018

Deploying Deep Ranking Models for Search Verticals

In this paper, we present an architecture executing a complex machine le...
research
03/08/2018

SentRNA: Improving computational RNA design by incorporating a prior of human design strategies

Designing RNA sequences that fold into specific structures and perform d...
research
05/21/2020

AOWS: Adaptive and optimal network width search with latency constraints

Neural architecture search (NAS) approaches aim at automatically finding...
research
01/17/2020

Latency-Aware Differentiable Neural Architecture Search

Differentiable neural architecture search methods became popular in auto...
research
04/26/2021

CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

The emergence of CNNs in mainstream deployment has necessitated methods ...

Please sign up or login with your details

Forgot password? Click here to reset