GPUNet: Searching the Deployable Convolution Neural Networks for GPUs

04/26/2022
by   Linnan Wang, et al.
0

Customizing Convolution Neural Networks (CNN) for production use has been a challenging task for DL practitioners. This paper intends to expedite the model customization with a model hub that contains the optimized models tiered by their inference latency using Neural Architecture Search (NAS). To achieve this goal, we build a distributed NAS system to search on a novel search space that consists of prominent factors to impact latency and accuracy. Since we target GPU, we name the NAS optimized models as GPUNet, which establishes a new SOTA Pareto frontier in inference latency and accuracy. Within 1ms, GPUNet is 2x faster than EfficientNet-X and FBNetV3 with even better accuracy. We also validate GPUNet on detection tasks, and GPUNet consistently outperforms EfficientNet-X and FBNetV3 on COCO detection tasks in both latency and accuracy. All of these data validate that our NAS system is effective and generic to handle different design tasks. With this NAS system, we expand GPUNet to cover a wide range of latency targets such that DL practitioners can deploy our models directly in different scenarios.

READ FULL TEXT
research
08/05/2019

SqueezeNAS: Fast neural architecture search for faster semantic segmentation

For real time applications utilizing Deep Neural Networks (DNNs), it is ...
research
09/04/2020

S3NAS: Fast NPU-aware Neural Architecture Search Methodology

As the application area of convolutional neural networks (CNN) is growin...
research
10/21/2020

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

Executing machine learning workloads locally on resource constrained mic...
research
01/24/2023

Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge

Neural Architecture Search (NAS) is quickly becoming the go-to approach ...
research
11/15/2022

NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction

With the wide and deep adoption of deep learning models in real applicat...
research
02/10/2021

Searching for Fast Model Families on Datacenter Accelerators

Neural Architecture Search (NAS), together with model scaling, has shown...
research
10/06/2020

LETI: Latency Estimation Tool and Investigation of Neural Networks inference on Mobile GPU

A lot of deep learning applications are desired to be run on mobile devi...

Please sign up or login with your details

Forgot password? Click here to reset