NAT: Neural Architecture Transformer for Accurate and Compact Architectures

10/31/2019
by   Yong Guo, et al.
0

Designing effective architectures is one of the key factors behind the success of deep neural networks. Existing deep architectures are either manually designed or automatically searched by some Neural Architecture Search (NAS) methods. However, even a well-searched architecture may still contain many non-significant or redundant modules or operations (e.g., convolution or pooling), which may not only incur substantial memory consumption and computation cost but also deteriorate the performance. Thus, it is necessary to optimize the operations inside an architecture to improve the performance without introducing extra computation cost. Unfortunately, such a constrained optimization problem is NP-hard. To make the problem feasible, we cast the optimization problem into a Markov decision process (MDP) and seek to learn a Neural Architecture Transformer (NAT) to replace the redundant operations with the more computationally efficient ones (e.g., skip connection or directly removing the connection). Based on MDP, we learn NAT by exploiting reinforcement learning to obtain the optimization policies w.r.t. different architectures. To verify the effectiveness of the proposed strategies, we apply NAT on both hand-crafted architectures and NAS based architectures. Extensive experiments on two benchmark datasets, i.e., CIFAR-10 and ImageNet, demonstrate that the transformed architecture by NAT significantly outperforms both its original form and those architectures optimized by existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2021

Towards Accurate and Compact Architectures via Neural Architecture Transformer

Designing effective architectures is one of the key factors behind the s...
research
06/15/2020

Differentiable Neural Architecture Transformation for Reproducible Architecture Improvement

Recently, Neural Architecture Search (NAS) methods are introduced and sh...
research
11/16/2019

Optimized CNN for PolSAR Image Classification via Differentiable Neural Architecture Search

Convolutional neural networks (CNNs) realize the automation of feature e...
research
11/16/2019

Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification

Convolutional neural networks (CNNs) have shown good performance in pola...
research
07/03/2022

Architecture Augmentation for Performance Predictor Based on Graph Isomorphism

Neural Architecture Search (NAS) can automatically design architectures ...
research
08/18/2021

Analyze and Design Network Architectures by Recursion Formulas

The effectiveness of shortcut/skip-connection has been widely verified, ...
research
05/22/2017

On-the-fly Operation Batching in Dynamic Computation Graphs

Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offe...

Please sign up or login with your details

Forgot password? Click here to reset