DeepAI AI Chat
Log In Sign Up

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

by   Lemeng Wu, et al.

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that includes a diverse set of candidate network structures. By using Taylor approximation, the optimal network structure in the neighborhood can be found with a greedy selection procedure. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures that avoid catastrophic forgetting in continual learning. Empirically, firefly descent achieves promising results on both neural architecture search and continual learning. In particular, on a challenging continual image classification task, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.


page 1

page 2

page 3

page 4


Efficient Architecture Search for Continual Learning

Continual learning with neural networks is an important learning framewo...

Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting

Addressing catastrophic forgetting is one of the key challenges in conti...

Learn to Bind and Grow Neural Structures

Task-incremental learning involves the challenging problem of learning n...

Neural Architecture Search of Deep Priors: Towards Continual Learning without Catastrophic Interference

In this paper we analyze the classification performance of neural networ...

Growing Representation Learning

Machine learning continues to grow in popularity due to its ability to l...

Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting

We propose signed splitting steepest descent (S3D), which progressively ...