Local Learning with Neuron Groups

01/18/2023
by   Adeetya Patel, et al.
0

Traditional deep network training methods optimize a monolithic objective function jointly for all the components. This can lead to various inefficiencies in terms of potential parallelization. Local learning is an approach to model-parallelism that removes the standard end-to-end learning setup and utilizes local objective functions to permit parallel learning amongst model components in a deep network. Recent works have demonstrated that variants of local learning can lead to efficient training of modern deep networks. However, in terms of how much computation can be distributed, these approaches are typically limited by the number of layers in a network. In this work we propose to study how local learning can be applied at the level of splitting layers or modules into sub-components, adding a notion of width-wise modularity to the existing depth-wise modularity associated with local learning. We investigate local-learning penalties that permit such models to be trained efficiently. Our experiments on the CIFAR-10, CIFAR-100, and Imagenet32 datasets demonstrate that introducing width-level modularity can lead to computational advantages over existing methods based on local learning and opens new opportunities for improved model-parallel distributed training. Code is available at: https://github.com/adeetyapatel12/GN-DGL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2021

PyTorch-Hebbian: facilitating local learning in a deep learning framework

Recently, unsupervised local learning, based on Hebb's idea that change ...
research
11/01/2022

Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism

The accuracy of AlphaFold2, a frontier end-to-end structure prediction s...
research
01/26/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Due to the need to store the intermediate activations for back-propagati...
research
12/07/2020

Parallel Training of Deep Networks with Local Updates

Deep learning models trained on large data sets have been widely success...
research
06/16/2022

PRANC: Pseudo RAndom Networks for Compacting deep models

Communication becomes a bottleneck in various distributed Machine Learni...
research
04/06/2022

LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification

We introduce LilNetX, an end-to-end trainable technique for neural netwo...
research
03/12/2019

Universally Slimmable Networks and Improved Training Techniques

Slimmable networks are a family of neural networks that can instantly ad...

Please sign up or login with your details

Forgot password? Click here to reset