Truly Sparse Neural Networks at Scale

02/02/2021
by   Selima Curci, et al.
0

Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approach, and we show that we can train truly sparse neural networks to harvest their full potential. To achieve this goal, we introduce three novel contributions, specially designed for sparse neural networks: (1) a parallel training algorithm and its corresponding sparse implementation from scratch, (2) an activation function with non-trainable parameters to favour the gradient flow, and (3) a hidden neurons importance metric to eliminate redundancies. All in one, we are able to break the record and to train the largest neural network ever trained in terms of representational power – reaching the bat brain size. The results show that our approach has state-of-the-art performance while opening the path for an environmentally friendly artificial intelligence era.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2021

Two Sparsities Are Better Than One: Unlocking the Performance Benefits of Sparse-Sparse Networks

In principle, sparse neural networks should be significantly more effici...
research
09/22/2021

Naming Schema for a Human Brain-Scale Neural Network

Deep neural networks have become increasingly large and sparse, allowing...
research
11/25/2019

Rigging the Lottery: Making All Tickets Winners

Sparse neural networks have been shown to be more parameter and compute ...
research
03/02/2021

Sparse Training Theory for Scalable and Efficient Agents

A fundamental task for artificial intelligence is learning. Deep Neural ...
research
03/05/2021

Artificial Neural Networks generated by Low Discrepancy Sequences

Artificial neural networks can be represented by paths. Generated as ran...
research
05/10/2021

A Bregman Learning Framework for Sparse Neural Networks

We propose a learning framework based on stochastic Bregman iterations t...
research
02/09/2023

SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks

We provide a new efficient version of the backpropagation algorithm, spe...

Please sign up or login with your details

Forgot password? Click here to reset