Large Scale Artificial Neural Network Training Using Multi-GPUs

11/13/2015
by   Linnan Wang, et al.
0

This paper describes a method for accelerating large scale Artificial Neural Networks (ANN) training using multi-GPUs by reducing the forward and backward passes to matrix multiplication. We propose an out-of-core multi-GPU matrix multiplication and integrate the algorithm with the ANN training. The experiments demonstrate that our matrix multiplication algorithm achieves linear speedup on multiple inhomogeneous GPUs. The full paper of this project can be found at [1].

READ FULL TEXT

page 1

page 2

research
03/24/2021

Accelerating Sparse Approximate Matrix Multiplication on GPUs

Although the matrix multiplication plays a vital role in computational l...
research
01/18/2017

On the Performance of Network Parallel Training in Artificial Neural Networks

Artificial Neural Networks (ANNs) have received increasing attention in ...
research
04/06/2023

The Concept of Forward-Forward Learning Applied to a Multi Output Perceptron

The concept of a recently proposed Forward-Forward learning algorithm fo...
research
03/10/2018

Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs

Large-scale floating-point matrix multiplication is a fundamental kernel...
research
12/14/2021

TCUDB: Accelerating Database with Tensor Processors

The emergence of novel hardware accelerators has powered the tremendous ...
research
08/15/2022

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

Large language models have been widely adopted but require significant G...
research
07/12/2022

Look-ups are not (yet) all you need for deep learning inference

Fast approximations to matrix multiplication have the potential to drama...

Please sign up or login with your details

Forgot password? Click here to reset