Large Scale Artificial Neural Network Training Using Multi-GPUs

11/13/2015
by   Linnan Wang, et al.
0

This paper describes a method for accelerating large scale Artificial Neural Networks (ANN) training using multi-GPUs by reducing the forward and backward passes to matrix multiplication. We propose an out-of-core multi-GPU matrix multiplication and integrate the algorithm with the ANN training. The experiments demonstrate that our matrix multiplication algorithm achieves linear speedup on multiple inhomogeneous GPUs. The full paper of this project can be found at [1].

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

03/24/2021

Accelerating Sparse Approximate Matrix Multiplication on GPUs

Although the matrix multiplication plays a vital role in computational l...
01/18/2017

On the Performance of Network Parallel Training in Artificial Neural Networks

Artificial Neural Networks (ANNs) have received increasing attention in ...
03/10/2018

Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs

Large-scale floating-point matrix multiplication is a fundamental kernel...
12/14/2021

TCUDB: Accelerating Database with Tensor Processors

The emergence of novel hardware accelerators has powered the tremendous ...
11/24/2018

Accelerating Reduction and Scan Using Tensor Core Units

Driven by deep learning, there has been a surge of specialized processor...
10/01/2020

BCNN: A Binary CNN with All Matrix Ops Quantized to 1 Bit Precision

This paper describes a CNN where all CNN style 2D convolution operations...
03/13/2019

GNA: new framework for statistical data analysis

We report on the status of GNA — a new framework for fitting large-scale...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.