PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

10/17/2022
by   Enrico Meloni, et al.
16

In this paper, we present PARTIME, a software library written in Python and based on PyTorch, designed specifically to speed up neural networks whenever data is continuously streamed over time, for both learning and inference. Existing libraries are designed to exploit data-level parallelism, assuming that samples are batched, a condition that is not naturally met in applications that are based on streamed data. Differently, PARTIME starts processing each data sample at the time in which it becomes available from the stream. PARTIME wraps the code that implements a feed-forward multi-layer network and it distributes the layer-wise processing among multiple devices, such as Graphics Processing Units (GPUs). Thanks to its pipeline-based computational scheme, PARTIME allows the devices to perform computations in parallel. At inference time this results in scaling capabilities that are theoretically linear with respect to the number of devices. During the learning stage, PARTIME can leverage the non-i.i.d. nature of the streamed data with samples that are smoothly evolving over time for efficient gradient computations. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning, distributing operations on up to 8 NVIDIA GPUs, showing significant speedups that are almost linear in the number of devices, mitigating the impact of the data transfer overhead.

READ FULL TEXT

page 1

page 2

page 8

research
07/14/2020

Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

A Multigrid Full Approximation Storage algorithm for solving Deep Residu...
research
10/03/2021

Scheduling Optimization Techniques for Neural Network Training

Neural network training requires a large amount of computation and thus ...
research
02/20/2020

Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

Convolutional Neural Networks (CNN) are becoming a common presence in ma...
research
04/11/2011

Simulating Spiking Neural P systems without delays using GPUs

We present in this paper our work regarding simulating a type of P syste...
research
03/02/2021

Scalable communication for high-order stencil computations using CUDA-aware MPI

Modern compute nodes in high-performance computing provide a tremendous ...
research
10/05/2019

Multiplierless and Sparse Machine Learning based on Margin Propagation Networks

The new generation of machine learning processors have evolved from mult...
research
09/17/2022

Introspective Learning : A Two-Stage Approach for Inference in Neural Networks

In this paper, we advocate for two stages in a neural network's decision...

Please sign up or login with your details

Forgot password? Click here to reset