Array Languages Make Neural Networks Fast

by   Artjoms Sinkarovs, et al.

Modern machine learning frameworks are complex: they are typically organised in multiple layers each of which is written in a different language and they depend on a number of external libraries, but at their core they mainly consist of tensor operations. As array-oriented languages provide perfect abstractions to implement tensor operations, we consider a minimalistic machine learning framework that is shallowly embedded in an array-oriented language and we study its productivity and performance. We do this by implementing a state of the art Convolutional Neural Network (CNN) and compare it against implementations in TensorFlow and PyTorch — two state of the art industrial-strength frameworks. It turns out that our implementation is 2 and 3 times faster, even after fine-tuning the TensorFlow and PyTorch to our hardware — a 64-core GPU-accelerated machine. The size of all three CNN specifications is the same, about 150 lines of code. Our mini framework is 150 lines of highly reusable hardware-agnostic code that does not depend on external libraries. The compiler for a host array language automatically generates parallel code for a chosen architecture. The key to such a balance between performance and portability lies in the design of the array language; in particular, the ability to express rank-polymorphic operations concisely, yet being able to do optimisations across them. This design builds on very few assumptions, and it is readily transferable to other contexts offering a clean approach to high-performance machine learning.


page 1

page 2

page 3

page 4


Modeling of languages for tensor manipulation

Numerical applications and, more recently, machine learning applications...

Comparing neural network training performance between Elixir and Python

With a wide range of libraries focused on the machine learning market, s...

Object-oriented design for massively parallel computing

We define an abstract framework for object-oriented programming and show...

LoopStack: a Lightweight Tensor Algebra Compiler Stack

We present LoopStack, a domain specific compiler stack for tensor operat...

SMaLL: A Software Framework for portable Machine Learning Libraries

Interest in deploying Deep Neural Network (DNN) inference on edge device...

Neural Networks for Beginners. A fast implementation in Matlab, Torch, TensorFlow

This report provides an introduction to some Machine Learning tools with...

cltorch: a Hardware-Agnostic Backend for the Torch Deep Neural Network Library, Based on OpenCL

This paper presents cltorch, a hardware-agnostic backend for the Torch n...

Please sign up or login with your details

Forgot password? Click here to reset