SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

05/19/2022
by   Nicolas Weber, et al.
0

The increased interest in Artificial Intelligence (AI) raised the need for highly optimized and sophisticated AI frameworks. Starting with the Lua-based Torch many frameworks have emerged over time, such as Theano, Caffe, Chainer, CNTK, MxNet, PyTorch, DL4J, or TensorFlow. All of these provide a high level scripting API that allows users to easily design neural networks and run these on various kinds of hardware. What the user usually does not see is the high effort put into these frameworks to provide peak execution performance. While mainstream CPUs and GPUs have the "luxury" to have a wide spread user base in the open source community, less mainstream CPU, GPU or accelerator vendors need to put in a high effort to get their hardware supported by these frameworks. This includes not only the development of highly efficient compute libraries such as CUDNN, OneDNN or VEDNN but also supporting an ever growing number of simpler compute operations such as summation and multiplications. Each of these frameworks, nowadays, supports several hundred of unique operations, with tensors of various sizes, shapes and data types, which end up in thousands of compute kernels required for each device type. And the number of operations keeps increasing. That is why NEC Laboratories Europe started developing the SOL AI Optimization project already years ago, to deliver optimal performance to users while keeping the maintenance burden minimal.

READ FULL TEXT

page 2

page 3

page 6

research
03/24/2020

SOL: Effortless Device Support for AI Frameworks without Source Code Changes

Modern high performance computing clusters heavily rely on accelerators ...
research
05/24/2019

Deploying AI Frameworks on Secure HPC Systems with Containers

The increasing interest in the usage of Artificial Intelligence techniqu...
research
02/12/2018

TVM: End-to-End Optimization Stack for Deep Learning

Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch drive...
research
12/13/2022

Towards Seamless Management of AI Models in High-Performance Computing

With the increasing prevalence of artificial intelligence (AI) in divers...
research
08/29/2019

High Performance Scalable FPGA Accelerator for Deep Neural Networks

Low-precision is the first order knob for achieving higher Artificial In...
research
03/03/2022

Query Processing on Tensor Computation Runtimes

The huge demand for computation in artificial intelligence (AI) is drivi...
research
11/13/2018

Intelligent Drone Swarm for Search and Rescue Operations at Sea

In recent years, a rising numbers of people arrived in the European Unio...

Please sign up or login with your details

Forgot password? Click here to reset