Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond

03/08/2019
by   Ashish Agarwal, et al.
0

We propose a static loop vectorization optimization on top of high level dataflow IR used by frameworks like TensorFlow. A new statically vectorized parallel-for abstraction is provided on top of TensorFlow, and used for applications ranging from auto-batching and per-example gradients, to jacobian computation, optimized map functions and input pipeline optimization. We report huge speedups compared to both loop based implementations, as well as run-time batching adopted by the DyNet framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir

This work introduces TapirXLA, a replacement for TensorFlow's XLA compil...
research
02/27/2019

TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning

TensorFlow Eager is a multi-stage, Python-embedded domain-specific langu...
research
10/18/2018

Private Machine Learning in TensorFlow using Secure Computation

We present a framework for experimenting with secure multi-party computa...
research
09/08/2017

TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

We introduce TensorFlow Agents, an efficient infrastructure paradigm for...
research
10/06/2018

Characterizing Deep-Learning I/O Workloads in TensorFlow

The performance of Deep-Learning (DL) computing frameworks rely on the p...
research
11/05/2017

HPX Smart Executors

The performance of many parallel applications depends on loop-level para...
research
05/22/2018

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

Deep learning emerges as an important new resource-intensive workload an...

Please sign up or login with your details

Forgot password? Click here to reset