On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML

01/02/2018
by   Matthias Boehm, et al.
0

Many large-scale machine learning (ML) systems allow specifying custom ML algorithms by means of linear algebra programs, and then automatically generate efficient execution plans. In this context, optimization opportunities for fused operators---in terms of fused chains of basic operators---are ubiquitous. These opportunities include (1) fewer materialized intermediates, (2) fewer scans of input data, and (3) the exploitation of sparsity across chains of operators. Automatic operator fusion eliminates the need for hand-written fused operators and significantly improves performance for complex or previously unseen chains of operations. However, existing fusion heuristics struggle to find good fusion plans for complex DAGs or hybrid plans of local and distributed operations. In this paper, we introduce an optimization framework for systematically reason about fusion plans that considers materialization points in DAGs, sparsity exploitation, different fusion template types, as well as local and distributed operations. In detail, we contribute algorithms for (1) candidate exploration of valid fusion plans, (2) cost-based candidate selection, and (3) code generation of local and distributed operations over dense, sparse, and compressed data. Our experiments in SystemML show end-to-end performance improvements with optimized fusion plans of up to 21x compared to hand-written fused operators, with negligible optimization and code generation overhead.

READ FULL TEXT
research
09/01/2020

Tensor Relational Algebra for Machine Learning System Design

Machine learning (ML) systems have to support various tensor operations....
research
10/05/2021

Scalable Relational Query Processing on Big Matrix Data

The use of large-scale machine learning methods is becoming ubiquitous i...
research
10/24/2016

Large Scale Parallel Computations in R through Elemental

Even though in recent years the scale of statistical analysis problems h...
research
01/30/2023

Operator Fusion in XLA: Analysis and Evaluation

Machine learning (ML) compilers are an active area of research because t...
research
09/17/2021

Asymmetric 3D Context Fusion for Universal Lesion Detection

Modeling 3D context is essential for high-performance 3D medical image a...
research
02/27/2023

GAM Coach: Towards Interactive and User-centered Algorithmic Recourse

Machine learning (ML) recourse techniques are increasingly used in high-...
research
02/26/2018

Cuttlefish: A Lightweight Primitive for Adaptive Query Processing

Modern data processing applications execute increasingly sophisticated a...

Please sign up or login with your details

Forgot password? Click here to reset