
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
MXNet is a multilanguage machine learning (ML) library to ease the deve...
read it

On Optimizing Operator Fusion Plans for LargeScale Machine Learning in SystemML
Many largescale machine learning (ML) systems allow specifying custom M...
read it

A Tensor Compiler for Unified Machine Learning Prediction Serving
Machine Learning (ML) adoption in the enterprise requires simpler and mo...
read it

GPTPU: Accelerating Applications using Edge Tensor Processing Units
Neural network (NN) accelerators have been integrated into a widespectr...
read it

MoA Interpretation of the Iterative Conjugate Gradient Method with Psi Reduction  A Tutorial to teach the Mathematically literate in Linear and Tensor Algebra: Part I
It is often difficult to learn new mathematics semantically and syntacti...
read it

SPORES: SumProduct Optimization via Relational Equality Saturation for Large Scale Linear Algebra
Machine learning algorithms are commonly specified in linear algebra (LA...
read it

HighPerformance Distributed ML at Scale through Parameter Server Consistency Models
As Machine Learning (ML) applications increase in data size and model co...
read it
Tensor Relational Algebra for Machine Learning System Design
Machine learning (ML) systems have to support various tensor operations. However, such ML systems were largely developed without asking: what are the foundational abstractions necessary for building machine learning systems? We believe that proper computational and implementation abstractions will allow for the construction of selfconfiguring, declarative ML systems, especially when the goal is to execute tensor operations in a distributed environment, or partitioned across multiple AI accelerators (ASICs). To this end, we first introduce a tensor relational algebra (TRA), which is expressive to encode any tensor operation that can be written in the Einstein notation. We consider how TRA expressions can be rewritten into an implementation algebra (IA) that enables effective implementation in a distributed environment, as well as how expressions in the IA can be optimized. Our empirical study shows that the optimized implementation provided by IA can reach or even outperform carefully engineered HPC or ML systems for large scale tensor manipulations and ML workflows in distributed clusters.
READ FULL TEXT
Comments
There are no comments yet.