Programming Parallel Dense Matrix Factorizations with Look-Ahead and OpenMP

04/19/2018
by   Sandra Catalán, et al.
0

We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multithreaded version of BLAS. This approach is also different from the more sophisticated runtime-assisted implementations, which decompose the operation into tasks and identify dependencies via directives and runtime support. Instead, our strategy attains high performance by explicitly embedding a static look-ahead technique into the DMF code, in order to overcome the performance bottleneck of the panel factorization, and realizing the trailing update via a cache-aware multi-threaded implementation of the BLAS. Although the parallel algorithms are specified with a highlevel of abstraction, the actual implementation can be easily derived from them, paving the road to deriving a high performance implementation of a considerable fraction of LAPACK functionality on any multicore platform with an OpenMP-like runtime.

READ FULL TEXT
research
11/19/2016

A Case for Malleable Thread-Level Linear Algebra Libraries: The LU Factorization with Partial Pivoting

We propose two novel techniques for overcoming load-imbalance encountere...
research
08/06/2018

NIMFA: A Python Library for Nonnegative Matrix Factorization

NIMFA is an open-source Python library that provides a unified interface...
research
04/04/2019

SMURFF: a High-Performance Framework for Matrix Factorization

Bayesian Matrix Factorization (BMF) is a powerful technique for recommen...
research
04/30/2023

A Wall-time Minimizing Parallelization Strategy for Approximate Bayesian Computation

Approximate Bayesian Computation (ABC) is a widely applicable and popula...
research
10/25/2021

NetMF+: Network Embedding Based on Fast and Effective Single-Pass Randomized Matrix Factorization

In this work, we propose NetMF+, a fast, memory-efficient, scalable, and...
research
03/23/2021

A parallel implementation of a diagonalization-based parallel-in-time integrator

We present and analyze a parallel implementation of a parallel-in-time m...
research
02/06/2013

Exploring Parallelism in Learning Belief Networks

It has been shown that a class of probabilistic domain models cannot be ...

Please sign up or login with your details

Forgot password? Click here to reset