A Task-based Multi-shift QR/QZ Algorithm with Aggressive Early Deflation

07/07/2020
by   Mirko Myllykoski, et al.
0

The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eigenvectors of a dense nonsymmetric matrix. This paper describes a task-based QR algorithm for reducing an upper Hessenberg matrix to real Schur form. The task-based algorithm also supports generalized eigenvalue problems (QZ algorithm) but this paper focuses more on the standard case. The task-based algorithm inherits previous algorithmic improvements, such as tightly-coupled multi-shifts and Aggressive Early Deflation (AED), and also incorporates several new ideas that significantly improve the performance. This includes the elimination of several synchronization points, the dynamic merging of previously separate computational steps, the shorting and the prioritization of the critical path, and the introduction of an experimental GPU support. The task-based implementation is demonstrated to be significantly faster than multi-threaded LAPACK and ScaLAPACK in both single-node and multi-node configurations on two different machines based on Intel and AMD CPUs. The implementation is built on top of the StarPU runtime system and is part of an open-source StarNEig library.

READ FULL TEXT

page 15

page 18

page 19

page 31

research
02/12/2020

Task-based, GPU-accelerated and Robust Library for Solving Dense Nonsymmetric Eigenvalue Problems

In this paper, we present the StarNEig library for solving dense nonsymm...
research
05/13/2019

Introduction to StarNEig – A Task-based Library for Solving Nonsymmetric Eigenvalue Problems

In this paper, we present the StarNEig library for solving dense non-sym...
research
02/03/2016

An SSD-based eigensolver for spectral analysis on billion-node graphs

Many eigensolvers such as ARPACK and Anasazi have been developed to comp...
research
04/20/2023

Optimizing High-Performance Linpack for Exascale Accelerated Architectures

We detail the performance optimizations made in rocHPL, AMD's open-sourc...
research
04/03/2020

Interpolation of Dense and Sparse Rational Functions and other Improvements in

We present the main improvements and new features in version of the ope...
research
12/27/2021

Design and Experimental Evaluation of Algorithms for Optimizing the Throughput of Dispersed Computing

With growing deployment of Internet of Things (IoT) and machine learning...
research
12/18/2018

MatRox: A Model-Based Algorithm with an Efficient Storage Format for Parallel HSS-Structured Matrix Approximations

We present MatRox, a novel model-based algorithm and implementation of H...

Please sign up or login with your details

Forgot password? Click here to reset