Aydin Buluc

research

∙ 05/08/2023

CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers

This paper introduces the batch-parallel Compressed Packed Memory Array ...

0 Brian Wheatman, et al. ∙

research

∙ 04/17/2023

Space Efficient Sequence Alignment for SRAM-Based Computing: X-Drop on the Graphcore IPU

Dedicated accelerator hardware has become essential for processing AI-ba...

0 Luk Burchard, et al. ∙

research

∙ 03/03/2023

Extreme-scale many-against-many protein similarity search

Similarity search is one of the most fundamental computations that are r...

0 Oguz Selvitopi, et al. ∙

research

∙ 01/29/2023

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

We present a data structure to randomly sample rows from the Khatri-Rao ...

0 Vivek Bharadwaj, et al. ∙

research

∙ 10/11/2022

Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition

Low-rank Candecomp / PARAFAC (CP) Decomposition is a powerful tool for t...

0 Vivek Bharadwaj, et al. ∙

research

∙ 09/07/2022

Large Scale Enrichment and Statistical Cyber Characterization of Network Traffic

Modern network sensors continuously produce enormous quantities of raw d...

0 Ivan Kawaminami, et al. ∙

research

∙ 07/10/2022

Distributed-Memory Parallel Contig Generation for De Novo Long-Read Genome Assembly

De novo genome assembly, i.e., rebuilding the sequence of an unknown gen...

0 Giulia Guidi, et al. ∙

research

∙ 03/25/2022

GraphBLAS on the Edge: Anonymized High Performance Streaming of Network Traffic

Long range detection is a cornerstone of defense in many operating domai...

0 Michael Jones, et al. ∙

research

∙ 03/19/2022

Temporal Correlation of Internet Observatories and Outposts

The Internet has become a critical component of modern civilization requ...

0 Jeremy Kepner, et al. ∙

research

∙ 03/15/2022

Distributed-Memory Sparse Kernels for Machine Learning

Sampled Dense Times Dense Matrix Multiplication (SDDMM) and Sparse Times...

10 Vivek Bharadwaj, et al. ∙

research

∙ 12/19/2021

Parallel Algorithms for Adding a Collection of Sparse Matrices

We develop a family of parallel algorithms for the SpKAdd operation that...

0 Md Taufique Hussain, et al. ∙

research

∙ 11/30/2021

Atos: A Task-Parallel GPU Dynamic Scheduling Framework for Dynamic Irregular Computations

We present Atos, a task-parallel GPU dynamic scheduling framework that i...

0 Yuxin Chen, et al. ∙

research

∙ 11/18/2021

Parallel Algorithms for Masked Sparse Matrix-Matrix Products

Computing the product of two sparse matrices (SpGEMM) is a fundamental o...

0 Srđan Milaković, et al. ∙

research

∙ 08/15/2021

Spatial Temporal Analysis of 40,000,000,000,000 Internet Darkspace Packets

The Internet has never been more important to our society, and understan...

0 Jeremy Kepner, et al. ∙

research

∙ 06/28/2021

Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems

Combinatorial algorithms such as those that arise in graph analysis, mod...

0 Ariful Azad, et al. ∙

research

∙ 04/19/2021

Randomized Algorithms for Scientific Computing (RASC)

Randomized algorithms have propelled advances in artificial intelligence...

24 Aydin Buluc, et al. ∙

research

∙ 11/02/2020

10 Years Later: Cloud Computing is Closing the Performance Gap

Can cloud computing infrastructures provide HPC-competitive performance ...

0 Giulia Guidi, et al. ∙

research

∙ 10/30/2020

PersGNN: Applying Topological Data Analysis and Geometric Deep Learning to Structure-Based Protein Function Prediction

Understanding protein structure-function relationships is a key challeng...

13 Nicolas Swenson, et al. ∙

research

∙ 10/20/2020

Parallel String Graph Construction and Transitive Reduction for De Novo Genome Assembly

One of the most computationally intensive tasks in computational biology...

0 Giulia Guidi, et al. ∙

research

∙ 10/16/2020

Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale

Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in ...

0 Md Taufique Hussain, et al. ∙

research

∙ 09/30/2020

Distributed Many-to-Many Protein Sequence Alignment using Sparse Matrices

Identifying similar protein sequences is a core step in many computation...

0 Oguz Selvitopi, et al. ∙

research

∙ 05/07/2020

Reducing Communication in Graph Neural Network Training

Graph Neural Networks (GNNs) are powerful and flexible neural networks t...

13 Alok Tripathy, et al. ∙

research

∙ 02/24/2020

Optimizing High Performance Markov Clustering for Pre-Exascale Architectures

HipMCL is a high-performance distributed memory implementation of the po...

0 Oguz Selvitopi, et al. ∙

research

∙ 02/12/2020

LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment

Pairwise sequence alignment is one of the most computationally intensive...

0 Alberto Zeni, et al. ∙

research

∙ 01/27/2020

diBELLA: Distributed Long Read to Long Read Alignment

We present a parallel algorithm and scalable implementation for genome a...

0 Marquita Ellis, et al. ∙

research

∙ 01/20/2020

The Parallelism Motifs of Genomic Data Analysis

Genomic data sets are growing dramatically as the cost of sequencing con...

0 Katherine Yelick, et al. ∙

research

∙ 10/14/2019

A High-Throughput Solver for Marginalized Graph Kernels on GPU

We present the design of a solver for the efficient and high-throughput ...

0 Yu-Hang Tang, et al. ∙

research

∙ 10/04/2019

RDMA vs. RPC for Implementing Distributed Data Structures

Distributed data structures are key to implementing scalable application...

0 Benjamin Brock, et al. ∙

research

∙ 08/04/2019

GraphBLAST: A High-Performance Linear Algebra-based Graph Framework on the GPU

High-performance implementations of graph algorithms are challenging to ...

0 Carl Yang, et al. ∙

research

∙ 10/30/2018

BCL: A Cross-Platform Distributed Container Library

One-sided communication is a useful paradigm for irregular parallel appl...

0 Benjamin Brock, et al. ∙

research

∙ 09/19/2018

Extreme Scale De Novo Metagenome Assembly

Metagenome assembly is the process of transforming a set of short, overl...

0 Evangelos Georganas, et al. ∙

research

∙ 04/10/2018

Implementing Push-Pull Efficiently in GraphBLAS

We factor Beamer's push-pull, also known as direction-optimized breadth-...

0 Carl Yang, et al. ∙

research

∙ 04/05/2018

High-performance sparse matrix-matrix products on Intel KNL and multicore architectures

Sparse matrix-matrix multiplication (SpGEMM) is a computational primitiv...

0 Yusuke Nagasaka, et al. ∙

research

∙ 03/22/2018

Design Principles for Sparse Matrix Multiplication on the GPU

We implement two novel algorithms for sparse-matrix dense-matrix multipl...

0 Carl Yang, et al. ∙

research

∙ 01/30/2018

A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

We design and implement an efficient parallel approximation algorithm fo...

0 Ariful Azad, et al. ∙

research

∙ 12/12/2017

Integrated Model, Batch and Domain Parallelism in Training Neural Networks

We propose a new integrated method of exploiting model, batch and domain...

0 Amir Gholami, et al. ∙

research

∙ 12/12/2017

Integrated Model and Data Parallelism in Training Neural Networks

We propose a new integrated method of exploiting both model and data par...

0 Amir Gholami, et al. ∙

research

∙ 10/30/2017

Communication-Avoiding Optimization Methods for Massive-Scale Graphical Model Structure Learning

Undirected graphical models compactly represent the structure of large, ...

0 Penporn Koanantakool, et al. ∙

research

∙ 10/26/2016

The Reverse Cuthill-McKee Algorithm in Distributed-Memory

Ordering vertices of a graph is key to minimize fill-in and data structu...

0 Ariful Azad, et al. ∙

research

∙ 06/18/2016

Mathematical Foundations of the GraphBLAS

The GraphBLAS standard (GraphBlas.org) is being developed to bring the p...

0 Jeremy Kepner, et al. ∙

Aydin Buluc

Featured Co-authors

Sign in with Google

Consider DeepAI Pro