
-
PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses
With the increasing adoption of graph neural networks (GNNs) in the mach...
read it
-
Safer Illinois and RokWall: Privacy Preserving University Health Apps for COVID-19
COVID-19 has fundamentally disrupted the way we live. Government bodies,...
read it
-
TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes
MPI derived datatypes are an abstraction that simplifies handling of non...
read it
-
When Machine Learning Meets Quantum Computers: A Case Study
Along with the development of AI democratization, the machine learning a...
read it
-
Interpretable Visual Reasoning via Induced Symbolic Space
We study the problem of concept induction in visual reasoning, i.e., ide...
read it
-
Efficient Neural Network Implementation with Quadratic Neuron
Previous works proved that the combination of the linear neuron network ...
read it
-
Effective Algorithm-Accelerator Co-design for AI Solutions on Edge Devices
High quality AI solutions require joint optimization of AI algorithms, s...
read it
-
Exploring Semantic Capacity of Terms
We introduce and study semantic capacity of terms. For example, the sema...
read it
-
DNNExplorer: A Framework for Modeling and Exploring a Novel Paradigm of FPGA-based DNN Accelerator
Existing FPGA-based DNN accelerators typically fall into two design para...
read it
-
Tearing Down the Memory Wall
We present a vision for the Erudite architecture that redefines the comp...
read it
-
Practical Detection of Trojan Neural Networks: Data-Limited and Data-Free Cases
When the training data are maliciously tampered, the predictions of the ...
read it
-
At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation
This paper presents GPU performance optimization and scaling results for...
read it
-
ICA-UNet: ICA Inspired Statistical UNet for Real-time 3D Cardiac Cine MRI Segmentation
Real-time cine magnetic resonance imaging (MRI) plays an increasingly im...
read it
-
Can Quantum Computers Learn Like Classical Computers? A Co-Design Framework for Machine Learning and Quantum Circuits
Despite the pursuit of quantum supremacy in various applications, the po...
read it
-
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
Although graph neural networks (GNNs) have made great progress recently ...
read it
-
EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs
Modern analytics and recommendation systems are increasingly based on gr...
read it
-
A Multi-Perspective Architecture for Semantic Code Search
The ability to match pieces of code to their corresponding natural langu...
read it
-
EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions
High quality AI solutions require joint optimization of AI algorithms an...
read it
-
Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation
Learning segmentation from synthetic data and adapting to real data can ...
read it
-
Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
We consider the problem of unsupervised domain adaptation for semantic s...
read it
-
Multi-Cycle-Consistent Adversarial Networks for CT Image Denoising
CT image denoising can be treated as an image-to-image translation task ...
read it
-
DLSpec: A Deep Learning Task Exchange Specification
Deep Learning (DL) innovations are being introduced at a rapid pace. How...
read it
-
MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale
Machine Learning (ML) and Deep Learning (DL) innovations are being intro...
read it
-
On Interpretability of Artificial Neural Networks
Deep learning has achieved great successes in many important areas to de...
read it
-
Tensor Recovery from Noisy and Multi-Level Quantized Measurements
Higher-order tensors can represent scores in a rating system, frames in ...
read it
-
ELFISH: Resource-Aware Federated Learning on Heterogeneous Edge Devices
In this work, we propose ELFISH - a resource-aware federated learning fr...
read it
-
Enabling real-time multi-messenger astrophysics discoveries with deep learning
Multi-messenger astrophysics is a fast-growing, interdisciplinary field ...
read it
-
The Design and Implementation of a Scalable DL Benchmarking Platform
The current Deep Learning (DL) landscape is fast-paced and is rife with ...
read it
-
DLBricks: Composable Benchmark Generation toReduce Deep Learning Benchmarking Effort on CPUs
The past few years have seen a surge of applying Deep Learning (DL) mode...
read it
-
DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs
The past few years have seen a surge of applying Deep Learning (DL) mode...
read it
-
DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs (Extended)
The past few years have seen a surge of applying Deep Learning (DL) mode...
read it
-
NAIS: Neural Architecture and Implementation Search and its Applications in Autonomous Driving
The rapidly growing demands for powerful AI algorithms in many applicati...
read it
-
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
As Deep Learning (DL) models have been increasingly used in latency-sens...
read it
-
Benanza: Automatic uBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
As Deep Learning (DL) models have been increasingly used in latency-sens...
read it
-
PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space
Finding the right reviewers to assess the quality of conference submissi...
read it
-
SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems
Developing object detection and tracking on resource-constrained embedde...
read it
-
MSU-Net: Multiscale Statistical U-Net for Real-time 3D Cardiac MRI Video Segmentation
Cardiac magnetic resonance imaging (MRI) is an essential tool for MRI-gu...
read it
-
SPGNet: Semantic Prediction Guidance for Scene Parsing
Multi-scale context module and single-stage encoder-decoder structure ar...
read it
-
Across-Stack Profiling and Characterization of Machine Learning Models on GPUs
The world sees a proliferation of machine learning/deep learning (ML) mo...
read it
-
XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs
There has been a rapid proliferation of machine learning/deep learning (...
read it
-
Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device
Unlike traditional PCIe-based FPGA accelerators, heterogeneous SoC-FPGA ...
read it
-
SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection
Developing artificial intelligence (AI) at the edge is always challengin...
read it
-
A Retrospective Recount of Computer Architecture Research with a Data-Driven Study of Over Four Decades of ISCA Publications
This study began with a research project, called DISCvR, conducted at th...
read it
-
A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices
Developing deep learning models for resource-constrained Internet-of-Thi...
read it
-
Challenges and Pitfalls of Reproducing Machine Learning Artifacts
An increasingly complex and diverse collection of Machine Learning(ML) m...
read it
-
Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking
An increasingly complex and diverse collection of Machine Learning (ML) ...
read it
-
FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge
While embedded FPGAs are attractive platforms for DNN acceleration on ed...
read it
-
Document Similarity for Texts of Varying Lengths via Hidden Topics
Measuring similarity between texts is an important task for several appl...
read it
-
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
Text style transfer rephrases a text from a source style (e.g., informal...
read it
-
SCNN: A General Distribution based Statistical Convolutional Neural Network with Application to Video Object Detection
Various convolutional neural networks (CNNs) were developed recently tha...
read it