
FORMS: Finegrained Polarized ReRAMbased Insitu Computation for Mixedsignal DNN Accelerator
Recent works demonstrated the promise of using resistive random access m...
Kudu: An Efficient and Scalable Distributed Graph Pattern Mining Engine
This paper proposes Kudu, a general distributed execution engine with a ...
HASCO: Towards Agile HArdware and Software COdesign for Tensor Computation
Tensor computations overwhelm traditional generalpurpose computing devi...
IntersectX: An Efficient Accelerator for Graph Mining
Graph pattern mining applications try to find all embeddings that match ...
Mix and Match: A Novel FPGACentric Deep Neural Network Quantization Framework
Deep Neural Networks (DNNs) have achieved extraordinary performance in v...
LowCost FloatingPoint Processing in ReRAM for Scientific Computing
We propose ReFloat, a principled approach for lowcost floatingpoint pr...
DwarvesGraph: A HighPerformance Graph Mining System with Pattern Decomposition
Graph mining tasks, which focus on extracting structural information fro...
ReversiSpec: Reversible Coherence Protocol for Defending Transient Attacks
The recent works such as InvisiSpec, SafeSpec, and CleanupSpec, among o...
A Lightweight Isolation Mechanism for Secure Branch Predictors
Recently exposed vulnerabilities reveal the necessity to improve the sec...
PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices
Deep neural network (DNN) has emerged as the most important and popular ...
A Comprehensive Evaluation of RDMAenabled Concurrency Control Protocols
Online transaction processing (OLTP) applications require efficient dis...
PatDNN: Achieving RealTime DNN Execution on Mobile Devices with Patternbased Weight Pruning
With the emergence of a spectrum of highend mobile devices, many applic...
HeterogeneityAware Asynchronous Decentralized Training
Distributed deep learning training usually adopts AllReduce as the sync...
A StochasticComputing based Deep Learning Framework using Adiabatic QuantumFluxParametron SuperconductingTechnology
The Adiabatic QuantumFluxParametron (AQFP) superconducting technology ...
Nonstructured DNN Weight Pruning Considered Harmful
Large deep neural network (DNN) models pose the key challenge to energy ...
Hop: HeterogeneityAware Decentralized Training
Recent work has shown that decentralized algorithms can deliver superior...
HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array
With the rise of artificial intelligence in recent years, Deep Neural Ne...
ADMMNN: An AlgorithmHardware CoDesign Framework of DNNs Using Alternating Direction Method of Multipliers
To facilitate efficient embedded and hardware implementations of deep ne...
ERNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs
Recurrent Neural Networks (RNNs) are becoming increasingly important for...
An Efficient Framework for Implementing Persist Data Structures on Remote NVM
The byteaddressable NonVolatile Memory (NVM) is a promising technology...
Towards UltraHigh Performance and Energy Efficiency of Deep Learning Systems: An AlgorithmHardware CoOptimization Framework
Hardware accelerations of deep learning systems have been extensively in...
VIBNN: Hardware Acceleration of Bayesian Neural Networks
Bayesian Neural Networks (BNNs) have been proposed to address the proble...
CirCNN: Accelerating and Compressing Deep Neural Networks Using BlockCirculantWeight Matrices
Largescale deep neural networks (DNNs) are both compute and memory inte...
GraphR: Accelerating Graph Processing Using ReRAM
This paper presents GRAPHR, the first ReRAMbased graph processing accel...
SCDCNN: HighlyScalable Deep Convolutional Neural Network using Stochastic Computing
With recent advancing of Internet of Things (IoTs), it becomes very attr...
