
-
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
The attention mechanism is becoming increasingly popular in Natural Lang...
read it
-
IOS: Inter-Operator Scheduler for CNN Acceleration
To accelerate CNN inference, existing deep learning frameworks focus on ...
read it
-
Hardware-Centric AutoML for Mixed-Precision Quantization
Model quantization is a widely used technique to compress and accelerate...
read it
-
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Self-driving cars need to understand 3D scenes efficiently and accuratel...
read it
-
Tiny Transfer Learning: Towards Memory-Efficient On-Device Learning
We present Tiny-Transfer-Learning (TinyTL), an efficient on-device learn...
read it
-
MCUNet: Tiny Deep Learning on IoT Devices
Machine learning on tiny IoT devices based on microcontroller units (MCU...
read it
-
Differentiable Augmentation for Data-Efficient GAN Training
The performance of generative adversarial networks (GANs) heavily deteri...
read it
-
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
We present APQ for efficient deep learning inference on resource-constra...
read it
-
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Transformers are ubiquitous in Natural Language Processing (NLP) tasks, ...
read it
-
MicroNet for Efficient Language Modeling
It is important to design compact language models for efficient deployme...
read it
-
GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning
Automatic transistor sizing is a challenging problem in circuit design d...
read it
-
Lite Transformer with Long-Short Range Attention
Transformer has become ubiquitous in natural language processing (e.g., ...
read it
-
A Fast Algorithm for Source-Wise Round-Trip Spanners
In this paper, we study the problem of efficiently constructing source-w...
read it
-
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Conditional Generative Adversarial Networks (cGANs) have enabled control...
read it
-
SpArch: Efficient Architecture for Sparse Matrix Multiplication
Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a ubiquitous...
read it
-
ChainSplitter: Towards Blockchain-based Industrial IoT Architecture for Supporting Hierarchical Storage
The fast developing Industrial Internet of Things (IIoT) technologies pr...
read it
-
Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos
Deep video recognition is more computationally expensive than image reco...
read it
-
Once for All: Train One Network and Specialize it for Efficient Deployment
Efficient deployment of deep learning models requires specialized neural...
read it
-
Point-Voxel CNN for Efficient 3D Deep Learning
We present Point-Voxel CNN (PVCNN) for efficient, fast 3D deep learning....
read it
-
Deep Leakage from Gradients
Exchanging gradients is a widely used method in modern multi-node machin...
read it
-
Design Automation for Efficient Deep Learning Computing
Efficient deep learning computing requires algorithm and hardware co-des...
read it
-
Defensive Quantization: When Efficiency Meets Robustness
Neural network quantization is becoming an industry standard to efficien...
read it
-
SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
read it
-
Fully Distributed Packet Scheduling Framework for Handling Disturbances in Lossy Real-Time Wireless Networks
Along with the rapid growth of Industrial Internet-of-Things (IIoT) appl...
read it
-
Learning to Design Circuits
Analog IC design relies on human experts to search for parameters that s...
read it
-
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Neural architecture search (NAS) has a great impact by automatically des...
read it
-
HAQ: Hardware-Aware Automated Quantization
Model quantization is a widely used technique to compress and accelerate...
read it
-
Temporal Shift Module for Efficient Video Understanding
The explosive growth in online video streaming gives rise to challenges ...
read it
-
Communication-Optimal Distributed Dynamic Graph Clustering
We consider the problem of clustering graph nodes over large-scale dynam...
read it
-
Path-Level Network Transformation for Efficient Architecture Search
We introduce a new function-preserving transformation for efficient neur...
read it
-
Fast inference of deep neural networks in FPGAs for particle physics
Recent results at the Large Hadron Collider (LHC) have pointed to enhanc...
read it
-
RT-DAP: A Real-Time Data Analytics Platform for Large-scale Industrial Process Monitoring and Control
In most process control systems nowadays, process measurements are perio...
read it
-
Efficient Sparse-Winograd Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are computationally intensive, whic...
read it
-
ADC: Automated Deep Compression and Acceleration with Reinforcement Learning
Model compression is an effective technique facilitating the deployment ...
read it
-
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Large-scale distributed training requires significant communication band...
read it
-
Deep Generative Adversarial Networks for Compressed Sensing Automates MRI
Magnetic resonance image (MRI) reconstruction is a severely ill-posed li...
read it
-
Exploring the Regularity of Sparse Structure in Convolutional Neural Networks
Sparsity helps reduce the computational complexity of deep neural networ...
read it
-
Classification of Neurological Gait Disorders Using Multi-task Feature Learning
As our population ages, neurological impairments and degeneration of the...
read it
-
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA
Long Short-Term Memory (LSTM) is widely used in speech recognition. In o...
read it
-
DSD: Dense-Sparse-Dense Training for Deep Neural Networks
Modern deep neural networks have a large number of parameters, making th...
read it
-
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
Recent research on deep neural networks has focused primarily on improvi...
read it
-
Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features
Generating natural language descriptions for images is a challenging tas...
read it
-
EIE: Efficient Inference Engine on Compressed Deep Neural Network
State-of-the-art deep neural networks (DNNs) have hundreds of millions o...
read it
-
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Neural networks are both computationally intensive and memory intensive,...
read it
-
Learning both Weights and Connections for Efficient Neural Networks
Neural networks are both computationally intensive and memory intensive,...
read it
-
Robust Face Recognition using Local Illumination Normalization and Discriminant Feature Point Selection
Face recognition systems must be robust to the variation of various fact...
read it