
QuantumNAS: NoiseAdaptive Search for Robust Quantum Circuits
Quantum noise is the key challenge in Noisy IntermediateScale Quantum (...
NAAS: Neural Accelerator Architecture Search
Datadriven, automatic design space exploration of neural accelerator ar...
Efficient and Robust LiDARBased EndtoEnd Navigation
Deep learning has been used to demonstrate endtoend neural network lea...
PatchNet – Shortrange Template Matching for Efficient Video Processing
Object recognition is a fundamental problem in many video processing tas...
Anycost GANs for Interactive Image Synthesis and Editing
Generative adversarial networks (GANs) have enabled photorealistic image...
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
The attention mechanism is becoming increasingly popular in Natural Lang...
IOS: InterOperator Scheduler for CNN Acceleration
To accelerate CNN inference, existing deep learning frameworks focus on ...
HardwareCentric AutoML for MixedPrecision Quantization
Model quantization is a widely used technique to compress and accelerate...
Searching Efficient 3D Architectures with Sparse PointVoxel Convolution
Selfdriving cars need to understand 3D scenes efficiently and accuratel...
Tiny Transfer Learning: Towards MemoryEfficient OnDevice Learning
We present TinyTransferLearning (TinyTL), an efficient ondevice learn...
MCUNet: Tiny Deep Learning on IoT Devices
Machine learning on tiny IoT devices based on microcontroller units (MCU...
Differentiable Augmentation for DataEfficient GAN Training
The performance of generative adversarial networks (GANs) heavily deteri...
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
We present APQ for efficient deep learning inference on resourceconstra...
HAT: HardwareAware Transformers for Efficient Natural Language Processing
Transformers are ubiquitous in Natural Language Processing (NLP) tasks, ...
MicroNet for Efficient Language Modeling
It is important to design compact language models for efficient deployme...
GCNRL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning
Automatic transistor sizing is a challenging problem in circuit design d...
Lite Transformer with LongShort Range Attention
Transformer has become ubiquitous in natural language processing (e.g., ...
A Fast Algorithm for SourceWise RoundTrip Spanners
In this paper, we study the problem of efficiently constructing sourcew...
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Conditional Generative Adversarial Networks (cGANs) have enabled control...
SpArch: Efficient Architecture for Sparse Matrix Multiplication
Generalized Sparse MatrixMatrix Multiplication (SpGEMM) is a ubiquitous...
ChainSplitter: Towards Blockchainbased Industrial IoT Architecture for Supporting Hierarchical Storage
The fast developing Industrial Internet of Things (IIoT) technologies pr...
Training Kinetics in 15 Minutes: Largescale Distributed Training on Videos
Deep video recognition is more computationally expensive than image reco...
Once for All: Train One Network and Specialize it for Efficient Deployment
Efficient deployment of deep learning models requires specialized neural...
PointVoxel CNN for Efficient 3D Deep Learning
We present PointVoxel CNN (PVCNN) for efficient, fast 3D deep learning....
Deep Leakage from Gradients
Exchanging gradients is a widely used method in modern multinode machin...
Design Automation for Efficient Deep Learning Computing
Efficient deep learning computing requires algorithm and hardware codes...
Defensive Quantization: When Efficiency Meets Robustness
Neural network quantization is becoming an industry standard to efficien...
SysML: The New Frontier of Machine Learning Systems
Machine learning (ML) techniques are enjoying rapidly increasing adoptio...
Fully Distributed Packet Scheduling Framework for Handling Disturbances in Lossy RealTime Wireless Networks
Along with the rapid growth of Industrial InternetofThings (IIoT) appl...
Learning to Design Circuits
Analog IC design relies on human experts to search for parameters that s...
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Neural architecture search (NAS) has a great impact by automatically des...
HAQ: HardwareAware Automated Quantization
Model quantization is a widely used technique to compress and accelerate...
Temporal Shift Module for Efficient Video Understanding
The explosive growth in online video streaming gives rise to challenges ...
CommunicationOptimal Distributed Dynamic Graph Clustering
We consider the problem of clustering graph nodes over largescale dynam...
PathLevel Network Transformation for Efficient Architecture Search
We introduce a new functionpreserving transformation for efficient neur...
Fast inference of deep neural networks in FPGAs for particle physics
Recent results at the Large Hadron Collider (LHC) have pointed to enhanc...
RTDAP: A RealTime Data Analytics Platform for Largescale Industrial Process Monitoring and Control
In most process control systems nowadays, process measurements are perio...
Efficient SparseWinograd Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are computationally intensive, whic...
ADC: Automated Deep Compression and Acceleration with Reinforcement Learning
Model compression is an effective technique facilitating the deployment ...
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Largescale distributed training requires significant communication band...
Deep Generative Adversarial Networks for Compressed Sensing Automates MRI
Magnetic resonance image (MRI) reconstruction is a severely illposed li...
Exploring the Regularity of Sparse Structure in Convolutional Neural Networks
Sparsity helps reduce the computational complexity of deep neural networ...
Classification of Neurological Gait Disorders Using Multitask Feature Learning
As our population ages, neurological impairments and degeneration of the...
ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA
Long ShortTerm Memory (LSTM) is widely used in speech recognition. In o...
DSD: DenseSparseDense Training for Deep Neural Networks
Modern deep neural networks have a large number of parameters, making th...
SqueezeNet: AlexNetlevel accuracy with 50x fewer parameters and <0.5MB model size
Recent research on deep neural networks has focused primarily on improvi...
Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features
Generating natural language descriptions for images is a challenging tas...
EIE: Efficient Inference Engine on Compressed Deep Neural Network
Stateoftheart deep neural networks (DNNs) have hundreds of millions o...
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Neural networks are both computationally intensive and memory intensive,...
Learning both Weights and Connections for Efficient Neural Networks
Neural networks are both computationally intensive and memory intensive,...
Song Han
Assistant professor in the Electrical Engineering and Computer Science Department of the Massachusetts Institute of Technology (MIT).