
Layerwise Pruning of Transformer Attention Heads for Efficient Language Modeling
While Transformerbased models have shown impressive language modeling p...
Stochastic Precision Ensemble: SelfKnowledge Distillation for Quantized Deep Neural Networks
The quantization of deep neural networks (QDNNs) has been actively studi...
SSGD: Symmetrical Stochastic Gradient Descent with Weight Noise Injection for Reaching Flat Minima
The stochastic gradient descent (SGD) method is most widely used for dee...
Quantized Neural Networks: Characterization and Holistic Optimization
Quantized deep neural networks (QDNNs) are necessary for lowpower, high...
SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of LowPrecision Deep Neural Networks
Designing a deep neural network (DNN) with good generalization capabilit...
Empirical Analysis of Knowledge Distillation Technique for Optimization of Quantized Deep Neural Networks
Knowledge distillation (KD) is a very popular method for model size redu...
Workloadaware Automatic Parallelization for MultiGPU DNN Training
Deep neural networks (DNNs) have emerged as successful solutions for var...
Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference
As neural network algorithms show high performance in many applications,...
Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
Deep neural networks (DNNs) usually demand a large amount of operations ...
Quantized neural network design under weight capacity constraint
The complexity of deep neural network algorithms for hardware implementa...
Compact Deep Convolutional Neural Networks With Coarse Pruning
The learning capability of a neural network improves with increasing dep...
CharacterLevel Language Modeling with Hierarchical Recurrent Neural Networks
Recurrent neural network (RNN) based characterlevel language models (CL...
Dynamic Hand Gesture Recognition for Wearable Devices with Low Complexity Recurrent Neural Networks
Gesture recognition is a very essential technology for many wearable dev...
FPGA Based Implementation of Deep Neural Networks Using Onchip Memory Only
Deep neural networks (DNNs) demand a very large amount of computation an...
CharacterLevel Incremental Speech Recognition with Recurrent Neural Networks
In realtime speech recognition applications, the latency is an importan...
Online Keyword Spotting with a CharacterLevel Recurrent Neural Network
In this paper, we propose a contextaware keyword spotting model employi...
Structured Pruning of Deep Convolutional Neural Networks
Real time application of deep learning algorithms is often hindered by h...
FixedPoint Performance Analysis of Recurrent Neural Networks
Recurrent neural networks have shown excellent performance in many appli...
Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification
Connectionist temporal classification (CTC) based supervised sequence tr...
Resiliency of Deep Neural Networks under Quantization
The complexity of deep neural network algorithms for hardware implementa...
Single stream parallelization of generalized LSTMlike RNNs on a GPU
Recurrent neural networks (RNNs) have shown outstanding performance on p...
Wonyong Sung
Professor of Electrical and Computer Engineering at Seoul National University