
Layerwise Pruning of Transformer Attention Heads for Efficient Language Modeling
While Transformerbased models have shown impressive language modeling p...
read it

Stochastic Precision Ensemble: SelfKnowledge Distillation for Quantized Deep Neural Networks
The quantization of deep neural networks (QDNNs) has been actively studi...
read it

SSGD: Symmetrical Stochastic Gradient Descent with Weight Noise Injection for Reaching Flat Minima
The stochastic gradient descent (SGD) method is most widely used for dee...
read it

Quantized Neural Networks: Characterization and Holistic Optimization
Quantized deep neural networks (QDNNs) are necessary for lowpower, high...
read it

SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of LowPrecision Deep Neural Networks
Designing a deep neural network (DNN) with good generalization capabilit...
read it

Empirical Analysis of Knowledge Distillation Technique for Optimization of Quantized Deep Neural Networks
Knowledge distillation (KD) is a very popular method for model size redu...
read it

Workloadaware Automatic Parallelization for MultiGPU DNN Training
Deep neural networks (DNNs) have emerged as successful solutions for var...
read it

Single Stream Parallelization of Recurrent Neural Networks for Low Power and Fast Inference
As neural network algorithms show high performance in many applications,...
read it

Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
Deep neural networks (DNNs) usually demand a large amount of operations ...
read it

Quantized neural network design under weight capacity constraint
The complexity of deep neural network algorithms for hardware implementa...
read it

Compact Deep Convolutional Neural Networks With Coarse Pruning
The learning capability of a neural network improves with increasing dep...
read it

CharacterLevel Language Modeling with Hierarchical Recurrent Neural Networks
Recurrent neural network (RNN) based characterlevel language models (CL...
read it

Dynamic Hand Gesture Recognition for Wearable Devices with Low Complexity Recurrent Neural Networks
Gesture recognition is a very essential technology for many wearable dev...
read it

FPGA Based Implementation of Deep Neural Networks Using Onchip Memory Only
Deep neural networks (DNNs) demand a very large amount of computation an...
read it

CharacterLevel Incremental Speech Recognition with Recurrent Neural Networks
In realtime speech recognition applications, the latency is an importan...
read it

Online Keyword Spotting with a CharacterLevel Recurrent Neural Network
In this paper, we propose a contextaware keyword spotting model employi...
read it

Structured Pruning of Deep Convolutional Neural Networks
Real time application of deep learning algorithms is often hindered by h...
read it

FixedPoint Performance Analysis of Recurrent Neural Networks
Recurrent neural networks have shown excellent performance in many appli...
read it

Online Sequence Training of Recurrent Neural Networks with Connectionist Temporal Classification
Connectionist temporal classification (CTC) based supervised sequence tr...
read it

Resiliency of Deep Neural Networks under Quantization
The complexity of deep neural network algorithms for hardware implementa...
read it

Single stream parallelization of generalized LSTMlike RNNs on a GPU
Recurrent neural networks (RNNs) have shown outstanding performance on p...
read it
Wonyong Sung
is this you? claim profile
Professor of Electrical and Computer Engineering at Seoul National University