Recent works have widely adopted large language model pretraining for so...
Autoregressive language models (LMs) map token sequences to probabilitie...
Pretrained Transformers achieve state-of-the-art performance in various
...
Deep learning models are widely used for solving challenging code proces...
Channel decoding, channel detection, channel assessment, and resource
ma...
Memorization studies of deep neural networks (DNNs) help to understand w...
Despite the conventional wisdom that using batch normalization with weig...
Source code processing heavily relies on the methods widely used in natu...
There is an emerging interest in the application of deep learning models...
Initially developed for natural language processing (NLP), Transformers ...
Ensembles of deep neural networks are known to achieve state-of-the-art
...
One of the generally accepted views of modern deep learning is that
incr...
Recently, a lot of techniques were developed to sparsify the weights of
...
Bayesian methods have been successfully applied to sparsify weights of n...
In natural language processing, a lot of the tasks are successfully solv...
Recurrent neural networks show state-of-the-art results in many text ana...