Prompt-tuning has become an increasingly popular parameter-efficient met...
Regularization techniques are crucial to improving the generalization
pe...
Knowledge Distillation (KD) is a commonly used technique for improving t...
We propose a general deep architecture for learning functions on multipl...
Slot-filling and intent detection are the backbone of conversational age...
GPT is an auto-regressive Transformer-based pre-trained language model w...
Knowledge Distillation (KD) is extensively used to compress and deploy l...
Knowledge Distillation (KD) is a model compression algorithm that helps
...
Existing Natural Language Understanding (NLU) models have been shown to
...
In this work, we examine the ability of NER models to use contextual
inf...
The advent of large pre-trained language models has given rise to rapid
...
Despite recent monumental advances in the field, many Natural Language
P...
Knowledge Distillation (KD) is a common knowledge transfer algorithm use...
Word-embeddings are a vital component of Natural Language Processing (NL...
Text generation with generative adversarial networks (GANs) can be divid...
Latent space based GAN methods and attention based sequence to sequence
...