Developing monolingual large Pre-trained Language Models (PLMs) is shown...
Knowledge distillation (KD) is an efficient framework for compressing
la...
Slot-filling and intent detection are the backbone of conversational age...
Intermediate layer knowledge distillation (KD) can improve the standard ...
Knowledge Distillation (KD) is extensively used to compress and deploy l...
Existing Natural Language Understanding (NLU) models have been shown to
...
In this work, we examine the ability of NER models to use contextual
inf...
Knowledge Distillation (KD) is a common knowledge transfer algorithm use...
Neural network approaches to Named-Entity Recognition reduce the need fo...