EPIK: Eliminating multi-model Pipelines with Knowledge-distillation

11/27/2022
by   Bhavesh Laddagiri, et al.
0

Real-world tasks are largely composed of multiple models, each performing a sub-task in a larger chain of tasks, i.e., using the output from a model as input for another model in a multi-model pipeline. A model like MATRa performs the task of Crosslingual Transliteration in two stages, using English as an intermediate transliteration target when transliterating between two indic languages. We propose a novel distillation technique, EPIK, that condenses two-stage pipelines for hierarchical tasks into a single end-to-end model without compromising performance. This method can create end-to-end models for tasks without needing a dedicated end-to-end dataset, solving the data scarcity problem. The EPIK model has been distilled from the MATra model using this technique of knowledge distillation. The MATra model can perform crosslingual transliteration between 5 languages - English, Hindi, Tamil, Kannada and Bengali. The EPIK model executes the task of transliteration without any intermediate English output while retaining the performance and accuracy of the MATra model. The EPIK model can perform transliteration with an average CER score of 0.015 and average phonetic accuracy of 92.1 time for execution has reduced by 54.3 has a similarity score of 97.5 EPIK model (student model) can outperform the MATra model (teacher model) even though it has been distilled from the MATra model.

READ FULL TEXT

page 1

page 8

research
04/17/2019

End-to-End Speech Translation with Knowledge Distillation

End-to-end speech translation (ST), which directly translates from sourc...
research
05/09/2023

Multi-Teacher Knowledge Distillation For Text Image Machine Translation

Text image machine translation (TIMT) has been widely used in various re...
research
09/03/2019

Knowledge Distillation for End-to-End Person Search

We introduce knowledge distillation for end-to-end person search. End-to...
research
09/03/2019

Knowledge Distillation for End-to-EndPerson Search

We introduce knowledge distillation for end-to-end person search. End-to...
research
10/07/2022

Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization

Due to the high performance of multi-channel speech processing, we can u...
research
10/02/2020

Neighbourhood Distillation: On the benefits of non end-to-end distillation

End-to-end training with back propagation is the standard method for tra...

Please sign up or login with your details

Forgot password? Click here to reset