Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance

08/09/2023
by   Sourya Dipta Das, et al.
0

Dialect classification is used in a variety of applications, such as machine translation and speech recognition, to improve the overall performance of the system. In a real-world scenario, a deployed dialect classification model can encounter anomalous inputs that differ from the training data distribution, also called out-of-distribution (OOD) samples. Those OOD samples can lead to unexpected outputs, as dialects of those samples are unseen during model training. Out-of-distribution detection is a new research area that has received little attention in the context of dialect classification. Towards this, we proposed a simple yet effective unsupervised Mahalanobis distance feature-based method to detect out-of-distribution samples. We utilize the latent embeddings from all intermediate layers of a wav2vec 2.0 transformer-based dialect classifier model for multi-task learning. Our proposed approach outperforms other state-of-the-art OOD detection methods significantly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/23/2021

Lightweight Detection of Out-of-Distribution and Adversarial Samples via Channel Mean Discrepancy

Detecting out-of-distribution (OOD) and adversarial samples is essential...
research
06/02/2021

Unsupervised Out-of-Domain Detection via Pre-trained Transformers

Deployed real-world machine learning applications are often subject to u...
research
10/27/2022

On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors

Out-of-distribution (OOD) detection is concerned with identifying data p...
research
07/19/2021

OODformer: Out-Of-Distribution Detection Transformer

A serious problem in image classification is that a trained model might ...
research
04/11/2023

Unsupervised out-of-distribution detection for safer robotically-guided retinal microsurgery

Purpose: A fundamental problem in designing safe machine learning system...
research
12/18/2022

Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data

As more and more conversational and translation systems are deployed in ...
research
06/06/2023

A Functional Data Perspective and Baseline On Multi-Layer Out-of-Distribution Detection

A key feature of out-of-distribution (OOD) detection is to exploit a tra...

Please sign up or login with your details

Forgot password? Click here to reset