Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss

04/12/2023
by   Zhiyuan Zhao, et al.
0

Filler words like “um" or “uh" are common in spontaneous speech. It is desirable to automatically detect and remove them in recordings, as they affect the fluency, confidence, and professionalism of speech. Previous studies and our preliminary experiments reveal that the biggest challenge in filler word detection is that fillers can be easily confused with other hard categories like “a" or “I". In this paper, we propose a novel filler word detection method that effectively addresses this challenge by adding auxiliary categories dynamically and applying an additional inter-category focal loss. The auxiliary categories force the model to explicitly model the confusing words by mining hard categories. In addition, inter-category focal loss adaptively adjusts the penalty weight between “filler" and “non-filler" categories to deal with other confusing words left in the “non-filler" category. Our system achieves the best results, with a huge improvement compared to other methods on the PodcastFillers dataset.

READ FULL TEXT
research
10/04/2020

Sentence Constituent-Aware Aspect-Category Sentiment Analysis with Graph Attention Networks

Aspect category sentiment analysis (ACSA) aims to predict the sentiment ...
research
07/23/2019

The word problem for double categories

We solve the word problem for double categories by translating it to the...
research
10/11/2019

Categories for Me, and You?

A non-self-contained gathering of notes on category theory, including th...
research
04/17/2023

Toward Auto-evaluation with Confidence-based Category Relation-aware Regression

Auto-evaluation aims to automatically evaluate a trained model on any te...
research
11/06/2018

WordNet-feelings: A linguistic categorisation of human feelings

In this article, we present the first in depth linguistic study of human...
research
02/22/2016

Empath: Understanding Topic Signals in Large-Scale Text

Human language is colored by a broad range of topics, but existing text ...

Please sign up or login with your details

Forgot password? Click here to reset