PSST! Prosodic Speech Segmentation with Transformers

02/03/2023
by   Nathan Roll, et al.
0

Self-attention mechanisms have enabled transformers to achieve superhuman-level performance on many speech-to-text (STT) tasks, yet the challenge of automatic prosodic segmentation has remained unsolved. In this paper we finetune Whisper, a pretrained STT model, to annotate intonation unit (IU) boundaries by repurposing low-frequency tokens. Our approach achieves an accuracy of 95.8 large-scale labeled data or enterprise grade compute resources. We also diminish input signals by applying a series of filters, finding that low pass filters at a 3.2 kHz level improve segmentation performance in out of sample and out of distribution contexts. We release our model as both a transcription tool and a baseline for further improvements in prosodic segmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2021

Pay Attention to MLPs

Transformers have become one of the most important architectural innovat...
research
02/06/2022

On Using Transformers for Speech-Separation

Transformers have enabled major improvements in deep learning. They ofte...
research
08/03/2023

Dynamic Token-Pass Transformers for Semantic Segmentation

Vision transformers (ViT) usually extract features via forwarding all th...
research
09/15/2022

Hydra Attention: Efficient Attention with Many Heads

While transformers have begun to dominate many tasks in vision, applying...
research
06/05/2020

Understanding Self-Attention of Self-Supervised Audio Transformers

Self-supervised Audio Transformers (SAT) enable great success in many do...
research
03/09/2022

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

Vision Transformer (ViT) has recently demonstrated promise in computer v...
research
08/05/2016

Boundary-based MWE segmentation with text partitioning

This work presents a fine-grained, text-chunking algorithm designed for ...

Please sign up or login with your details

Forgot password? Click here to reset