Star Temporal Classification: Sequence Classification with Partially Labeled Data

01/28/2022
by   Vineel Pratap, et al.
0

We develop an algorithm which can learn from partially labeled and unsegmented sequential data. Most sequential loss functions, such as Connectionist Temporal Classification (CTC), break down when many labels are missing. We address this problem with Star Temporal Classification (STC) which uses a special star token to allow alignments which include all possible tokens whenever a token could be missing. We express STC as the composition of weighted finite-state transducers (WFSTs) and use GTN (a framework for automatic differentiation with WFSTs) to compute gradients. We perform extensive experiments on automatic speech recognition. These experiments show that STC can recover most of the performance of supervised baseline when up to 70 recognition to show that our method easily applies to other sequence classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2023

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

This paper presents a novel algorithm for building an automatic speech r...
research
10/30/2019

Unsupervised Star Galaxy Classification with Cascade Variational Auto-Encoder

The increasing amount of data in astronomy provides great challenges for...
research
09/15/2023

Unimodal Aggregation for CTC-based Speech Recognition

This paper works on non-autoregressive automatic speech recognition. A u...
research
10/02/2020

Differentiable Weighted Finite-State Transducers

We introduce a framework for automatic differentiation with weighted fin...
research
09/18/2019

Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes

In sequence modeling tasks the token order matters, but this information...
research
08/12/2023

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition

When labeled data is insufficient, semi-supervised learning with the pse...
research
03/11/2021

Learning with partially separable data

There are partially separable data types that make classification tasks ...

Please sign up or login with your details

Forgot password? Click here to reset