Sampling from Stochastic Finite Automata with Applications to CTC Decoding

05/21/2019
by   Martin Jansche, et al.
0

Stochastic finite automata arise naturally in many language and speech processing tasks. They include stochastic acceptors, which represent certain probability distributions over random strings. We consider the problem of efficient sampling: drawing random string variates from the probability distribution represented by stochastic automata and transformations of those. We show that path-sampling is effective and can be efficient if the epsilon-graph of a finite automaton is acyclic. We provide an algorithm that ensures this by conflating epsilon-cycles within strongly connected components. Sampling is also effective in the presence of non-injective transformations of strings. We illustrate this in the context of decoding for Connectionist Temporal Classification (CTC), where the predictive probabilities yield auxiliary sequences which are transformed into shorter labeling strings. We can sample efficiently from the transformed labeling distribution and use this in two different strategies for finding the most probable CTC labeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2019

Sequentiality of String-to-Context Transducers

Transducers extend finite state automata with outputs, and describe tran...
research
09/10/2018

On Finding a First-Order Sentence Consistent with a Sample of Strings

We investigate the following problem: given a sample of classified strin...
research
02/04/2019

On Prefix-Sorting Finite Automata

Being able to efficiently test the membership of a word in a formal lang...
research
11/14/2022

Growing Random Strings in CA

We discuss a class of cellular automata (CA) able to produce long random...
research
03/26/2021

On the Theory of Stochastic Automata

The theory of discrete stochastic systems has been initiated by the work...
research
01/29/2020

Stochastic L-system Inference from Multiple String Sequence Inputs

Lindenmayer systems (L-systems) are a grammar system that consist of str...
research
07/05/2020

Proving Non-Inclusion of Büchi Automata based on Monte Carlo Sampling

The search for a proof of correctness and the search for counterexamples...

Please sign up or login with your details

Forgot password? Click here to reset