Probabilistic finite automata (PFA) are an extension of classical nondeterministic finite automata (NFA) where transitions, for each state and letter, are represented as probability distributions. The PFA model was first introduced by Rabin.
There are a variety of classical problems for PFA. Let denote the acceptance probability of a PFA on a word , where is an alphabet. A central question is that of (strict) emptiness of cutpoint languages: given some probability , does there exist a finite input word whose probability of acceptance is greater than , i.e., for which (or for strict emptiness). Another important problem is that of cutpoint isolation, where we are given a probability and we must determine if can be approached arbitrarily closely, i.e., for each , does there exist a word such that ? The injectivity problem for PFA is defined as follows: given a PFA over alphabet determine whether the acceptance function is injective (i.e. do there exist two distinct words with the same acceptance probability?) Finally we mention the -reachability problem: given a PFA and , does there exist such that ?
. The emptiness problem is known to be decidable over a single letter alphabet (i.e., a Markov chain). The injectivity problem for PFA is undecidable , even for polynomially ambiguous PFA .
The main focus of this paper is the cutpoint isolation problem. The authors of  show that the problem of determining if a given cutpoint is isolated (resp. if a PFA has any isolated cutpoint) is undecidable and this was shown to hold even for PFA with (resp. ) states over a binary alphabet . The cutpoint isolation problem, in the special case where , is also known to be undecidable . The problem is especially interesting given the seminal result of Rabin that if a cutpoint is isolated, then the cutpoint language associated with is necessarily regular .
Given the multitude of undecidability results for PFA, it is natural to consider restricted classes for which tractable results may be found. Various classes of restrictions on PFA are possible, depending upon the structure of the PFA or on possible input words. Some restrictions relate to the number of states of the automaton, the alphabet size and whether one defines the PFA over the algebraic real numbers or the rationals. Recent work has studied PFA with finite, polynomial or exponential ambiguity (in terms of the underlying NFA) , PFA defined for restricted input words (for example those coming from regular, bounded or letter-monotonic languages) [4, 3], commutative PFA, where all transition matrices commute, for which cutpoint languages and non-free languages generated by such automata necessarily become commutative  or other structural restrictions on the PFA such as #-acyclic automata, for which some problems become decidable .
A natural restriction on PFA was studied in , where input words of the PFA are restricted to be from some letter-monotonic language of the form with distinct letters (analogous to a -way PFA, whose read head may “stay put” on an input letter but never moves left). The emptiness and -reachability problems for PFA on letter-monotonic languages were shown to be undecidable for high (finite) dimensional matrices via an encoding of Hilbert’s tenth problem on the solvability of Diophantine equations and Turakainen’s method to transform weighted integer automata to probabilistic automata . These undecidability results were later shown to also hold for polynomially ambiguous PFA with commutative matrices .
The authors of  recently studied decision problems for PFA of various degrees of ambiguity in order to map the frontier of decidability for restricted classes of PFA. The degree of ambiguity of a PFA is a structural property, giving an indication of the number of accepting runs for a given input word and it can be used to give various classifications of ambiguity including finite, polynomial and exponential ambiguity. The degree of ambiguity of automata is a well-known and well-studied property in automata theory . The authors of  show that the emptiness problem for PFA remains undecidable even for polynomially ambiguous automata (quadratic ambiguity), before going on to show PSPACE-hardness results for finitely ambiguous PFA and that emptiness is in NP for the class of -ambiguous PFA for every . The emptiness problem for PFA was later shown to also be undecidable even for linearly ambiguous automata in .
1.1 Our Contributions
It is natural to consider the decidability of the cutpoint isolation problem for polynomially ambiguous PFA on letter-monotonic or commutative languages, given that the (strict) emptiness problems for such automata are undecidable . In the present paper we prove the surprising result that the cutpoint isolation problem is in fact decidable, even if the PFA is exponentially ambiguous, matrices are non-commutative, and the input language is not just the letter-monotonic language but instead a more general letter-monotonic context-free language. The results are shown in Table 1.
The result is surprising since in order to solve the cutpoint isolation problem, we must solve two subproblems. Either the cutpoint can be reached exactly (the -reachability problem), or else it can only be approximated arbitrarily closely and is only reached exactly in some limit. As mentioned, the emptiness problem for cutpoint languages is undecidable for PFA on letter-monotonic languages, even when all matrices commute and the PFA has polynomial ambiguity . The proof of this result shows a construction of a PFA for which determining if a given is ever reached (i.e., the -reachability problem) is undecidable. This may at first seem to contradict the results of this paper, since the -reachability problem is one of the two subproblems to be solved for cutpoint isolation. Why is there no contradiction then? It comes from the fact that as the powers of matrices used in the PFA constructed in  increase, the PFA valuation tends towards the limit value . Therefore, this is always non-isolated and hence the cutpoint isolation problem for such constructed PFA and is decidable. However, determining if the PFA ever exactly reaches is undecidable. So, there is no contradiction with the results of this paper. Our main result is stated as follows.
The cutpoint isolation problem for probabilistic finite automata where inputs are constrained to a given letter-monotonic context-free language is decidable.
The proof of Theorem 1 is found in Section 3. Our proof technique for showing the decidability of cutpoint isolation for PFA on letter-monotonic languages uses the following crucial facts. If a PFA over a letter-monotonic context-free language can approach some given cutpoint arbitrarily closely, then the PFA can reach exactly
if we allow a subset of the matrices to be taken to one of their ‘limiting powers’. We use the property that each limiting power (of which there may be finitely many) of a stochastic matrix can be computed in polynomial time (see Lemma4
), as well as a crucial property from linear algebra that dominant eigenvalues (those of strictly largest magnitude) of a stochastic matrix are necessarily of magnitude, roots of unity and they have equal geometric and algebraic multiplicities (see Lemma 3). Since the input words of the PFA come from a letter-monotonic CFL, we also use the fact that a letter-monotonic language is context-free if and only if its Parikh image is a stratified semilinear set (see Proposition 6).
The combination of these ideas allows us to derive Algorithm 1, which works as follows. We initially set all variables as free (rather than fixed), and compute the Parikh image of the given letter-monotonic CFL . Using the fact that is a semilinear set, we compute which letters can be taken to arbitrarily high powers and which letters have fixed finite values. We then use the technical Proposition 7 which states that if we can reach then we can either do so by setting all free variables to an infinite power, or else we can compute an integer such that the value of one of free variables must be less than . We then either set all free variables as in the first case, or nondeterministically choose one of the free variables and assign it a value less than in the latter case. In the second case we also update the semilinear set and repeat the above procedure until no free variables remain. Finally, we verify that the PFA has exactly the value for the chosen values of the variables.
The crucial Proposition 7 is somewhat technical, but relies on splitting a product of stochastic matrices into a summation involving dominant and subdominant eigenvalues (a subdominant eigenvalue being one with magnitude strictly less than ) and then applying the spectral decomposition or Jordan normal form of each stochastic matrix in order to derive the constant which bounds the value of one of free variables. The provided algorithm is nondeterministic in nature although we do not currently have an upper bound on its complexity. We do however provide the following lower bound via an adaptation of a proof technique from .
Cutpoint isolation is NP-hard for -state PFA on letter-monotonic languages.
Notation. We denote by the set of all matrices over some field . We will primarily be interested in rational matrices. We define the spectrum (set of eigenvalues) of as arranged in monotonically nonincreasing order, i.e. for all and we define as the set of eigenvalues of of absolute value . We call eigenvalues dominant eigenvalues and eigenvalues subdominant eigenvalues.
Given and we define the direct sum of and by:
where is the zero matrix.
We use a nonstandard form of Dirac bra-ket notation in several calculations, to simplify the notation in some complex formulae. If
is a column vector, then we writeand where denotes the transpose of , i.e., . Note that Dirac bra-ket notation ordinarily defines that where denotes the conjugate transpose of , however we will not use this notion at any point. Note that is just a rank 1 matrix . We use and to denote the ’th basis row/column vector respectively.
Algebraic numbers. A complex number is algebraic if it is a root of a polynomial . The defining polynomial for is the unique polynomial of least degree with positive leading coefficient such that the coefficients of do not have a common factor and . The degree and height of are defined to be that of .
In order to do computations with algebraic numbers we use their standard representations. Namely, an algebraic number can be represented by its defining polynomial and a sufficiently good complex rational approximation. More precisely, will be represented by a tuple , where is the defining polynomial for and are such that is the unique root of inside the circle in with centre and radius . As shown in , if are roots of , then , where and are the degree and height of , respectively. So, if we require to be smaller than half of this bound, the above representation is well-defined.
Let be the size of the standard representation of , that is, the total bit size of and the coefficients of . It is well-known fact that for given algebraic numbers and , one can compute , and in time polynomial in , and one can compute and and decide whether in time polynomial in . Moreover, for a real algebraic , deciding whether can be done in time polynomial in . Finally, there is a polynomial time algorithm that for a given computes the standard representations of all roots of . For more information on efficient algorithmic computations with algebraic numbers the reader is referred to [1, 8, 16, 19].
Spectral decomposition and Jordan normal forms. We will use both the spectral decomposition theorem and the Jordan normal form of stochastic matrices in later proofs. For background, see .
Let be a matrix (we use notation since it will prove useful in the proof of Proposition 7), and let be the eigenvalues
of listed according to their geometric multiplicities111Note that is the number of linearly independent eigenvectors of
is the number of linearly independent eigenvectors ofor the number of Jordan blocks in the Jordan normal form of . The matrix is diagonalizable if and only if . Jordan normal forms are unique up to permutations of the Jordan blocks.. Then can be written in Jordan normal form , where
is an invertible matrix (det) and is a Jordan block for , with the number of Jordan blocks of and the size of the Jordan block corresponding to eigenvalue , such that . Jordan block corresponds to the eigenvalue of and has the form:
The matrix contains the generalised eigenvectors of . Noting that if , we now see that
The spectral decomposition of a matrix is a special case of the Jordan normal form. Namely, any diagonalizable matrix can be written as
where is the set of eigenvalues of , is the ’th column of and is the ’th row of . Thus we have .
We will also require the following technical lemma concerning the dominant eigenvalues of stochastic matrices.
Lemma 3 ([11, Theorem 6.5.3]).
Let be a dominant eigenvalue of a stochastic matrix . Then is a root of unity of order no more than . Moreover, the geometric multiplicity of is equal to its algebraic multiplicity. In other words, the Jordan blocks that correspond to have size .
Remark. Here is a simple proof of the last statement of Lemma 3. Let be the Jordan normal form of and be a transformation matrix such that . Then we have . Suppose there is an eigenvalue of which is a root of unity but whose Jordan block has size larger than . Then has at least one unbounded entry as . On the other hand, is stochastic for any , in particular all entries of belong to . It follows that the entries of are also bounded as . Since we get a contradiction. We also require the following lemma about the limit set of powers of a stochastic matrix.
For any stochastic matrix , the sequence has a finite number of limits. Namely, there exist a computable constant such that, for each , the subsequence converges to a limit, and this limit can be computed in polynomial time given and .
Let be a stochastic matrix. As shown in , we can compute in polynomial time the Jordan normal form of and a transformation matrix such that . Note that may have complex eigenvalues, so all computations are done using standard representations of algebraic numbers.
By Lemma 3, all dominant eigenvalues of are roots of unity of orders no more than , and their Jordan blocks have size . If is a root of unity of order , then is a periodic sequence with period . On the other hand, if is a Jordan block corresponding to an eigenvalue such that , then
is equal to the zero matrix.
Let be the least common multiple of the orders of the roots of unity among the eigenvalues of . Now if is a dominant eigenvalue of , then the values of do not depend on , where . Hence converges to a limit when . This limit is equal to a matrix obtained from by replacing all dominant with and all Jordan blocks corresponding to subdominant eigenvalues with zero matrices. So, .
This shows that has at most limits. Finally, we note that may be exponential in the dimension of . However, if has a single limit, then this limit can be computed in polynomial time. ∎
For a stochastic matrix , we will use notation to denote the set of all limits of the sequence . If has a single limit , then we will identify the set with the matrix and write .
Probabilistic Finite Automata (PFA). A PFA with states over an alphabet is defined as where is the initial probability distribution; is the final state vector and each is a (row) stochastic matrix. For a word , we define the acceptance probability of as:
which denotes the acceptance probability of .222Some authors interchange the order of and and use column stochastic matrices, although the two definitions are trivially isomorphic.
For any and PFA over alphabet , we define a cutpoint language to be: , and a strict cutpoint language by replacing with . The (strict) emptiness problem for a cutpoint language is to determine if (resp. ). The focus of the present paper is on the cutpoint isolation problem which we now define.
Problem 5 (Cutpoint isolation).
Given a PFA with cutpoint , determine if for each there exists some such that .
Let be an alphabet with distinct letters. A language is called letter-monotonic if . If is letter-monotonic and also a context-free language, then it is called a letter-monotonic context-free language. We are interested in cutpoint isolation for PFA whose inputs come from a given letter-monotonic context-free language.
For a letter-monotonic language , define its Parikh image333In general, the Parikh image of is defined as . as
Recall that a subset is called linear if there are vectors such that
We say that a linear set is stratified if for each the vector has at most two nonzero coordinates, and for any if both and have two nonzero coordinates, and , respectively, then their order is not , i.e., they are not interlaced. A finite union of linear sets is called a semilinear set, and a finite union of stratified linear sets is called a stratified semilinear set.
We will need the following classical facts about context-free languages.
If is a context-free language, then its Parikh image is a semilinear set that can be effectively constructed from the definition of .
Probabilistic Finite Automata on Letter-Monotonic Languages. Let be row stochastic matrices. Let be a stochastic vector (the initial vector) and (the final state vector). Let be a letter-monotonic context-free language, and let be a cutpoint for which we want to decide if it is isolated or not, that is, whether belongs to the closure of .
If is not isolated, then there are two scenarios: either there exists such that , or else is never reached but only approached arbitrarily closely. In the second case there is a sequence of tuples such that
and, furthermore, for every , either for all , i.e. is fixed, or is strictly increasing and converges to a limit as . It follows that if is not isolated, then there exists a choice of variables such that
Note that if , then is a finite set. In this case we substitute all limits of in the above formula, and so also becomes a finite set444In case if all are finite or if all limits are unique, then we identify the number with the one element set .
3 Decidability of Cutpoint Isolation
In this section we will give a proof of Theorem 1 which is our main result. The crucial ingredient of our proof is the following technical proposition.
Let be indices, a cutpoint, and let be such that is a free variable, for , and is assigned a fixed finite value, for . Then
either , where for all ,
or else there exists a constant such that implies for at least one .
Moreover, we can decide whether the first case holds and compute the constant in the second case.
Let be a given letter-monotonic CFL. We start by considering all indices as free and iteratively fix them until no free indices remain. We first use Parikh and Ginsburg’s results (Proposition 6) to compute the Parikh image . Then we nondeterministically choose a linear subset and use it to determine the indices which can be taken to arbitrary high values while staying within . These indices will correspond to the “free variables” in the algorithm.
Let be a set of such indices (which will be called in Algorithm 1). We then set for to appropriate finite values, while with remain free variables. We wish to determine if there is a choice of for such that
that is, whether can be reached by setting each free variable either to some finite value or else to , an “infinite” power. Proposition 7 then tells us that either all free variables should be set at in order to reach (and this is decidable), or else there exists a computable constant such that if we can reach by some choice of these free variables, then some for an index .
In the first case, we set all free variable to . In the second case, we nondeterministically choose some free variable, fix its value in the range and then update our linear set to satisfy a new constraint. The procedure repeats iteratively until all free variables have been assigned a fixed value. The algorithm then verifies if this choice of variables gives a solution.
4 Proof of Proposition 7
We begin with a proof sketch. Since each is stochastic, contains at least one eigenvalue and all other eigenvalues in are roots of unity by Lemma 3. All eigenvalues in have absolute value strictly smaller than . Our approach is to rewrite the expression
into the sum of two terms (which will be denoted by and ) such that determines the limit behaviour as all free variables tend towards infinity, since they control only dominant eigenvalues, while is vanishing, since at least one free variable controls a subdominant eigenvalue. We can then reason that if all free variables simultaneously become larger, then Eqn (3) tends towards a set of computable limits with some vanishing terms. Therefore we can determine either that we can reach when all free variables are , or else we can prove that Eqn (3) is within any of a limit value once all free variables are sufficiently large, which proves the proposition (by setting as less than the smallest difference from a limit value and ). We now proceed with the formal details.
First, we consider the simpler case when all matrices are diagonalizable and then show how to extend this argument to the general case.
Diagonalizable matrices. Let us first assume that all matrices are diagonalizable. By the spectral decomposition theorem (see Eqn (2)), we may write a matrix as:
where are the eigenvalues of repeated according to their multiplicities, and the vectors and , for , are related to the eigenvectors of . Now, we can write:
Let us thus define . The above sum can be split in two: the first summand containing terms where only dominant eigenvalues are to the power of free variables, and the second containing terms with at least one subdominant eigenvalue to the power of a free variable (these two terms are labelled and below). This is a useful decomposition since any term which contains a subdominant eigenvalue taken to the power of a free variable will tend towards zero as the values of all free variables (simultaneously) increase. Suppose for , where is some constant to be chosen later. We can write , where
By Lemma 3 the dominant eigenvalues are roots of unity, and so assumes only finitely many different values as with vary, while with are fixed.
Suppose in not an empty sum since otherwise . Then there exists and such that . Let be the maximum among such values, that is,
Recall that for . Then
can be estimated as follows: since for every choice ofin the summation there is with and for all other , we have
Notice that for any rational , we can compute such that . Now, gives a finite number of limit values for . If is not equal to any of them, then choose to be less than the minimal distance between and those limit values. Using this , we compute as above. By definition of , if all for , then the distance between and one of the limit values of is less than . Thus cannot be equal to when all for . Hence if , then there is such that .
The general case. We now show how to extend the proof to the case when some matrices are non-diagonalizable.
Let be the sum of the sizes of the first Jordan blocks of matrix , so that etc. Then we see that by using Eqn (1), has the form
where and , with the basis vector. Here we used the property that for any . We may now compute that: