DeepAI
Log In Sign Up

The No Endmarker Theorem for One-Way Probabilistic Pushdown Automata

11/04/2021
by   Tomoyuki Yamakami, et al.
0

In various models of one-way pushdown automata, the explicit use of two designated endmarkers on a read-once input tape has proven to be extremely useful for making a conscious, final decision on the acceptance/rejection of each input word right after reading the right endmarker. With no endmarkers, by contrast, a machine must constantly stay in either accepting or rejecting states at any moment since it never notices the end of the input instance. This situation, however, helps us analyze the behavior of the machine whose tape head makes the consecutive moves on all prefixes of a given extremely long input word. Since those two machine formulations have their own advantages, it is natural to ask whether the endmarkers are truly necessary to correctly recognize languages. In the deterministic and nondeterministic models, it is well-known that the endmarkers are removable without changing the acceptance criteria of each input instance. This paper proves that, for a more general model of one-way probabilistic pushdown automata, the endmarkers are also removable. This is proven by employing probabilistic transformations from an "endmarker" machine to an equivalent "no-endmarker" machine at the cost of double exponential state complexity without compromising its error probability. By setting this error probability appropriately, our proof also provides an alternative proof to both the deterministic and the nondeterministic models as well.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

03/18/2019

One-Way Topological Automata and the Tantalizing Effects of Their Topological Features

We cast new light on the existing models of 1-way deterministic topologi...
12/09/2018

Equivalence of pushdown automata via first-order grammars

A decidability proof for bisimulation equivalence of first-order grammar...
02/07/2019

A coalgebraic take on regular and ω-regular behaviours

We present a general coalgebraic setting in which we define finite and i...
11/09/2021

Behavioral Strengths and Weaknesses of Various Models of Limited Automata

We examine the behaviors of various models of k-limited automata, which ...
01/31/2018

New Size Hierarchies for Two Way Automata

We introduce a new type of nonuniform two--way automaton that can use a ...
09/16/2022

An Efficient Modular Exponentiation Proof Scheme

We present an efficient proof scheme for any instance of left-to-right m...
07/01/2020

Minimal witnesses for probabilistic timed automata

Witnessing subsystems have proven to be a useful concept in the analysis...

1 With or Without Endmarkers, That is a Question

1.1 Two Different Formulations of One-Way Pushdown Automata

In automata theory, pushdown automata are regarded as one of the most fundamental architectures operated with finite-state controls. A pushdown automaton is, in general, a finite-controlled machine equipped with a special memory device, called a stack, in which information is stored and accessed in the first-in last-out manner. Here, we are focused on one-way pushdown automata whose tape heads either move to the right or stay still at any step. In particular, a nondeterministic variant of those machines, known as one-way nondeterministic pushdown automata (or 1npda’s), characterizes context-free languages. Similarly, one-way deterministic pushdown automata (or 1dpda’s) introduce deterministic context-free languages. These machines can be seen as special cases of a much general model of one-way probabilistic pushdown automata (or 1ppda’s). Macarie and Ogihara [8] discussed the computational complexity of languages recognized by 1ppda’s with unbounded-error probability. In recent literature, Hromkovič and Schnitger [4] and Yamakami [15] further discussed the limitations of the power of bounded-error 1ppda’s.

In textbooks and scientific papers, ignoring small variational deviations, various one-way pushdown automata have two distinct but widely-used formulations. At the first glance, those formulations look quite different and it is not obvious that they are essentially “equivalent” in recognition power. In the first formalism, an input string over a fixed alphabet is initially given to an input tape together with two designated endmarkers, which mark both ends of the input string. The machine starts at scanning the left endmarker and moves its tape head to the right whenever it accesses an input symbol until it reaches the right endmarker . In addition, the machine can proceed without reading any input symbol even after reading . Such special steps are called -moves or a -transitions. Even if the machine halts even before reading the endmarker by entering halting states, the use of -moves makes it possible to postpone the time to halt until the machine reads the right endmarker. Whenever the machine enters a halting state (i.e., either an accepting state or a rejecting state), the machine is considered to halt and its computation terminates in acceptance or rejection depending on the type of halting states.

In the second formalism, by sharp contrast, the input tape contains only a given input string without the presence of two endmarkers and the machine starts reading the input from left to right until it reads off the rightmost symbol. Since the machine need to access the entire input string without knowing the end of the input, whenever the machine reads off the input, it is considered to “halt” and the current inner state of the machine determines the acceptance and rejection of the input.

For convenience, we call a model described by the first formalism an endmarker model and one by the second formalism a no-endmarker model. Kaņeps, Geidmanis, and Freivalds [6], Hromkovič and Schnitger [4], and Yamakami [15] all used a no-endmarker model of 1ppda’s (succinctly called no-endmarker 1ppda’s) in their analyses of machine’s behaviors, whereas an endmarker model of 1ppda’s (called endmarker 1ppda’s) was used by Macarie and Ogihara [8] and Yamakami [16]. It is commonly assumed that endmarker 1ppda’s are equivalent in recognition power to no-endmarker 1ppda’s. Unfortunately, there is no “formal” proof for the equivalence between those two models without compromising error probability.

The roles of two endmarkers, the left and the right endmarkers, are clear. In the presence of the right endmarker , in particular, when the machine reads , the machine certainly notices the end of the input string and, based on this knowledge, it can make the final transition (followed by a possible series of -moves) before entering either accepting or rejecting states. In the presence of , we can make a series of -moves to empty the stack before halting. Without the right endmarker, however, the machine must be always in a “constantly halting” inner state (either an accepting state or a rejecting state). Such an inner state determines the acceptance or rejection of an input just after reading off the entire input string and possibly following a series of -moves.

This halting condition certainly helps us analyze the behavior of the machine simply by tracing down the changes of accepting and rejecting states on all prefixes of any extremely long input string. For instance, some of the results of Hromkovič and Schnitger [4] and Yamakami [15] were proven in this way.

For 1dpda’s as well as 1npda’s, Hopcroft and Ullman [3] used in their textbook the no-endmarker model to define pushdown automata whereas Lewis and Papadimitriou [7] suggested in their textbook the use of the endmarker model. For those basic pushdown automata, the endmarker model and the no-endmarker model are in fact well-known to be “equivalent” in their computational power. We succinctly call this assertion the no endmarker theorem throughout this paper.

For 1npda’s, for instance, we can easily convert one model to the other without compromising its acceptance/rejection criteria by first transforming a 1npda to its equivalent context-free grammar, converting it to Greibach Normal Form, and then translating it back to its equivalent 1npda (see, e.g., [3]).

The question of whether the endmarker models of one-way pushdown automata can be computationally equivalent to the no endmarker models of the same machine types is so fundamental and also useful in the study of various types of one-way pushdown automata. This paper extends our attention to 1ppda’s—a probabilistic variant of pushdown automata. For such 1ppda’s, we wish to argue whether the no endmarker theorem is indeed true because, unfortunately, not all types of pushdown automata enjoy the no endmarker theorem. The right endmarker can be eliminated if our machine model is closed under right quotient with regular languages (see, e.g., [3]). Since 1dpda’s and 1npda’s satisfy such a closure property [2], the no endmarker theorem holds for those machine models. For counter machines (i.e., pushdown automata with single letter stack alphabets except for the bottom marker), on the contrary, the endmarkers are generally unremovable [1]; more precisely, the deterministic reverse-bounded multi-counter machines with endmarkers are generally more powerful than the same machines with no endmarkers. This latter fact demonstrates the crucial role of the endmarkers for pushdown automata. It is therefore significantly important to discuss the removability of the endmarkers for bounded-error and unbounded-error 1ppda’s.

1.2 The No Endmarker Theorem

This paper presents the proof of the no endmarker theorem for 1ppda’s with arbitrary error probability. More precisely, we prove the following statement, which allows us to safely remove the two endmarkers of 1ppda’s without compromising error probability.

Theorem 1.1

[No Endmarker Theorem] Let be any alphabet and let be any error-bound parameter (when is a one-sided error, we can take instead). For any language over , the following two statements are logically equivalent.

  1. There exists a 1ppda with two endmarkers that recognizes with error probability on every input .

  2. There exists a 1ppda with no endmarker that recognizes with error probability on every input .

The first part of Theorem 1.1 for 1ppda’s asserts that we can safely eliminate the two endmarkers from each 1ppda without changing the original error probability. The invariance of this error probability is important because this invariance makes it possible to apply the same proof of ours to both 1dpda’s and 1npda’s as special cases by setting for all and by making for all and for all , respectively. Thus, we obtain the well-known fact that the endmarkers are removable for 1dpda’s and 1npda’s.

Corollary 1.2

Theorem 1.1 also holds for 1dpda’s as well as 1npda’s.

Since the proof of Theorem 1.1 is constructive, it is possible to discuss the increase of the number of inner states and of stack alphabet size in the construction of new machines. Another important factor is the maximal size of stack strings stored by “push” operations. We call this number the push size of the machine. Throughout this paper, altogether of those three factors are succinctly referred to as the stack-state complexity of transforming one model to the other.

For brevity, we say that two 1ppda’s (with or without endmarkers) are error-equivalent if their outputs agree with each other on all inputs with exactly the same error probability.

Proposition 1.3

Given an -state no-endmarker 1ppda with stack alphabet size and push size , there is an error-equivalent endmarker 1ppda of stack alphabet size and push size with at most states.

Proposition 1.4

Given an -state endmarker 1ppda with arbitrary push size, there is an error-equivalent no-endmarker 1ppda such that it is double-exponential in state size and stack alphabet size but is of push size .

Since Theorem 1.1 follows directly from Propositions 1.31.4, we will concentrate our efforts on proving these propositions in the rest of this paper. From the propositions, we can observe that endmarker 1ppda’s are likely to be double-exponentially more succinct in descriptional complexity than no-endmarker 1ppda’s. We are not sure, however, that this bound can be significantly improved.

Organization of This Paper.

In Section 2, we will start with the formal definition of 1ppda’s (and their variants, 1dpda’s and 1npda’s) with or without endmarkers. In Section 3, we will prove Proposition 1.3. Proposition 1.4 will be proven in Section 5. To simplify the proof of Proposition 1.4 in Section 4, we will transform each standard 1ppda into another special 1ppda that does not halt before the right endmarker and takes a “push-pop-controlled” form, called an ideal shape, which turns out to be quite useful in proving various properties of languages. As a concrete example of usefulness, we will demonstrate in Section 4.3 the closure property of language families induced by bounded-error endmarker 1ppda’s under “reversal”. Lastly, a few open questions regarding the subjects of this paper will be discussed in Section 6.

2 Various One-Way Pushdown Automata

Let us review two machine models of 1ppda’s in details, focusing on the use of the two endmarkers. A one-way probabilistic pushdown automaton (abbreviated as 1ppda) runs essentially in a way similar to a one-way nondeterministic pushdown automaton (or a 1npda) except that, instead of making a nondeterministic choice at each step forming a number of computation paths, randomly chooses one of all possible transitions and then branches out to produce multiple computation paths. The probability of such a computation path is determined by a series of random choices made along the computation path. The past literature has taken two different formulations of 1ppda’s, with or without two endmarkers. We will explain these two formulations in the subsequent subsections.

2.1 Numbers, Strings, and Languages

Let denote the set of all natural numbers, including , and set to be . Given a number , denotes the set . Given a set , denotes the power set of ; namely, the set of all subsets of .

An alphabet is a finite nonempty set of “letters” or “symbols.” Given an alphabet , a string over is a finite sequence of symbols in . Conventionally, we use to express the empty string as well as the pop operant for a stack. The length of a string , expressed as , is the total number of symbols in . For a string over with for any index , the reverse of is the string and is denoted by .

The notation indicates the set of all strings over whereas expresses . A language over is a subset of . For two languages and , denotes the concatenation of and , that is, . In particular, when , we write instead of . Similarly, when , we use the notation . For a number , expresses the set . Moreover, we use the notation for two sets and to denote the set

of “bracketed” ordered pairs.

2.2 The First Formulation

We start with explaining the first formulation of 1ppda’s whose input tapes marked by two designated endmarkers. We always assume that those endmarkers are not included in any input alphabet. As discussed in, e.g., [7], a 1ppda with two endmarkers (which we call an endmarker 1ppda) has a read-once semi-infinite input tape, on which an input string is initially placed, surrounded by two endmarkers (left endmarker) and (right endmarker). To clarify the use of those two endmarkers, we explicitly include them in the description of a machine as , where is a finite set of inner states, is an input alphabet, is a stack alphabet, is a finite subset of with , (with and ) is a probabilistic transition function, is the initial state, is the bottom marker, is a set of accepting states, and is a set of rejecting states, where denotes . We always assume that . For convenience, let . The minimum positive integer for which is called the push size of . Each value expresses the probability that, when reads on the input tape and in the top of the stack in inner state , changes to another inner state , and replaces by . For the clarity reason, we express as , because this emphasizes the circumstance that is changed into after scanning .

All tape cells on an input tape are indexed sequentially by natural numbers from left to right in such a way that is located in the th cell (called the start cell), a given input is placed from cell to cell , and is in the st cell.

We demand that cannot be removed from the stack; that is, if , then for any , and when . If , then the tape head must stay still and we call this transition a -move (or a -transition); otherwise, the tape head must move to an adjacent cell. Notice that, unless making a -move, always moves its tape head in one direction, from left to right.

We remark that, in certain literature, a pushdown automaton is assumed to halt just after reading without making any extra -move. For a general treatment of 1ppda’s, this paper allows the machine to make a (possible) series of -moves even after reading until it finally enters a halting state. After reading , the tape head is considered to move off (or leave) the input region, which is marked as on the input tape.

To deal with 1npda’s -moves, for any segment of input , we say that is completely read if reads all symbols in , moves its tape head off the string , and makes all (possible) -moves after reading the last symbol of . At each step, probabilistically selects either a -move or a non--move, or both. Conveniently, we define for any triplet . We demand that satisfies the following probability requirement: for any .

To describe the behaviors of a stack, we follow [14] for the basic terminology. A stack content means a series of stack symbols in , which are stored inside the stack sequentially from the topmost symbol of the stack to the lowest symbol (). Since the bottom marker is not popped, we often say that the stack is empty if there is no symbol in the stack except for .

A (surface) configuration of is a triplet , which indicates that is in inner state , ’s tape head scans the th cell, and ’s stack contains . The initial configuration of is . An accepting (resp., a rejecting) configuration is a configuration with an accepting (resp., a rejecting) state and a halting configuration is either an accepting or a rejecting configuration. Given a fixed input , we say that a configuration follows another configuration with probability if is the th symbol of , and if and if .

A computation path of length is a series of configurations, which describes a history of consecutive “moves” chosen by on input , starting at the initial configuration with probability and, for each index , the st configuration follows the th configuration with probability , ending at a halting configuration with probability . To such a computation path, we assign the probability . A computation path is called accepting (rejecting, resp.) if the path ends with an accepting configuration (a rejecting configuration, resp.). Generally, a 1ppda may produce an extremely long computation path or even an infinite computation path; therefore, we must restrict our attention to finite computation paths.

Hromkovič and Schnitger [4] and Kaņeps, Geidmanis, and Freivalds [6] both used a model of 1ppda’s whose computation paths all halt eventually (i.e., in finitely many steps) on every input. We also adopt this convention222Even if we allow infinitely long computation paths with a relative small probability, we can still simulate two models as in Theorem 1.1, and thus the theorem still holds. in the rest of this paper. In what follows, we always assume that all 1ppda’s should satisfy this requirement. Standard definitions of 1dpda’s and 1npda’s do not have such a runtime bound, because we can easily convert those machines to ones that halt within time (e.g., [3, 12, 14]).

The acceptance probability of on input is the sum of all probabilities of accepting computation paths of starting with on its input tape. We express by the acceptance probability of on . Similarly, we define to be the rejection probability of on . Whenever is clear from the context, we often omit script “” entirely and write, e.g., instead of . We further say that accepts (resp., rejects) if the acceptance (resp., rejection) probability (resp., ) is more than (resp., at least ). Since all computation paths are assumed to halt in linear time, for any given string , either accepts it or rejects it. The notation stands for the set of all strings accepted by ; that is, .

Given a language , we say that recognizes if coincides with the set . A 1ppda is said to make bounded error if there exists a constant (called an error bound) such that, for every input , either or ; otherwise, we say that makes unbounded error.

Regarding the behavioral equivalence of probabilistic machines, two 1ppda’s and are said to be error-equivalent to each other if and simulates (or simulates ) with exactly the same error probability on every input. In this paper, we are also concerned with the descriptional complexity of machine models. We use the following three complexity measures to describe each machine . The state size (or state complexity) of is , the stack alphabet size of is , and the push size of is the maximum length of any string in . It then follows that if has stack alphabet size and push size .

2.3 The Second Formulation

In comparison with the first formulation, let us consider 1ppda’s with no endmarker. Such machines are succinctly called no-endmarker 1ppda’s and they are naturally obtained from all the definitions stated in Section 2.2 except for removing the entire use of and . For example, an input region is now the cells that contain , instead of . We express such a machine as without and . For such a no-endmarker 1ppda, its probabilistic transition function maps to , where . The acceptance and rejection of such no-endmarker 1ppda’s are determined by whether the 1ppda’s are respectively in accepting states or in rejecting states just after reading off the entire input string and leaving the input region. To ensure this, must be partitioned into and .

It is important to note that every 1ppda in the initial state must read the leftmost symbol of a given non-empty input string written on the input tape at the first step. In particular, when an input is the empty string, for any machine that has the right endmarker but no left endmarker, it can still read at the first step. In contrast, since a no-endmarker machine cannot “read” any blank symbol, we must allow the machine to make a -move at the first step.

2.4 Deterministic and Nondeterministic Variants

Deterministic and nondeterministic variants of 1ppda’s can be easily obtained by slightly modifying the definition of the 1ppda’s. To obtain a one-way deterministic pushdown automaton (or a 1dpda), we require (instead of the unit real interval ) for all tuples . Similarly, we can obtain one-way nondeterministic pushdown automata (or 1npda’s) if we pick a possible next move uniformly at random and take the following acceptance/rejection criteria: a 1npda accepts input if and rejects if . It is not difficult to see that these “probabilistic” definitions of 1dpda’s and 1npda’s coincide with the “standard” definitions written in many textbooks.

3 From No-Endmarker 1ppda’s to Endmarker 1ppda’s

Proposition 1.3 asserts a linear increase of the stack-state complexity of transforming no-endmarker 1ppda’s into error-equivalent endmarker 1ppda’s. Although the transformation seems rather easy, we briefly describe how to construct from any no-endmarker 1ppda an error-equivalent endmarker 1ppda .

Proof of Proposition 1.3.   Given any no-endmarker 1ppda , we construct an endmarker 1ppda , where , , , and . It follows that and and that the push size does not change.

Intuitively speaking, the behavior of is described as follows. In the first move, uses a new initial state to read the left endmarker and then enters ’s initial state (in Line 1 below). Note that, while reading a non- symbol, by the definition, does not halt even if it enters halting states. When is in a halting state after reading the rightmost input symbol and making a series of (possible) -moves, then reads and enters new but corresponding halting states without changing the stack content (in Line 3). Formally, we define the probabilistic transition function as follows. Let .

  1. for , , and .

  2. for and if .

It is not difficult to show that simulates correctly with the same error probability. This completes the proof of Proposition 1.3.

4 Preparatory Machine Modifications

A key to our proof of Proposition 1.4 is the step to normalize the wild behaviors of endmarker 1ppda’s without compromising their error probability. This preparatory step is quite crucial and it will help us prove the proposition significantly more easily in Section 5. In the nondeterministic model, it is always possible to eliminate all -moves of 1npda’s and limit the set of inner states to (this is a byproduct of translating a context-free grammar in Greibach Normal Form to a corresponding 1npda). For 1ppda’s, however, we can neither eliminate -moves nor limit to . Moreover, we cannot control the number of consecutive -moves. Despite all those difficulties, we can still curtail certain behaviors of 1ppda’s to control the execution of push and pop operations during their computations. Such a push-pop-controlled form is called “ideal shape.”

For to 1npda’s and 1dpda’s, there are a few precursors in this direction: Hopcroft and Ullman [3, Chapter 10] and Pighizzini and Pisoni [11, Section 5]. We intend to utilize the basic ideas of them and further expand the ideas to prove that all 1ppda’s can be transformed into their ideal-shape form.

4.1 No Halting Before Reading $

We first modify a given endmarker 1ppda to another one with a certain nice property without compromising its error probability. In particular, we want to eliminate the possibility of premature halting well before reading ; namely, to force the machine to halt only on or after reading with a (possible) series of subsequent -moves. Recall that a stack is empty when it contains only the bottom marker.

Lemma 4.1

Given any endmarker 1ppda with states, stack alphabet size, and push size, there exists an error-equivalent endmarker 1ppda having states with stack alphabet size and push size such that it halts only on or after reading ; moreover, when halts, its stack becomes empty.

A proof idea of Lemma 4.1 is to simulate step by step until either enters a halting state or reads . When enters a halting state before reading , remembers this state, empties the stack, continues reading input symbols, and finally enters a true halting state. On the contrary, when reaches , remembers the passing of , continues the simulation of ’s -moves. Once enters a halting state, remembers this state, empties the stack, and enters a true halting state.

For the clarity of the state complexity term of in Lemma 4.1 and also for the later use of the lemma’s proof in Section 4.2, we include the detailed proof of the lemma.

Proof of Lemma 4.1.   Let be any 1ppda with , , and push size . In the following, we define the desired 1ppda that can simulate with the same error probability. Firstly, we prepare a new non-halting inner state , which corresponds to each state in , and new accepting and rejecting states and . We set and . We also prepare for each state in and define . Moreover, we define , , , and . Recall from Section 2.2 the notation and . In what follows, we describe how to define , which aims at simulating step by step on a given input. Let us consider two cases, depending on whether or not reads .

(1) When enters an accepting state (resp., a rejecting state), say, before reading , first enters its associated state , empties the stack, and continues reading the rest of an input string deterministically. When finally reaches in state in (resp., ), changes it to (resp., ) and halt. Formally, is set as follows. Let , , , , and .

  1. if and .

  2. if and .

  3. if and .

  4. if and .

  5. , where if , and if .

(2) On the contrary, when reaches in a non-halting state, say, , the new machine enters and then simulates ’s -moves using corresponding states in . After making a series of consecutive -moves, when enters an accepting state (resp., a rejecting state), say, , enters its corresponding state , deterministically empties the stack, and finally enters (resp., ). Formally, is defined as follows. Let , , , and .

  1. if .

  2. .

  3. if .

  4. if , where if , and otherwise.

It is not difficult to show that simulates on every input and errs with the same probability as does. Note that and that both the stack alphabet size and the push size are unaltered.

4.2 Transforming 1ppda’s to the Ideal-Shape Form

We next convert a 1ppda into a specific form, called an “ideal shape,” where the 1ppda takes a “push-pop-controlled form,” in which pop operations always take place by reading a non-blank input symbol and then making a (possible) series of pop operations without reading any other input symbol and push operations add only single symbols without altering any existing stack content.

Any 1ppda with , , , and is said to be in an ideal shape if the following conditions (1)–(5) are all satisfied. (1) Scanning , it preserves the topmost stack symbol (called a stationary operation). (2) Scanning , it pushes a new symbol () without changing any symbol stored in the stack. (3) Scanning , it pops the topmost stack symbol. (4) Without scanning any input symbol (i.e., making a -move), it pops the topmost stack symbol. (5) The stack operation (4) comes only after either (3) or (4). For later convenience, we refer to these five conditions as the ideal-shape conditions.

More formally, the ideal-shape conditions are described using in the following way. Let , , , and .

  1. If , then is in for a certain .

  2. If and , then .

  3. If for a certain , then

  4. If , then .

Notice that Conditions 1–2 are related to the stack operations of (1)–(4) described above and that Conditions 3–4 are related to the condition (5). For no-endmarker 1ppda’s, nonetheless, we can define the same notion of “ideal shape” by requiring Conditions 1–4 to hold only for nonempty inputs.

Lemma 4.2 states that any 1ppda can be converted into its error-equivalent 1ppda in an ideal shape.

Lemma 4.2

[Ideal Shape Lemma] Let . Any -state endmarker 1ppda with stack alphabet size and push size can be converted into another error-equivalent 1ppda in an ideal shape with states and stack alphabet size . The above statement is also true for no-endmarker 1ppda’s.

The proof of the lemma partly follows the arguments of Hopcroft and Ullman [3, Chapter 10] and of Pighizzini and Pisoni [11, Section 5]. In the proof, the desired conversion of a given 1ppda into its error-equivalent ideal-shape 1ppda is carried out stage by stage. Firstly, is converted into another 1ppda whose -moves are limited only to pop operations. This is further modified into another 1ppda , which satisfies (2)–(3) of the ideal-shape conditions together with the extra condition that all -moves are limited to either pop operations or exchanges of a topmost stack symbol with one symbol. From this , we construct another 1ppda , which satisfies (1)–(3) of the ideal-shape conditions. Finally, we modify into another 1ppda , which further forces the ideal shape condition (5) to meet. This last machine becomes the desired 1ppda .

Proof of Lemma 4.2.   Let be any endmarker 1ppda with , , and push size . We first apply Lemma 4.1 to obtain another 1ppda that halts only on or after reading . We recall the notation used in the proof of Lemma 4.1. Notice that the obtained machine in the proof uses only a distinguished set of inner states after reading and that, when it halts, its stack becomes empty. Moreover, we recall the two halting states and from the proof. Here, we set and . For readability, we also call the obtained 1ppda by and assume that and . Let . Starting with this new machine , we perform a series of conversions to satisfy the desired ideal-shape conditions. In the process of such conversions, we denote by the 1ppda with push size obtained at stage .

To our description simpler, we further introduce the succinct notation (where , , , and ) for the total probability of the event that, starting in state with stack content (for an “arbitrary” string ), makes a (possible) series of consecutive -moves without accessing any symbol in , and eventually reaches inner state with stack content . This notation is formally defined as , where is introduced in the following inductive way. Let , , and .

  1. and for any .

  2. and if , where , , and .

Notice that, in particular, follows.

By our definition of 1ppda’s, all computation paths must terminate eventually on every input . Thus, any series of consecutive -moves makes the stack height increase by no more than because, otherwise, the series produces an infinite computation path. It thus follows that (*) implies .

(1) We first convert the original 1ppda to another error-equivalent 1ppda, say, whose -moves are restricted only to pop operations; namely, for all elements , , and . For this purpose, we need to remove any -move by which changes topmost stack symbol to a certain nonempty string . We also need to remove all transitions of the form , which violates the requirement of concerning pop operations. Notice that, once reads , it makes only -moves with inner states in and eventually empties the stack.

We define , , , and , where is a new bottom marker not in . Moreover, we set to be . It then follows from the above definition of that and . By Statement (*), the push size of is at most . The probabilistic transition function is constructed formally in the following substages (i)–(iii). For any value of not listed below is assumed to be .

(i) Recall that the first step of any 1ppda must be a non--move in general. At the first step, changes to (for an appropriate string ) so that, after this step, simulates using as a standard stack symbol but with no access to (except for the final step of ). This process is expressed as follows.

  1. for satisfying .

(ii) Assume that is ’s stack content and makes a (possible) series of consecutive -moves by which never accesses any stack symbol in . Consider the case where changes to by the end of the series and, at reading symbol , replaces with at the next step in order to produce (). In this case, we merge this entire process into one single non--move.

  1. if , , and , where and .

  2. if , where and .

Since we obtain , non-moves of may contain pop operations. Lines 1–3 indicate that the new push size is exactly , and thus follows.

Let us consider the case where, with a certain probability, produces a stack content by the end of the series of -moves and then pops at the next step without reading any input symbol. In this case, we also merge this entire process into a single -move of pop operation as described below.

  1. for and , where and .

(iii) Assume that has already read but is still in a non-halting state, say, in making a series of consecutive -moves. Unless reaches , similarly to Line 4., we merge this series of -moves and one pop operation into a single -move.

  1. for , , and , where and .

Once enters an inner state in at scanning , we remove and enter the same halting state.

  1. for and .

(2) We next convert to another error-equivalent 1ppda that conducts only the following types of moves: () pushes one symbol without changing the exiting stack content, () it replaces the topmost stack symbol by a (possibly different) single symbol, and () it pops the topmost stack symbol. We also demand that all -moves of are limited only to either () or (). From these conditions, we obtain .

To describe the intended conversion, we introduce two notations. The first notation represents a new stack symbol that encodes an element in . To make remember the topmost stack symbol of , we introduce another notation indicating that is in inner state reading in the stack.

We set , . Let denote a new bottom marker and let denote a new initial state. It then follows that and since .

(i) Consider the case where reads , substitutes (with and ) for , and enters inner state . In this case, reads with a current topmost stack symbol of the form , changes it to (or if ), and enters state . Let , , , and . This process is described as follows.

  1. if .

  2. .

(ii) Assuming that reads and pops by entering state , if , then reads in state and changes to by entering inner state . The function is formally defined as follows.

  1. if .

  2. if .

(iii) Finally, we need to deal with the special case of . Because is the bottom marker of , a slightly different treatment is required as shown below. Notice that