1 Introduction
Randomization plays an indispensable role in computer science. The celebrated result, the PCP Theorem [2], reveals the power of “interaction+randomness+error” in problem solving. Given an NP complete problem, one may design an interactive proof system consisting of a verifier and a prover [16, 3]. Upon receiving a problem instant the verifier accepts or rejects the input with high confidence in polynomial time by using logarithmic random bits and asking a constant number of questions to the prover. The scenario can be generalized to a multiprover situation with an increased power on the verifier side [9, 4, 13]. This fundamental result is significant to modern computing systems, which are open, distributed, interactive, and have both nondeterministic behaviours and randomized choices. To formalize models in which results like the PCP Theorem apply, one may introduce randomization to interaction models (process models). There are two kinds of randomness in randomized process models. A process may send a random value to another; and it may randomly choose whom it will send a value to. We call the former content randomness and the latter channel randomness. Content randomness is basically a computational issue [29, 15], whereas channel randomness is to do with interaction.
What kind of channel randomness are there? In literature one finds basically two answers to the question [20, 17, 34, 25, 30]. Generative models feature probabilistic choice for external actions. The standard syntax for a probabilistic choice term is of the form
(1) 
where and . The infix notation is often used. The semantics is defined by , meaning that may evolve into
with probability
by performing the action . The generative model is problematic in the presence of the interleaving composition operator and the localization operator. Let be and be . What is then the behaviour of ? And how about and ? What is the probability of interacting with at channel in ? In interaction at channel is disabled. How does that reconcile with the prescription that interacts at channel with probability ? It does not sound right to say that performs the action with probability one. A reasonable semantics is that may do the action with probability and becomes dead with probability . If this is indeed the interpretation, should really be . Symmetrically one may argue that should really be . All problems with the probabilistic choice (1) is gone if it is replaced by the random choice term(2) 
where the size of the index set is at least and . Thus for all . Early generative models are fully probabilistic [8]. Nondeterminism was considered later [25].
In reactive models, introduced by Larsen and Skou [20] and popularized by the work of van Glabbeek, Smolka and Steffen [30], nondeterministic choice and probabilistic choice come in alternation. Using a suggestive notation one may write for example
(3) 
This is a process that may perform an action and turns into with probability and with probability . It may also do an interaction at channel and becomes with probability and with probability . It is not helpful to think of simply as a distribution over . The distribution can only be achieved by carrying out a certain amount of computation, say invoking a random number generator. The details of the computation can be abstracted away, but it should definitely be formalized as an internal action. The best way to understand the process in (3) is to see it as a simplification of
(4) 
The process in (4) may do an external nondeterministic choice, and then an internal random choice. This is why reactive models are also called (strict) alternating models. However once we have separated the two kinds of choice, there is no point in insisting on the alternation. What it means is that we might as well give up on generative probabilistic choice and reactive probabilistic choice altogether in favour of (2) and nondeterministic choice. A systematic exposure of the research progress on reactive models is given in Deng’s excellent book [10].
The central issue in defining a probabilistic process model is the treatment of nondeterminism in the presence of probabilistic choice. The philosophy we shall be following in this paper is that nondeterminism is an attribute of interaction while randomness is a computational feature. Nondeterminism is a system feature, which cannot be implemented. Randomness is a process property, which can be implemented with a negligible error. We advocate a model independent methodology that turns an interaction model into a randomized interaction model by adjoining (2). The semantics of the random operator is defined by
(5) 
We emphasize that the label should be understood as the same thing as . The additional information attached by is to help reasoning with the bisimulation semantics. Talking about bisimulation equivalence it is often useful to think of the transitions defined by (5) as a single silent transition. We introduce the collective silent transition
(6) 
The collective silent transition is closed under composition, localization and recursion.
Strong bisimulations for probabilistic labeled transition systems, pLTS for short, are well understood [20, 17, 25, 30, 10]. Weak bisimulations have been studied for reactive models [27, 10] and alternation models [23]. In the presence of probabilistic choice a silent transition sequence appears as a tree of silent transitions. Schedulers are introduced to resolve the nondeterminism when constructing such a tree. Branching bisimulations have also been studied for reactive models [27].
Our current understanding of weak/branching bisimulations for probabilistic process models is not very satisfactory in several accounts, which can be summarized as follows.

Majority of the works are about pLTS. In a pLTS process combinators disappear. As far as we know none of the weak/branching bisimulations studied in literature is closed under all the three indispensable process combinators, the composition, localization and recursion operators. In fact some of them is closed in none of the three operators. This is not surprising because a pLTS without referring to any model defines a semantics for automata [26], not a semantics for processes. There are suggestions to look at synchronous probabilistic process models [30, 8]. A basic problem in the synchronous scenario is if internal actions are synchronized. A yes answer seems to contradict to the very idea of observational theory. But if the silent transitions are not synchronized, the composition operator is unlikely associative.

A consequence of the failure to account for the composition and localization, most results, even definitions, apply to only finite state probabilistic processes [23, 1, 10]. The coincidence between the weak bisimularity and the branching bisimilarity for example is only proved for the finite state fully probabilistic processes [8]. In fact in literature probabilistic processes are often defined as finite labeled graphs [23]
or labeled concurrent Markov chains
[33]. These restricted models preempt any study on process combinators. 
The issue of divergence has not been properly dealt with. This is definitely an omission, especially so in the presence of random silent actions.
The main task of the paper is to justify the model independent methodology proposed in the above. We shall convince the reader not only that randomization of process calculi ought to be model independent, but also that the bisimulation theory of the randomized version of any process model can be obtained from the bisimulation theory of in a uniform manner. Section 2 defines a randomized process model. For simplicity the model is taken to be a submodel of Milner’s CCS. Section 3 introduces tree and showcases its role in transferring the bisimulation theory of a model to the bisimulation theory of randomized . Section 4 proves the congruence property of the bisimulation equivalence. Section 5 makes some final comment.
2 Random Process Model
Let be the set of channels, ranged over by lowercase letters. Let . The set will be ranged over by small Greek letters. We let if . The set of actions is . We write and its decorated versions for elements of . The grammar of CCS [21] is as follows:
(7) 
where the indexing set is finite. We write for the nondeterministic term in which is the empty set. A trailing is often omitted. We also use the infix notation of , writing for example . A process variable that appears in is guarded. We shall assume that in the fixpoint term the bounded variable is guarded in . A term is a process if it contains no free variables. We write for processes. Let be the set of all CCS terms and be the set of all CCS processes. A finite state term/process is a term/process that contains neither the composition operator nor the localization operator. We can define prefix in the standard manner. For example can be defined by for some fresh . From now on we shall use this derived notation without further comment. The transition semantics of CCS is generated by the following rules, where .
For an equivalence on we write for . The advantage of the infix notation is that we may write for example and . The notation stands for the set of equivalence classes defined by . The equivalence class containing is denoted by , or when the equivalence is clear from context. We write if , and for the reflexive and transitive closure of . For we write for the fact that for some . A process is divergent if there is an infinite silent sequence .
The Randomized CCS, RCCS for short, is defined on top of CCS. The RCCS terms are obtained by extending the definition in (7) with the randomized choice term defined in (2). A variable that appears in is also guarded. The transition semantics of RCCS is defined by the above rules of CCS plus the rule defined in (5). The label that appears in these rules ranges over . The set of RCCS terms is denoted by and that of RCCS processes by .
We shall find it convenient to interpret as . So is a random silent transition if and an interaction if . The (reflexive and) transitive closure of is denoted by (). We shall say that is the probability of the silent transition sequence .
3 Epsilon Tree
Bisimulation equivalence is the standard equality for concurrent objects [21, 22]. Bauer and Hermanns’ technique [8] applied in the proof that the weak bisimilarity coincides with the branching bisimilarity on the finitestate fully probabilistic processes offers a convincing argument that one should focus on the branching bisimulation equivalence in probabilistic setting. For any process equality on one thinks of a silent transition as statepreserving, and a silent transition such that as statechanging. The basic idea of van Glabbeek and Weijland’s branching bisimulation [31, 32] is that a statechanging silent action must be explicitly bisimulated whereas statepreserving silent actions are ignorable. If then does not have to do anything because . If then must be simulated by some . Branching bisimulation requires that conversely must be simulated by . It is in this sense that is bisimulated by . The difference between branching bisimilarity and weak bisimilarity is that the former is a bisimulation equivalence whereas the latter is a simulation equivalence. A minute’s thought would lead us to believe that must be of the form . With these remarks in mind let us formalize the notion of branching bisimulation. An equivalence on is a branching bisimulation if for all and all such that , the following statement is valid for all .

If , then .
Clearly implies . It follows from definition that is bisimulated by vacuously. That explains the condition .
The extensional equality for computation never identifies a nonterminating computation to a terminating computation. The best way to formalize this requirement in bisimulation semantics is introduced in [24]. It is the key condition that turns a bisimulation equality for interaction to an equality for both interaction and computation [14]. An equivalence on is codivergent if, for every , either all members of are divergent, or no member of is divergent. The union of a class of codivergent branching bisimulations on is a codivergent branching bisimulation on [14]. So we may let be the largest such relation on .
Having motivated the bisimulation equality for CCS, we are in a position to randomize it as it were to an equality for RCCS. In RCCS a silent transition is generally a distribution over a finite set of silent transitions. A finite sequence of silent transitions in CCS then turns into a silent transition tree in RCCS. To describe that we introduce an auxiliary definition. Suppose is an equivalence on and . A silent tree of is a labeled tree rendering true the following statements.

Every node of is labeled by an element of . The root of is labeled by .

The edges are labeled by elements of . If an edge from a node labeled to a node labeled is labeled , then .
An tree of is a silent tree of such that all the labels of the nodes of are in . If we confuse a node with its label, we may say for example that is an edge in . Definition 3 formalizes statepreserving silent transition sequence in the probabilistic setting. An tree of with regard to is an tree of rendering true (1,2).

If for some , some collective silent transition exists such that for all and are the only children of .

If , then and is the only child of .
Intuitively an tree of with regard to is a random version of . All nodes of an tree with regard to are equal from the viewpoint of . Condition 1 requires that if one of is in the tree then all of are in the tree, and is for some . This is nothing more than the intuition that is conceptually a single silent transition. The number of trees of with regard to an equivalence class are in general infinite. Let’s see some examples.
Let . Let be any equivalence that distinguishes a divergent process from a nondivergent one. A finite tree of with regard to corresponds to a finite transition sequence of the form . In the nonrandom case an tree with regard to is just an instance of . There is an infinite tree of , corresponding to the divergent sequence .
Let . There are infinitely many trees of with regard to any equivalence. An tree may be a single node tree (the left diagram below), or a three node tree (the middle diagram below), or an infinite tree (the right diagram below). Unlike Example 3 the divergence in this case is immune from any intervention.
Let . Let be an equivalence such that . A finite tree of with regard to is described by the left diagram below, one of its leaves cannot do an immediate action. The right diagram describes an infinite tree of with regard to , all of its leaves can do an immediate action.
Let . Let be any equivalence such that . Two trees of with regard to are described by the following infinite diagrams. Every leaf of the left diagram can do an immediate action, whereas none of the leaves of the right diagram can do an immediate action.
Let . An tree of with regard to an equivalence rendering true is described by the left diagram below. Every leaf of the tree can do an immediate action. Another tree of with regard to is described by the right diagram below, in which every leaf can do an immediate action.
These examples bring out a few observations. Firstly trees are meant to generalize . This is clear from Example 3. However trees are a little too general. Two trees of a process may differ in that every leaf of one tree may do an immediate action whereas in the other this is not true.
To isolate the trees that truly correspond to , we introduce some auxiliary definitions. A path in a silent tree is either a finite path going from the root to a node or an infinite path starting from the root. A branch of is either a path ending in a leaf or an infinite path. The length of a path is the number of edges in if is finite; it is otherwise. For let be the label of the th edge. The probability of a finite path is . A path of length zero is a single node, and its probability is . The probability of an infinite path is the limit of , whose existence is guaranteed because the decreasing sequence is bounded by from below. It is finite, define . If is infinite, we need to define the probability in terms of approximation. Let be the subtree of defined by the nodes of height no more than . Inductively

is induced by the root of ; and

is induced by the nodes of and all the children of these nodes.
It should be clear that . The probability of the tree is defined by the limit . for every tree .
Proof.
for all . ∎
The probability of the finite branches of is defined by , where
(8) 
We are now in a position to generalize a branching bisimulation for CCS processes to a branching bisimulation for RCCS processes. First of all we generalize statepreserving silent transition sequences of finite length. Intuitively such a sequence turns into an tree that probabilistically contains no infinite branches. An tree is regular if . In the same line of thinking an tree is divergent if it has no finite branches. An tree is divergent if . The next definition is the probabilistic counterpart of Definition 3. An equivalence on is codivergent if the following is valid:

For every , either all members of have divergent trees with regard to , or no member of has any divergent tree with regard to .
To discuss the branching bisimulation for random processes, we need to talk about a transition from a process to an equivalence class . This makes sense because the processes in are supposed to be all equal. We would like to formalize the idea that after a finite number of statepreserving silent transitions an action is performed and the end processes are in . Suppose . An transition from to with regard to consists of a regular tree of with regard to and a transition for every leaf of . We will write if there is an transition from to with regard to . By definition whenever .
Let’s see some examples. For the process in Example 3 one has , where the regular tree is described by the right diagram in Example 3. For the process in Example 3 one has , where the regular tree is described by the left diagram in Example 3. For the process in Example 3, via the regular tree described by the left diagram, and via the regular tree described by the right diagram.
Now consider the situation where evolves into processes in with probability greater than . Suppose such that . Define
Define the weighted probability
(9) 
Intuitively (9) is the probability that may leave the class silently for elements of . If one leaf of the regular can do a silent transition that leaves with a nonzero probability, we require that every leaf of is capable of doing a silent transition that leaves with that probability. This probabilistic bisimulation property is observed in [8] in the simpler setting of the finite state fully probabilistic processes. In our general setting a process may do several random silent transitions caused by different random combinators. Suppose . A transition from to with regard to consists of a regular tree of with regard to and, for every leaf of , a collective silent transition such that
We will write if there is a transition from to with regard to . An equivalence on is a branching bisimulation if (1,2) are valid.

If such that , then .

If such that , then .
Consider . The behaviour of the process can be pictured as a ring (the left diagram below), in which all nodes are equal [32, 14]. Consider a different process . Its behaviour is pictured by the right diagram below. No two nodes in the right ring can be in any branching bisimulation. For example the top node in the ring can reach to the process with probability , whereas the bottom node in the ring cannot reach to with probability .
The process of Example 3 and the process of Example 3 cannot be in any codivergent branching bisimulation because the former is divergent whereas the latter is not. For a relation on , let be the equivalence closure of . Clearly is a codivergent bisimulation. And is a codivergent bisimulation, where is defined in Example 3 and . Also is a codivergent bisimulation, where is defined in Example 3, , and .
4 Equality for Random Process
The following lemma follows immediately from definition. If is a codivergent equivalence for every , then so is . The proof of the next fact is slightly complicated but standard.
Proposition .
If is a branching bisimulation for every , then so is .
Proof.
Let . Assume that is due to for some . Let such that . By definition there must be a family of pairwise disjoint equivalence classes of such that . Consider an transition . It consists of an tree of with regard to and, for every leaf of , a transition . We construct by induction on the structure of an transition . The basic idea is to construct an tree, whose nodes are all in , for every edge of . By sticking these trees together we get an tree of with regard to . Formally the bisimulation can be derived by induction.

The root of has only one child . By definition the edge from to is labeled by . If , we construct by structural induction on the tree of . If then is bisimulated by some transition consisting of an tree of with regard to and, for every leaf of , a transition for some . We then continue to construct an tree for each by induction on the structure of the tree of .

The root of has children with the corresponding edges labeled by respectively. By definition
There are two cases. In the first case for all . We construct by structural induction on the tree of say . In the second case suppose without loss of generality that . Let . Then by definition. The transition consists of a regular tree of with regard to and, for each leaf of , a collective silent transition such that
For every process the transition reaches, we continue to construct an tree of by induction on the structure of .

The root of does the transition . Then by definition.
In Figure 1 the left is a diagram for , while the right is a diagram for the stepwise bisimulation . The above itemized cases are described by the upper, middle, and bottom parts of the diagrams respectively. We still need to verify the regularity property. Given , there is a number such that . Now every edge in is bisimulated either vacuously or by an tree . There is a number such that for every such tree it holds that . It is not difficult to see that . Therefore is regular. So is bisimulated by . For the same reason is bisimulated by some . We are done by induction.
We should also consider transitions of the form for some , which can be treated in the same fashion. ∎
Proposition 4 is reassuring. We may now define the equality on RCCS processes, denoted by , as the largest codivergent branching bisimulation on . We abbreviate to in the rest of the section.
The equality is a congruence.
Proof.
It is easy to see that is closed under both the nondeterministic choice operation and the random choice operation. Consider . We prove that is a codivergent branching bisimulation. Suppose and for some equivalence class such that . Let denote the tree of in the transition. Using the technique explained in the proof of Proposition 4 it is routine to build up an transition that bisimulates . This is inductively described as follows.

An edge from to labeled is caused by a transition . In this case . If it is caused by such that for all , then obviously for each .

An edge from to labeled is caused by a transition . Then . It should be clear that .

Suppose and and . Define and and . Then is equal to . By assumption and . It follows that .

An edge from to labeled is caused by and . Then