1 Introduction
The Aalen–Johansen estimator of transition probabilities in multistate models and the derived estimator of state occupation probabilities are known to be consistent when the Markov property holds for the multistate process, which may be subject to independent censoring. A result by Datta and Satten (2001) is that the estimator of state occupation probabilities derived from the Aalen–Johansen estimator remains valid under standard assumptions even in the nonMarkov case. Some steps of the argument seem to rely on martingale properties of certain processes. Although these processes are martingales in a Markov setting, it is not clear to this author that they retain the necessary martingale properties generally in a nonMarkov setting. In any case, it is of interest to establish the same result without the use of martingale arguments.
In this paper, the consistency of the Aalen–Johansenderived estimate of state occupation probabilities is established by appealing to a simple identity for the state occupation probability and results on additive and multiplicative transforms of interval functions that are established. This approach offers further insights into why the consistency continues to hold in the nonMarkov case.
2 The multistate setting
Consider a càdlàg multistate process with state space and time parameter space
. The state occupation probabilities are given by the row vector
with entries . With the definition , a transition matrix is defined by . The conditional probability is when and taken to be otherwise. We define a cumulative transition hazard by for where is the expected number of direct transitions from to up to time . A cumulative transition hazard matrix is then defined by having as the entry when and as the entry.With full information on independent replications of the multistate process , the natural estimator of the state occupation probability is an average of the state occupation indicators over replications. If independent replications of are observed, take the estimate which has entries . Estimation is complicated by censoring of the multistate process . Consider a multistate process with state space fulfilling when . Then can be considered a censored version of with denoting that is unobserved. The state 0 may or may not be absorbing for . Generally, but perhaps especially when 0 is not absorbing for , the term filtering rather than censoring of for the case may be more in line with the usual terminology, for instance with the terminology from Andersen et al. (1993).
Consider independent replications of as the observed information. Let denote the number of transitions from state to and let denote the mean. For replication , we let , and an empirical mean is defined by . Similarly, we use for state occupation with expectation . For replication , we let , and an empirical mean is defined by . The Nelson–Aalen estimate of is
(1) 
With , a matrix can be defined. Based on this, the Aalen–Johansen estimate of the matrix is, as defined in Aalen and Johansen (1978),
(2) 
where
denotes the identity matrix. The derived estimate of
is(3) 
for and .
It should not be surprising that tends to converge to , the observable transition hazard, rather than as desired. In order for this approach to work, we make the assumption for all with for all and . This is a weak version of an independent censoring assumption and is equivalent to assuming
(4)  
for almost all of interest for all and . This can be called the statusindependent observation assumption since, for fixed , it states that among the statuses of transitioning from to at time and being in state immediately before time , the probability of observing such status does not depend on the status. This term is along the lines of Overgaard and Hansen (2019) and the equivalence mentioned can be established using the techniques of that paper. Also, in order for to be a consistent estimate of , the assumption and for , or equivalently that , the probability of observing the initial state given that the initial state is , for with is positive and does not depend on , is appropriate.
Proposition 1.
For given , assume and that for some for almost all for all and . Also, assume that and and for all for all and . Then
(5) 
in probability as uniformly for .
Proof.
Since the Markov property is not assumed to hold, the usual martingale arguments are not expected to work. In particular, the process is not expected to be a martingale since is not expected to take all past information into account. The result can be proven by taking the functional approach of Glidden (2002) in this setting. Or the result can be proven by taking a functional approach based on variation for a as laid out in Overgaard (2019) since the underlying functionals are continuous in a variation setting and since in probability for all and for such under the assumptions where is the variation norm. This yields the convergence in variation norm on and so in particular uniformly on . Either approach can be used to study the asymptotic properties of the estimator in more detail as is done in Glidden (2002). ∎
We have, by the definitions, for any ,
(6) 
and, by iterating, this leads to
(7) 
for any choice of time points , as also pointed out by Aalen et al. (2001). On the basis of (7) and Proposition 1, the remaining task of this paper is to argue that the limit over refinements exists and equals . If this holds, by taking the limit in (7),
(8) 
which is consistently estimated by the Aalen–Johansen estimators of the state occupation probabilities under some assumptions according to Proposition 1, establishing the desired result. As a consequence of Theorem 5 below the limit exists and equals as desired under an upper continuity requirement and a bounded variation requirement on the s.
On a side note, the identity (7) for the empirical distribution with time points at transition times also explains why the Aalen–Johansen estimators of state occupation probabilities are simply the observed proportions in the uncensored case as also established in section IV.4.1.4 of Andersen et al. (1993).
3 Interval functions and their transforms
The concept of interval functions, as known partly from Gill and Johansen (1990) but especially from Dudley and Norvaiša (2011), will be at the core of the argument presented here. Consider an interval and the set of all subintervals of , denoted . An interval function is a function defined on such a . The interval functions we will consider here map into or, more generally, into the vector space of matrices, , which will be equipped with the maximum norm and the standard matrix multiplication. We let denote the identity matrix. With as the identity element, is a unital Banach algebra, satisfying for elements , and can be considered any general unital Banach algebra in the following.
We use the notation for intervals if for any choices of and . Two types of interval functions are important here:

An interval function is said to be additive if
(9) for any such that and .

An interval function is said to be multiplicative if
(10) for any such that .
Since is not generally commutative, the order of multiplication matters in the definition of a multiplicative interval function and here the stated definition is used in line with Gill and Johansen (1990)
but at odds with the definition preferred by
Dudley and Norvaiša (2011).A partition of an interval is a finite set of subintervals of such that . A variation concept for interval functions is defined by
(11) 
where the supremum is over partitions of . An interval function is then of bounded variation when . In this case, a realvalued interval function is obtained by , where the supremum is over partitions of .
If both and are partitions of an interval , is called a refinement of if any is a subinterval of an interval in . For an interval , is the length of the interval, and for a partition the mesh is . Consider a function which associates any partition, , of an interval with an element . Two notions of a limit will be of interest:

If is such that for each a partition exists such that, for any refinement of , then we say that is the limit of over refinements, which is denoted by .

If is such that for each a exists such that, for any partition with , then is the limit of in mesh, which is denoted by .
It is worth noting that if is the limit of in mesh then is also the limit of over refinements. Another useful fact is that is a limit of in mesh, , if and only if for any sequence of partitions with as .
Examples of as considered above are and for partitions of with for an interval function . Limits of these lead to what will be called additive and multiplicative transforms of .

If, for a given interval function , for any , the limit over refinements of , exists, the interval function is called the additive transform of .

If, for a given interval function , for any , the limit over refinements of , exists, the interval function is called the multiplicative transform of .
Either of the transforms will be unique when it exists. Clearly, the additive transform, when it exists, is an additive interval function and the multiplicative transform, when it exists, is a multiplicative interval function.
At this stage it is worth noting that what we are ultimately looking for is to establish the existence of a multiplicative transform of as an interval function with an expression as a product integral. Also, the product integral corresponds to the evaluation of the multiplicative transform of seen as an interval function.
In the following, somewhat stricter versions of the additive and multiplicative transforms will be useful.

A strict additive transform of an interval function is an additive interval function such that for any a partition of exists such that
(12) for any refinement of .

A strict multiplicative transform of an interval function is a multiplicative interval function such that for any a partition of exists such that
(13) for any refinement of .
By the triangle inequality, it can be seen that a strict additive transform is, in fact, an additive transform , where the limit is over refinements of partitions of , for any . Similarly, Theorem 9.34 of Dudley and Norvaiša (2011) establishes that a strict multiplicative transform is a multiplicative transform , where the limit is over refinements of partitions of , for any under the assumption that , which is implied by , for instance. What is here called a strict multiplicative transform of corresponds to a multiplicative transform of in the terminology of Dudley and Norvaiša (2011).
An important result is the following.
Theorem 2.
Consider an interval function of bounded variation. Then has a strict additive transform, , if and only if has a strict multiplicative transform, . When this happens, is also the strict multiplicative transform of and is also the strict additive transform of .
Proof.
Assume that has a strict additive transform . Since, for any partition , for any refinement of , we see from the properties of the strict additive transform that . Using the arguments of Section 2 of Gill and Johansen (1990) or of Chapter 9 of Dudley and Norvaiša (2011), it can be established that if is an additive interval function of bounded variation then the strict multiplicative transform of exists, and similarly, if is a multiplicative interval function and is of bounded variation then the strict additive transform of exists. According to this result, has a strict multiplicative transform. Let denote the strict multiplicative transform of . Then, for any partition of ,
(14) 
For any , we can find a partition such that either term on the righthand side is smaller than whenever is a refinement of . In particular, this shows that the strict multiplicative transform of exists and corresponds to , the strict multiplicative transform of . The other implication is shown in a similar fashion, where it is important to note that if the multiplicative transform of exists then, for any partition ,
(15) 
for any refinement of since by multiplicativity if is a partition of and since such that . ∎
An interval function is said to be upper continuous if, for all , for any with as . According to Proposition 2.6 of Dudley and Norvaiša (2011), an additive interval function is upper continuous when and only when for any with , which is called upper continuity at . For a strict additive transform of an interval function , this is the case when is upper continuous at .
A function is said to be regulated if it has limits from the left as well as from the right everywhere where applicable, potentially including at and if is unbounded. In particular, a regulated function is bounded and has at most a finite number of jumps larger than any fixed . A second important result is the following.
Theorem 3.
Consider an interval function which is upper continuous at and has bounded variation and which has a strict additive transform, . Consider also a regulated function . Define an interval function by when left end point of is in and by when is not in . Then has a strict additive transform, , which is given by the Kolmogorov integral .
Proof.
Since will be additive, upper continuous and of bounded variation and is regulated, the Kolmogorov integral exists as a consequence of Theorem 2.20 and Proposition 2.25 of Dudley and Norvaiša (2011). The Kolmogorov integral satisfies , where is the variation on . We will consider a partition of with elements of the form and . Such a partition is called a Young partition in Dudley and Norvaiša (2011). Since is regulated we can, according to Theorem 2.1 of Dudley and Norvaiša (2011), find such a partition such that the oscillation of on the interval , , does not exceed a given for any . Potentially by a refinement, we can take such that also for any refinement of since is the strict additive transform of . Now, consider any refinement of and let denote any member of and, if is the left end point of , let if and if . We then have the conclusion that
(16)  
which can be made arbitrarily small by an appropriate choice of . ∎
Since is upper continuous and of bounded variation under the assumptions of Theorem 3 and since a regulated function is bounded and Borel measurable, the Kolmogorov integral of the theorem also corresponds to the LebesgueStieltjes integral.
The concept of a random interval function on probability space can be introduced as a function such that is Borel measurable for all . This concept will be useful in the following.
4 Statement and proof of main result
Let us consider for some and define the interval functions that are relevant in the mutlistate context. We will consider an interval function with definitions
for . Here, we can again take if and similarly if . We have and whenever , and and whenever . Similarly the matrixvalued can be considered an interval function with the interval function as the th entry. Also, as an interval function is given by for an interval . This defines an additive interval function with values in . As an interval function, is a multiplicative interval function when and only when the Chapman–Kolmogorov equation for holds. In the nonMarkov setting we consider, this is not generally the case.
If we consider again the multistate process , then with , , , and for is a random, upper continuous interval function. The interval functions defined, for intervals , by for will be important. These interval functions are additive and upper continuous. Since a change in state for involves at least one direct transition somewhere, we have and so for due to additivity of . The interval function is also the candidate for the strict additive transform of . Since we consider for some , we have that for any sequence of partitions with mesh converging to 0, for almost all since separates jumps when the mesh is sufficiently small. In particular, will be the strict additive transform of in this case, and is a random interval function since, for any , is the limit of Borel measurable functions like for partitions of . We define interval functions by and by . Here, is upper continuous. As an interval function, is additive and, at least if , also upper continuous.
Proposition 4.
For a given , assume for all . Then is the strict additive transform of for all .
Proof.
For any given and any sequence of partitions with as ,
(17) 
for by dominated convergence since , which is integrable under the assumption. In particular, has limit in mesh and so over refinements, which is the requirement for to be the strict additive transform of . ∎
As a consequence of Proposition 4 and the argument found in the proof of Theorem 2, we obtain when , but this conclusion holds generally in the sense that implies which can be seen as a result of Fatou’s lemma. From the pointwise bound , we obtain also.
The main result is given as follows. Recall that we are considering a bounded interval .
Theorem 5.
Assume is upper continuous at and of bounded variation. Then is the strict additive transform of .
Proof.
The assumption implies that for all and . We consider now such a and . Since when and otherwise and similarly for other types of intervals, we have . Split into and where and which are open and closed respectively relative to . For any interval of the type an exists such that and for all by Lemma 7 of the appendix. As a function on , is then regulated. With and , Theorem 3 now implies that is the strict additive transform of on . It is worth noting about that additivity and nonnegativity means that for any interval . Also that for , generally, and for since is the strict additive transform of on any in this case and since is upper continuous by definition.
Next, an interval partition of is considered. As an open set relative to , is the countable union of open intervals, open relative to . According to Lemma 9 of the appendix, if and then either for some if or if . The first case cannot be encountered and the second case can only be encountered a finite number of times on since is dominated by for all on these types of intervals. This means that is actually a union of finitely many open intervals and this implies the existence of a partition with open intervals or and with . Additionally, we necessarily have for all . The existence of such a partition also means that can easily be established from the results above.
Following the proof of Proposition 3.50 of Dudley and Norvaiša (2011), it can be proven that upper continuity of and the assumption lead to being upper continuous at . This means that for any , we can find a such that for all intervals among and for . In particular, for any partition of such an interval . Consider . Since , the argument of Lemma 7 of the appendix leads to the existence of an such that for all . Then, as seen above, is the strict additive transform of on . This is also trivially the case when since both