Acronyms and Abbreviations
Acronyms  
Probability density function.  
ML  Maximum likelihood. 
MLEs  Maximum likelihood estimates. 
MAP  Maximum a Posteriori. 
MRE  Mean relative error. 
MSE  Mean square error. 
CP  Coverage probability. 
CI  Credibility intervals. 
NHPP  Nonhomogeneous Poisson process. 
PLP  Power law process. 
ABAO  As bad as old. 
CMLE  Conditionaly Unbiased MLE 
Notation
Intensity function.  
Causespecific intensity function.  
Cumulative intensity function.  
Causespecific Cumulative intensity function.  
Causespecific counting process.  
Continuous random variable represent failure time 
ith failure time.  
ith failure time of the cause j.  
Shape parameter of the cause j.  
Scale parameter of the cause j.  
Mean number of failures of the cause j.  
General parameters vector. 

MLE of .  
Number of failures of the system.  
Number of failures of the cause j.  
Likelihood function.  
Loglikelihood function.  
Probability density function of gamma distribution.  
Fisher information matrix.  
Jeffreys prior distribution.  
Reference prior distribution.  
Posterior distribution.  
Maximum a posteriori estimator.  
Cause of failure indicator.  
Expectation.  
Indicator function.  
Bayes Estimator of . 
I Introduction
The study of recurrent event data is important in many areas such as engineering and medicine. In the former, interest is usually centered in failure data from repairable systems. Efficient modeling and analysis of this data allows the operators of the equipments to design better maintenance policies, see [2, 3, 4].
In the repairable system literature, it is often assumed that failures occur following a NHPP with power law intensity. The resulting process is usually referred to as PLP. The PLP process is convenient because it is easy to implement, flexible and the parameters have nice interpretation. Regarding classical inference for the PLP, see, for instance, Ascher and Feingold [4] or Ridgon and Basu [3]. Bayesian inference has been considered among others by BarLev et al. [5], Guida et al. [6], Pievatolo and Ruggeri [7] and Ruggeri [8]. In this direction, Oliveira et al. [9] introduced an orthogonal parametrization ([10, 11]) of the PLP which simplifies both the analysis and the interpretation of the results.
In complex systems, such as supercomputers, airplanes or cars in the reliability area, it is common the presence of multiple causes of failure. Traditionally, models with this characteristic are known as competing risks and are analyzed as if the causes had an independent structure, or analogously, a system where their components are connected in series, that is, when a component fails the whole system fails. In this paper we advocate the use of causespecific intensity functions because they are observable quantities (unlike of the latent failure time approach) in competing risk analysis. See Pintilie [12], Crowder et al. [13] and Lawless [14] for an overview about this approach.
Recently, Somboonsavatdee and Sen [1] discussed classical inference for a repairable system that is subjected to multiple causes. Fu et al. [15] conducted a Bayesian reference analysis to model recurrent events with competing risks. They modeled failure types through a proportional intensity homogeneous Poisson process along with fixed covariates. In this framework, the objective of the authors was to estimate the causespecific intensity functions. They used three reference priors according to different orders of grouping of model parameters. We note that, in the multiparamenter case, the overall reference prior may depend on which components of the vector of parameters are considered to be of primary interest. This problem can be quite challenging (see Berger et al. [16], for a detailed discussion). Since we are interested simultaneously in all the parameters of the model, we propose an orthogonal reparametrization in the Somboonsavatdee and Sen [1] model in order to obtain a unique overall reference prior.
The main purpose of this paper is to discuss Bayesian inference under an overall reference prior for the parameters of the PLP intensities under competing risks. We prove that the overall prior is also a matching prior, i.e., the resulting marginal posterior intervals have accurate frequentist coverage [17]. The resulting posterior is proper and has interesting properties, such as onetoone invariance, consistent marginalization, and consistent sampling properties. An extensive simulation study is presented which suggests that the resulting Bayes estimates outperform the estimates obtained from the classical approach.
The rest of the paper is organized as follows: in Section II we present the motivating examples that will be used throughout of this paper. Section III presents some basic concepts regarding counting processes, repairable systems and competing risks model and objective Bayesian inference. Section IV presents the minimal repair for competing risks model and the maximum likelihood estimators of the model parameters. In Section V, we investigated s Some properties of the data are characterized and methods of estimation are studied, from a single system assuming the independence of failure modes. In Section Z we study some results when there is a lack of prior information. The case of NHPP with a power law intensity function is studied in detail with an objective Bayesian inference approach. Finally, in section W, we present the conclusion and ideas for future research.
Ii Motivating Data
Table I shows failure times and causes for a sugarcane harvester during a crop. This machine harvest an average of 20 tons of sugarcane per hour and its malfunction can lead to major losses. It can fail due to either malfunction of electrical components, of the engine or of the elevator, which are denoted as Cause 1, 2 and 3 respectively in Table I. There are 10, 24 and 14 failures for each of these causes. During the harvest time that comprehends 254 days the machine operates on a 7x24 regime. Therefore, we will assume that each repair is minimal (i.e. it leaves the machine at exactly the same condition it was right before it failed) and that data collection time truncated at T = 254 days. The recurrence of failure causes is shown in Figure 1.
Time  Cause  Time  Cause  Time  Cause  Time  Cause 

4.987  1  7.374  1  15.716  1  15.850  2 
20.776  2  27.476  3  29.913  1  42.747  1 
47.774  2  52.722  2  58.501  2  65.258  1 
71.590  2  79.108  2  79.688  1  79.794  3 
80.886  3  85.526  2  91.878  2  93.541  3 
94.209  3  96.234  2  101.606  3  103.567  2 
117.981  2  120.442  1  120.769  3  123.322  3 
124.158  2  126.097  2  137.071  2  142.037  3 
150.342  2  150.467  2  161.743  2  161.950  2 
162.399  3  185.381  1  193.435  3  205.935  1 
206.310  2  210.767  3  212.982  2  216.284  2 
219.019  2  222.831  2  233.826  3  234.641  3 
Our second data set was introduced by Somboonsavatdee and Sen [1]
. It consists of warranty claims for a fleet of automobiles classified according to three possible causes. The data describes the 372 recurrences of three causes of failures, hereafter, Cause 1, 2 and 3. The recurrence of failure causes can be seen in Figure
2.Understanding the causes and the rate of accumulation of warranty claims very important in the automobile business. Warranty claim data can be used to estimate both the reliability of the vehicles in the field and the extent of future claim payments. Hence, it will impact on costs, investment and security. For more details regarding a modern treatment of warranty claims please see [18], [19] and [20].
Iii Background
In this section, we discuss the analysis of the failures of a single unit repairable system under the action of multiple causes of recurrent and independent failures described by causespecific intensity functions.
Iiia The NHPP
We begin by considering a repairable system with only one cause of failure. Let be the number of failures before time and be the number of failures in the time interval . An NHPP with intensity function () is a counting process having independent increments and such that
(1) 
Alternatively, for each , the random variate
follows a Poisson distribution with mean
. A versatile parametric form for the intensity is(2) 
where . In this case the NHPP is said to be a PLP and its mean function is
(3) 
The scale parameter is the time for which we expect to observe a single event, while is the elasticity of the mean number of events with respect to time [9].
IiiB Competing Risks
In reliability theory, the most common system configurations are the series systems, parallel systems, and seriesparallel systems. In a series system, components are connected in such a way that the failure of a single component results in system failure (see Figure 3). A series system is also referred to as a competing risks system since the failure of a system can be classified as one of possible risks (componentS) that compete for the systems failure.
We note that we use the term risks before failure and causes afterward: the risks compete to be the cause. This paper is based on the classical assumption that the risks act independently. Thus, independent risks are equivalent to independent causes of failure.
A considerable amount of literature involving complex systems uses the assumption of stochastic independence in which is based on the physically independent functioning of components.
Although competing risks is widely observed in medical studies, recent applications can bee seen in reliability. See, for instance, Crowder et. al [21] and references therein. For instance, an application can be seen when units under observation can experience any of several distinct failure causes, in which case for each unit we observe both the time to failure and the type of failure.
IiiC The Minimal Repair Model
The main challenge when modelling repairable systems data is how to account for the effect of a repair action performed immediately after a failure. As usual, we will assume here that repair actions are instantaneous. The most explored assumptions are either minimal repair and perfect repair. In the former, it is supposed that the repair action after a failure returns the system to exactly the same condition it was immediately before it failed. In the latter, the repair action leaves the system as if it were new. In the engineering literature this type of repair or of corrective maintenance are usually called ABAO and AGAN, for as bad as old and as good as new respectively ([22, 23, 24, 25, 26]). More sophisticated models which account for a repair action that leaves the system somewhere between the ABAO and AGAN conditions are possible, although they will not be considered here (see, for instance, [27]).
Under minimal repair, the failure history of a repairable system is modelled as a NHPP. As mentioned above, the PLP (2) provides a flexible parametric form for the intensity of the process. Under the time truncation design, i.e. when failure data is collected up to time , the likelihood becomes
(4) 
where we assume that it has been observed failures at times (see, for instance, Rigdon and Basu [3]).
Oliveira et al [9] suggest reparametrizing the model in terms of and , where
(5) 
so that the likelihood becomes
(6)  
where , is the maximum likelihood estimator of and is the PDF of the gamma distribution with shape and scale parameters and , respectively. From (6) it follows that and are orthogonal (for the advantages of having orthogonal parameters see Cox and Reid [10]).
IiiD Bayesian Inference
The use of Bayesian methods has grown due to the advance of computational techniques. This approach is very attractive especially to construct credibility intervals for the parameters of the model. While in the classical approach the obtained intervals lie on the assumptions of asymptotic results, under the Bayesian approach such results can be obtained directly from the posterior density.
In this context, the prior distribution used to obtain the posterior quantities is of primary concern. Historical data or expert knowledge can be used to obtain a prior distribution. However, the elicitation process may be difficult and time consuming. An alternative is to consider objective priors, in this case, we want to select a prior distribution in which the information provided by the data will not be obfuscated by subjective information. Based on the above reasons, we focused on objective Bayesian analysis for the parameters of the causespecific intensity functions. The formal rules to obtain such priors are discussed below.
IiiE Jeffreys Prior
Jeffreys [28] proposed a rule for deriving a noninformative prior which is invariant to any onetoone reparametrization. The Jeffreys’ prior is one of the most popular objective priors and is proportional to the square root of the determinant of the expected Fisher information matrix , i.e.,
Although Jeffreys prior performs satisfactorily in one parameter cases, Jeffreys himself noticed that it may not adequate for the multiparameter case. Indeed, such prior can lead to marginalization paradoxes and strong inconsistencies (see Bernardo [29, pg. 41]).
IiiF Reference Prior
Bernardo [30]
introduced a class of objective priors known as reference priors. Such class of priors maximize the expected KullbackLeibler divergence between the posterior distribution and the prior. The reference prior has minimal influence in a precise informationtheoretic sense. that separated the parameters into the parameters of interest and nuisance parameters. To derive the reference prior function one need to set the parameters according to their order of inferential importance (see for instance,
[30] and [29]). The main problem is that different ordering of the parameters return different priors and the selection of the more adequate prior may be quite challenging.To overcome this problem Berger et al. [16] discussed different procedures to construct overall reference prior for all parameters. Additionally, under certain conditions, such prior is unique in the sense of being the same regardless the ordering of the parameters. To obtain such prior the expected Fisher information matrix must have a diagonal structure. The following result can be used to obtain the overall reference prior.
Theorem III.1.
[Berger et al. [16]] Suppose that the Fisher information matrix of is of the form
where , is a diagonal matrix, is a positive function of and is a positive function of , for . Then, the reference prior, for any chosen interest parameter and any ordering of nuisance parameters, is given by
(7) 
The reference posterior distribution has desirable theoretical properties such as invariance under onetoone transformations of the parameters, consistency under marginalization and consistent sampling properties.
IiiG Matching Priors
Researchers attempted to evaluate inferential procures with good coverage errors for the parameters. While the frequentist methods usually relies on asymptotic confidence intervals, under the Bayesian approach formal rules have been proposed to derive such estimators. Tibshirani
[17] discussed sufficient conditions to derive a class of noninformative priors where is the parameter of interest so that the credible interval for has a coverage error () in the frequentist sense, i.e.,(8) 
where denote the
th quantile of the posterior distribution of
. The class of priors satisfying (8) are known as matching priors [31].To obtain such priors, Tibshirani [17] proposed to reparametrize the model in terms of the orthogonal parameters in the sense discussed by Cox & Reid [10]. That is, for all , where is the parameter of interest and is the orthogonal nuisance parameter. In this case, the matching priors are all priors of the form
(9) 
where () is an arbitrary function and , ) is the entry of the Fisher Information Matrix. The same idea is applied to derive priors when there are a vector of nuisance parameters. In the present study, we considered an orthogonal reparametrization in order to obtain priors with matching priors properties.
IiiH Bayesian Point Estimators
There are different types of Bayesian estimators, the three most commonly used are the posterior mean, the posterior mode and the posterior median. Here we considered the posterior mode, that is usually refer as MAP estimate, due to structure that has a simple closedform expression and can be rewritten as a bias corrected MLE. A similar approach considering MAP estimates has been considered by Ramos et al. [32]
to derive nearly unbiased estimator for the Nakagami distribution. One can define a maximum a posteriori estimator,
, which is given by maximizing the posterior distribution, i.e.,(10)  
Iv Modeling Minimal Repair with Competing Risks
The assumptions of the repairable system under examination is that the components can perform different operations, and thus be subject to different types of failures. Hence, in our model there are causes of failure. At each failure, the cause of failure (or failure mode) is denoted by for . If failures have been observed in , then we observe the data , where are the system failure times and the s represent the associated failure cause with th failure time.
One can introduce a counting process whose behavior is associated with the causespecific intensity function
(11) 
Hence, the global system failure process is a superposition of NHPPs and its intensity is
(12) 
The causespecific and the system cumulative intensities are respectively
(13) 
where we use the PLP intensity (2) for each cause, so that
(14) 
for and
(15) 
Iva Maximum Likelihood Estimation
Recalling that causes of failure act independently and they are mutually exclusive, so the general form of the likelihood function can be written as
(16)  
where represents the indicator function of the cause associated with th time of failure and .
To better understand how to compute the likelihood, we consider the following cases.
Case ,
Initially, we obtain the MLEs with only two independent failure causes, and in which the system has been observed until a fixed time . Resulting in
(17)  
where ; ; and . The MLEs are
(18) 
Case ,
For the case of the different shape parameters, , the system intensity function is no longer a PLP. The likelihood function is given by
(19)  
where , and the MLEs are
(20) 
Note that in the MLEs only exist if for .
Case , at least two s are different
The likelihood function is given by
(21) 
and the MLEs are
(22) 
where again the MLEs exist only if for .
V Objective Bayesian Inference for the Model
In this section, we present an objective Bayesian inference for the framework discussed so far by considering the reparametrization given in (5) in order to obtain an orthogonal structure in the Fisher information matrix, and as a result, a unique objective prior.
Case ,
Denote by the common value of and . The likelihood function considering the reparametrization is given by
(23)  
where now . The loglikelihood is given by
The MLE for is the same as presented in (18). On the other hand, the MLEs for are . To compute the Fisher Information Matrix, note that the partial derivatives are
(note that, since we are considering time truncation, both and the ’s are random). Hence, the expectation of the second derivatives above are given by
The Fisher information matrix is diagonal and given by
The first step in this approach begins by obtaining the Jeffreys prior distribution which is given by
(24) 
Proposition V.1.
The Jeffreys prior (24) is a matching prior for .
Proof.
Let be the parameter of interest and denote by the nuisance parameter. First, since the information matrix is diagonal, the Jeffreys’ prior can be written in the form (9) ∎
The joint posterior distribution for and produced by Jeffreys’ prior is proportional to the product of the likelihood function (23) and the prior distribution (24) resulting in
(25)  
The posterior (25) do not have closedform, this implies that it may be improper, which is undesirable. Moreover, to obtain the necessary credibility intervals we would have to resort to Monte Carlo methods. To overcome these problems, we propose the alternative reference prior described below.
based on .
Note that, if in Theorem III.1 we take , , , , and , the overall reference prior is
(26) 
Proposition V.2.
The overall reference prior (26) is a matching prior for all the parameters.
Proof.
If is the parameter of interest and , then the proof is analogous to that for the Jeffreys’ prior above but considering . If is the parameter of interest and are the nuisance parameters. Then, as and . Hence, the overall reference prior (26) can be written in the form (9). The case that is the parameter of interest is similar. ∎
The posterior distribution when using the overall reference prior (26) is
(27)  
that is,
(28) 
which is the product of independent gamma densities. Clearly, if there is at least one failure for each cause, this posterior is proper.
The marginal posterior distributions are given by
(29) 
and
(30) 
Note that, as was proved in Proposition V.2, the marginal posterior intervals have accurate frequentist coverage for all parameters.
From the posterior marginal the Bayes estimator using the MAP for is given by
(31) 
Throughout the rest of the paper we will denote the Bayes estimators of . Rigdon and Basu [3] argued that
(32) 
i.e., the proposed estimator is unbiased. Although they called CMLE, there is, however, no theoretical justification for its derivation. Here we provided a natural approach to obtain unbiased estimators for by considering the reference posterior.
In the case of Bayes estimators for , note that,
(33) 
These estimators are biased specially when are small. On the other hand
(34) 
Since , this implies that as increase, . Therefore, as a Bayes estimator, we choose the unbiased estimator
(35) 
Ramos et al. [33] presented a similar approach to obtain Bayes estimators for another distribution. It is noteworthy that the credibility interval must be evaluated considering the quantile function of the to satisfy the matching prior properties.
Case
In this case we substitute (5) in (17) too obtain the likelihood function
(36)  
where and The MLEs have explicit solutions
(37) 
Since , for , the Fisher information matrix is
The Jeffreys prior is given by
(38) 
Proposition V.3.
The Jeffreys prior (38) is matching prior for and .
Proof.
The joint posterior obtained using Jeffreys’ prior (38) is
(39) 
Therefore, the Bayes estimators using the MAP are
(40) 
On the other hand, considering Theorem III.1, the overall prior distributions is
(41) 
Proposition V.4.
The overall reference prior (41) is a matching prior for all parameters.
Proof.
The proofs for and follow the same steps as in the proof of Proposition V.3. The cases of and of follow directly from Proposition V.2.
∎
The reference posterior distribution is given by
(42) 
Hence, the MAP estimator for is given by
(43) 
To obtain the Bayes estimators for we considered the same argument described in the last section, which gives in this case
(44) 
Case 3: p causes of failure and different s
The likelihood function in this case is
(45) 
where . The MLEs have explicit solutions
(46) 
and the Fisher information matrix is
The Jeffreys prior is
(47) 
which gives the posterior distribution
(48) 
To prove that (47) is a matching prior for we can consider the same steps of the proof of Proposition V.3.
Scenario 1  Scenario 2  Scenario 3  Scenario 4  Scenario 5  

Parameter  Method  MRE  MSE  MRE  MSE  MRE  MSE  MRE  MSE  MRE  MSE 
MLE  1.2401  2.5087  1.0411  0.1484  1.2897  3.2159  1.2344  2.6494  1.1636  0.0255  
Bayes  0.9991  0.9509  1.0000  0.1313  0.9993  1.1567  0.9995  1.0319  0.9993  0.0136  
MLE  1.5693  6.7586  1.5272  13.0190  1.0812  0.0740  1.0771  0.0527  1.0101  0.0426  
Bayes  0.9980  1.8000  1.0008  3.4447  0.9998  0.0576  0.9998  0.0415  0.9999  0.0413  
MLE  1.0102  6.1686  0.9997  26.4448  1.0210  5.1477  1.0091  6.2929  1.0015  8.3588  
Bayes  1.0102  6.1686  0.9997  26.4448  1.0210  5.1477  1.0091  6.2929  1.0015  8.3588  
MLE  1.2306  2.2679  1.1700  2.5335  0.9997  14.4912  0.9998  15.1111  0.9999  99.8225  
Bayes  1.2306  2.2679  1.1700  2.5335  0.9997  14.4912  0.9998  15.1111  0.9999  99.8225 
Method  Scenario 1  Scenario 2  Scenario 3  Scenario 4  Scenario 5  

MLE  0.9556  0.9523  0.9555  0.9555  0.9553  
CMLE  0.8740  0.9357  0.8599  0.8759  0.8958  
Jeffreys  0.9503  0.9501  0.9503  0.9502  0.9503  
Reference  0.9503  0.9501  0.9503  0.9502  0.9503  
MLE  0.9538  0.9538  0.9538  0.9537  0.9501  
CMLE  0.7869  0.7989  0.9226  0.9237  0.9460  
Jeffreys  0.9501  0.9498  0.9499  0.9499  0.9497  
Reference  0.9501  0.9498  0.9499  0.9499  0.9497  
MLE  0.8884  0.9325  0.9352  0.8943  0.9196  
CMLE  0.8884  0.9325  0.9352  0.8943  0.9196  
Jeffreys  0.9674  0.9494  0.9715  0.9636  0.9661  
Reference  0.9338  0.9494  0.9715  0.9518  0.9447  
MLE  0.9971  0.9940  0.9438  0.9218  0.9453  
CMLE  0.9971  0.9940  0.9438  0.9218  0.9453  
Jeffreys  0.9203  0.9514  0.9365  0.9481  0.9497  
Reference  0.9707  0.9514  0.9526  0.9439  0.9552 
Finally, the reference prior using Theorem III.1 is
(49) 
Thus, in this case, the joint posterior distribution is
(50) 
Proposition V.5.
The overall reference prior (50) is matching prior for all parameters.
Proof.
The proof is essentially the same as that of Proposition V.4. ∎
Using the same approach of the case where , the Bayes estimators are
(51) 
and, for ,
(52) 
Vi Simulation Study
In this section we present a simulation study to compare the Bayes estimators and the MLEs. We used two criteria to evaluate the estimators behaviour: the mean relative error (MRE)
and the mean square error (MSE)
where is is the number of estimates (i.e. the Monte Carlo size), which we take throughout the section, and is the vector of parameters.
Additionally, for the objective bayesian credibility intervals and the asymptotic maximum likelihood based confidence intervals for , , and we computed the interval coverage probability, denoted by . Good estimators should have MRE close to one and MSE close to zero and good intervals should be short while showing close to 95%. We also considered a confidence interval based on the CMLE obtained from the unbiased estimator (32
) and the asymptotic variances estimated from the Fisher information matrix, similar to what is done to obtain the ML interval.
The results were computed using the software R. Below we show the results for a single system subject to two causes of failure. We assumed that the twocomponent system was observed on the fixed time interval and considered five different scenarios for and the parameters:

Scenario 1: ;

Scenario 2: ;

Scenario 3: ;

Scenario 4: ;

Scenario 5: ;
The values of the parameters were selected in order to obtain different samples sizes. The results were presented only for these five scenarios due to the lack of space. However, the obtained results are similar for other choices of the parameters. Using the fact that the causes are independent and well known results about NHPPs [3], for each Monte Carlo replication the failure times were generated as follows:

Step 1: For each cause of failure, generate random numbers ().

Step 2: For each cause of failure, let the failure times be , where
Comments
There are no comments yet.