# Posterior properties of the Weibull distribution for censored data

The Weibull distribution is one of the most used tools in reliability analysis. In this paper, assuming a Bayesian approach, we propose necessary and sufficient conditions to verify when improper priors lead to proper posteriors for the parameters of the Weibull distribution in the presence of complete or right-censored data. Additionally, we proposed sufficient conditions to verify if the obtained posterior moments are finite. These results can be achieved by checking the behavior of the improper priors, which are applied in different objective priors to illustrate the usefulness of the new results. As an application of our theorem, we prove that if the improper prior leads to a proper posterior, the posterior mean, as well as other higher moments of the scale parameter, are not finite and, therefore, should not be used.

## Authors

• 5 publications
• 10 publications
07/05/2021

### Analyzing Relevance Vector Machines using a single penalty approach

Relevance vector machine (RVM) is a popular sparse Bayesian learning mod...
05/15/2020

### Power laws distributions in objective priors

The use of objective prior in Bayesian applications has become a common ...
04/09/2020

### Objective Bayesian analysis for spatial Student-t regression models

The choice of the prior distribution is a key aspect of Bayesian analysi...
12/28/2020

### Objective Bayesian Analysis for the Differential Entropy of the Gamma Distribution

The use of entropy related concepts goes from physics, such as in statis...
11/07/2020

### Existence of matching priors on compact spaces

A matching prior at level 1-α is a prior such that an associated 1-α cre...
04/29/2020

### Objective priors for divergence-based robust estimation

Objective priors for outlier-robust Bayesian estimation based on diverge...
08/01/2020

### Posterior Impropriety of some Sparse Bayesian Learning Models

Sparse Bayesian learning models are typically used for prediction in dat...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The Weibull distribution plays a central role in reliability analysis as one of the most important generalizations of the exponential distribution. Introduced by Weibull

weibull1951statistical, his distribution has been used in many applications as baseline distribution. Let

be a non-negative random variable with the Weibull distribution given by

 f(x|η,β)=βηβ(ηx)β−1e−ηβxβ, (1)

where and are unknown scale and shape parameters, respectively.

The maximum likelihood estimators, as well as other frequentist methods of inference for Weibull distribution, are standard in the reliability literature

teimouri2013comparison. From the Bayesian point of view, a prior distribution must be chosen for the parameters. Objective priors have already been derived for the Weibull distribution in the literature considering complete data, for instance, Sun sun1997note have discussed many objective priors for the Weibull model such as the Jeffreys prior jeffreys1946theory, reference priors bernardo1979a; bernardo2005; berger2015, matching priors tibshirani1989, and Moala et al. moala2009note considered these priors as well as the maximum data information (MDI) prior zellner1; zellner2 for the reliability function. Some recent derivations of objective priors for related functions of Weibull distribution can be seen in lee2015noninformative; kang2017noninformative; lee2017objective.

A major drawback related to the use of objective priors is that they are improper and could lead to improper posteriors. Northrop and Attalides northrop2016 argued that “

there is no general theory providing simple conditions under which improper prior yields a proper posterior for a particular model, so this must be investigated case-by-case”. In this paper, for the Weibull distribution, we overcome this problem by providing sufficient and necessary conditions to check when a wide class of posterior distributions is proper. We then proceeded to use such theorem to investigate whether these improper priors lead to proper posterior distributions. Another important issue is related to the computation of the posterior moments since even if the posterior distribution is proper, the related posterior moments (posterior mean, posterior variance, among others) can be infinite. Thus, for the Weibull model, we shall also discuss sufficient conditions that lead the posterior moments to be proper. Based on our findings, further researches will be able to check if the obtained posterior is proper or improper easily, as well as the finiteness of the posterior moments analyzing the behavior of the improper prior directly. Although the proof for the posterior distribution has been considered for complete data, there is no similar proof for random censoring in the literature. Hence our results allow the use of the objective Bayesian analysis for the Weibull distribution in the presence of censoring. The new results are applied in different objective priors such as independent uniform priors, Jeffreys’ first rule

kass1996selection, Jeffreys’ prior jeffreys1946theory, maximal data information (MDI) prior zellner1; zellner2 and reference priors bernardo1979a; bernardo2005; berger2015.

The remainder of this paper is organized as follows. Section 2 introduces essential results that will be used to prove the new theorem. Section 3 presents a theorem that provides necessary and sufficient conditions for the posterior distributions to be proper and also sufficient conditions to check if the posterior moments of the parameters are finite. Section 4 presents the applications of our main theorem in different objective priors. Finally, Section 5 summarizes the study.

## 2 Posterior distribution

Let be an realization of an independent and identically distributed sample where Weibull. Moreover, suppose that the th individual has a lifetime and a censoring time , additionally, the random censoring times s are independent of s and their distribution does not depend on the parameters, then the data set is , where and is an indicator function of the presence of censoring. Considering a prior distribution , the joint posterior distribution for is given by

 π(η,β|x)=π(η,β)d(x)ηmββm(n∏i=1xδiβi)exp{−n∑i=1xβiηβ}, (2)

where , and is a normalized constant in the form

 d(x)=∫∞0∫∞0π(η,β)ηmββm(n∏i=1xδiβi)exp{−n∑i=1xβiηβ}dηdβ. (3)

Here, our purpose is to find necessary and sufficient conditions for the posterior distribution to be proper for some general class of priors. The proof of the following theorem will be left to the appendix.

###### Theorem 2.1.

Let be a class of priors such that

 π(η,β)∝exp(−pβ−1)ηrβq

where , , are constants. Then:

• If , then the posterior distribution of under the prior is improper.

• If , and at least two non-censored data are distinct, then the posterior distribution of under the prior is proper if and only if .

• If , and then the posterior distribution of under the prior is improper.

###### Corollary 2.2.

Let be a class of improper priors such that

 π(η,β)∝ηrβq

where and are constants and suppose the posterior related to is proper. Then the posterior moments relative to are finite, and the posterior moments relative to are not finite for both complete and censored data.

###### Proof.

Since the posterior is proper, by Theorem 2.1 we have and .

Now, given and , since , from Theorem 2.1 it follows that the posterior relative to the prior is proper and thus

 E[βk|x]=∫∞0∫∞0βkπ(η,β)ηmββm+q(n∏i=1xδiβi)exp{−n∑i=1xβiηβ}dηdβ<∞.

Analogously, given it follows that , where and thus from Theorem 2.1 it follows that

 E[ηk|x]=∫∞0∫∞0ηkπ(η,β)ηmββm+q(n∏i=1xδiβi)exp{−n∑i=1xβiηβ}dηdβ=∞,

which concludes the proof. ∎

## 3 Applications

Here, we will apply our proposed methodology in different objective priors to verify if the posterior distributions are proper and verify if its posterior moments are finite. Once again in the following we let . Moreover, in the following, all considered priors shall take the form where either or . Thus, due to Theorem 2.1, it follows that the posterior relative to such priors are improper in a case . Therefore, in the following propositions, we shall be concerned only with cases where , and under this hypothesis, we shall always suppose that there are at least two distinct non-censored data.

A common objective prior to that does not depend on any metric is obtained by considering independent uniform priors. In the case of positive parameters, the prior is given by .

###### Proposition 3.1.

The posterior density using uniform prior is improper for all , in which case the posterior moments relative to are finite and the posterior moments relative to are not finite.

###### Proof.

Notice that given , the hypothesis of Theorem 2.1 hold for , and , and since , the conclusion of Theorem 2.1 implies the posterior relative to to be improper. ∎

Another common objective prior distribution can be obtained using the Jeffreys‘ first rule (see, Sun sun1997note). This prior has invariance property under power transformations. As the parameters of the Weibull distribution are contained in the interval , the prior distribution using the Jeffreys’ rule is given by

 π2(η,β)∝1ηβ⋅ (4)
###### Proposition 3.2.

The posterior density using the prior (4) is proper for all , in which case the posterior moments relative to are finite and the posterior moments relative to are not finite.

###### Proof.

If we let , the hypothesis of Theorem 2.1 are valid for , and , and thus applying Theorem 2.1 it follows that the posterior relative obtained from the prior is proper for all . The additional conclusions follows directly from Corollary 2.2. ∎

The most well-known objective prior was introduced by Jeffreys jeffreys1946theory which carries his name. The Jeffreys prior is obtained from the square root of the determinant of the Fisher information matrix , that is

 π3(η,β)∝√detI(η,β) (5)

where is the Fisher information matrix given by

 I(η,β)=n⎡⎢ ⎢ ⎢ ⎢⎣β2η2(1−γ)η(1−γ)η1β2(π26+(1−γ)2)⎤⎥ ⎥ ⎥ ⎥⎦, (6)

and is known as Euler-Mascheroni Constant.

This prior is widely used due to its invariance property under one-to-one transformations of parameters. For the Weibull distribution, computing the square root of the determinant of we have that

 π3(η,β)∝1η⋅ (7)
###### Theorem 3.3.

The posterior density using the prior (7) is proper for all , in which case the posterior moments relative to are finite and the posterior moments relative to are not finite.

###### Proof.

By letting , the hypothesis of Theorem 2.1 if we assume , and . Thus, from the conclusion of Theorem 2.1 it follows that the posterior relative to the prior is proper for all . The additional conclusions follows directly from Corollary 2.2. ∎

Zellner zellner1; zellner2 introduced an objective prior based on the Shannon entropy. The Maximal Data Information Prior emphasis the information in the likelihood function, therefore, its information is weak in comparison with data information. This prior is obtained by

 π4(η,β)∝exp(H(η,β)), (8)

where is the solution of the information measure given by

 H(η,β)=∫∞0log[βηβx−(β+1)exp{−(ηx)β}]f(t|(η,β)dt.

The MDI prior has invariance properties only for linear transformations on the parameters. For the Weibull distribution

can be written as:

 H(η,β)=log(ηβ)+γ(1−1β)−1.

Therefore the MDI prior (8) for the Weibull distribution is given by

 π4(η,β)∝exp(−γβ−1)ηβ. (9)
###### Theorem 3.4.

The posterior distribution obtained from the MDI prior (9) is improper for all .

###### Proof.

Letting , notice the hypothesis of Theorem 2.1 will be valid for , and . Thus, since , Theorem 2.1 implies directly that the posterior relative to is improper. ∎

An important non-informative prior was introduced by Bernardo bernardo1979a, with further developments see for instance Bernardo bernardo2005 and Berger et al. berger2015

. His prior is referred to as reference prior, and it is obtained under the idea of maximizing the expected Kullback-Leibler divergence between the posterior distribution and the prior. The reference prior provides posterior distribution with interesting properties, such as invariance, consistent marginalization, and consistent sampling properties.

The algorithm to derive the reference prior can be obtained in detail throughout Bernardo bernardo2005. However, since the Fisher information has a particular form, we will present one corollary that allows obtaining the reference priors quickly for the IW distribution.

###### Corollary 3.5.

Considering that is the parameter of interest and is the nuisance parameter, suppose the posterior distribution of is asymptotically normal with dispersion matrix . If the parameter space of is independent of and the functions and can be factorize in the form

 S−12δ,δ(δ,λ)=fθ(θ)gθ(λ)andI12λ,λ(θ,λ)=fλ(θ)gλ(λ),

then the reference prior when is the parameter of interest and is the nuisance parameter is simply even if is not proper.

Through the Corollary 3.5 and using the Fisher information matrix (6) if is the parameter of interest and is the nuisance parameter, after some algebraic manipulations it follows that the reference prior is the same as given by the Jeffreys’ rule prior given by. Additionally, considering that is the parameter of interest and is the nuisance parameter, after some algebraic manipulations the reference prior has also the same form given by . Since the reference priors have the same form of the Jeffreys’ rule prior, it follows that the obtained posterior is proper for all , in which case the posterior moments relative to are finite and the posterior moments relative to are not finite.

## 4 Other parametrization

The results presented above can be easily extended to other commonly used parametrization. For instance, considering , the PDF is given by

 f(x|θ,β)=βxβ−1θβe−(xθ)β. (10)

Therefore, for a prior distribution , the joint posterior distribution for is given by

 π(θ,β|x)=π(θ,β)d(x)θ−mββmn∏i=1xδiβiexp{−n∑i=1(xiθ)β}, (11)

where , and is a normalized constant in the form

 d(x)=∫∞0∫∞0π(θ,β)θ−mββmn∏i=1xδiβiexp{−n∑i=1(xiθ)β}dθdβ. (12)

In the following we consider the case where there are at least two distinct non-censored data. An analogous theorem can be proved for by using item of Theorem 2.1.

###### Corollary 4.1.

Let be a class of improper priors such that

 π(θ,β)∝exp(−pβ−1)θrβq

where , and are constants and suppose that at least two non-censored data are distinct. Then the posterior distribution of under the prior is improper in case , and is proper if and only if and in case .

###### Proof.

We have that

 d(x)=∫∞0∫∞0exp(−qβ−1)θ−mβ+rβm+qn∏i=1xδiβiexp{−n∑i=1(xiθ)β}dθdβ.

Thus, through the change of variables it follows that

 d(x)=∫∞0∫∞0π0(η,β)ηmββmn∏i=1xδiβiexp{−n∑i=1(ηxi)β}dθdβ

where and thus Theorem 2.1 is applied using the prior and implies in the proposed result.

###### Corollary 4.2.

Let be a class of improper priors such that

 π(θ,β)∝θrβq

where and are constants, and suppose the posterior related to is proper. Then the posterior moments relative to are finite, and the posterior moments relative to are not finite.

###### Proof.

Just as in Theorem 4.1, the proof is direct from Theorem 2.2 combined with an application of the change of variables in the integral. ∎

## 5 Discussion

In this paper, we have studied the posterior properties of the Weibull distribution in the presence of complete and censored data. We proposed necessary and sufficient conditions to check if an improper prior distribution leads to a proper posterior. An exciting aspect of our finds is that one can check the posterior distribution is proper by analyzing only the behavior of the prior distribution. For proper posterior distributions, we also provided sufficient conditions to check if its higher posterior moments are finite or infinite.

The main theorem is used in different objective priors, such as the uniform prior, Jeffreys’ first rule prior, Jeffreys prior, MDIP, and reference priors. We proved that among the considered priors, only the MDIP leads to an improper posterior. Although the priors above lead to proper posterior, we showed that none of the objective priors lead to finite posterior moments for the parameter and, therefore, should not be used as Bayes estimator. Further, the results were also presented for another standard parametrization, which can be used, for instance, in the objective prior obtained by Sun sun1997note. Similar to the other parametrizations, the results showed that none of the objective priors lead to finite posterior moments for the parameter . However, Sun sun1997note used the posterior mean of in a real data example for complete data, but since we proved the posterior mean is not finite, it should not be used in practice. The find above is an interesting example where the MCMC methods may return a finite estimate when the true value is infinite. Hence, future researches should be careful in computing the posterior moments without checking its finiteness.

## Acknowledgment

Pedro L. Ramos is grateful to the São Paulo State Research Foundation (FAPESP Proc. 2017/25971-0).

## Appendix A - Preliminaries

In the appendix, will denote the extended real number line and the subscript in and will denote the exclusion of in these sets.

###### Definition 5.1.

Let and , where . We say that if there exists and such that for every .

###### Definition 5.2.

Let , and , where . We say that if

 liminfx→ag(x)h(x)>0  and %  limsupx→ag(x)h(x)<∞.

The meaning of the relations and for are defined analogously.

Note that, from the above definiton, if for some we have that , then it will follow that . The following proposition is a direct consequence of the above definition, where the functions , , and bellow are supposed to be positive in their domain .

The following propositions gives us a relation between Definition 5.1 and Definition 5.2 and its proofs can be seen in Ramos et al. ramos2017.

###### Proposition 5.3.

Let and be continuous functions on , where and . Then if and only if and .

###### Proposition 5.4.

Let and be continuous functions in , where and , and let . Then, if either or , it will follow respectively that

 ∫cag(x)dx∝∫cah(x)dx  or  ∫bcg(x)dx∝∫bch(x)dx.

## Appendix B - Proof of Theorem 2.1

During this proof we shall use the fact that if and only if , which can be easily proved for by using Proposition 5.3 and the fact that , and for by the change of variables in the integral and the definition of Gamma function.

Before we prove the items we develop the integrals used. From the hypothesis it follows that

 π(η,β|x) =π(η,β)ηmββm(n∏i=1xδiβi)exp{−n∑i=1xβiηβ} ∝exp(−pβ−1)ηmβ+rβm+q(n∏i=1xδiβi)exp{−n∑i=1xβiηβ}.

Thus, from the change of variable in the integral it follows that

 d(x) ∝∞∫0∞∫0exp(−pβ−1)βm+q(n∏i=1xδiβi)ηmβ+rexp{−xβiηβ}dηdβ, =∞∫0exp(−pβ−1)βm+q−1(∏ni=1xδiβi)(∑i=1xβi)m+r+1β∞∫0um−1+r+1βexp{−u}dudβ

On the other hand, if we denote and if correspond to the number of indexes such that then, since for all it follows that

 limβ→∞(∑ni=1xβi)(xmax)β=k⇒n∑i=1xβi∝β→∞(xmax)β,

and, moreover, trivially it follows that

 limβ→0(∑ni=1xβi)(xmax)β=1⇒n∑i=1xβi∝β→0(xmax)β.

Thus, from Proposition 5.3 it follows that in and therefore

 d(x)∝∞∫0exp(−pβ−1)βm+q−1(∏ni=1xδiβi)(xmax)mβ∞∫0um−1+r+1βexp{−u}dudβ. (13)

Proof of item : We divide the proof in the cases where and .

First let us suppose that and . Then it follows from (13) that

 d(x)∝∞∫0exp(−pβ−1)βq−1∞∫0u−1+r+1βexp{−u}dudβ.

and since for all it follows that for all and thus in this case.

Now let us suppose that and . Therefore, for every it follows that and thus

 exp(−pβ−1)βm+q−1(∏ni=1xδiβi)(xmax)mβ∞∫0um−1+r+1βexp{−u}dη=∞,

for all . Therefore it follows from (13) that

 d(x)∝ ∞∫0exp(−pβ−1)βm+q−1(∏ni=1xδiβi)(xmax)mβ∞∫0um−1+r+1βexp{−u}dudβ ≥ r+1m∫0exp(−pβ−1)βm+q−1(∏ni=1xδiβi)(xmax)mβ∞∫0um−1+r+1βexp{−u}dudβ=∞,

and hence in case

Now let us suppose that . Therefore, from (13) and letting it follows that

 d(x)∝∞∫0exp(−pβ−1)βm+q−1exp(−hβ)Γ(m+r+1β)dβ (14)

Thus, using the change of variables in the integral on (14), and using that it follows that

 (15)

On the other hand, from the Stirling formula and since and , it follows that

 limα→∞Γ(α)exp(pαr+1)(α−m)m+q+1exp(h(r+1)α−m)=√2πlimα→∞αα−12exp(−α)exp(pαr+1)αm+q+1 =√2πlimα→∞exp(α(ln(α)−(1+pr+1)−ln(α)α(m+q+32)))=∞,

where the last inequality follows directly from and . Therefore from (15) we conclude that in case , and thus item is proved.

Proof of item : If we suppose that and then, once again from (13) and letting , it follows that

 d(x)∝∞∫0βm+q−1exp(−hβ)Γ(m)dβ∝∞∫0βm+q−1exp(−hβ)dβ,

and since by hypothesis of item there are at least two distinct non-censored data, it follows that and thus by the last proportionality it follows that if and only if , which concludes the proof of item .

Proof of item : If we suppose that , and , then from (13) it follows that

 d(x)∝∞∫0βq−1∞∫0exp{−u}dudβ=∞∫0βq−1dβ,

and thus in this case.

Finally if we suppose that , and then from (13) it follows directly that

 d(x)∝∞∫0βq−1∞∫0u−1exp{−u}dudβ, (16)

and since , it follows that , which completes the proof.