# Modified Cox regression with current status data

In survival analysis, the lifetime under study is not always observed. In certain applications, for some individuals, the value of the lifetime is only known to be smaller or larger than some random duration. This framework represent an extension of standard situations where the lifetime is only left or only right randomly censored. We consider the case where the independent observation units include also some covariates, and we propose two semiparametric regression models. The new models extend the standard Cox proportional hazard model to the situation of a more complex censoring mechanism. However, like in Cox's model, in both models the nonparametric baseline hazard function still could be expressed as an explicit functional of the distribution of the observations. This allows to define the estimator of the finite-dimensional parameters as the maximum of a likelihood-type criterion which is an explicit function of the data. Given an estimate of the finite-dimensional parameter, the estimation of the baseline cumulative hazard function is straightforward.

## Authors

• 2 publications
• 1 publication
• 1 publication
• 9 publications
12/15/2019

### On function-on-function regression: Partial least squares approach

Functional data analysis tools, such as function-on-function regression ...
05/21/2019

### Efficient Estimation For The Cox Proportional Hazards Cure Model

While analysing time-to-event data, it is possible that a certain fracti...
05/04/2018

### Estimation of Extreme Survival Probabilities with Cox Model

We propose an extension of the regular Cox's proportional hazards model ...
03/06/2018

### Self-reporting and screening: Data with current-status and censored observations

We consider survival data that combine three types of observations: unce...
09/13/2019

### Uniform convergence rate of nonparametric maximum likelihood estimator for the current status data with competing risks

We study the uniform convergence rate of the nonparametric maximum likel...
01/10/2019

### Multi-Parameter Regression Survival Modelling: An Alternative to Proportional Hazards

It is standard practice for covariates to enter a parametric model throu...
03/26/2021

### Testing For a Parametric Baseline-Intensity in Dynamic Interaction Networks

In statistical network analysis it is common to observe so called intera...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Driven by applications, there is a constant interest in time-to-event analysis to extend the predictive models to situations where the lifetimes of interest suffer from complex censoring mechanisms. Here we consider the case where instead of the lifetime of interest one observes independent copies of a finite nonnegative duration and of a discrete variable such that

 ⎧⎪⎨⎪⎩X=TsiA=0,X

Depending on the application, the inequality signs in (1.1) could be strict or not. Let us point out that the limit case where the event (resp.

) has zero probability corresponds to the usual random right-censoring (resp. left-censoring) setup, while the case where the probability of the event

is null corresponds to the current status framework.

Let us assume that and let

be a vector of random covariates. All the random variables we consider are defined on some probability space

Although takes values only on the real line, we allow a positive probability for the event that is we allow for cured individuals (see, for instance Fang et al. (2005) and Zheng et al. (2006) and the references therein for the applications where infinity lifetimes could occur). Symmetrically, we also allow the zero lifetime to have positive probability, that is a zero-inflated law for could be taken into account (see Braekers & Growels (2015) for some motivations).

Let be the support of

given is characterized by the sub-distributions functions

 Hk([0,t]|z)=P(X≤t,A=k|Z=z),t≥0,k∈{0,1,2},z∈Z.

Let denote the associated measures. Moreover, let

 Hk([0,t])=P(X≤t,A=k)

be the unconditional versions of these sub-distributions. Clearly,

 Hk([0,t])=E(Hk([0,t]|Z)),t≥0,k∈{0,1,2}.

The conditional distribution function of given is then

 H([0,t]|z)=P(X≤t|Z=z)=H0([0,t]|z)+H1([0,t]|z)+H2([0,t]|z).

It is important to understand that, based on the data, one could only identify the conditional sub-distributions . For identifying and consistently estimating the conditional law of the lifetime of interest , one should introduce some assumptions on the censoring mechanism. In other words, one has to consider a latent model. Several censoring mechanisms have proposed in the case without covariates. Turnbull (1974) considered two censoring times such that the case (resp. ) (resp. ) corresponds to the event and (resp. and ) (resp. and ). Patilea & Rolin (2006b) relaxed the condition and proposed two models that could be easily illustrated using simple electric circuits with three components connected in series and/or parallel. Patilea and Rolin (2006a) extended the standard right-censoring (resp. left-censoring) model by allowing uncensored lifetimes for which one only knows that are smaller (resp. larger) than the observation . This corresponds, for instance, to the case of a medical study where a disease is detected for a patient, but the onset time could not be determined from medical records, personal information, etc, while for other patients with the disease detected the onset time is available. The model of Turnbull does not allow to express the law of the lifetime of interest as an explicit function of the sub-distributions , as it is the case for the models proposed by Patilea & Rolin (2006a, 2006b). Thus a numerical algorithm is necessary to compute Turnbull’s estimator. It is important to keep in mind that any of these latent models could be correct and useful for a specific application. The data does not allow to check the validity of the model. Turnbull’s model, perhaps the most popular model for data structures as we consider here, is not necessarily justified in applications where there is no natural interpretation of the variables and .

The aim of this paper is to extend the modeling of data as in equation (1.1) to the case where some covariates are available. Kim et al. (2010) extended Turnbull’s model to the case with covariates using a proportional hazard approach. Here we consider the extension of the approaches proposed by Patilea & Rolin (2006a) imposing the same proportional hazard assumption. More precisely, we propose two novel latent models for observed lifetimes as in (1.1) in the presence of covariates. Both models are well suited for data as in (1.1), and hence could be used in applications. The decision to use one of them, or the one proposed by Kim et al. (2010), could be made only on the basis of additional information on the application. Current status data corresponds to . Right (resp. left) censored data corresponds to (resp. ). This explains the terminology we propose for our models: modified Cox regressions with current status lifetimes. For each of the new models, we introduce a semiparametric estimator for the finite-dimensional parameters, together with the corresponding baseline cumulative hazard functions estimators. Our estimators are easy to implement.

The paper is organized as follows. Our semiparametric models are introduced in section 2. They extend the standard right, respectively left, random censoring proportional hazard models. In section 3 we introduce the semiparametric estimators of the covariates coefficients, and the estimators of the cumulative hazard and survival functions. In particular, we provide an estimator for the cure rate and the zero-lifetime probability. The theoretical results are presented

## 2 Censored and current status lifetimes

In our models we follow the idea of Cox’s semiparametric proportional hazard model. In both models we are able to express the baseline cumulative hazard function as a functional of distribution of the observations, characterized by the conditional sub-distributions and the law of , and the coefficients of the covariates. This makes that the coefficients of the covariates could be estimated by maximizing an likelihood-type criterion that is build as an explicit function of the observations. Thus the numerical aspects are very much simplified, compared to the model considered by Kim et al. (2010). With at hand the estimate of the finite-dimensional parameters, we could easily build the estimator of the baseline cumulative hazard function. In particular, using the estimate of total mass of the baseline cumulative hazard, we provide a simple estimate of the conditional cure rate . Similarly, we could provide an estimator for the conditional zero-lifetime probability . The extension to the case of mixture models, such as considered by Fang et al. (2005), where the cure rate or the zero-lifetime could depend on possibly different set of covariates, is left for future work.

### 2.1 Right-censoring case

Let be a random censoring time and be a Bernoulli random variable with success probability . Let and be the conditional distribution function and survivor function of given . Similarly, and denote the distribution function and the survivor function of . Following Patilea & Rolin (2006a), the latent model for is defined by:

 ⎧⎪⎨⎪⎩(X,A,Z)=(T,0,Z)if0≤T≤C and Δ=1,(X,A,Z)=(C,1,Z)if0≤C

Let us notice that is the classical right-censoring limit case, while would correspond to the pure current status setup. The later limit case is not included in what follows since we assume In the case the observed outcome is not be influenced by the value of

For identification purposes we consider the following assumption.

A1: Assume that:

• a) conditionally on , the latent variables and are independent;

• b) and are independent.

The independence assumptions allow to write

 (2.2)

The system could be solved for the quantities and First, let us write

 H0([t,∞)|z)+pH1([t,∞)|z)=pST(t−|z)SC(t−|z).

Since , we deduce

 H0([t,∞)|z)=p{H0([t,∞)|z)+H2([t,∞)|z)},t≥0.

Integrating out the covariate and taking we could derive the simple representation

 p=H0([0,∞))H0([0,∞))+H2([0,∞))=P(Δ=1,T≤C)P(T≤C). (2.3)

Let us point out that one could replace the condition A1b) by the weaker condition that and are independent given and still write the equations (2.2) with replaced by some function of the covariates In this case one would derive the conditional version of the representation (2.3), but then the estimation of would require the estimation of the conditional versions of and For the sake of a simpler setup we suppose that does not depend on the covariates.

Next, we solve (2.2) for the conditional distribution of For this purpose we follow a proportional hazards model approach and we suppose that the risk function of given could be written as

 λ(t|z)=λ(t)exp(β⊤z),∀t>0,∀z∈Z, (2.4)

where is some unknown baseline hazard function and is a vector of unknown regression parameters. (Herein the vectors are matrix columns and denotes the transposed of ) With this assumption, for each and we could write

 H0(dt|z) = pFT(dt|z)SC(t−|z) = FT(dt|z)ST(t−|z)pST(t−|z)SC(t−|z) = λ(t)exp(β⊤z)pST(t−|z)SC(t−|z)dt = λ(t)exp(β⊤z){H0([t,∞)|z)+pH1([t,∞)|z)}dt.

Hence,

 H0(dt)=E{H0(dt|Z)}=E{exp(β⊤Z)(H0([t,∞)|Z)+pH1([t,∞)|Z))}λ(t)dt.

Moreover,

As a consequence, for any such that ,

 λ(t)dt=H0(dt)E{exp(β⊤Z)[1(X≥t,A=0)+p1(X≥t,A=1)]}. (2.5)

Thus, the baseline cumulative hazard function could be expressed as a functional of the observed variables and the finite-dimensional parameters of the model :

 Λ(t)=Λ(t;p,β)=∫[0,t]H0(ds)E{exp(β⊤Z)[1(X≥s,A=0)+p1(X≥s,A=1)]}. (2.6)

The conditional survival function of the lifetime of interest can be expressed as

 ST(t∣z)=∏s∈(0,t](1−exp(β⊤z)Λ(ds)).

Herein, the notation means the product-integral over the interval , as formally defined in Gill & Johansen (1990). In particular, the conditional cure probability can be expressed as

 ST(∞∣z)=∏s∈(0,∞)(1−exp(β⊤z)Λ(ds)).

### 2.2 Left-censoring case

Let be a random censoring time and be a Bernoulli random variable with success probability . In this case the latent model for is defined by:

 ⎧⎪⎨⎪⎩(X,A,Z)=(T,0,Z)if0

The case corresponds to the classical left-censored data situation. Consider the assumptions A1a) and A1b). Then we can write

 ⎧⎪⎨⎪⎩H0(dt|z)=pFC(t|z)FT(dt|z),H1(dt|z)=(1−p)FC(dt|z)ST(t−|z),H2(dt|z)=FC(dt|z)FT(t−|z). (2.7)

This system also could be solved for the quantities and First, combining the first and the third equation, deduce

 H0([0,t]|z)+pH2([0,t]|z)=pFT(t|z)FC(t|z),

so that

 p=H0([0,∞))H0([0,∞))+H1([0,∞)).

Moreover, for each and we could write

 H0(dt|z) = pFT(dt|z)FC(t|z) = FT(dt|z)FT(t|z)pFT(t|z)FC(t|z) = R(dt|z){H0([0,t]|z)+pH2([0,t]|z)},

where

 R(dt|z)=FT(dt|z)FT(t|z)

is the conditional reverse hazard measure. The quantity could be interpreted as the conditional probability that the event occurs in the interval , given that the event occurs no later than . This measure has the property

 FT(t|z)=∏s∈(t,∞)(1−R(ds|z)),∀t≥0.

In particular,

 FT(0|z)=∏s∈(0,∞)(1−R(ds|z)).

Inspired by the proportional hazards approach, let us consider that the conditional reverse hazard function of given could be written as

 r(t|z)=r(t)exp(β⊤z),∀t>0,∀z∈Z, (2.8)

where is some unknown baseline reverse hazard function and is a vector of unknown regression parameters.

Similar to the right-censoring case, one can deduce

 r(t)dt=H0(dt)E{exp(β⊤Z)[1(X≤t,A=0)+p1(X≤t,A=2)]}, (2.9)

and the baseline cumulative reverse hazard is obtained as .

## 3 Semiparametric likelihood estimation

Let , denote the observations that are independent copies of In the following, we consider with some positive integer. With observations of the covariates and of lifetimes as in (1.1), a natural likelihood-type criterion is the one considered by Kim et al. (2010) :

 (3.1)

In this criterion, the factors involving the distribution of are dropped, as they are supposed uninformative.

To write the likelihood-type criterion , we only used a hazard rate as in (2.4), without specifying any censoring mechanism or latent model. Alternatively, one could write the likelihood in terms of the cumulative reverse hazard we defined in section 2.2, using only the assumption (2.8). The two criteria are equivalent and would be valid for the type of data we consider. Next, one could follow the profiling idea. In the case where this leads to Cox’s partial likelihood with right-censored data. See Murphy & van der Vaart (2000). A similar situation, Cox’s partial likelihood with left-censored data, occurs when . Unfortunately, given a value , the maximization with respect to (or ) of does not have a nondegenerate, explicit solution when both and are positive. See Kim et al. (2010), the Remark on page 1341. A possible solution, proposed by Kim et al., would be to consider a numerical approximation. Here we propose an alternative, more convenient and sound route. To estimate the parameters of interest, one has to consider a model for the censoring mechanism. In the model considered by Kim et al. (2010), there is no way to connect the infinite-dimensional parameter (or ) to the quantities that could be easily estimated from the data, such as . This makes the profiling approach complicated. The profiling approach is very appealing in the standard right-censoring (resp. left-censoring) case because there could be easily expressed in terms of , and (resp. ).

In the two models we propose, the relationship between quantities that could be estimated by sample means from the data and the infinite-dimensional parameter (or ) is explicit and this allows us to build a user-friendly approximated likelihood. These models does not only make the optimization of the likelihood-type criteria simpler. First of all, they induce censoring mechanisms that make sense in some applications. See Patilea & Rolin (2006a) for a discussion.

### 3.1 The right-censoring and current status data case

The parameters of our first model are and the hazard function . Let and denote the true values of the parameters. Using the notation from equation (2.6) we can also write .

In view of equation (2.3) let us consider

 ˆp=∑ni=11(Ai=0)∑ni=11(Ai≠1)

as estimator of For estimating we shall use a partial likelihood approach. With at hand an estimate of , we will use an empirical version of equation (2.6) and build an estimate of . For these purposes let us define empirical quantities

 Nki(t)=1(Xi≤t,Ai=k),1≤i≤n,k∈{0,2},
 Nn,0(t)=1nn∑i=11(Xi≤t,Ai=0)=1nn∑i=1N0i(t).

For a column vector , and Let

 S(l)n,k(t;β)=1nn∑i=1exp(β⊤Zi)Z⊗li1(Xi≥t,Ai=k),l=0,1,k∈{0,1,2},

and

 E(l)n(t;θ)=E(l)n(t;p,β)=S(l)n,0(t;β)+pS(l)n,1(t;β). (3.2)

Consider

 Λn(t;θ)=Λn(t;p,β)=∫[0,t]Nn,0(ds)S(0)n,0(s;β)+pS(0)n,1(s;β)=∫[0,t]Nn,0(ds)E(0)n(s;θ)

as the empirical version of the cumulative hazard function , as defined in (2.6).

Using these empirical quantities, and recalling that , we can write the following approximation of the criterion defined in (3.1) :

 n∏i=1⎧⎨⎩∏t∈[0,τ][exp(β⊤Zi)Λn(t;θ)]N0i(dt)[1−exp(−∫[0,Xi]exp(β⊤Zi)Λn(ds;θ))]N2i(dt)⎫⎬⎭×exp(−∫[0,τ]{S(0)0(t;β)+S(0)1(t;β)}Λn(dt;θ)),

where is some threshold that prevents from dividing by zero, it will be specified below. Hence, let us define the approximate log-likelihood function

 ℓn(p,β;τ) = 1nn∑i=1Dτ0i(β⊤Zi−log(E(0)(Xi;p,β))) + 1nn∑i=1Dτ2ilog(1−exp(−∫[0,Xi]exp(β⊤Zi)E(0)(s;p,β)Nn,0(ds))) − ∫[0,τ]E(0)(t;1,β)E(0)(t;p,β)Nn,0(dt),

and The regression parameter is then estimated by

 ˆβ=argmaxβ∈Bℓn(ˆp,β;τ),

where is a set of parameters and is fixed by the statistician. For theoretical results, one needs conditions allowing to control for small values of This is technical condition that is usually ignored in practice where one would simply take equal to the largest uncensored observation. Next, the cumulative hazard function is estimated by

 ˆΛ(t)=Λn(t;ˆp,ˆβ)

and the conditional survival function of the lifetime of interest is estimated by

 ˆST(t∣z)=∏s∈(0,t](1−exp(ˆβ⊤z)ˆΛ(ds)),t<τ.

The conditional cure probability is then estimated by

 ˆST(∞∣z)=ˆST(τ∣z)=∏s∈(0,τ](1−exp(ˆβ⊤z)ˆΛ(ds)).

### 3.2 The left-censoring and current status data case

In the case of the model for left-censored and current status data the estimate of is

 ˆp=∑ni=11(Ai=0)∑ni=11(Ai≠2).

Next, using the same notation as above, let us define

 F(l)n,k(t;β)=1nn∑i=1exp(β⊤Zi)Z⊗li1(Xi≤t,Ai=k),l=0,1,k∈{0,1,2}.

Let us denote

 L(l)n(t;p,β)=F(l)n,0(t;β)+pF(l)n,2(t;β)

and, for any such that , consider

 Rn(dt;p,β)=Nn,0(dt)L(0)n(t;p,β).

Let us fix some (small) value such that and, similarly to the construction presented in section 3.1, define the approximated log-likelihood function

 ℓn(p,β;ϱ) = 1nn∑i=1Dϱ0i(β⊤Zi−log(L(0)n(Xi;p,β))) + 1nn∑i=1Dϱ1ilog⎛⎝1−exp⎛⎝−∫[Xi,∞)exp(β⊤Zi)L(0)n(s;p,β)Nn,0(ds)⎞⎠⎞⎠ − ∫[ϱ,∞)L(0)n(t;1,β)L(0)n(t;p,β)Nn,0(dt),

where The regression parameter is then estimated by

 ˆβ=argmaxβ∈Bℓn(ˆp,β;ϱ),

where is a set of parameters and is fixed by the statistician. Like in the previous model, imposing a bound , here it should be a lower one, is a technical condition usually ignored in applications. Next, the conditional distribution function of the lifetime of interest is estimated by

 ˆFT(t∣z)=∏(t,∞)(1−exp(ˆβ⊤z)Rn(ds;ˆp,ˆβ)),t≥ϱ.

The zero lifetime conditional probability is then estimated by

 ˆFT(0∣z)=ˆFT(ϱ∣z)

and the baseline cumulative reverse hazard is estimated by .

## 4 Asymptotic results

For the asymptotic results we only consider the investigation of the right-censored and current status data case. For the left-censored and current status data case the results are similar and could be obtained after obvious modifications.

Let be the probability distribution of and for any integrable function let Let

 Pn=1nn∑i=1δ(Xi,Ai,Zi)

be the empirical distribution function and

Let us introduce the following additional assumptions.

• The vector of covariates lies in , with

fixed, has a positive definite variance and is bounded, that is

a.s. Moreover, is an interior point of the parameter set that is a compact subset of , and ;

• The value is such that

For simplicity we rule out the case because in this case and a.s., that is we are exactly in the classical PH model under right-censoring. Since is strictly positive, Assumption A3) is equivalent to Also for simplicity, in the sequel we assume that the lifetime of interest and the censoring time are almost surely different. Let us notice that the construction we propose in sections 2.1 and 2.2 adapts to the case where depends on the sample size, or to the case where is an infinite-dimensional space. The study of the properties of the estimators defined in such cases is left for future work.

###### Theorem 4.1 (Consistency).

Let . Assume and Assumptions A1A3 hold true. Then:

1. , in probability;

2. in probability.

###### Theorem 4.2 (I.i.d. representation).

Under the assumptions of Theorem 4.1 we have:

 √n⎛⎜ ⎜⎝ˆp−p0ˆβ−β0ˆΛ(t)−Λ0(t)⎞⎟ ⎟⎠=Gn~ℓt;p0,β0,Λ0+Rn(t),t∈[0,τ],

where is some squares integrable function and is a reminder term that is uniformly negligible, that is

###### Corollary 4.3 (Clt).

Under the assumptions of Theorem 4.2

 √n⎛⎜ ⎜⎝ˆp−p0ˆβ−β0ˆΛ(⋅)−Λ0(⋅)⎞⎟ ⎟⎠⇝G in Rq+1×ℓ∞([0,τ]),

where is a tight, zero-mean Gaussian process with covariance function

 ρG(s,t)=P~ℓs;p0,β0,Λ0~ℓ⊤t;p0,β0,Λ0,0≤s,t≤τ.

We could also derive the asymptotic law of the estimator of the survivor function for an arbitrary value in the support of the covariates. The following result is a straightforward extension of classical results for Cox PH model, see Link (1984).

###### Corollary 4.4 (CLT for the conditional survivor).

Under the assumptions of Theorem 4.2 and for any fixed

 √n(ˆST(⋅∣z)−ST(⋅∣z))⇝Sz in ℓ∞([0,τ]),

where is a tight, zero-mean Gaussian process.

Let us now investigate the estimator of the cure rate. Suppose that has a bounded support and let be its right endpoint. Assume that Then in our model we necessarily have and Since one cannot identify the law of beyond the last uncensored observation, by an usual convention, These quantities could be estimated by The following corollary is a direct consequence of Corollary 4.4.

###### Corollary 4.5 (CLT for the conditional cure rate).

Suppose that the assumptions A1, A2 hold true. Moreover, has a bounded support with right endpoint . Assume that Then

 √n(ˆST(X0(n)∣z)−ST(∞∣z))⇝N(0,V(z)),

where is the largest uncensored observation and with from Corollary 4.4.

The estimation of the covariance functions of the processes and , and of the variance is quite difficult. Therefore we propose an alternative route, based on the weighted bootstrap, for estimating the asymptotic law of our estimators. Let us consider that is an uniformly consistent estimator of . Next, let us define

 G′n=1√nn∑i=1(ξi−¯ξ)δ(Xi,Ai,Zi)

where are i.i.d., with zero mean and unit variance random variables, for instance gaussian, independent of the data.

###### Theorem 4.6 (Asymptotic law approximation).

Under the assumptions of Theorem 4.2:

 (Gn~ℓ⋅;p0,β0,Λ0,G′n~ℓ⋅;ˆp,ˆβ,ˆΛ)⇝(G,G′) in (Rq+1×ℓ∞([0,τ]))2

where and are independent and identically distributed.

As a direct consequence of Theorem 4.6 one could obtain the validity of the bootstrap approximation of the asymptotic laws stated in Corollaries 4.3 to 4.5. The details are omitted.

## References

• [1] Braekers, R. & Grouwels, Y. (2015). A semi-parametric Cox’s regression model for zero-inflated left-censored time to event data. Communications in Statistics – Theory and Methods 45(7), 1969–1988.
• [2] Cox, D.R. (1972). Regression models and life tables (with discussion). J. Roy. Statist. Soc. Ser. B 34, 187–220.
• [3] Cox, D.R. (1975). Partial likelihood. Biometrika 62, 269–276.
• [4] Fang, H.B., Li, G., & Sun, J. (2005). Maximum likelihood estimation in a semiparametric Logistic/proportional-hazards mixture model. Scand. J. Statist. 32, 59–75.
• [5] Gill, R.D. (1994). Lectures on survival analysis

. Lectures on probability theory: Ecole d’été de probabilités de Saint-Flour XXII. Lecture notes in mathematics 1581. Springer.

• [6] Gill, R.D., Johansen, S. (1990). A Survey of Product-Integration with a View Toward Application in Survival Analysis. Ann. Statist. 18(4), 1501–1555.
• [7] Huang, J. (1999). Asymptotic properties of nonparametric estimation based on partly interval-censored data. Statistica Sinica 9, 501–519.
• [8] Kim, J.S. (2003). Maximum likelihood estimation for the proportional hazards models with partly interval-censored data. J. Royal Stat. Soc. B 65, 489–502.
• [9] Kim, Y., Kim, B., Jang, W. (2010). Asymptotic properties of the maximum likelihood estimator for the proportional hazards model with doubly censored data. J. Multivar. Anal. 101, 1339–1351.
• [10] Kosorok, M.D. (2008). Introduction to empirical process and semiparametric inference. Springer Series in Statistics, Springer: New-York.

(1984). Confidence intervals for the survival function using Cox’s proportional-hazard model with covariates.

Biometrics 40, 601–609.
• [12] Murphy, S. A., & A. W. van der Vaart (2000). On profile likelihood. J. Amer. Statist. Assoc. 95(450), 449–465.
• [13] Patilea, V., & Rolin, J.-M. (2006a). Product-limit estimators of the survival function for two modified forms of current-status data. Bernoulli 12, 801–819.
• [14] Patilea, V., & Rolin, J.-M. (2006b). Product-limit estimators of the survival function with twice censored data. Ann. Statist. 34, 925–938.
• [15] Turnbull, B.W. (1974). Nonparametric estimation of a survivorship function with doubly censored data. J. Amer. Statist. Assoc. 69, 169–173.
• [16] van der Vaart, A.D. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge.
• [17] van der Vaart, A.D., & Wellner, J.A. (1996). Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag, New-York.
• [18] van der Vaart, A.D., & Wellner, J.A. (2007). Empirical processes indexed by estimated functions. In Asymptotics: Particles, Processes and Inverse Problems, IMS Lecture Notes–Monograph Series, Vol. 55 (2007) 234–252.
• [19] Zheng, D., Yin, G. & Ibrahim, J.G. (2006). Semiparametric Transformation Models for Survival Data With a Cure Fraction. J. Amer. Statist. Assoc. 101(474), 670–684.

## 5 Appendix

### 5.1 Notation

For any matrix , we denote by Let us recall that vectors are considered as column matrices. The spaces of functions we consider are endowed with the uniform (supremum) norm that is denoted by Let and denote the partial derivation operators with respect to and respectively.

Let

 s(l)k(t;β)=E{S(l)n,k(t;β)}=E{exp(β⊤Z)Z⊗l1(X≥t,A=k)},

and

 e(l)(t;θ)=e(l)(t;p,β)=E{E(l)n(t;θ)}=s(l)0(t;β)+ps(l)1(t;β),l=0,1,k∈{0,1,2}.

Let

 ℓ(p,β;τ)=E[β⊤Z1(X≤τ,A=0)]−∫[0,τ]log(e(0)(t;p,β))H0(dt)+E[log(1−exp(−exp(β⊤Z)Λ(X;p,β)))1(X≤τ,A=2)]−∫[0,τ]e(0)(t;1,β)e(0)(t;p,β)H0(dt).

The criterion is expected to be the limit of the approximated log-likelihood function . Let us recall that denotes the probability distribution of and for any integrable function let Moreover,

 Pn=1nn∑i=1δ(Xi,Ai,Zi)

is the empirical measure, and Finally, define

 δk(a)=1(a=k),k∈{0,1,2}.

### 5.2 Proof of Theorem 4.1

To prove consistency for , it suffices, for instance, to use the results from section 5.2 of van der Vaart (1998). This means to check that

 ℓ(p0,β0;τ)>