Lifetime Data AnalDOI 10.1007/s10985-015-9325-0
Estimating the survival function based on thesemi-Markov model for dependent censoring
Ziqiang Zhao1 · Ming Zheng1 · Zhezhen Jin2
Received: 10 March 2014 / Accepted: 7 March 2015© Springer Science+Business Media New York 2015
Abstract In this paper, we study a nonparametric maximum likelihood estimator(NPMLE) of the survival function based on a semi-Markov model under dependentcensoring. We show that the NPMLE is asymptotically normal and achieves asymp-totic nonparametric efficiency. We also provide a uniformly consistent estimator ofthe corresponding asymptotic covariance function based on an information operator.The finite-sample performance of the proposed NPMLE is examined with simulationstudies, which show that the NPMLE has smaller mean squared error than the exist-ing estimators and its corresponding pointwise confidence intervals have reasonablecoverages. A real example is also presented.
Keywords Semi-Markov model · Dependent censoring · NPMLE ·Survival function
Electronic supplementary material The online version of this article (doi:10.1007/s10985-015-9325-0)contains supplementary material, which is available to authorized users.
B Zhezhen [email protected]
Ziqiang [email protected]
Ming [email protected]
1 Department of Statistics, School of Management, Fudan University,670 Guoshun Road, Shanghai, China
2 Department of Biostatistics, Mailman School of Public Health, Columbia University,722 West 168th Street, New York, NY, USA
123
Z. Zhao et al.
1 Introduction
In survival analysis, the survival function of the failure time is commonly estimated bythe Kaplan-Meier estimator and the Nelson-Aalen estimator. For these two estimators,a key assumption is that the censoring time and the survival time are independent. Itis challenging to estimate the survival function under dependent censorship (Tsiatis1975).
In oncology studies, independent dropout yields independent censoring. In addition,subjects are often censored due to the ending of the study or to the onset of progressivedisease (PD). The study-ending censoring is usually independent of the survival timewhile the PD-related censoring might be dependent on the survival time since the PDcould be a precursor of both death and lost of follow-up. In other words, in the presenceof PD, the patients have a much higher risk of death and are more likely to leave thestudy. Ignoring such dependence would yield biased and inconsistent estimation ofthe survival function. On the other hand, it is possible to improve the estimation if thedependency information is properly used. Datta et al. (2000) considered nonparametricestimation using a three-stage irreversible illness–death model.
In the case of dependent censoring, Lee and Tsai (2005) proposed a semi-Markovmodel and developed an empirical-type estimator of the survival function. The asymp-totic variance of their proposed estimator, however, is too complicated to compute. Inthis paper, we present a nonparametric maximum likelihood estimator (NPMLE) ofthe survival function based on the semi-Markov model. We show that our proposedNPMLE converges weakly to a Gaussian process and achieves asymptotic efficiency.In addition, we develop a consistent estimator of its asymptotic covariance functionbased on an information operator, which can be easily calculated.
The remainder of the paper is organized as follows. In Sect. 2, we will introduce thesemi-Markov model. In Sect. 3, we will derive the NPMLE of the survival function.In Sect. 4, we will establish the asymptotic properties of the NPMLE, and constructa consistent estimator of its asymptotic covariance function based on an informationoperator. In Sect. 5, we will present simulation studies and re-analysis of the examplein Lee and Tsai (2005). We will conclude with a short discussion in Sect. 6 and providesketches of the proofs of theorems in the Appendix.
2 The semi-Markov model
Let T be the survival time and U be the PD censoring time. The semi-Markov modelproposed by Lee and Tsai (2005) assumes that:
λT|U (t |u) = λ(0,2)0 (t) I {t � u} + λ
(1,2)0 (t − u) I {t > u} , for t, u � 0, (1)
whereλT|U is the conditional hazard function of T given U, I {·} is the indicator function
taking the value 1 if the condition is satisfied and the value 0 otherwise, λ(0,2)0 and λ
(1,2)0
are unknown hazard functions for death without PD and death with PD respectively.
123
Semi-Markov model for dependent censoring
Fig. 1 Transition
Note that the cumulative form of Model (1) is:
ΛT|U (t |u) = Λ(0,2)0 (min {t, u}) + Λ
(1,2)0 (t − u) I {t > u} , for t, u � 0, (2)
and that for any t, u � 0,
ST|U (t |u) =⎧⎨
⎩
S(0,2)0 (t) t � u,
S(0,2)0 (u) S(1,2)
0 (t − u) t > u,
where Λ(0,1)0 ,Λ
(0,2)0 ,Λ
(1,2)0 and S(0,1)
0 , S(0,2)0 , S(1,2)
0 are the corresponding causespecific cumulative hazard functions and the corresponding cause specific survivalfunctions for λ
(0,1)0 , λ
(0,2)0 and λ
(1,2)0 , respectively. The superscript values 0, 1, 2 indi-
cate three different states, with 0 being the state of alive without PD, 1 being the stateof alive with PD and 2 being the state of death.
More precisely, Model (1) corresponds to a non-homogeneous semi-Markovprocess J with the state space {0, 1, 2}:
J (t) =⎧⎨
⎩
0 T > t, U > t (Alive without PD at time t)1 T > t, U � t (Alive with PD at time t), for t � 02 T � t (Died at time t)
.
With this specification, if λ(0,1)0 denotes the hazard function of U, i.e. the hazard
function for PD, it can be shown that λ(0,1)0 , λ
(0,2)0 and λ
(1,2)0 are respectively the cause
specific hazard functions for the transition from state 0 to state 1, state 0 to state 2,and state 1 to state 2. Figure 1 provides a plot of transition for illustration.
In addition, we use C to denote independent censoring, which is independent of(T, U). Let X = min {T, C} and Δ = I {T � C}. When X � U, the subject is dead orcensored before PD, i.e., (X,Δ) is observed while U is not. When X > U, PD occursbefore both death and the independent censoring C and so the subject would leave thestudy at time U with a probability θ0, i.e., U is observed and (X,Δ) is observed withprobability θ0, where θ0 is an unknown parameter.
123
Z. Zhao et al.
Let V = min {X, U} , R = I {X � U} and ξ be the indicator of observ-ing (X,Δ). The observed data for a subject is (V, R, ξ, ξX, ξΔ). Throughout thepaper, it is assumed that ξ is conditionally independent of (T, U, C) given R withP (ξ = 1|R = 1) = 1 and P (ξ = 1|R = 0) = θ0. For a sample of size n, we observe(Vi , Ri , ξi , ξi Xi , ξiΔi ) , i = 1, . . . , n.
3 Nonparametric maximum likelihood estimation
In general, without additional information, likelihood based estimation is not availablefor dependent censoring problems. Using the semi-Markov model specification, it ispossible to develop likelihood based estimation.
Next, we present the likelihood for the unknown parameters:
(Λ(0,1), Λ(0,2), Λ(1,2), Λ(c), θ
),
where Λ(c) is the parameter for Λ(c)0 which is the true cumulative hazard function of
C. Note that Λ(c) and θ are nuisance parameters.Suppose that Λ(0,1), Λ(0,2), Λ(1,2) and Λ(c) are differentiable with corresponding
derivatives: λ(0,1), λ(0,2), λ(1,2) and λ(c). Let S(0,1), S(0,2), S(1,2) and S(c) be the cor-responding survival functions of Λ(0,1), Λ(0,2), Λ(1,2) and Λ(c). With these notations,the likelihood of
(Λ(0,1), Λ(0,2), Λ(1,2), Λ(c), θ
)can be derived.
For subjects with ξi = 1, (Vi , Xi ,Δi , Ri , ξi ) is observed and its corresponding like-lihood can be obtained based on the joint distribution of (V, X,Δ, R, ξ). Specifically,when ξi = 1, Ri = 1 and Δi = 1, the likelihood is:
λ(0,2) (Vi ) S(0,2) (Vi ) S(c) (Vi ) S(0,1) (Vi ) ;
When ξi = 1, Ri = 1 and Δi = 0, the likelihood is:
λ(c) (Vi ) S(c) (Vi ) S(0,2) (Vi ) S(0,1) (Vi ) ;
When ξi = 1, Ri = 0 and Δi = 1, the likelihood is:
λ(1,2) (Xi − Vi ) S(1,2) (Xi − Vi ) S(c) (Xi ) λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) θ;
When ξi = 1, Ri = 0 and Δi = 0, the likelihood is:
λ(c) (Xi ) S(c) (Xi ) S(1,2) (Xi − Vi ) λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) θ.
For subjects with ξi = 0, (Vi , Ri , ξi ) is observed and its corresponding likelihood is:
λ(0,1) (Vi ) S(0,1) (Vi ) S(0,2) (Vi ) S(c) (Vi ) (1 − θ) .
123
Semi-Markov model for dependent censoring
Hence, the likelihood based on the observed data (Vi , Ri , ξi , ξi Xi , ξiΔi ) , i =1, . . . , n, is proportional to
n∏
i=1
⎡
⎣[λ(0,1) (Vi )
](1−Ri ) ∏
y∈[0,Vi )
(1 − Λ(0,1) (dy)
)⎤
⎦
×n∏
i=1
⎡
⎣[λ(0,2) (Vi )
]Δi Ri ∏
y∈[0,Vi )
(1 − Λ(0,2) (dy)
)⎤
⎦
×n∏
i=1
⎡
⎣[λ(1,2) (Xi − Vi )
]Δi ∏
y∈[0,Xi −Vi )
(1 − Λ(1,2) (dy)
)⎤
⎦
ξi (1−Ri )
.
Unfortunately, this function is unbounded from above and the usual maximum like-lihood estimator (MLE) does not exist when Λ
(0,1)0 ,Λ
(0,2)0 and Λ
(1,2)0 are restricted
to continuous functions. With discretized extensions by allowing Λ(0,1)0 ,Λ
(0,2)0 and
Λ(1,2)0 to be discontinuous, the NPMLE is well defined in the sense of Kiefer and
Wolfowitz (1956) and Scholz (1980). Specifically, we assume that Λ(0,1)0 ,Λ
(0,2)0 and
Λ(1,2)0 are cadlag, piecewise constant and right continuous with left limits.To obtain the NPMLE, we rewrite the likelihood function with the discretized
Λ(0,1), Λ(0,2) and Λ(1,2):
Ln
(Λ(0,1), Λ(0,2), Λ(1,2)
)= L(0,1)
n
(Λ(0,1)
)L(0,2)
n
(Λ(0,2)
)L(1,2)
n
(Λ(1,2)
),
where for any cumulative hazard function Λ,
L(0,1)
n (Λ) =n∏
i=1
⎡
⎣[Λ {Vi }
](1−Ri )∏
y∈[0,Vi )
(1 − Λ(dy))
⎤
⎦ ,
L(0,2)
n (Λ) =n∏
i=1
⎡
⎣[Λ {Vi }
]Δi Ri (1 − Λ {Vi })1−Δi Ri∏
y∈[0,Vi )
(1 − Λ(dy))
⎤
⎦ ,
L(1,2)
n (Λ)
=n∏
i=1
⎡
⎣[Λ {Xi −Vi }
]Δi (1−Λ {Xi − Vi })1−Δi∏
y∈[0,Xi−Vi )
(1 − Λ(dy))
⎤
⎦
ξi (1−Ri )
,
with Λ {t} = Λ(t) − Λ(t−) for all t � 0.It is also assumed that Λ
(0,1)0 does not share jump points with Λ
(0,2)0 and Λ
(c)0 ,
which means that the time of PD censoring occurrence is different from the time ofdeath or the time of independent censoring. Then, for any cumulative hazard functionΛ,
123
Z. Zhao et al.
log L(0,1)n (Λ) �
∫ +∞
0log [Λ {t}] N
(0,1)
n (dt) +∑
t�0
Y(0)
n (t+) log [1 − Λ {t}] (3)
�∫ +∞
0log
[Λ(0,1)
n {t}]
N(0,1)
n (dt) +∑
t�0
Y(0)
n (t+) log[1 − Λ(0,1)
n {t}]
(4)
= log L(0,1)n
(Λ(0,1)
n
),
log L(0,2)n (Λ) �
∫ +∞
0log [Λ {t}] N
(0,2)
n (dt) +∑
t�0
Y(0)
n (t+) log [1 − Λ {t}] (5)
�∫ +∞
0log
[Λ(0,2)
n {t}]
N(0,2)
n (dt) +∑
t�0
Y(0)
n (t+) log[1 − Λ(0,2)
n {t}]
(6)
= log L(0,2)n
(Λ(0,2)
n
),
log L(1,2)n (Λ) �
∫ +∞
0log [Λ {t}] N
(1,2)
n (dt) +∑
t�0
Y(1)
n (t+) log [1 − Λ {t}] (7)
�∫ +∞
0log
[Λ(1,2)
n {t}]
N(1,2)
n (dt) +∑
t�0
Y(1)
n (t+) log[1 − Λ(1,2)
n {t}]
= log L(1,2)n
(Λ(1,2)
n
), (8)
where for any t � 0,
Y(0)
n (t) =n∑
i=1
I {Vi � t} , Y(1)
n (t) =n∑
i=1
ξi (1 − Ri ) I {Xi − Vi � t} ,
N(0,1)
n (t) =n∑
i=1
(1 − Ri ) I {Vi � t} , N(0,2)
n (t) =n∑
i=1
Δi Ri I {Vi � t} ,
N(1,2)
n (t) =n∑
i=1
ξiΔi (1 − Ri ) I {Xi − Vi � t} ,
Λ(0,1)n (t) =
∫
[0,t]
(Y
(0)
n (y))−1
I{Y
(0)
n (y) > 0}
N(0,1)
n (dy) ,
Λ(0,2)n (t) =
∫
[0,t]
(Y
(0)
n (y))−1
I{Y
(0)
n (y) > 0}
N(0,2)
n (dy) ,
Λ(1,2)n (t) =
∫
[0,t]
(Y
(1)
n (y))−1
I{Y
(1)
n (y) > 0}
N(1,2)
n (dy) .
The equalities for (3), (5), and (7) hold if and only if Λ is a pure jump function. Theinequalities in (4), (6), and (8) follow from the fact that for any λ ∈ (0, 1) and anyα, β > 0,
123
Semi-Markov model for dependent censoring
α log λ + β log (1 − λ) � α log (α/(α + β)) + β log (β/(α + β)) .
According to Criteria (B) of Scholz (1980),(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)is the NPMLE of
(Λ
(0,1)0 ,Λ
(0,2)0 ,Λ
(1,2)0
), since Ln
(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)> 0.
Let S0 be the marginal survival function of T. Note that for any t � 0,
S0 (t) = P (T > t) = −∫
R+ST|U (t |u) S(0,1)
0 (du)
= S(0,1)0 (t) S(0,2)
0 (t) −∫
[0,t]S(0,2)
0 (u) S(1,2)0 (t − u) S(0,1)
0 (du) .
Correspondingly, for any t � 0, define
Sn (t) = S(0,1)n (t) S(0,2)
n (t) −∫
[0,t]S(0,2)
n (u) S(1,2)n (t − u) S(0,1)
n (du) ,
where for any t � 0,
S(0,1)n (t) =
∏
y∈[0,t]
[1 − Λ(0,1)
n (dy)], S(0,2)
n (t) =∏
y∈[0,t]
[1 − Λ(0,2)
n (dy)],
S(1,2)n (t) =
∏
y∈[0,t]
[1 − Λ(1,2)
n (dy)].
By the invariance property, it follows that(S(0,1)
n , S(0,2)n , S(1,2)
n
)is the NPMLE of
(S(0,1)
0 , S(0,2)0 , S(1,2)
0
)and Sn is the NPMLE of S0.
4 Asymptotic results
In this section, we present the asymptotic properties of the NPMLE.
4.1 Regularity conditions
We list regularity conditions in this subsection.For any t � 0, define L(0)
0 (t) = S(0,1)0 (t) S(0,2)
0 (t) S(c)0 (t) and
L(1)0 (t) = −θ0S(1,2)
0 (t)∫
R+S(0,2)
0 (u) S(c)0 (u + t) S(0,1)
0 (du) .
The regularity conditions are as follows:
A.1 For a sample of size n, (Ti , Ui , Ci , Xi ,Δi , Vi , Ri , ξi ) , i = 1, . . . , n are n inde-pendent copies of (T, U, C, X,Δ, V, R, ξ).
123
Z. Zhao et al.
A.2 Model (2) holds with Λ(1,2)0 (0) = 0.
A.3 The censoring time C is independent of (T, U).A.4 The indicator ξ is conditionally independent of (T, U, C) given R, with
P (ξ = 1|R = 1) = 1 and P (ξ = 1|R = 0) = θ0 ∈ (0, 1) .
A.5 There exists τ > 0 such that L(0)0 (τ−) > 0 and L(1)
0 (τ−) > 0.
A.6 For any t ∈ [0, τ ] ,Λ(0,1)0 {t} Λ
(0,2)0 {t} = Λ
(0,1)0 {t} Λ
(c)0 {t} = 0.
Assumption A.1 is commonly satisfied. Assumption A.2 is a technical conditionfor model specification. Assumption A.3 indicates that C is independent censoring.Assumption A.4 assumes that there is a subgroup of the potentially dependent censoredsubjects whose (X,Δ) are observed. Assumption A.5 is equivalent to L(0)
0 (0) > 0 and
L(1)0 (0) > 0. The required τ is generally not unique and could be chosen as large as
possible. Assumption A.6 is a mild technical condition which is weaker than continu-ity.
4.2 Asymptotic normality
In this subsection, we establish the asymptotic normality and the asymptotic effi-ciency of the NPMLE estimators. The asymptotic normality is established byconvergence to a tight Gaussian process in the space of uniformly bounded func-tions on [0, τ ]. Specifically, Sn is viewed as a random element in ∞([0, τ ]), and(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)and
(S(0,1)
n , S(0,2)n , S(1,2)
n
)are viewed as random elements in
∞3 ([0, τ ]) = ∞([0, τ ]) × ∞([0, τ ]) × ∞([0, τ ]), where ∞([0, τ ]) denotes the
space of all uniformly bounded real valued functions defined on [0, τ ] equipped withthe uniform norm. The asymptotic efficiency is shown by convolution theorem.
The first theorem gives the asymptotic normality for(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
).
Theorem 1 Under assumptions A.1–A.6,
n1/2(Λ(0,1)
n − Λ(0,1)0 , Λ(0,2)
n − Λ(0,2)0 , Λ(1,2)
n − Λ(1,2)0
)
weakly converges to a tight zero-mean Gaussian process in ∞3 ([0, τ ]) with covariance
function U0, as n → ∞, where for any t, s ∈ [0, τ ],
U0 (t, s) = diag(U (0,1)
0 (t, s) ,U (0,2)0 (t, s) ,U (1,2)
0 (t, s))
,
in which, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ],
U (i, j)0 (t, s) =
∫
[0,min{t,s}]
(L(i)
0 (y−))−1 (
1 − Λ(i, j)0 {y}
)Λ
(i, j)0 (dy) .
The second theorem gives the asymptotic normality for(S(0,1)
n , S(0,2)n , S(1,2)
n
).
123
Semi-Markov model for dependent censoring
Theorem 2 Under assumptions A.1–A.6,
n1/2(S(0,1)
n − S(0,1)0 , S(0,2)
n − S(0,2)0 , S(1,2)
n − S(1,2)0
)
weakly converges to a tight zero-mean Gaussian process in ∞3 ([0, τ ]) with covariance
function V0, as n → ∞, where for any t, s ∈ [0, τ ],
V0 (t, s) = diag(V(0,1)
0 (t, s) ,V(0,2)0 (t, s) ,V(1,2)
0 (t, s)),
in which, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ],
V(i, j)0 (t, s) = S(i, j)
0 (t) S(i, j)0 (s) W(i, j)
0 (t, s),
W(i, j)0 (t, s) =
∫
[0,min{t,s}]
(L
(i)
0 (y−))−1 (
1 − Λ(i, j)0 {y}
)−1Λ
(i, j)0 (dy) .
The next theorem establishes the asymptotic normality for Sn .
Theorem 3 Under assumptions A.1–A.6, n1/2(Sn − S0) weakly converges to a tightzero-mean Gaussian process in ∞([0, τ ]) with covariance function Ω0, as n → ∞,where for any t, s ∈ [0, τ ],
Ω0 (t, s) = Ω(0,1)0 (t, s) + Ω
(0,2)0 (t, s) + Ω
(1,2)0 (t, s) ,
Ω(0,1)0 (t, s) =
∫
(0,t]
∫
(0,s]V(0,1)
0 (x−, y−) S(0,2)0 (dy) S(0,2)
0 (dx)
− S(0,2)0 (0)
∫
(0,t]
∫
(0,s]V(0,1)
0 (x−, s − y) S(1,2)0 (dy) S(0,2)
0 (dx)
− S(0,2)0 (0)
∫
(0,t]
∫
(0,s]V(0,1)
0 (t − x, y−) S(0,2)0 (dy) S(1,2)
0 (dx)
+[S(0,2)
0 (0)]2
∫
(0,t]
∫
(0,s]V(0,1)
0 (t − x, s − y) S(1,2)0 (dy) S(1,2)
0 (dx),
Ω(0,2)0 (t, s) = S(0,1)
0 (t) S(0,1)0 (s) V(0,2)
0 (t, s)
− S(0,1)0 (t)
∫
[0,s]V(0,2)
0 (t, y) S(1,2)0 (s − y) S(0,1)
0 (dy)
− S(0,1)0 (s)
∫
[0,t]V(0,2)
0 (x, s) S(1,2)0 (t − x) S(0,1)
0 (dx)
+∫
[0,t]
∫
[0,s]V(0,2)
0 (x, y) S(1,2)0 (t − x) S(1,2)
0 (s − y) S(0,1)0 (dy) S(0,1)
0 (dx) ,
Ω(1,2)0 (t, s) =
∫
[0,t]
∫
[0,s]V(1,2)
0 (t − x, s − y) S(0,2)0 (x) S(0,2)
0 (y) S(0,1)0 (dy) S(0,1)
0 (dx) .
It is worth noting that, although Λ(0,1)n − Λ
(0,1)0 , Λ
(0,2)n − Λ
(0,2)0 and Λ
(1,2)n − Λ
(1,2)0
themselves are martingales, they do not form a joint martingale, since no common σ -field is available. Hence, the martingale limit theory cannot be directly used here. Theproof of the above theorems are based on empirical process theory and are providedin the Appendix.
123
Z. Zhao et al.
4.3 Asymptotic efficiency
In this subsection, we turn to the asymptotic nonparametric efficiency, which is for-mally defined by the convolution theorem (Theorem VIII.3.1 of Andersen et al. 1993).
We explicitly specify the Hilbert space required by the convolution theorem. DefineH = H
(0,1) × H(0,2) × H
(1,2), where for (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)},
H(i, j) =
{
h :∫
[0,τ ][h (y)]2 L(i)
0 (y−)(
1 − Λ(i, j)0 {y}
)−1Λ
(i, j)0 (dy) < +∞
}
.
For any g = (g(0,1), g(0,2), g(1,2)
), h = (
h(0,1), h(0,2), h(1,2)) ∈ H, define
〈g, h〉H =∑
(i, j)
∫
[0,τ ]g(i, j) (y) h(i, j) (y) L(i)
0 (y−)(
1 − Λ(i, j)0 {y}
)−1Λ
(i, j)0 (dy) ,
where the summation is taken over {(0, 1) , (0, 2) , (1, 2)}. For any h ∈ H, define‖h‖H = 〈h, h〉1/2
H. It can be shown that H is a Hilbert space with the inner product
〈·, ·〉H and the norm ‖·‖H.For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any h = (
h(0,1), h(0,2), h(1,2)) ∈ H,
define
Λ(i, j)n,h (t) =
∫
[0,t]
(1 + n−1/2h(i, j) (y)
)Λ
(i, j)0 (dy) ,
the likelihood ratio of(Λ
(0,1)n,h ,Λ
(0,2)n,h ,Λ
(1,2)n,h
)to
(Λ
(0,1)0 ,Λ
(0,2)0 ,Λ
(1,2)0
)is:
Rn (h) = R(0,1)n (h)R(0,2)
n (h)R(1,2)n (h) ,
where for any h = (h(0,1), h(0,2), h(1,2)
) ∈ H,
log R(0,1)n (h) =
n∑
i=1
(1 − Ri ) log[1 + n−1/2h(0,1) (Vi )
]
+n∑
i=1
⎡
⎣log∏
y∈[0,Vi )
(1 − Λ
(0,1)n,h (dy)
)− log
∏
y∈[0,Vi )
(1 − Λ
(0,1)0 (dy)
)⎤
⎦ ,
log R(0,2)n (h) =
n∑
i=1
Δi Ri log[1 + n−1/2h(0,2) (Vi )
]
+n∑
i=1
(1 − Δi Ri )[log
(1 − Λ
(0,2)n,h {Vi }
)− log
(1 − Λ
(0,2)0 {Vi }
)]
+n∑
i=1
⎡
⎣log∏
y∈[0,Vi )
(1 − Λ
(0,2)n,h (dy)
)− log
∏
y∈[0,Vi )
(1 − Λ
(0,2)0 (dy)
)⎤
⎦ ,
123
Semi-Markov model for dependent censoring
log R(1,2)n (h) =
n∑
i=1
ξi (1 − Ri ) Δi log[1 + n−1/2h(1,2) (Xi − Vi )
]
+n∑
i=1
ξi (1 − Ri ) (1 − Δi ) log(
1 − Λ(1,2)n,h {Xi − Vi }
)
−n∑
i=1
ξi (1 − Ri ) (1 − Δi ) log(
1 − Λ(1,2)0 {Xi − Vi }
)
+n∑
i=1
ξi (1 − Ri ) log∏
y∈[0,Xi −Vi )
(1 − Λ
(1,2)n,h (dy)
)
−n∑
i=1
ξi (1 − Ri ) log∏
y∈[0,Xi −Vi )
(1 − Λ
(1,2)0 (dy)
).
Theorem 4 Under assumptions A.1–A.6, for any m > 0, h1, . . . , hm ∈ H,
(log Rn (h1) , . . . , log Rn (hm))
weakly converges to a Gaussian random vector in Rm with mean
−1
2
(‖h1‖2
H, . . . , ‖hm‖2
H
)
and covariance matrix
⎛
⎜⎝
〈h1, h1〉H . . . 〈h1, hm〉H...
. . ....
〈hm, h1〉H · · · 〈hm, hm〉H
⎞
⎟⎠ .
Hence, the information operator I0 : H × H → R is: for any g, h ∈ H,
I0 (g, h) = 〈g, h〉H .
For any h = (h(0,1), h(0,2), h(1,2)
) ∈ H, define
κn (h)=∫
[0,τ ]
(h(0,1) (y) Λ(0,1)
n (dy)+h(0,2) (y) Λ(0,2)n (dy)+h(1,2) (y) Λ(1,2)
n (dy)).
Note that for any g, h ∈ H, the asymptotic covariance of κn (g) and κn (h) is
∑
(i, j)
∫
[0,τ ]g(i, j) (y) h(i, j) (y) L(i)
0 (y−)(
1 − Λ(i, j)0 {y}
)−1Λ
(i, j)0 (dy) (9)
123
Z. Zhao et al.
and can be interpreted as the “inverse” of the information operator I0, wherethe summation is taken over {(0, 1) , (0, 2) , (1, 2)}. The efficiency result for(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)and Sn is stated in the following theorem.
Theorem 5 Under assumptions A.1–A.6,(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)and Sn are efficient.
The proof of the above theorems are given in the Appendix.
4.4 Asymptotic covariance function estimation
To carry out statistical inference, it is necessary to estimate the asymptotic covari-ance function. In this subsection, we provide uniform consistent estimators for theasymptotic covariance functions.
For any g = (g(0,1), g(0,2), g(1,2)
), h = (
h(0,1), h(0,2), h(1,2)) ∈ H, define
In (g, h) =∑
(i, j)
∫
[0,τ ]g(i, j) (y) h(i, j) (y)
(1 − Λ
(i, j)n {y}
)−1N
(i, j)n (dy)
where the summation is taken over {(0, 1) , (0, 2) , (1, 2)}.For any g = (
g(0,1), g(0,2), g(1,2)), h = (
h(0,1), h(0,2), h(1,2)) ∈ H, it can be
shown that In (g, h) is the negative second order directional derivative of log Ln at(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)along the direction
[Gn, Hn
], where
Gn =(∫
g(0,1)dΛ(0,1)n ,
∫
g(0,2)dΛ(0,2)n ,
∫
g(1,2)dΛ(1,2)n
)
,
Hn =(∫
h(0,1)dΛ(0,1)n ,
∫
h(0,2)dΛ(0,2)n ,
∫
h(1,2)dΛ(1,2)n
)
.
Since the function Ln plays the role of the likelihood, In should be a reasonableestimator for I0. The “inverse” operation in (9) leads to an estimator for the asymptoticcovariance function of κn :
∑
(i, j)
∫
[0,τ ]g(i, j) (y) h(i, j) (y)
(1 − Λ
(i, j)n {y}
)−1Y
(0)
n (y) Λ(i, j)n (dy)
for all g = (g(0,1), g(0,2), g(1,2)
), h = (
h(0,1), h(0,2), h(1,2)) ∈ H, where the summa-
tion is taken over {(0, 1) , (0, 2) , (1, 2)}.Hence, the estimators for U0 and V0 are given as follows: for any t, s ∈ [0, τ ],
Un (t, s) = diag(U (0,1)
n (t, s) , U (0,2)n (t, s) , U (1,2)
n (t, s))
,
Vn (t, s) = diag(V(0,1)
n (t, s) , V(0,2)n (t, s) , V(1,2)
n (t, s))
,
123
Semi-Markov model for dependent censoring
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ],
V(i, j)n (t, s) = S(i, j)
n (t) S(i, j)n (s) W(i, j)
n (t, s) ,
U (i, j)n (t, s) = n
∫
[0,min{t,s}]
(1 − Λ
(i, j)n {y}
) (Y
(i)n (y)
)−1Λ
(i, j)n (dy) ,
W(i, j)n (t, s) = n
∫
[0,min{t,s}]
(Y
(i)n (y)
)−1 (1 − Λ
(i, j)n {y}
)−1Λ
(i, j)n (dy) .
The estimator for Ω0 is given as Ωn : for any t, s ∈ [0, τ ],
Ωn (t, s) = Ω(0,1)n (t, s) + Ω(0,2)
n (t, s) + Ω(1,2)n (t, s) ,
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t, s ∈ [0, τ ],
Ω(0,1)n (t, s) =
∫
(0,t]
∫
(0,s]V(0,1)
n (x−, y−) S(0,2)n (dy) S(0,2)
n (dx)
− S(0,2)n (0)
∫
(0,t]
∫
(0,s]V(0,1)
n (x−, s − y) S(1,2)n (dy) S(0,2)
n (dx)
− S(0,2)n (0)
∫
(0,t]
∫
(0,s]V(0,1)
n (t − x, y−) S(0,2)n (dy) S(1,2)
n (dx)
+[S(0,2)
n (0)]2
∫
(0,t]
∫
(0,s]V(0,1)
n (t − x, s − y) S(1,2)n (dy) S(1,2)
n (dx) ,
Ω(0,2)n (t, s)
= S(0,1)n (t) S(0,1)
n (s) V(0,2)n (t, s)
− S(0,1)n (t)
∫
[0,s]V(0,2)
n (t, y) S(1,2)n (s − y) S(0,1)
n (dy)
− S(0,1)n (s)
∫
[0,t]V(0,2)
n (x, s) S(1,2)n (t − x) S(0,1)
n (dx)
+∫
[0,t]
∫
[0,s]V(0,2)
n (x, y) S(1,2)n (t − x) S(1,2)
n (s − y) S(0,1)n (dy) S(0,1)
n (dx) ,
Ω(1,2)n (t, s) =
∫
[0,t]
∫
[0,s]V(1,2)
n (t − x, s − y) S(0,2)n (x) S(0,2)
n (y) S(0,1)n (dy)
× S(0,1)n (dx) .
The following theorem states the uniform consistency for Un, Vn and Ωn .
Theorem 6 Under assumptions A.1–A.6, as n → ∞,
supt,s∈[0,τ ]
∣∣∣Un (t, s) − U0 (t, s)
∣∣∣ , sup
t,s∈[0,τ ]
∣∣∣Vn (t, s) − V0 (t, s)
∣∣∣ ,
supt,s∈[0,τ ]
∣∣∣Ωn (t, s) − Ω0 (t, s)
∣∣∣
converge to zero in probability.
The proof of the theorem is given in the Appendix.
123
Z. Zhao et al.
Based on the asymptotic theory and the estimation of the asymptotic covariancefunction, various types of inference can be conducted. For example, for any fixedt0 ∈ [0, τ ], a 95 % pointwise confidence interval can be constructed as:
Sn (t0) ± 1.96Ω1/2n (t0, t0) .
5 Numerical studies
5.1 Simulation studies
Four sets of simulation studies were carried out to examine the finite-sample perfor-mance of the proposed NPMLE.
In the first and the third sets of the simulation studies, the independent censoring timeC was specified as +∞. In the second and the fourth sets, the independent censoringtime C was generated from a uniform distribution such that P (T � C) = 0.15.
In the first and the second sets of the simulation studies, the PD censoring timeU was generated from Uniform (0, η), while in the third and the fourth sets, the PDcensoring time U was generated from the exponential distribution with hazard rate η.
In all simulation studies, the survival time T was generated as
T ={
T(1) if T(1) � UU + T(2) if T(1) > U
,
where T(1) was generated from the exponential distribution with hazard rate 1 and T(2)
was generated from the exponential distribution with hazard rate λ. The parameterswere set to be λ ∈ {1, 2} , P (U < T) ∈ {0.3, 0.7} and θ0 ∈ {0.3, 0.6, 1}. For eachscenario, we generated 1000 samples of size n = 100.
The simulation results were summarized in Tables 1, 2, 3, 4, 5, 6, 7 and 8, whichreport the empirical bias (Bias), the empirical standard deviation (Std), the square rootof the empirical mean squared error (sqrt[MSE]) and the average estimated standarddeviation (AES) of the proposed estimator, as well as the empirical coverage rate (CR)of the corresponding 95 % pointwise confidence interval, at the 0.3th, 0.5th and 0.7thquantiles of T. The values of Bias, Std, sqrt(MSE) and AES were multiplied by 10,000.For comparison, the results from the Kaplan–Meier approach were also included in thetables. In addition, the empirical bias, the empirical standard deviation and the squareroot of the empirical mean squared error of Lee and Tsai’s estimator were includedin Tables 1, 2, 5 and 6. Because there is no valid estimator of the variance of Lee andTsai’s estimator, the AES and the CR are not reported for the Lee and Tsai’s estimator.The Lee and Tsai’s estimator was not included in Tables 3, 4, 7 and 8, because it doesnot incorporate the additional independent censoring.
For the cases with C = +∞, in which the Lee and Tsai’s approach is applicable,the proposed NPMLE and the Lee and Tsai’s estimator were nearly unbiased andconsistent. The NPMLE had smaller sqrt(MSE) than the Lee and Tsai’s estimatorfor all three quantiles, the improvement ranged from 0.6 to 7.9 %. The coverage ofthe 95 % pointwise confidence interval for the proposed NPMLE were close to thenominal level.
123
Semi-Markov model for dependent censoring
Tabl
e1
Sim
ulat
ion
resu
ltsw
hen
C=
+∞an
dU
isge
nera
ted
from
the
unif
orm
dist
ribu
tion
with
P(T
>U
)=
0.3
λθ
τL
eean
dT
sai’s
estim
ator
Kap
lan–
Mei
eres
timat
orN
PML
E
Bia
sSt
dsq
rt(M
SE)
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
316
.322
449.
049
449.
346
17.5
0445
2.96
345
3.30
146
4.36
60.
946
15.0
9544
4.90
044
5.15
645
3.46
50.
948
10.
30.
520
.922
514.
706
515.
131
19.3
0251
9.67
252
0.03
051
9.93
20.
952
18.8
5550
7.43
750
7.78
850
8.85
80.
950
10.
30.
715
.460
493.
339
493.
581
4.20
548
3.33
248
3.35
049
7.77
30.
954
20.2
4548
7.24
148
7.66
251
1.33
30.
966
10.
60.
311
.145
477.
190
477.
320
9.97
148
7.06
348
7.16
546
0.30
90.
932
9.91
147
3.49
347
3.59
644
7.44
70.
928
10.
60.
511
.651
505.
172
505.
306
9.10
551
7.07
951
7.16
051
0.23
50.
945
10.3
6149
7.35
649
7.46
448
9.32
20.
942
10.
60.
729
.273
472.
164
473.
071
30.3
5048
6.64
448
7.58
947
9.24
30.
935
25.3
4946
1.29
346
1.98
946
7.52
00.
946
11.
00.
319
.700
456.
914
457.
339
19.7
0045
6.91
445
7.33
945
4.64
70.
949
17.7
1644
4.79
744
5.15
044
4.82
50.
940
11.
00.
5−8
.600
507.
468
507.
541
−8.6
0050
7.46
850
7.54
149
7.40
70.
947
−9.6
4848
1.70
148
1.79
848
0.68
90.
953
11.
00.
7−9
.100
457.
686
457.
777
−9.1
0045
7.68
645
7.77
745
5.09
80.
948
−10.
352
437.
898
438.
020
446.
506
0.95
1
20.
30.
39.
379
467.
485
467.
579
89.8
5147
1.06
547
9.55
846
0.53
30.
935
12.2
6046
4.08
246
4.24
445
3.54
10.
944
20.
30.
5−3
.479
507.
639
507.
651
183.
352
510.
943
542.
845
518.
393
0.93
2−2
.961
502.
958
502.
966
512.
860
0.94
6
20.
30.
7−1
1.35
948
0.61
448
0.74
928
2.39
950
2.68
257
6.57
550
5.04
90.
927
− 12.
246
477.
239
477.
396
518.
828
0.96
5
20.
60.
3−7
.299
457.
866
457.
924
36.8
1746
3.47
646
4.93
645
9.00
20.
940
−5.3
2044
8.09
444
8.12
644
6.42
90.
943
20.
60.
51.
588
523.
249
523.
251
103.
364
533.
389
543.
312
508.
829
0.93
18.
289
507.
991
508.
059
493.
961
0.94
3
20.
60.
7−1
6.65
447
8.81
447
9.10
413
3.20
850
4.19
052
1.49
048
1.33
20.
935
−15.
314
464.
329
464.
582
488.
198
0.96
1
21.
00.
3−9
.700
440.
306
440.
413
−9.7
0044
0.30
644
0.41
345
6.14
90.
956
−10.
120
426.
256
426.
376
443.
491
0.95
2
21.
00.
5−1
1.00
047
3.92
447
4.05
1−1
1.00
047
3.92
447
4.05
149
7.74
00.
949
−8.2
8546
0.14
946
0.22
448
5.97
50.
957
21.
00.
79.
900
449.
504
449.
613
9.90
044
9.50
444
9.61
345
6.03
50.
960
9.80
643
0.75
443
0.86
547
6.47
00.
970
The
AE
San
dth
eC
Rar
eno
trep
orte
dfo
rthe
Lee
and
Tsa
i’ses
timat
or,s
ince
the
corr
espo
ndin
gva
rian
cees
timat
orha
sno
tbee
nde
velo
ped.
The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Z. Zhao et al.
Tabl
e2
Sim
ulat
ion
resu
ltsw
hen
C=
+∞an
dU
isge
nera
ted
from
the
unif
orm
dist
ribu
tion
with
P(T
>U
)=
0.7
λθ
τL
eean
dT
sai’s
estim
ator
Kap
lan–
Mei
eres
timat
orN
PML
E
Bia
sSt
dsq
rt(M
SE)
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
3−1
1.33
749
9.43
749
9.56
5−0
.496
520.
377
520.
377
503.
210
0.93
0−1
4.57
649
2.27
649
2.49
247
4.69
80.
932
10.
30.
5−3
1.31
165
8.79
165
9.53
4−3
3.69
567
5.06
067
5.90
164
4.22
20.
925
−30.
850
647.
774
648.
508
631.
552
0.93
9
10.
30.
7−2
.828
716.
866
716.
871
0.97
376
0.75
276
0.75
372
8.59
20.
918
−3.8
0271
0.68
371
0.69
469
7.28
80.
920
10.
60.
34.
242
442.
553
442.
574
1.27
246
9.83
646
9.83
848
0.90
60.
946
2.31
541
8.69
341
8.69
943
7.04
70.
959
10.
60.
55.
565
525.
029
525.
059
3.81
355
8.60
555
8.61
856
2.14
00.
950
4.55
949
6.98
549
7.00
651
1.49
10.
952
10.
60.
7−5
.090
531.
379
531.
404
−6.2
8556
0.95
656
0.99
155
6.61
60.
946
−5.8
1451
0.06
051
0.09
352
1.13
10.
951
11.
00.
31.
900
467.
809
467.
813
1.90
046
7.80
946
7.81
345
5.30
10.
941
−9.3
9043
0.11
943
0.22
142
0.91
80.
932
11.
00.
5−2
.100
489.
127
489.
131
−2.1
0048
9.12
748
9.13
149
7.59
20.
945
−5.5
4744
7.85
244
7.88
645
4.72
20.
950
11.
00.
7−6
.200
448.
980
449.
023
−6.2
0044
8.98
044
9.02
345
5.33
60.
950
−6.6
4241
4.76
941
4.82
242
7.75
80.
956
20.
30.
3− 5
.362
495.
129
495.
158
245.
676
499.
341
556.
506
483.
087
0.89
5−5
.022
479.
782
479.
808
467.
545
0.94
5
20.
30.
512
.071
602.
643
602.
763
513.
093
595.
947
786.
395
593.
462
0.85
18.
185
589.
272
589.
329
581.
017
0.93
5
20.
30.
71.
222
645.
529
645.
530
466.
958
739.
829
874.
869
721.
418
0.88
16.
178
634.
232
634.
262
623.
757
0.93
2
20.
60.
37.
979
460.
885
460.
954
138.
658
474.
635
494.
474
470.
687
0.93
116
.016
438.
841
439.
133
428.
777
0.94
3
20.
60.
511
.827
527.
470
527.
603
247.
354
563.
289
615.
206
546.
354
0.91
411
.581
494.
984
495.
120
491.
471
0.95
0
20.
60.
7−6
.385
511.
517
511.
557
193.
050
566.
862
598.
833
559.
222
0.93
8−1
.412
485.
327
485.
329
479.
425
0.94
4
21.
00.
3−8
.200
457.
627
457.
700
−8.2
0045
7.62
745
7.70
045
5.86
50.
953
−15.
557
409.
854
410.
149
413.
332
0.95
6
21.
00.
5−2
3.80
048
3.32
948
3.91
5−2
3.80
048
3.32
948
3.91
549
7.64
50.
956
−24.
953
436.
142
436.
856
449.
685
0.95
8
21.
00.
7−1
4.80
044
5.84
744
6.09
3−1
4.80
044
5.84
744
6.09
345
4.98
40.
960
−18.
720
392.
816
393.
261
406.
233
0.95
1
The
AE
San
dth
eC
Rar
eno
trep
orte
dfo
rthe
Lee
and
Tsa
i’ses
timat
or,s
ince
the
corr
espo
ndin
gva
rian
cees
timat
orha
sno
tbee
nde
velo
ped.
The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Semi-Markov model for dependent censoring
Tabl
e3
Sim
ulat
ion
resu
ltsw
hen
P(T
>C
)=
0.15
and
Uis
gene
rate
dfr
omth
eun
ifor
mdi
stri
butio
nw
ithP
(T>
U)=
0.3
λθ
τK
apla
n–M
eier
estim
ator
NPM
LE
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
33.
157
475.
857
475.
868
465.
098
0.94
31.
603
464.
227
464.
229
454.
039
0.94
4
10.
30.
5−8
.948
518.
809
518.
886
520.
695
0.94
9−1
2.34
651
5.59
251
5.74
051
0.10
70.
952
10.
30.
7−2
4.53
548
0.11
648
0.74
249
7.57
60.
949
−26.
005
488.
030
488.
722
512.
812
0.96
1
10.
60.
3−5
5.07
145
7.71
946
1.02
046
3.38
60.
952
−53.
657
444.
683
447.
908
450.
794
0.94
3
10.
60.
5−2
7.60
552
1.31
252
2.04
250
9.74
00.
944
−23.
875
496.
274
496.
848
489.
593
0.93
9
10.
60.
7−4
.198
477.
638
477.
657
477.
467
0.95
2−6
.434
444.
951
444.
998
466.
468
0.96
0
11.
00.
35.
800
456.
393
456.
430
455.
269
0.95
72.
850
450.
296
450.
305
445.
537
0.95
1
11.
00.
5−1
.200
517.
326
517.
327
497.
304
0.93
8−0
.957
504.
409
504.
410
480.
853
0.94
5
11.
00.
7−9
.700
457.
368
457.
470
455.
068
0.94
9−4
.476
439.
057
439.
080
446.
865
0.94
7
20.
30.
373
.204
466.
208
471.
920
461.
281
0.93
2−3
.787
461.
198
461.
213
454.
278
0.93
6
20.
30.
518
6.87
751
9.63
855
2.22
051
8.04
10.
932
5.28
050
8.99
250
9.02
051
3.61
60.
955
20.
30.
728
5.55
252
0.92
959
4.06
050
4.75
90.
911
9.75
049
4.25
249
4.34
851
9.51
30.
964
20.
60.
354
.006
466.
143
469.
262
458.
288
0.94
711
.512
451.
738
451.
884
445.
745
0.93
7
20.
60.
512
4.85
253
3.05
254
7.47
850
9.02
00.
923
18.9
4050
8.31
950
8.67
149
4.00
40.
936
20.
60.
716
8 .70
148
1.22
150
9.93
548
3.24
50.
939
16.3
4844
7.17
444
7.47
348
9.08
20.
963
21.
00.
3−4
.200
473.
568
473.
587
455.
499
0.94
5−4
.680
459.
618
459.
642
442.
950
0.93
4
21.
00.
58.
000
498.
583
498.
647
497.
498
0.94
84.
254
478.
715
478.
734
485.
907
0.95
5
21.
00.
78.
600
444.
391
444.
474
456.
052
0.96
15.
087
423.
137
423.
167
476.
635
0.97
4
The
Lee
and
Tsa
i’ses
timat
oris
noti
nclu
ded,
sinc
eth
ead
ditio
nali
ndep
ende
ntce
nsor
ing
isno
tinc
orpo
rate
din
this
appr
oach
.The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Z. Zhao et al.
Tabl
e4
Sim
ulat
ion
resu
ltsw
hen
P(T
>C
)=
0.15
and
Uis
gene
rate
dfr
omth
eun
ifor
mdi
stri
butio
nw
ithP
(T>
U)=
0.7
λθ
τK
apla
n–M
eier
estim
ator
NPM
LE
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
3−1
1.32
150
8.92
750
9.05
350
4.17
20.
952
−7.0
3249
1.38
349
1.43
347
3.16
40.
938
10.
30.
5−2
.953
674.
774
674.
780
642.
816
0.93
84.
681
649.
068
649.
085
629.
126
0.93
0
10.
30.
735
.088
748.
811
749.
633
731.
977
0.93
621
.350
714.
376
714.
695
698.
724
0.93
3
10.
60.
3−1
.934
492.
202
492.
205
480.
297
0.94
1−1
.255
440.
665
440.
666
437.
101
0.94
8
10.
60.
5−1
.948
557.
711
557.
714
561.
540
0.95
011
.016
501.
484
501.
605
511.
521
0.94
5
10.
60.
721
.931
556.
991
557.
423
556.
910
0.94
721
.264
510.
031
510.
474
521.
362
0.93
5
11.
00.
3−5
.000
475.
711
475.
738
455.
495
0.94
3−5
.471
432.
805
432.
840
420.
981
0.93
7
11.
00.
5−1
5.70
051
0.45
551
0.69
649
7.37
40.
938
−12.
767
461.
219
461.
395
454.
286
0.94
5
11.
00.
7−1
2.20
046
8.41
346
8.57
245
4.83
80.
946
−21.
023
433.
071
433.
581
427.
045
0.93
8
20.
30.
326
8.55
846
8.72
654
0.21
148
2.08
00.
889
9.30
946
6.17
146
6.26
446
6.74
50.
940
20.
30.
550
0.11
459
3.13
577
5.83
759
4.95
20.
851
−1.3
8658
7.40
958
7.41
058
1.48
20.
945
20.
30.
742
0.00
973
9.10
585
0.10
872
2.83
90.
900
−13.
806
641.
460
641.
608
621.
044
0.92
4
20.
60.
397
.707
453.
296
463.
707
473.
168
0.94
5−2
7.11
741
3.58
141
4.46
943
0.98
10.
960
20.
60.
521
7.81
255
5.66
759
6.83
154
6.97
00.
916
−17.
538
483.
365
483.
683
492.
237
0.95
0
20.
60.
719
0.46
256
1.83
659
3.24
255
9.10
50.
942
−7.8
1148
2.64
448
2.70
847
9.95
50.
937
21.
00.
30.
100
453.
780
453.
780
455.
556
0.95
71.
700
407.
523
407.
527
412.
883
0.94
8
21.
00.
5−1
8.30
051
4.19
851
4.52
349
7.33
30.
936
−7.8
8245
0.73
945
0.80
744
9.65
40.
938
21.
00.
7−2
0.60
046
5.65
346
6.10
845
4.48
50.
943
−19.
324
415.
083
415.
533
405.
605
0.93
7
The
Lee
and
Tsa
i’ses
timat
oris
noti
nclu
ded,
sinc
eth
ead
ditio
nali
ndep
ende
ntce
nsor
ing
isno
tinc
orpo
rate
din
this
appr
oach
.The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Semi-Markov model for dependent censoring
Tabl
e5
Sim
ulat
ion
resu
ltsw
hen
C=
+∞an
dU
isge
nera
ted
from
the
expo
nent
iald
istr
ibut
ion
with
P(T
>U
)=
0.3
λθ
τL
eean
dT
sai’s
estim
ator
Kap
lan–
Mei
eres
timat
orN
PML
E
Bia
sSt
dsq
rt(M
SE)
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
310
.296
456.
820
456.
936
11.2
4746
9.84
746
9.98
146
8.01
80.
941
10.4
4245
0.53
845
0.65
945
6.57
40.
944
10.
30.
54.
645
522.
233
522.
254
10.9
0652
7.33
852
7.45
152
6.66
20.
955
3.85
251
7.10
551
7.12
052
1.40
50.
953
10.
30.
7−2
.288
513.
833
513.
838
−2.4
5650
3.66
550
3.67
150
5.22
60.
943
−3.6
1351
0.71
151
0.72
452
9.39
60.
952
10.
60.
321
.891
464.
160
464.
676
19.8
9947
0.59
347
1.01
446
1.64
30.
940
20.4
0745
7.76
245
8.21
644
6.71
80.
942
10.
60.
514
.251
495.
721
495.
925
14.5
9650
3.02
350
3.23
551
3.28
80.
951
12.5
1348
9.97
549
0.13
549
2.02
70.
947
10.
60.
7−0
.230
475.
155
475.
155
0.28
648
2.68
348
2.68
348
1.63
20.
944
1.16
846
4.45
646
4.45
747
3.89
40.
954
11.
00.
319
.800
456.
154
456.
583
19.8
0045
6.15
445
6.58
345
4.66
50.
958
19.1
5644
6.73
044
7.14
144
2.64
00.
952
11.
00.
55.
600
485.
494
485.
526
5.60
048
5.49
448
5.52
649
7.62
80.
947
4.39
346
9.21
646
9.23
647
9.71
00.
944
11.
00.
76.
200
453.
594
453.
636
6.20
045
3.59
445
3.63
645
5.83
80.
956
7.41
542
9.56
642
9.63
044
8.33
90.
963
20.
30.
318
.762
440.
773
441.
172
116.
418
447.
944
462.
825
462.
360
0.93
819
.215
433.
321
433.
747
456.
426
0.95
3
20.
30.
52.
862
514.
297
514.
305
222.
344
511.
420
557.
662
523.
404
0.93
42.
159
508.
523
508.
527
522.
428
0.95
0
20.
30.
7−1
.447
496.
163
496.
165
318.
350
506.
289
598.
059
512.
735
0.91
5−3
.142
489.
412
489.
422
526.
006
0.96
1
20.
60.
3−1
2.46
645
4.92
045
5.09
143
.213
460.
479
462.
502
460.
518
0.93
4−1
5.99
643
8.64
843
8.93
944
6.45
40.
944
20.
60.
5−8
.363
503.
876
503.
945
113.
093
513.
398
525.
707
512.
080
0.94
3−1
0.43
548
8.75
448
8.86
549
6.81
80.
955
20.
60.
7−0
.668
472.
414
472.
414
167.
725
498.
620
526.
074
486.
202
0.94
3−1
.017
458.
369
458.
371
491.
470
0.97
1
21.
00.
3−2
2.60
045
7.66
745
8.22
4−2
2.60
045
7.66
745
8.22
445
6.49
30.
948
−18.
267
440.
063
440.
442
441.
278
0.93
9
21.
00.
5−2
2.50
051
7.35
051
7.83
9−2
2.50
051
7.35
051
7.83
949
7.29
80.
941
−16.
422
489.
006
489.
282
484.
956
0.94
4
21.
00.
71.
400
452.
751
452.
753
1.40
045
2.75
145
2.75
345
5.62
60.
951
−2.1
1643
5.38
743
5.39
247
6.52
30.
955
The
AE
San
dth
eC
Rar
eno
trep
orte
dfo
rthe
Lee
and
Tsa
i’ses
timat
or,s
ince
the
corr
espo
ndin
gva
rian
cees
timat
orha
sno
tbee
nde
velo
ped.
The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Z. Zhao et al.
Tabl
e6
Sim
ulat
ion
resu
ltsw
hen
C=
+∞an
dU
isge
nera
ted
from
the
expo
nent
iald
istr
ibut
ion
with
P(T
>U
)=
0.7
λθ
τL
eean
dT
sai’s
estim
ator
Kap
lan–
Mei
eres
timat
orN
PML
E
Bia
sSt
dsq
rt(M
SE)
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
3−5
.225
538.
632
538.
657
−3.8
1855
0.36
155
0.37
452
6.10
30.
935
−4.2
0252
4.80
952
4.82
550
7.76
90.
936
10.
30.
524
.438
685.
975
686.
410
29.8
3767
2.75
967
3.42
064
9.39
20.
932
22.8
3567
7.26
067
7.64
565
5.72
90.
931
10.
30.
720
.986
710.
008
710.
318
35.8
3670
1.71
170
2.62
668
3.78
20.
933
14.4
0570
3.19
570
3.34
368
5.00
90.
928
10.
60.
30.
793
458.
252
458.
253
−0.8
2347
9.78
247
9.78
349
0.88
80.
956
0.68
043
5.08
443
5.08
544
5.07
10.
956
10.
60.
5−1
7.91
953
6.53
353
6.83
2−1
9.71
356
0.27
856
0.62
556
6.15
90.
939
−18.
499
514.
169
514.
501
521.
047
0.94
6
10.
60.
7−1
7.08
852
6.25
452
6.53
2−1
3.20
955
3.73
755
3.89
454
5.97
40.
929
−15.
800
505.
719
505.
966
516.
219
0.94
6
11.
00.
3−1
7.50
045
0.74
045
1.08
0−1
7.50
045
0.74
045
1.08
045
6.36
70.
956
−12.
883
413.
698
413.
899
417.
697
0.96
0
11.
00.
5−1
1.10
048
4.33
648
4.46
3−1
1.10
048
4.33
648
4.46
349
7.63
80.
945
−16.
014
445.
981
446.
268
457.
365
0.95
9
11.
00.
7−1
3.40
044
3.30
244
3.50
5−1
3.40
044
3.30
244
3.50
545
5.08
30.
957
−10.
796
411.
697
411.
839
430.
601
0.95
5
20.
30.
3−2
4.21
252
3.87
752
4.43
728
9.56
150
3.13
858
0.51
149
7.89
20.
881
−23.
854
509.
650
510.
208
498.
870
0.93
4
20.
30.
5−2
4.45
061
8.91
161
9.39
350
3.50
261
2.36
179
2.78
061
2.97
60.
854
−18.
987
606.
850
607.
147
613.
808
0.94
5
20.
30.
7−1
4.46
762
2.93
362
3.10
159
8.47
067
8.10
390
4.42
866
6.90
60.
856
−12.
684
614.
458
614.
588
619.
218
0.93
9
20.
60.
3−5
.620
447.
090
447.
125
153.
853
465.
542
490.
307
478.
634
0.92
94.
358
422.
673
422.
695
435.
396
0.94
6
20.
60.
50.
120
511.
504
511.
504
259.
435
540.
600
599.
629
554.
792
0.92
61.
101
489.
888
489.
889
503.
871
0.96
4
20.
60.
7−3
.514
498.
036
498.
048
266.
248
542.
172
604.
018
549.
819
0.93
0−1
0.18
146
9.89
347
0.00
448
8.70
90.
955
21.
00.
321
.600
469.
163
469.
660
21.6
0046
9.16
346
9.66
045
4.39
00.
942
21.1
3841
7.59
741
8.13
240
6.46
10.
933
21.
00.
531
.400
501.
483
502.
465
31.4
0050
1.48
350
2.46
549
7.45
90.
938
23.5
0745
5.78
245
6.38
845
1.00
10.
943
21.
00.
76.
100
457.
320
457.
361
6.10
045
7.32
045
7.36
145
5.78
20.
954
15.7
0142
1.12
142
1.41
442
4.77
10.
955
The
AE
San
dth
eC
Rar
eno
trep
orte
dfo
rthe
Lee
and
Tsa
i’ses
timat
or,s
ince
the
corr
espo
ndin
gva
rian
cees
timat
orha
sno
tbee
nde
velo
ped.
The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Semi-Markov model for dependent censoring
Tabl
e7
Sim
ulat
ion
resu
ltsw
hen
P(T
>C
)=
0.15
and
Uis
gene
rate
dfr
omth
eex
pone
ntia
ldis
trib
utio
nw
ithP
(T>
U)=
0.3
λθ
τK
apla
n–M
eier
estim
ator
NPM
LE
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
3−3
7.14
248
6.21
548
7.63
246
9.84
50.
948
−32.
751
468.
945
470.
088
458.
227
0.94
6
10.
30.
5−9
.926
527.
004
527.
098
526.
192
0.94
4−7
.799
521.
589
521.
647
520.
417
0.94
8
10.
30.
715
.360
508.
382
508.
614
505.
214
0.94
014
.140
511.
564
511.
759
528.
501
0.95
6
10.
60.
3−1
.646
440.
079
440.
082
463.
025
0.96
0−0
.956
421.
928
421.
929
448.
100
0.96
5
10.
60.
516
.974
487.
732
488.
027
513.
509
0.96
119
.143
461.
533
461.
930
492.
530
0.96
3
10.
60.
78.
971
464.
171
464.
258
482.
019
0.95
79.
506
438.
781
438.
884
473.
189
0.96
0
11.
00.
3−5
.500
473.
206
473.
238
455.
534
0.94
8−2
.166
459.
658
459.
663
443.
548
0.93
9
11.
00.
52.
600
517.
978
517.
985
497.
297
0.93
07.
841
491.
710
491.
773
479.
562
0.93
5
11.
00.
7−8
.700
451.
950
452.
033
455.
201
0.95
0−7
.242
431.
096
431.
157
448.
312
0.95
0
20.
30.
310
6.27
244
4.51
445
7.04
146
2.60
10.
942
4.02
844
5.56
544
5.58
345
7.06
60.
954
20.
30.
521
8.34
249
9.22
054
4.87
952
3.11
20.
941
2.86
350
1.59
050
1.59
852
2.94
80.
955
20.
30.
732
0.37
050
0.00
959
3.84
051
2.66
80.
919
1.89
048
1.91
148
1.91
452
8.81
80.
972
20.
60.
357
.458
475.
318
478.
778
459.
332
0.93
66.
531
458.
627
458.
673
444.
893
0.93
3
20.
60.
514
8.05
452
4.99
354
5.47
051
1.50
10.
931
27.5
5850
0.38
550
1.14
349
6.14
10.
949
20.
60.
719
2.21
848
1.92
051
8.84
048
7.07
30.
941
22.6
1744
7.30
444
7.87
549
2.08
80.
973
21.
00.
3−2
4.80
047
4.12
047
4.76
845
6.40
40.
944
− 24.
034
449.
961
450.
602
441.
301
0.94
7
21.
00.
5−1
2.10
049
1.28
949
1.43
849
7.57
00.
943
−21.
879
480.
151
480.
650
485.
176
0.94
8
21.
00.
7−2
3.10
045
7.47
845
8.06
045
4.47
60.
961
−25.
766
429.
097
429.
870
476.
058
0.97
0
The
Lee
and
Tsa
i’ses
timat
oris
noti
nclu
ded,
sinc
eth
ead
ditio
nali
ndep
ende
ntce
nsor
ing
isno
tinc
orpo
rate
din
this
appr
oach
.The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Z. Zhao et al.
Tabl
e8
Sim
ulat
ion
resu
ltsw
hen
P(T
>C
)=
0.15
and
Uis
gene
rate
dfr
omth
eex
pone
ntia
ldis
trib
utio
nw
ithP
(T>
U)=
0.7
λθ
τK
apla
n–M
eier
estim
ator
NPM
LE
Bia
sSt
dsq
rt(M
SE)
AE
SC
RB
ias
Std
sqrt
(MSE
)A
ES
CR
10.
30.
311
.840
530.
033
530.
165
525.
895
0.92
815
.547
504.
538
504.
778
506.
972
0.94
7
10.
30.
515
.475
636.
189
636.
378
650.
307
0.94
715
.571
625.
898
626.
092
654.
501
0.95
5
10.
30.
754
.060
674.
713
676.
876
684.
240
0.94
234
.744
662.
199
663.
110
685.
833
0.94
9
10.
60.
3−7
.869
501.
497
501.
559
490.
966
0.93
1−1
0.25
045
1.60
645
1.72
344
6.25
90.
937
10.
60.
5−7
.043
577.
838
577.
881
565.
730
0.94
4−1
5.67
652
5.63
552
5.86
952
1.54
10.
944
10.
60.
7−1
9.48
154
7.57
054
7.91
654
6.07
60.
943
−21.
699
510.
310
510.
771
515.
834
0.93
8
11.
00.
3−2
0.20
044
8.63
544
9.08
945
6.50
20.
965
−22.
370
400.
871
401.
494
418.
539
0.96
0
11.
00.
5−1
3.10
050
0.16
950
0.34
049
7.48
00.
938
−20.
281
457.
657
458.
107
457.
652
0.95
7
11.
00.
7−1
1.60
046
5.06
146
5.20
645
4.89
80.
944
−4.1
6842
8.94
642
8.96
643
0.92
90.
941
20.
30.
331
7.39
549
6.28
558
9.10
049
6.72
20.
873
−10.
674
501.
787
501.
900
497.
296
0.94
5
20.
30.
555
9.88
462
0.29
683
5.60
661
2.19
90.
826
−10.
036
621.
620
621.
701
613.
743
0.94
3
20.
30.
765
8.43
066
3.68
193
4.88
167
0.39
20.
849
4.92
762
1.53
362
1.55
261
8.64
60.
932
20.
60.
316
3.33
948
6.41
051
3.10
347
8.17
70.
917
5.58
543
7.10
643
7.14
243
5.28
40.
949
20.
60.
528
8.48
155
0.58
162
1.57
955
5.22
70.
919
14.5
3349
5.36
349
5.57
650
4.19
10.
954
20.
60.
728
6.41
454
9.55
261
9.71
055
0.98
00.
927
16.6
1147
2.10
347
2.39
549
0.12
80.
952
21.
00.
336
.000
460.
734
462.
139
453.
857
0.95
127
.417
398.
346
399.
288
406.
952
0.94
5
21.
00.
545
.700
495.
830
497.
932
497.
505
0.95
435
.270
434.
118
435.
548
451.
011
0.95
6
21.
00.
732
.400
458.
891
460.
033
456.
942
0.95
131
.639
410.
367
411.
585
425.
044
0.95
9
The
Lee
and
Tsa
i’ses
timat
oris
noti
nclu
ded,
sinc
eth
ead
ditio
nali
ndep
ende
ntce
nsor
ing
isno
tinc
orpo
rate
din
this
appr
oach
.The
valu
esof
Bia
s,St
d,sq
rt(M
SE)
and
AE
Sha
vebe
enm
ultip
lied
by10
,000
Bia
sth
eem
piri
calb
ias,
Std
the
empi
rica
lsta
ndar
dde
viat
ion,
sqrt
(MSE
)th
esq
uare
root
ofth
eem
piri
calm
ean
squa
red
erro
r,A
ES
the
aver
age
estim
ated
stan
dard
devi
atio
n,C
Rth
eem
piri
calc
over
age
rate
ofth
e95
%co
nfide
nce
inte
rval
123
Semi-Markov model for dependent censoring
100806040200
0.0
Survival Function Estimate
t
Sur
viva
l Rat
eNPMLEKaplan−Meier EsimatorEmpirical DistributionPointwise 95% Confidence Interval Based on the NPMLE ApproachPointwise 95% Confidence Interval Based on the Kaplan−Meier Approach
0.2
0.4
0.6
0.8
1.0
Fig. 2 Kaplan–Meier estimate and NPMLE
For the cases with λ = 1 when T was independent of U and the cases with θ = 1when (X,Δ) was fully observed, the Kaplan–Meier estimator was nearly unbiasedand consistent and the corresponding 95 % confidence intervals provided reasonablecoverage rates; the proposed NPMLE and the Lee and Tsai’s estimator were alsonearly unbiased and consistent, and the coverage rate of the proposed 95 % confidenceintervals were close to the nominal level. Nevertheless, the proposed NPMLE hadsmaller sqrt(MSE) than the Kaplan–Meier estimator.
For the cases with λ = 2 and θ < 1, the Kaplan–Meier estimator had larger biasand appeared inconsistent, while the other two estimators performed well.
5.2 An example
We illustrate our method with the data from a clinical trial of lung cancer conducted bythe Eastern Cooperative Oncology Group, which was initially analyzed by Lagakosand Williams (1978). The data were also used by Lee and Wolfe (1998) and Lee andTsai (2005) for the illustration of their methods. The complete data can be found inLee and Tsai (2005). The study consists of 61 patients with inoperable carcinoma ofthe lung who were treated with the drug cyclophosphamide. Among the 61 patients,28 patients experienced metastatic disease or a significant increase in the size of their
123
Z. Zhao et al.
100806040200
0.0
Survival Function Estimate
t
Sur
viva
l Rat
eNPMLELee and Tsai’s EstimatorEmpirical DistributionPointwise 95% Confidence Interval Based on the NPMLE Approach
0.2
0.4
0.6
0.8
1.0
Fig. 3 Lee and Tsai’s estimator and NPMLE
primary lesion, which are certain types of PD. In addition, Lee and Tsai (2005) dida pseudo second stage sampling, in which they selected 10 of these 28 patients as arandom sample and pretended that the death time of the rest of the 18 patients wereunknown.
For illustration, we also pretended that these 18 patients’ death time were unknown.We treated the 10 selected patients as if they did not leave the study while the rest of the18 did. Figure 2 shows the NPMLE and its pointwise 95 % confidence intervals alongwith the Kaplan–Meier estimator and its pointwise 95 % confidence intervals and theempirical distribution based on the complete data. Figure 3 presents the NPMLE andits pointwise 95 % confidence intervals along with the Lee and Tsai’s estimator andthe empirical distribution based on the complete data.
Figure 2 clearly shows that the Kaplan–Meier estimator is very different from theempirical distribution, and far away from the other estimators. Figures 2 and 3 showthat both NPMLE and the Lee and Tsai’s estimator are very close to the empiricaldistribution. The proposed pointwise 95 % confidence interval contains the Lee andTsai’s estimator and the empirical distribution.
6 Discussion
In oncology studies, in addition to the usual independent censoring, the censoringcaused by the PD is a certain type of dependent censoring. In this paper, we have
123
Semi-Markov model for dependent censoring
adopted the semi-Markov model provided by Lee and Tsai (2005) with an additionalindependent censoring and studied the NPMLE of the cause specific cumulative hazardfunctions and the marginal survival function, where we have shown the asymptoticnormality of the NPMLE in the space of uniformly bounded functions and establishedits asymptotic efficiency. We have also developed uniformly consistent estimators forthe covariance function based on the information operator.
A limitation of the model (1) is that it assumes that the risk of death after diseaseprogression only depends on the time since progression, but not the duration of theprogression. The limitation could be addressed by regression type models with theinclusion of the time to progression as a covariate. It is of interest to develop diagnosticmethods for the model and to investigate regression type models in future studies.
Acknowledgments The research is supported by the National Natural Science Foundation of China(11271081) and the Student Growth Fund Scholarship of School of Management, Fudan University.
Appendix
Preliminary
Lemmas 1 and 2 provide related Donsker properties.
Lemma 1 The classes {(x, δ) �→δI {x � t} : t � 0} and {(x, δ) �→δI {x � t} : t � 0}are Donsker classes of functions on R+ × {0, 1}.
Proof Combining Lemma 2.6.15 and Lemma 2.6.18(iii, vi) of Vaart and Wellner(1996), it can be shown that {(x, δ) �→ δI {x � t} : t � 0} and {(x, δ) �→ δI {x � t} :t � 0} are VC-classes on R+ ×{0, 1}. By Theorem 2.6.8 of Vaart and Wellner (1996),they are Donsker classes, where the measurability conditions could be verified via thedenseness of the rational numbers.
Lemma 2 Let η be a positive real number and Λ be a nondecreasing cadlag function
on [0, η]. The class{(x, δ) �→ ∫ t
0 δI {x � y} Λ(dy) : t ∈ [0, η]}
is a Donsker class
of functions on R+ × {0, 1}.
Proof The result is easy to see by combining Example 2.6.21 and Example 2.10.8 ofVaart and Wellner (1996).
Lemma 3 Let η be a positive real number and BV ([0, η]) be the space of all cadlagfunctions defined on [0, η] whose total variation are bounded by 2. For any (F, G, H) ∈BV ([0, η]) × BV ([0, η]) × BV ([0, η]) and any t ∈ [0, η], let
φ (F, G, H) [t] = F (t) G (t) −∫
[0,t]G (u) H (t − u) F (du) .
123
Z. Zhao et al.
For any F0, G0, H0 ∈ BV ([0, η]) , φ is Hadamard differentiable at (F0, G0, H0) withderivative φ′(F0, G0, H0), where for any t ∈ [0, η],
φ′ (β0) [β] [t] = F0 (t) G (t) + F (t) G0 (t) −∫
[0,t]G0 (x) H0 (t − x) F (dx)
−∫
[0,t]G0 (x) H (t − x) F0 (dx) −
∫
[0,t]G (x) H0 (t − x) F0 (dx) ,
where β = (F, G, H) and β0 = (F0, G0, H0).
Proof Let F0, G0, H0, F, G, H ∈ BV ([0, η]) , {(Fm, Gm, Hm) : m ∈ N} be asequence converging to (F, G, H) in BV ([0, η]) × BV ([0, η]) × BV ([0, η]) and{hm : m ∈ N} be a sequence of real numbers converging to 0.
Denote β0 = (F0, G0, H0) and βm = β0 + hm(F, G, H).Note that for any t ∈ [0, η] , φ(βm)[t] = φ(β0)[t] + hmΓm(t) + h2
mWm(t), where
Γm (t) = F0 (t) Gm (t) + Fm (t) G0 (t) −∫
[0,t]G0 (x) H0 (t − x) Fm (dx)
−∫
[0,t]G0 (x) Hm (t − x) F0 (dx) −
∫
[0,t]Gm (x) H0 (t − x) F0 (dx) ,
Wm (t) = Fm (t) Gm (t) −∫
[0,t]H0 (t − x) Gm (x) Fm (dx)
−∫
[0,t]G0 (x) Hm (t − x) Fm (dx) −
∫
[0,t]Gm (x) Hm (t − x) F0 (dx)
− hm
∫
[0,t]Gm (x) Hm (t − x) Fm (dx) .
For any t ∈ [0, η], let
Γ0 (t) = F0 (t) G (t) + F (t) G0 (t) −∫
[0,t]G0 (x) H0 (t − x) F (dx)
−∫
[0,t]G0 (x) H (t − x) F0 (dx) −
∫
[0,t]G (x) H0 (t − x) F0 (dx) .
Note that for any t ∈ [0, η] , |Wm(t)| � 28 + 8hm and
|Γm (t) − Γ0 (t)| � 6 sups∈[0,η]
|Fm (s) − F (s)| + 6 sups∈[0,η]
|Gm (s) − G (s)|+ 4 sup
s∈[0,η]|Hm (s) − H (s)| .
Hence, as m → ∞, supt∈[0,η] |Wm(t)| = O(1) and supt∈[0,η] |Γm(t)−Γ0(t)| = o(1).
123
Semi-Markov model for dependent censoring
Therefore, φ is Hadamard differentiable at β0 and for any t ∈ [0, η],
φ′ (β0) [β] [t] = F0 (t) G (t) + F (t) G0 (t) −∫
[0,t]G0 (x) H0 (t − x) F (dx)
−∫
[0,t]G0 (x) H (t − x) F0 (dx) −
∫
[0,t]G (x) H0 (t − x) F0 (dx) ,
where β = (F, G, H).
Proof of Theorems 1 to 3
For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t � 0, define
M(i, j)n (t) = N
(i, j)n (t) −
∫
[0,t]Y
(i)n (y) Λ
(i, j)0 (dy) .
Proof Combining Theorem 2.10.6 of Vaart and Wellner (1996) with Lemmas 1 and2, it can be shown that, as n → ∞,
n−1/2(M
(0,1)
n , M(0,2)
n , M(1,2)
n , Y(0)
n − EY(0)
n , Y(1)
n − EY(1)
n
)
weakly converges to a tight zero mean Gaussian process in
∞5 ([0, τ ]) = ∞ ([0, τ ]) × ∞ ([0, τ ]) × ∞ ([0, τ ]) × ∞ ([0, τ ]) × ∞ ([0, τ ]) .
Combining this with Lemma 3.9.17, Lemma 3.9.25, and Theorem 3.9.4 of Vaart andWellner (1996), for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)},
supt∈[0,τ ]
∣∣∣∣Λ
(i, j)n (t) − Λ
(i, j)0 (t) −
∫
[0,t]
(L(i)
0 (y−))−1
n−1M(i, j)n (dy)
∣∣∣∣ = op
(n−1/2) ,
(10)
By the continuity and the linearity of the integral operator, as n → ∞,
n1/2(Λ(0,1)
n − Λ(0,1)0 , Λ(0,2)
n − Λ(0,2)0 , Λ(1,2)
n − Λ(1,2)0
)
weakly converges to a tight Gaussian process in ∞3 ([0, τ ]) with covariance function
U0.By Lemma 3.9.30 and Theorem 3.9.4 of Vaart and Wellner (1996), as n → ∞,
n1/2(S(0,1)
n − S(0,1)0 , S(0,2)
n − S(0,2)0 , S(1,2)
n − S(1,2)0
)
weakly converges to a tight Gaussian process in ∞3 ([0, τ ]) with covariance function
V0.
123
Z. Zhao et al.
Combining Lemma 3 with Theorem 3.9.4 of Vaart and Wellner (1996),
supt∈[0,τ ]
∣∣∣Sn (t) − S0 (t) − φ′ (β0)
[βn − β0
][t]
∣∣∣ = op
(n−1/2
),
where β0 =(S(0,1)
0 , S(0,2)0 , S(1,2)
0
)and βn =
(S(0,1)
n , S(0,2)n , S(1,2)
n
).
Therefore, as n → ∞, n1/2(Sn − S0) weakly converges to a tight zero meanGaussian process in ∞ ([0, τ ]) with covariance function Ω0.
Proof of Theorems 4 and 5
Proof To obtain the efficiency, we will verify the conditions of Theorem VIII.3.2 andTheorem VIII.3.3 of Andersen et al. (1993).
First, we verify the local asymptotic normality (LAN) Assumption (AssumptionVIII.3.1 of Andersen et al. 1993).
By the Taylor expansion, it is easy to show that for any h = (h(0,1), h(0,2), h(1,2)) ∈H,
log Rn (h) = S(0,1)n (h) + S(0,2)
n (h) + S(1,2)n (h) − 1
2‖h‖2
H+ op (1) , (11)
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any h = (h(0,1), h(0,2), h(1,2)) ∈ H,
S(i, j)n (h) = n−1/2
∫
[0,τ ]h(i, j) (y)
(1 − Λ
(i, j)0 {y}
)−1M
(i, j)n (dy) .
Hence, for any m ∈ N+ and any h1, . . . , hm ∈ H,
(log Rn (h1) , . . . , log Rn (hm))
weakly converges to a Gaussian random vector in Rm with mean
−1
2(‖h1‖H , . . . , ‖hm‖H)
and covariance matrix
⎛
⎜⎝
〈h1, h1〉H . . . 〈h1, hm〉H...
. . ....
〈hm, h1〉H · · · 〈hm, hm〉H
⎞
⎟⎠ .
Next, we verify the Differentiability Assumption (Assumption VIII.3.2 of Andersenet al. 1993).
123
Semi-Markov model for dependent censoring
For any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, any h = (h(0,1), h(0,2), h(1,2)) ∈ H, anyt ∈ [0, τ ],
n1/2(Λ
(i, j)n,h (t) − Λ
(i, j)0 (t)
)=
∫
[0,t]h(i, j) (y) Λ
(i, j)0 (dy) .
Note that, for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)}, any h = (h(0,1), h(0,2), h(1,2)) ∈ H,and any t ∈ [0, τ ],
∣∣∣∣
∫
[0,t]h(i, j) (y) Λ
(i, j)0 (dy)
∣∣∣∣ � ‖h‖H Λ
(i, j)0 (τ )
(L(i)
0 (τ−))−1
.
Hence, the Differentiability Assumption is verified.
Next, we show that(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)is regular.
Recall the weak convergence of the log-likelihood ratio log Rn (h). The continuityfollows from the Le Cam’s third Lemma and the weak convergence can be shown viasimilar arguments in the previous subsection. Hence, the regularity follows.
It remains to check Equation (8.3.5) of Andersen et al. (1993) for each coordinate.For any t ∈ [0, τ ], define
γ(0,1)t =
(ϕ
(0,1)t , 0, 0
), γ
(0,2)t =
(0, ϕ
(0,2)t , 0
), γ
(1,2)t =
(0, 0, ϕ
(1,2)t
)
where for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any s ∈ [0, τ ],
ϕ(i, j)t (s) = I {s � t} L(i)
0 (s−)−1(
1 − Λ(i, j)0 {y}
).
By (10) and (11), for any (i, j) ∈ {(0, 1) , (0, 2) , (1, 2)} and any t ∈ [0, τ ],
Λ(i, j)n (t) − Λ
(i, j)0 (t) − n−1/2
(
log Rn
(γ
(i, j)t
)+ 1
2‖h‖2
H
)
= op
(n−1/2
).
Therefore, combining all the above results with Theorem VIII.3.2 and Theorem
VIII.3.3 of Andersen et al. (1993), it follows that(Λ
(0,1)n , Λ
(0,2)n , Λ
(1,2)n
)is asymp-
totically efficient.Combining Theorem VIII.3.4 of Andersen et al. (1993) with Lemma 3, the effi-
ciency of Sn follows.
References
Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical models based on counting processes.Springer, New York
Datta S, Satten GA, Datta S (2000) Nonparametric estimation for the three-stage irreversible illness-deathmodel. Biometrics 56:841–847
Kiefer J, Wolfowitz J (1956) Consistency of the maximum likelihood estimator in the presence of infinitelymany incidental parameters. Ann Math Stat 27(4):887–906
123
Z. Zhao et al.
Lagakos SW, Williams JS (1978) Models for censored survival analysis: a cone class of variable-summodels. Biometrika 65(1):181–189
Lee SY, Tsai WY (2005) An estimator of the survival function based on the semi-markov model underdependent censorship. Lifetime Data Anal 11:193–211
Lee SY, Wolfe RA (1998) A simple test for independent censoring under the proportional hazards model.Biometrics 54:1176–1182
Scholz FW (1980) Towards a unified definition of maximum likelihood. Can J Stat 8:193–203Tsiatis AA (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci USA
72:20–22Van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes: with applications to
statistics. Springer, New York
123