+ All Categories
Home > Documents > 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST...

4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST...

Date post: 07-Mar-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
44
4. Single Decision Treatment Regimes: Additional Methods 4.1 Optimal Regimes from a Classification Perspective 4.2 Outcome Weighted Learning 4.3 Interpretable Treatment Regimes via Decision Lists 4.4 Additional Approaches 4.5 Key References 189 ST 790, Dynamic Treatment Regimes
Transcript
Page 1: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

4. Single Decision Treatment Regimes: Additional Methods4.1 Optimal Regimes from a Classification Perspective4.2 Outcome Weighted Learning4.3 Interpretable Treatment Regimes via Decision Lists4.4 Additional Approaches4.5 Key References

189 ST 790, Dynamic Treatment Regimes

Page 2: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Premise:• Estimation of an optimal regime can be viewed as a weighted

classification problem

• This allows the vast literature on classification and machinelearning to be exploited to define a restricted class of regimesand to estimate an optimal regime within it

• Rules characterizing regimes can be likened to classifiers so caninvolve high-dimensional parameterizations that are complex,flexible functions of individual characteristics

• Algorithms and software developed for classification problemscan be exploited for implementation

• Demonstrated by Zhang et al. (2012) and Zhao et al. (2012)

190 ST 790, Dynamic Treatment Regimes

Page 3: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Generic classification problem

• Z = outcome or class label; here, Z = {0, 1} (binary)

• X = vector of covariates or features taking values in X, thefeature space

• d is a classifier: d : X → {0, 1}

• D is a family of classifiers; e.g., with X = (X1,X2)T

I Hyperplanes of the form

d(X) = I(η11 + η12X1 + η13X2 > 0)

I Rectangular regions of the form

d(X) = I(X1 < η11) + I(X1 ≥ η11,X2 < η12)

191 ST 790, Dynamic Treatment Regimes

Page 4: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Generic classification problem

Implementation:• Training set: (Xi ,Zi), i = 1, . . . ,n• Find classifier d ∈ D that minimizes

I Classification error

n∑i=1

{Zi − d(Xi )}2 =n∑

i=1

I{Zi 6= d(Xi )}

I Weighted classification error

n∑i=1

wi{Zi − d(Xi )}2 =n∑

i=1

wi I{Zi 6= d(Xi )}

for wi , i = 1, . . . ,n, fixed, known weights

192 ST 790, Dynamic Treatment Regimes

Page 5: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Generic classification problem

• This problem has been studied extensively by statisticians andcomputer scientists

• Is a form of supervised learning , an approach within the broadarea of machine learning

• Many methods and software are available

• Recursive partitioning (CART): Rectangular regions

• Support vector machines (SVM): Hyperplanes (linear SVM),nonlinear SVM

193 ST 790, Dynamic Treatment Regimes

Page 6: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Value search estimation, revisited

Zhang et al. (2012): A1 = {0,1}, restricted class Dη• Elements dη = {d1(h1; η1)}, optimal restricted regime

doptη = {d1(h1; ηopt

1 )}, ηopt1 = arg max

η1

V(dη)

• AIPW estimator (3.43) for V(dη) for fixed η = η1

VAIPW (dη) =

n−1n∑

i=1

[ Cdη ,iYi

πdη ,1(H1i ; η1, γ1)−Cdη ,i − πdη ,1(H1i ; η1, γ1)

πdη ,1(H1i ; η1, γ1)Qdη ,1(H1i ; η1, β1)

]

Cdη = I{A1 = d1(H1; η1)} = A1I{d1(H1; η1) = 1}+ (1− A1)I{d1(H1; η1) = 0}

πdη,1(H1; η1, γ1) = π1(H1; γ1)I{d1(H1; η1) = 1}+{1−π1(H1; γ1)}I{d1(H1; η1) = 0}

Qdη,1(H1; η1, β1) = Q1(H1,1;β1)I{d1(H1; η1) = 1}+Q1(H1,0;β1)I{d1(H1; η1) = 0}

194 ST 790, Dynamic Treatment Regimes

Page 7: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Value search estimation, revisited

Estimator for doptη : dopt

η,AIPW = {d1(h1; ηopt1,AIPW )}

• ηopt1,AIPW maximizes VAIPW (dη) in η1

Algebra:CdηY

πdη,1(H1; η1, γ1)=

[A1I{d1(H1; η1) = 1}+ (1− A1)I{d1(H1; η1) = 0}]Yπ1(H1; γ1)I{d1(H1; η1) = 1}+ {1− π1(H1; γ1)}I{d1(H1; η1) = 0}

=A1Y

π1(H1; γ1)I{d1(H1; η1) = 1}+

(1− A1)Y{1− π1(H1; γ1)}

I{d1(H1; η1) = 0}

Cdη − πdη,1(H1; η1, γ1)

πdη,1(H1; η1, γ1)Qdη,1(H1; η1, β1)

={A1 − π1(H1; γ1)}

π1(H1; γ1)Q1(H1,1;β1)I{d1(H1; η1) = 1}

− {A1 − π1(H1; γ1)}1− π1(H1; γ1)

Q1(H1,0;β1)I{d1(H1; η1) = 0}

195 ST 790, Dynamic Treatment Regimes

Page 8: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Define:

ψ1(H1,A1,Y ) =A1Yπ1(H1)

− {A1 − π1(H1)}π1(H1)

Q1(H1,1), (4.1)

ψ0(H1,A1,Y ) =(1− A1)Y1− π1(H1)

+{A1 − π1(H1)}

1− π1(H1)Q1(H1,0) (4.2)

• Under SUTVA, NUC, positivity

E{ψ1(H1,A1,Y )|H1} = Q1(H1,1), E{ψ0(H1,A1,Y )|H1} = Q1(H1,0)

• Thus

E{ψ1(H1,A1,Y )−ψ0(H1,A1,Y )|H1} = C1(H1) = Q1(H1,1)−Q1(H1,0),

the contrast function (3.33)

196 ST 790, Dynamic Treatment Regimes

Page 9: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Thus, by all of this algebra: Can write

VAIPW (dη) = n−1n∑

i=1

[ψ1(H1i ,A1i ,Yi)I{d1(H1i ; η1) = 1}

+ ψ0(H1i ,A1i ,Yi)I{d1(H1i ; η1) = 0}]

• ψ1(H1i ,A1i ,Yi) and ψ0(H1i ,A1i ,Yi) are (4.1) and (4.2) evaluatedat (H1i ,A1i ,Yi) with the fitted models Q1(H1,1; β1), Q1(H1,0; β1),and π1(H1; γ1) substituted

• Rewrite using I{d1(H1; η1) = 1} = d1(H1; η1),I{d1(H1; η1) = 0} = 1− d1(H1; η1)

197 ST 790, Dynamic Treatment Regimes

Page 10: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

By further algebra: VAIPW (dη) can be expressed as

VAIPW (dη)

= n−1n∑

i=1

[ψ1(H1i ,A1i ,Yi )d1(H1i ; η1) + ψ0(H1i ,A1i ,Yi ){1− d1(H1i ; η1)}

]= n−1

n∑i=1

[d1(H1i ; η1)

{ψ1(H1i ,A1i ,Yi )− ψ0(H1i ,A1i ,Yi )

}+ ψ0(H1i ,A1i ,Yi )

]= n−1

n∑i=1

{d1(H1i ; η1)C1(H1i ,A1i ,Yi ) + ψ0(H1i ,A1i ,Yi )

}

• Predictor of the contrast function

C1(H1i ,A1i ,Yi) = ψ1(H1i ,A1i ,Yi)− ψ0(H1i ,A1i ,Yi)

198 ST 790, Dynamic Treatment Regimes

Page 11: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Result: Maximizing VAIPW (dη) in η1 is equivalent to maximizing

n−1n∑

i=1

d1(H1i ; η1)C1(H1i ,A1i ,Yi)

More algebra: Using a = I(a > 0)|a| − I(a ≤ 0)|a| for any a andwriting dη1,1i = d1(H1i ; η1), C1i = C1(H1i ,A1i ,Yi)

dη1,1i C1i = dη1,1i I(C1i > 0)|C1i | − dη1,1i I(C1i ≤ 0)|C1i |

= I(C1i > 0)|C1i | − |C1i |{(1− dη1,1i )I(C1i > 0) + dη1,1i I(C1i ≤ 0)}

= I(C1i > 0)|C1i | − |C1i |{I(C1i > 0)− dη1,1i}2

Thus:

d1(H1i ;η1)C1(H1i ,A1i ,Yi) = I{C1(H1i ,A1i ,Yi) ≥ 0}|C1(H1i ,A1i ,Yi)|

− |C1(H1i ,A1i ,Yi)|[I{C1(H1i ,A1i ,Yi) ≥ 0} − d1(H1i ; η1)

]2

199 ST 790, Dynamic Treatment Regimes

Page 12: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Final result: Maximizing VAIPW (dη) in η1 is equivalent to minimizingin η1

n−1n∑

i=1

|C1(H1i ,A1i ,Yi)|[I{C1(H1i ,A1i ,Yi) > 0} − d1(H1i ; η1)

]2

= n−1n∑

i=1

|C1(H1i ,A1i ,Yi)|I[I{C1(H1i ,A1i ,Yi) > 0} 6= d1(H1i ; η1)

](4.3)

• A weighted classification error with

I “Label” I{C1(H1i ,A1i ,Yi ) ≥ 0} (Zi )

I “Weight” |C1(H1i ,A1i ,Yi )| (wi )

I “Classifier” d1(h1; η1) (d)

200 ST 790, Dynamic Treatment Regimes

Page 13: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

n−1n∑

i=1

|C1(H1i ,A1i ,Yi)|I[I{C1(H1i ,A1i ,Yi) > 0} 6= d1(H1i ; η1)

](4.3)

Intuitive interpretation:• From (3.34), dopt ∈ D satisfies

dopt1 (h1) = I{C1(h1) > 0}

• The second term in (4.3) compares a predictor of the optionselected by the global dopt to that selected by a rule in Dη

• The “weight” |C1(H1i ,A1i ,Yi)| in (4.3) places greater importanceon contributions from individuals for whom the absolutedifference in expected outcomes for options 0 and 1 is large

201 ST 790, Dynamic Treatment Regimes

Page 14: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Similarly: Analogous argument applies to

VIPW (dη) = n−1n∑

i=1

Cdη ,iYi

πdη ,1(H1i ; η1, γ1)

• Can be shown: The same formulation applies with

ψ1(H1,A1,Y ) =A1Yπ1(H1)

, ψ0(H1,A1,Y ) =(1− A1)Y1− π1(H1)

C1(H1i ,A1i ,Yi) = ψ1(H1i ,A1i ,Yi)− ψ0(H1i ,A1i ,Yi)

=A1iYi

π1(H1i ; γ1)− (1− A1i)Yi

1− π1(H1i ; γ1)

202 ST 790, Dynamic Treatment Regimes

Page 15: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Classification analogy

Summary: Value (policy) search estimation of an optimal restrictedregime dopt

η by maximizing VIPW (dη) or VAIPW (dη) is equivalent tominimizing a weighted classification error

• Choice of classification approach dictates the restricted class Dη• Can be implemented using off-the-shelf software and algorithms

for classification problems

• E.g., for CART, SVM

However: This analogy does not circumvent the need to optimize anonsmooth function of η1

203 ST 790, Dynamic Treatment Regimes

Page 16: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Demonstration

Decision function: Write d1(h1; η1) = I{f1(h1; η1) > 0}• E.g.

f1(h1; η1) = η11 + η12x11 + η13x12

• By algebra, can write

n−1n∑

i=1

|C1(H1i ,A1i ,Yi)|I[I{C1(H1i ,A1i ,Yi) > 0} 6= d1(H1i ; η1)

]= n−1

n∑i=1

|C1(H1i ,A1i ,Yi)| `0-1

([2I{C1(H1i ,A1i ,Yi) > 0} − 1

]f1(H1i ; η1)

)in terms of the 0-1 loss function

`0-1(x) = I(x ≤ 0)

204 ST 790, Dynamic Treatment Regimes

Page 17: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Demonstration

n−1n∑

i=1

|C1(H1i ,A1i ,Yi)| `0-1

([2I{C1(H1i ,A1i ,Yi) > 0} − 1

]f1(H1i ; η1)

)Source of difficulty: The 0-1 loss function is nonconvex

`0-1(x) = I(x ≤ 0)

• Optimization involving nonconvex loss functions is challenging;standard techniques cannot be used

• This problem has been well studied in the classification literature

• E.g., with SVM, replace `0-1(x) by a convex “surrogate” such asthe hinge loss function

`hinge(x) = (1− x)+, x+ = max(0, x)

205 ST 790, Dynamic Treatment Regimes

Page 18: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Hinge loss vs. 0-1 loss

−2 −1 0 1 2

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

206 ST 790, Dynamic Treatment Regimes

Page 19: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

4. Single Decision Treatment Regimes: Additional Methods4.1 Optimal Regimes from a Classification Perspective4.2 Outcome Weighted Learning4.3 Interpretable Treatment Regimes via Decision Lists4.4 Additional Approaches4.5 Key References

207 ST 790, Dynamic Treatment Regimes

Page 20: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Original formulation

Zhao et al. (2012): Approach based on the IPW estimator

VIPW (dη) = n−1n∑

i=1

Cdη ,iYi

πdη ,1(H1i ; η1, γ1)

• Assume that Y is bounded and Y ≥ 0• Can be developed as a special case with

ψ1(H1,A1,Y ) =A1Yπ1(H1)

, ψ0(H1,A1,Y ) =(1− A1)Y1− π1(H1)

C1(H1i ,A1i ,Yi) = ψ1(H1i ,A1i ,Yi)− ψ0(H1i ,A1i ,Yi)

=A1iYi

π1(H1i ; γ1)− (1− A1i)Yi

1− π1(H1i ; γ1)

=Yi{A1i − π1(H1; γ1)}

π1(H1; γ1){1− π1(H1; γ1)}

208 ST 790, Dynamic Treatment Regimes

Page 21: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

As a special case

• With Y ≥ 0 and the positivity assumption

I{C1(H1i ,A1i ,Yi) > 0} = I(A1i = 1) = A1i

• By considering A1i = 1 and A1i = 0

|C1(H1i ,A1i ,Yi)| =Yi

A1iπ1(H1i ; γ1) + (1− A1i){1− π1(H1i ; γ1)}

• Substitute in (4.3): Maximizing VIPW (dη) in η1 is equivalent tominimizing

n−1n∑

i=1

Yi

A1iπ1(H1i ; γ1) + (1− A1i){1− π1(H1i ; γ1)}︸ ︷︷ ︸“Weight”

I{A1i 6= d1(H1i ; η1)}

(4.4)

• “Label” A1i , “Classifier” d1(h1; η1)

209 ST 790, Dynamic Treatment Regimes

Page 22: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Original formulation

Randomized study: With known

π1(H1) = P(A1 = 1|H1) = P(A1 = 1) = π1

• Recode options: A1 = {−1,1}• d1(h1; η1) = sign{f1(h1; η1)} for decision function f1(h1; η1)

Weighted classification error: (4.4) can be rewritten as

n−1n∑

i=1

Yi

A1iπ1 + (1− A1i)/2I{A1i 6= d1(H1i ; η1)}

= n−1n∑

i=1

Yi

A1iπ1 + (1− A1i)/2I[A1i 6= sign{f1(H1i ; η1)}]

• Involves the 0-1 loss function

I[A1i 6= sign{f1(H1i ; η1)}] = I{A1i f1(H1i ; η1) ≤ 0} = `0-1{A1i f1(H1i ; η1)}

210 ST 790, Dynamic Treatment Regimes

Page 23: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Outcome weighted learning (OWL)

Minimize:

n−1n∑

i=1

Yi

A1iπ1 + (1− A1i)/2`0-1{A1i f1(H1i ; η1)}

Original OWL: Zhao et al. (2012)

• Restricted class Dη induced by linear or nonlinear SVM

• Replace 0-1 loss by the convex surrogate hinge loss function

`hinge(x) = (1− x)+, x+ = max(0, x)

• Minimize in η1 the penalized objective function

n−1n∑

i=1

Yi

A1iπ1 + (1− A1i)/2{1− A1i f1(H1i ; η1)}+ + λn‖f1‖2 (4.5)

• Flexible, highly parameterized representation of f1(h1; η1),penalty for overfitting

211 ST 790, Dynamic Treatment Regimes

Page 24: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Outcome weighted learning

Remarks:• Can take a similar approach (flexible f1(h1; η1), penalization) with

the full AIPW formulation• Important: Minimizing the original objective (4.4) and minimizing

(4.5) with hinge loss substituted are different optimizationproblems and will lead to different ηopt

1 and thus differentestimated optimal regimes

• Similarly for the AIPW formulation• Simulation evidence: Suggests this might not be such a big deal

in practice; resulting estimated optimal regimes perform well

Refinements and extensions of OWL: Zhou et al. (2017), Liu et al(2018)

212 ST 790, Dynamic Treatment Regimes

Page 25: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

4. Single Decision Treatment Regimes: Additional Methods4.1 Optimal Regimes from a Classification Perspective4.2 Outcome Weighted Learning4.3 Interpretable Treatment Regimes via Decision Lists4.4 Additional Approaches4.5 Key References

213 ST 790, Dynamic Treatment Regimes

Page 26: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Flexibility vs. interpretability

Classification approach: Flexible representation• Complex, highly parameterized estimated decision rules• Pro: Can synthesize high-dimensional patient information and

achieve performance close to a true optimal regime dopt ∈ D• Con: Difficult to interpret, “black box,” hard to glean new scientific

insights

Opposing view: Emphasize parsimony and interpretability• Deliberately focus on Dη with rules that can be understood by

clinicians and patients• Pro: Accessibility, more readily accepted, can generate new

scientific insights and hypotheses• Con: Optimal such regimes may not approach performance of

dopt ∈ D

214 ST 790, Dynamic Treatment Regimes

Page 27: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Decision rules as decision lists

Zhang et al. (2015): Focus on Dη with decision rules characterizingregimes in the form of a decision list• Decision list: A sequence of if-then clauses• “If” is a condition involving patient information that, if true, leads

to selection of an option a1 ∈ A1

• Natural for A1 with m1 ≥ 2 options

Example: Acute leukemia, A1 = {C1,C2} = {0,1}• Rule d1(h1) = I(age < 50 and WBC < 10) as a list

If age < 50 and WBC < 10 then C2;

else C1

215 ST 790, Dynamic Treatment Regimes

Page 28: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Decision rules as decision lists

Fancier example: Acute leukemia, {C1,C2,C3}If age < 50 and WBC < 10 then C2;

else if age ≥ 50 then C1;

else C3

(4.6)

C1

C2

C3

WBC 10

50 age

216 ST 790, Dynamic Treatment Regimes

Page 29: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Decision rules as decision lists

Fancier still:

If age < 50 and ECOG < 2 then C2;

else if WBC ≥ 20 then C1;

else if PLAT > 300 then C1;

else C3.

(4.7)

217 ST 790, Dynamic Treatment Regimes

Page 30: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Decision rules as decision lists

In general: A1 with m1 ≥ 2 options; rule in form of decision list oflength L1

If c11 then a11;

else if c12 then a12;

...

else if c1L1 then a1L1 ;

else a10,

• Summarized as {(c11,a11), . . . , (c1L1 ,a1L1),a10}• For example, in (4.7), L1 = 3, c11 = { age < 50 and ECOG < 2},

and a11 = C2

• Treatment options can be repeated in different clauses• L1 = 0 corresponds to a static regime

218 ST 790, Dynamic Treatment Regimes

Page 31: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Basic formulation

Can mathematize: Define

T1(c1`) = {h1 : c1` is true }, ` = 1, . . . ,L1

R11 = T1(c11)

R1` ={∩j<`T1(c1j)

c}⋂ T1(c1`), ` = 2, . . . ,L1,

R10 =

L1⋂j=1

T1(c1j)c

• Each R1`, ` = 0, . . . ,L1, represents the conditions that must besatisfied for an individual to receive option a1`

• For the diligent student: Determine the sets R11, R12, R13, andR10 for the example in (4.7)

• Clearly: A given h1 can belong to at most one set R1`,` = 0,1, . . . ,L1

219 ST 790, Dynamic Treatment Regimes

Page 32: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Basic formulation

Treatment regime: Decision rules of form

d1(h1) =

L1∑`=0

a1` I(h1 ∈ R1`). (4.8)

• Characterized by {(c11,a11), . . . , (c1L1 ,a1L1),a10}

Zhang et al. (2015): For parsimony and interpretability, restrict to c1`involving at most 2 components of h1

• For h1 with p1 components, j1 < j2 ∈ {1, . . . ,p1}, restrict toT1(c1`) of any of the forms{h1 : h1j1 ≤ τ11} {h1 : h1j1 ≤ τ11 or h1j2 ≤ τ12}{h1 : h1j1 ≤ τ11 and h1j2 ≤ τ12} {h1 : h1j1 ≤ τ11 or h1j2 > τ12}{h1 : h1j1 ≤ τ11 and h1j2 > τ12} {h1 : h1j1 > τ11 or h1j2 ≤ τ12}{h1 : h1j1 > τ11 and h1j2 ≤ τ12} {h1 : h1j1 > τ11 or h1j2 > τ12}{h1 : h1j1 > τ11 and h1j2 > τ12} {h1 : h1j1 > τ11}

(4.9)

220 ST 790, Dynamic Treatment Regimes

Page 33: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Regimes

Restricted class Dη: Define η1 to be a collection

{(c11,a11), . . . , (c1L1 ,a1L1),a10}

with conditions c1` as in one of the T1(c1`) in (4.9)• A rule as in (4.8) can be written as d1(h1; η1), and Dη comprises

all regimes with rules of this form• Feature: Do not need to collect all patient variables up front; can

ascertain as needed, useful if some are expensive orburdensome to collect

221 ST 790, Dynamic Treatment Regimes

Page 34: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Nonuniqueness

A decision rule can be represented with more than one list:• For a decision list with η1 = {(c11,a11), . . . , (c1L1 ,a1L1),a10} and

decision rule d1(h1; η1), there may exist another decision list withη′1 = {(c′11,a

′11), . . . , c′1L′1

,a′1L′1),a′10} and decision rule d1(h1; η′1)

such that d1(h1; η1) = d1(h1; η′1) for all h1 but L1 6= L′1 or L1 = L′1but c1j 6= c′1j or a1j 6= a′1j for some j = 1, . . . ,L1

222 ST 790, Dynamic Treatment Regimes

Page 35: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Nonuniqueness

Example: (4.6) and alternative

If age < 50 and WBC < 10 then C2; If age ≥ 50 then C1;

else if age ≥ 50 then C1; else if WBC < 10 then C2;

else C3 else C3

C1

C2

C3

WBC 10

50 age

223 ST 790, Dynamic Treatment Regimes

Page 36: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Optimal regime

Value of a regime: For any regime dη ∈ Dη, V(dη) is the sameregardless of which version of dη is considered

• Optimal regime doptη

d1(h1; ηopt1 ), ηopt

1 = arg maxη1

V(dη)

• Suggests: If there are equivalent versions of doptη , estimate the

version that is least costly/burdensome to implement• Value search: Maximize VAIPW (dη) on Slide 185 subject to

targeting the version of an optimal regime minimizing a measureof “cost”

224 ST 790, Dynamic Treatment Regimes

Page 37: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Optimal regime

Value search: As on Slide 185, with A1 = {1, . . . ,m1}

πdη ,1(H1; η1, γ1) =

m1∑a1=1

I{d1(H1; η1) = a1}ω1(H1,a1; γ1)

Qdη ,1(H1; η1, β1) =

m1∑a1=1

I{d1(H1; η1) = a1}Q1(H1,a1;β1)

VAIPW (dη)

= n−1n∑

i=1

[ Cdη ,iYi

πdη ,1(H1i ; η1, γ1)−Cdη ,i − πdη ,1(H1i ; η1, γ1)

πdη ,1(H1i ; η1, γ1)Qdη ,1(H1i ; η1, β1)

]

= n−1n∑

i=1

m1∑a1=1

([I(A1i = a1)

ω1(H1i ,a1; γ1)

{Yi −Q1(H1i ,a1; β1)

}+ Q1(H1i ,a1; β1)

]I{d1(H1i ; η1) = a1}

)as in (2) of Zhang et al. (2015)

225 ST 790, Dynamic Treatment Regimes

Page 38: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Optimal regime

Cost: If N1` = cost of measuring components of h1 necessary tocheck c11, . . . , c1` (fewer comparisons to thresholds is better), thecost of implementing regime dη with rule as in (4.8)

d1(h1; η1) =

L1∑`=0

a1` I(h1 ∈ R1`).

in the population is

N1(dη) =

L1∑`=1

N1`P(H1 ∈ R1`) +N1L1P(H1 ∈ R10)

• Zhang et al. (2015): Describe a computational algorithm tomaximize VAIPW (dη) while minimizing an estimator of the costN1(dη)

226 ST 790, Dynamic Treatment Regimes

Page 39: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

4. Single Decision Treatment Regimes: Additional Methods4.1 Optimal Regimes from a Classification Perspective4.2 Outcome Weighted Learning4.3 Interpretable Treatment Regimes via Decision Lists4.4 Additional Approaches4.5 Key References

227 ST 790, Dynamic Treatment Regimes

Page 40: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Extensive literature

Numerous approaches: We highlight two additional approaches toestimation of an optimal regime• Regression-based estimation: To mitigate concern over

parametric model misspecification, represent

Q1(h1,a1) = E(Y |H1 = h1,A1 = a1)

nonparametrically,, e.g., using generalized additive models,support vector regression, random forests, etc, to obtain anonparametric estimator Q1(h1,a1)

• Use Q1(h1,a1) as the fitted model and thus obtain

doptQ,1(h1) = arg max

a1∈A1

Q1(h1,a1)

• E.g., Qian and Murphy (2011)

228 ST 790, Dynamic Treatment Regimes

Page 41: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

Extensive literature

• Alternative form of value search: Because for dη in a restrictedclass Dη

V(dη) = E [Q1(H1,1)I{d1(H1; η1) = 1}+ Q1(H1,0)I{d1(H1; η1) = 0}]

maximize in η1

V(dη)

= n−1n∑

i=1

[Q1(H1,1)I{d1(H1; η1) = 1}+ Q1(H1,0)I{d1(H1; η1) = 0}

]• As above, Q1(h1,a1) is a nonparametric estimator for Q1(h1,a1)

• But here Q1(h1,a1) is used only to ensure faithful representationof the value and not to define the optimal regime estimatordirectly (Taylor et al., 2015)

229 ST 790, Dynamic Treatment Regimes

Page 42: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

4. Single Decision Treatment Regimes: Additional Methods4.1 Optimal Regimes from a Classification Perspective4.2 Outcome Weighted Learning4.3 Interpretable Treatment Regimes via Decision Lists4.4 Additional Approaches4.5 Key References

230 ST 790, Dynamic Treatment Regimes

Page 43: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

References

Liu, Y.,Wang, Y., Kosorok, M. R., Zhao, Y., and Zeng, D. (2018).Augmented outcome-weighted learning for estimating optimaldynamic treatment regimens. Statistics in Medicine 37, 3776–3788.

Qian, M. and Murphy, S. (2011). Performance guarantees forindividualized treatment rules. Annals of Statistics 39, 1180–1210.

Taylor, J. M. G., Cheng, W., and Foster, J. C. (2015). Reader reactionto “A robust method for estimating optimal treatment regimes” byZhang et al. (2012). Biometrics 71, 267–273.

Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., and Laber, E. B.(2012). Estimating optimal treatment regimes from a classificationperspective. Stat 1, 103–114.

231 ST 790, Dynamic Treatment Regimes

Page 44: 4. Single Decision Treatment Regimes: Additional Methods 4 ...davidian/dtr20/dtrcourse4.pdf199 ST 790, Dynamic Treatment Regimes Classification analogy Final result: Maximizing Vb

References

Zhang, Y., Laber, E. B., Tsiatis, A. A., and Davidian, M. (2015). Usingdecision lists to construct interpretable and parsimonious treatmentregimes. Biometrics 71, 895–904.

Zhao, Y., Zeng, D., Rush, A. J., and Kosorok, M. R. (2012). Estimatingindividualized treatment rules using outcome weighted learning.Journal of the American Statistical Association 107, 1106–1118.

Zhou, X., Mayer-Hamblett, N., Khan, U., and Kosorok, M. R. (2017).Residual weighted learning for estimating individualized treatmentrules. Journal of the American Statistical Association 112, 169–187.

232 ST 790, Dynamic Treatment Regimes


Recommended