Empirical Efficiency Maximization:
Locally Efficient Covariate Adjustment in Randomized Experiments
Daniel B. Rubin
Joint work with Mark J. van der Laan
Outline
• Review adjustment in experiments.
• Locally efficient estimation. Problems with standard methods.
• New method addressing problems.
• Abstract formulation.
• Back to experiments, with simulations and numerical results.
• Application to survival analysis.
Randomized Experiments (no covariates yet)
Randomized Experiments
Randomized Experiments
• Example: Women recruited. Randomly assigned to diaphragm or no diaphragm. See if they get HIV.
• Example: Men recruited. Randomly assigned to circumcision or no circumcision. See if they get HIV.
Randomized Experiments
• Randomization allows causal inference.
• No confounding. Differences in outcomes between treatment and control groups are due to the treatment.
• Unverifiable assumptions needed for causal inference in observational studies.
Counterfactual Outcome Framework: Neyman-Rubin Model
Causal Parameters
Causal Parameters (binary response)
Estimating Causal Parameters in Randomized Experiments
Randomized Experiments with Covariates
• Same setup, only now demographic or clinical measurements taken prior to randomization.
• Question: With extra information, can we more precisely estimate causal parameters?
• Answer: Yes. (Fisher, 1932). Subject’s covariate has information about how he or she would have responded in both arms.
Covariate Adjustment (has at least two meanings)
1: Gaining precision in randomized experiments.
2: Accounting for confounding in observational studies.
This talk only deals with the first meaning.
Covariate Adjustment.
• Not very difficult when covariates divide subjects into a handful of strata.
• Have to think with a single continuous covariate (e.g. age). Modern studies can collect a lot of baseline information. Gene expression profile, complete medical history, biometric measurements.
• Important longstanding problem, but lots of confusion. Not “solved.” Recent work by others.
Covariate Adjustment
• Pocock et al. (2002) recently surveyed 50 clinical trial reports.
• Of 50 reports, 36 used covariate adjustment for estimating causal parameters, and 12 emphasized adjusted over unadjusted analysis.
• “Nevertheless, the statistical emphasis on covariate adjustment is quite complex and often poorly understood, and there remains confusion as to what is an appropriate statistical strategy.”
Recent Work on this Problem
• Koch et al. (1998).
- Modifications to ANCOVA.• Tsiatis et al. (2000, 2007a, 2007b).
- Locally efficient estimation.• Moore and van der Laan (2007).
- Targeted maximum likelihood.• Freedman (2007a, 2007b).
- Classical methods under misspecification.
Covariate Adjustment
• Can choose to ignore baseline measurements. Why might extra precision be worth it?
• Narrow confidence intervals for treatment effect.
• Stop trials earlier.
• Subgroup analysis, having smaller sample sizes.
Covariate Adjustment
Example: Intention-to-treat
Example: Intention-to-treat
Example: Log Odds Ratio
Locally Efficient Estimation
• Primarily motivated by causal inference problems in observational studies.
• Origin in Robins and Rotnitzky (1992), Robins, Rotnitzky, and Zhao (1994).
• Surveyed in van der Laan and Robins (2003), Tsiatis (2006).
Locally Efficient Estimation
Locally Efficient Estimation in Randomized Experiments
• Working model for treatment distribution given covariates known by design.
• So what does local efficiency tell us?
• Model outcome distribution given (covariates, treatment).
• We’ll be asymptotically efficient if working model is correct, but still asymptotically normal otherwise.
• But what does this mean if there’s no reason to believe the working model? Unadjusted estimators are also asymptotically normal. What about precision?
Empirical Efficiency Maximization
• Working model for outcome distribution given (treatment, covariates) typically fit with likelihood-based methods.
• Often linear, logistic, or Cox regression models.
• “Factorization of likelihood” means such estimates lead to double robustness in observational studies. But such robustness is superfluous in controlled experiments.
• We try to find the working model element resulting in the parameter estimate with smallest asymptotic variance.
Interlude: Abstract Formulation
Interlude: Abstract Formulation
Interlude: Abstract Formulation
Interlude: Abstract Formulation
Interlude: Abstract Formulation
Asymptotics
Interlude: Abstract Formulation
Back to Randomized Experiments
Randomized Experiments
Randomized Experiments
Working Model Loss Function
Connection to High Dimensional or Complex Data
• Suppose a high dimensional covariate is related to the outcome, and we would like to adjust for it to gain precision.
• Many steps in data processing can be somewhat arbitrary (e.g. dimension reduction, smoothing, noise removal).
• With cross-validation, new loss function can guide selection of tuning parameters governing this data processing.
Numerical Asymptotic Efficiency Calculations
1=Unadjusted estimator ignoring covariates. 2=Likelihood-based locally efficient. 3=Empirical efficiency maximization. 4=Efficient.
Intention-to-treat Parameter
Intention-to-treat Parameter
Intention-to-treat Parameter
Numerical Asymptotic Efficiency Calculations
1=Unadjusted difference in means. 2=Standard likelihood-based locally efficient estimator. 3=empirical efficiency maximization. 4=efficient estimator.
Treatment Effects
Sneak Peak: Survival Analysis
Sneak Peak: Survival Analysis
• Form locally efficient estimate. Working model for full data distribution now likely a proportional hazards model.
• For estimating (e.g. five-year survival), will be asymptotically efficient if model correct. Otherwise still asymptotically normal.
• But locally efficient estimator can be worse than Kaplan-Meier if model is wrong.
Sneak Peak: Survival Analysis
Sneak Peak: Survival Analysis.
• Generated Data. Want five-year survival.
1=Kaplan-Meier. 2=Likelihood-based locally efficient. 3=Empirical efficiency maximization. 4=Efficient.
Summary
• Robins and Rotnitzky’s locally efficient estimation developed for causal inference in observational studies.
• In experiments, estimators can gain or lose efficiency, depending on validity of working model.
• Often there might be no reason to have any credence in a working model.
• A robustness result implies we can better fit working model (or select tuning parameters in data processing) with a nonstandard loss function (the squared influence curve).
Troublesome Issues
• When working regression model is linear, our procedure usually reduces to existing procedures.
• When nonlinear, fitting regression model with our loss function can entail non-convex optimization.
• Proving convergence to working model elements for well-known logistic and Cox regression models.
References
• Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment. In press in International Journal of Biostatistics.
• Technical reports at www.bepress.com/ucbbiostat/paper229/ and www.bepress.com/ucbbiostat/paper220/.