Empirical Efficiency Maximization:

transcript

Locally Efficient Covariate Adjustment in Randomized Experiments

Daniel B. Rubin

Joint work with Mark J. van der Laan

Outline

• Review adjustment in experiments.

• Locally efficient estimation. Problems with standard methods.

• New method addressing problems.

• Abstract formulation.

• Back to experiments, with simulations and numerical results.

• Application to survival analysis.

Randomized Experiments (no covariates yet)

Randomized Experiments

• Example: Women recruited. Randomly assigned to diaphragm or no diaphragm. See if they get HIV.

• Example: Men recruited. Randomly assigned to circumcision or no circumcision. See if they get HIV.

• Randomization allows causal inference.

• No confounding. Differences in outcomes between treatment and control groups are due to the treatment.

• Unverifiable assumptions needed for causal inference in observational studies.

Counterfactual Outcome Framework: Neyman-Rubin Model

Causal Parameters

Causal Parameters (binary response)

Estimating Causal Parameters in Randomized Experiments

Randomized Experiments with Covariates

• Same setup, only now demographic or clinical measurements taken prior to randomization.

• Question: With extra information, can we more precisely estimate causal parameters?

• Answer: Yes. (Fisher, 1932). Subject’s covariate has information about how he or she would have responded in both arms.

Covariate Adjustment (has at least two meanings)

1: Gaining precision in randomized experiments.

2: Accounting for confounding in observational studies.

This talk only deals with the first meaning.

Covariate Adjustment.

• Not very difficult when covariates divide subjects into a handful of strata.

• Have to think with a single continuous covariate (e.g. age). Modern studies can collect a lot of baseline information. Gene expression profile, complete medical history, biometric measurements.

• Important longstanding problem, but lots of confusion. Not “solved.” Recent work by others.

Covariate Adjustment

• Pocock et al. (2002) recently surveyed 50 clinical trial reports.

• Of 50 reports, 36 used covariate adjustment for estimating causal parameters, and 12 emphasized adjusted over unadjusted analysis.

• “Nevertheless, the statistical emphasis on covariate adjustment is quite complex and often poorly understood, and there remains confusion as to what is an appropriate statistical strategy.”

Recent Work on this Problem

• Koch et al. (1998).

- Modifications to ANCOVA.• Tsiatis et al. (2000, 2007a, 2007b).

- Locally efficient estimation.• Moore and van der Laan (2007).

- Targeted maximum likelihood.• Freedman (2007a, 2007b).

- Classical methods under misspecification.

• Can choose to ignore baseline measurements. Why might extra precision be worth it?

• Narrow confidence intervals for treatment effect.

• Stop trials earlier.

• Subgroup analysis, having smaller sample sizes.

Example: Intention-to-treat

Example: Log Odds Ratio

Locally Efficient Estimation

• Primarily motivated by causal inference problems in observational studies.

• Origin in Robins and Rotnitzky (1992), Robins, Rotnitzky, and Zhao (1994).

• Surveyed in van der Laan and Robins (2003), Tsiatis (2006).

Locally Efficient Estimation

Locally Efficient Estimation in Randomized Experiments

• Working model for treatment distribution given covariates known by design.

• So what does local efficiency tell us?

• Model outcome distribution given (covariates, treatment).

• We’ll be asymptotically efficient if working model is correct, but still asymptotically normal otherwise.

• But what does this mean if there’s no reason to believe the working model? Unadjusted estimators are also asymptotically normal. What about precision?

Empirical Efficiency Maximization

• Working model for outcome distribution given (treatment, covariates) typically fit with likelihood-based methods.

• Often linear, logistic, or Cox regression models.

• “Factorization of likelihood” means such estimates lead to double robustness in observational studies. But such robustness is superfluous in controlled experiments.

• We try to find the working model element resulting in the parameter estimate with smallest asymptotic variance.

Interlude: Abstract Formulation

Asymptotics

Interlude: Abstract Formulation

Back to Randomized Experiments

Working Model Loss Function

Connection to High Dimensional or Complex Data

• Suppose a high dimensional covariate is related to the outcome, and we would like to adjust for it to gain precision.

• Many steps in data processing can be somewhat arbitrary (e.g. dimension reduction, smoothing, noise removal).

• With cross-validation, new loss function can guide selection of tuning parameters governing this data processing.

Numerical Asymptotic Efficiency Calculations

1=Unadjusted estimator ignoring covariates. 2=Likelihood-based locally efficient. 3=Empirical efficiency maximization. 4=Efficient.

Intention-to-treat Parameter

Numerical Asymptotic Efficiency Calculations

1=Unadjusted difference in means. 2=Standard likelihood-based locally efficient estimator. 3=empirical efficiency maximization. 4=efficient estimator.

Treatment Effects

Sneak Peak: Survival Analysis

• Form locally efficient estimate. Working model for full data distribution now likely a proportional hazards model.

• For estimating (e.g. five-year survival), will be asymptotically efficient if model correct. Otherwise still asymptotically normal.

• But locally efficient estimator can be worse than Kaplan-Meier if model is wrong.

Sneak Peak: Survival Analysis

Sneak Peak: Survival Analysis.

• Generated Data. Want five-year survival.

1=Kaplan-Meier. 2=Likelihood-based locally efficient. 3=Empirical efficiency maximization. 4=Efficient.

Summary

• Robins and Rotnitzky’s locally efficient estimation developed for causal inference in observational studies.

• In experiments, estimators can gain or lose efficiency, depending on validity of working model.

• Often there might be no reason to have any credence in a working model.

• A robustness result implies we can better fit working model (or select tuning parameters in data processing) with a nonstandard loss function (the squared influence curve).

Troublesome Issues

• When working regression model is linear, our procedure usually reduces to existing procedures.

• When nonlinear, fitting regression model with our loss function can entail non-convex optimization.

• Proving convergence to working model elements for well-known logistic and Cox regression models.

References

• Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment. In press in International Journal of Biostatistics.

• Technical reports at www.bepress.com/ucbbiostat/paper229/ and www.bepress.com/ucbbiostat/paper220/.

Empirical Efficiency Maximization:

Documents