OXYGEN DESATURATION IN INTERSTITIAL PNEUMONIA PROGNOSIS
Pawel Paczuski1 – I like to code with R in Emacs and read Plato.
Teng Sun1 – I'm a dog person but I don't have a dog yet. Wenbo Sun2 – I am from mainland China.
Anatoli Zaremba2 – I am 192cm and I love to travel.
By signing below, I agree that each of our group members will receive an identical project grade.
Pawel Paczuski______________________________
Teng Sun __________________________________
Wenbo Sun ________________________________
Anatoli Zaremba ____________________________
Contributions: Each group member has contributed to every part of the project, from the Univariate summaries to Kaplan-‐Meier curve estimates to the exploration of interaction effects as well as the explanations for the final model. The entire analysis was a team effort, with each group member contributing to several parts of the analysis.
1 Department of Biostatistics, School of Public Health, University of Michigan. 2 Department of Statistics, University of Michigan.
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
2
Abstract We studied the effect of a decrease in oxygen saturation in patients diagnosed with interstitial pulmonary fibrosis to determine whether it is a marker of disease progression by fitting a Cox proportional hazards model. We found that in an adjusted model, oxygen desaturation is a highly significant predictor of death: the hazard death ratio is 3.23 (95%CI: 1.40, 7.48).
Introduction We studied the effect of a decrease in oxygen saturation in patients diagnosed with interstitial pulmonary fibrosis to determine whether it is a marker of disease progression. We explored the prognostic value of this new variable, in association with related variables and standard covariates. Specifically, we had time to death data on 104 subjects with indicators for: alive or censored; diagnosis type: usual interstitial pneumonia (UIP) or non-‐specific interstitial pneumonia (NSIP); oxygen desaturation of less than 88% during a timed walk (Desat1 – a key predictor of interest); gender; smoking status. We also had data for continuous predictors: time to death in days; forced vital capacity in units of 10% of predicted; resting oxygen saturation percent; and age. We performed Kaplan-‐Meier non-‐parametric analysis, as well as Cox proportional hazards regression with appropriate diagnostics.
Statistical methods We obtained overall and desaturation-‐stratified univariate summaries, and ran Kaplan-‐Meier survival analysis for all data, and stratified by categorical covariates. Follow-‐up time was obtained by switching the indicator of censoring to an indicator of death in the Kaplan-‐Meier survival curve estimation. Full main-‐effects model including all covariates was fit using Cox proportional hazards regression.
We investigated the functional form of Resting Oxygen Saturation using Martingale and Deviance residual plots. Initially, we found a non-‐zero trend, and therefore constructed a linear spline with knot at value of 95. This was our final model.
When modeling a Cox proportional hazard model a key assumption is proportional hazards. There are a number of basic concepts for testing proportionality. Here, we used two approaches to make the test. Note that we should assume that we have got the correct functional form of the predictors before this analysis.
The first method is a check with Kaplan-‐Meier curves. The method could only be used for the covariates with few levels (in fact, the discrete variables in our project are all binary). If the predictors satisfy the proportional hazard assumption, then the graph of the survival function versus the survival time should result in a graph with parallel curves. Here, we can use this method to test the proportional hazard assumption for variable desat1, uip1 and SEX_M1.
The second method is through Schoenfeld Residuals. This method could be used for both discrete and continuous variables. The general idea is to test for a non-‐zero slope in a generalized linear regression of the scaled Schoenfeld residuals on functions of time. If the slope is non-‐zero, we can reject our null hypothesis of proportional hazards. We can also check the performance of the tests in the regression but it is less recommended because the curve may go up and down, the two trends will offset and then lead to a low p-‐value. The influence of outliers may also lead to an inaccurate result if we only check the p-‐value.
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
3
Model selection After obtaining the full main-‐effects model, we explored possible interactions between oxygen desaturation and each covariate. We performed Cox proportional hazards regression with interaction covariates, one interaction at a time. We found that none of the interaction effects were close to statistical significance, and model fit (AIC-‐based) was always worse than the main-‐effects model, except for the model with UIP by Desat1 interaction (AIC of 257.818 vs. 258.332 for main-‐effects model).
Results
Univariate summary Table 1 shows the univariate summaries of the 8 covariates in the data with a sample size of 104. As we can see from the table, the mean age of subjects is 70.0 with a standard deviation 10.1. Resting oxygen saturation percent (rest_Sp) has a mean of 95.5 with standard deviation 2.28. Forced vital capacity in units of 10% of predicted (fvcppd10) has a mean of 6.50 with standard deviation 1.97. Table 1: Univariate summaries (overall and stratified by Desaturation).
Variable Overall n=104*
No Desaturation n=59
(mean ± sd)
Desaturation n=45
(mean ± sd) P-‐value Demographic Gender (%Male) 58 (56%) 31 (52%) 27 (60%) 0.45# Age 70.0 ± 10.1 60.9 ± 10.7 61.0 ± 9.5 0.97^
Treatment 0.46# UIP 82 (79%) 45 (76%) 37 (82%) NSIP 22 (21%) 14 (24%) 8 (18%)
Death 35 (34%) 10 (17%) 25 (56%) <0.001# Censored 69 (66%) 49 (83%) 10 (44%) Covariates Resting Saturation, % 95.5 ± 2.28 96.3 ± 1.9 94.3 ± 2.3 <0.001^ Vital Capacity 6.50 ± 1.97 6.9 ± 2.14 6.0 ± 1.6 0.022^ Smoking 70 (68%) 39 (67%) 31 (69%) 0.86#
*Note overall n=104, but full model used n=103 (1 observation had missing value for smoking). ^ t-‐test # chi-‐squared test
Kaplan-‐Meier survival estimation Table 2 shows overall Kaplan-‐Meier estimated survival and follow-‐up times, while the survival plot is shown in Figure 1. Of the 104 observations, 35 died during the follow-‐up and 69 were censored (66.35%). The mean of survival time is 1257.83 days with standard error 61.73. Median Survival time was 1567 days.
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
4
Table 2: Overall Kaplan-‐Meier survival statistics. Mean (SE) Median n=104 Survival time 1257.8 ± 61.7 1567 Deaths=35 Follow-‐up time 1114.3 ± 58.3 1122 Censored=69 (66.3%)
NB: time in days. Figure 1: Overall KM survival curve.
Figure 2 provides estimated Kaplan-‐Meier curves for patients after stratifying by usual interstitial pneumia, oxygen saturation, sex, and smoking status. We see clear differences in the estimated survival rates for the first two variables. This indicates that patients diagnosed with usual interstitial pneumonia and patients who were more likely to experience desaturation were more likely to experience death. The Survival curves for smoking status and sex lead us to believe that smoking status and sex do not appear to be significant factors that influence death for interstitial pulmonary fibrosis patients. Figure 2: Kaplan-‐Meier curves for patients stratified by usual interstitial pneumonia (top left), desaturation (top right), smoking status (bottom left), and sex (bottom right).
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
5
Table 3 provides a summary of covariate association with death. From this table we can see that UIP and desaturation appear to be highly associated with death, which confirms our observation above. We can also see from the below summaries that smoking status and sex does not appear to have a statistically significant impact on death for interstitial pulmonary fibrosis patients. Table 3: Covariate association with death (separate, stratified KM models). Usual Interstitial Pneumonia P Value Oxygen Desaturation P Value Variable NSIP UIP 0.0077* No Desaturation Desaturation <0.0001*
N/F/C 22/3/19 82/32/50 59/10/49 45/25/20 Mean 1579 ± 96 1111 ± 66 1333 ± 61 1013 ± 93 75% 1691 606 1514 1567. 50% . 1514 . 989 Smoking Status P Value Sex P Value Variable Non-‐Smoker Smoker 0.3960** Female Male 0.4768*
N/F/C 33/11/22 70/23/47 46/14/32 58/21/37 Mean 1143 ± 103 1282 ± 73 1224 ± 87 1221 ± 83 75% 578 856 670 838 50% . 1567 . 1534 NB: time in days. N = count. F=Failed. C=Censored. * Log-‐Rank Test ** Wilcoxon Test
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
6
Resting Oxygen Functional Form Initial and post-‐spline diagnostic plots for the functional form of rest_sp are shown in Figure 3. The graph is very good for the second panel, showing desired scatter around the y=0 line. Figure 3: Martingale residual plot before spline construction (left) and after spline construction.
Final model We included the spline term in our final model. As shown in Table 4, we detected a significant association between patient survival and our main predictor of interest, Oxygen desaturation status. Desat1 had a p-‐value < 0.01, and after adjusting for all other covariates, the hazard ratio for desat1 was 3.23 (95%CI: 1.40, 7.48). This means that patients who had a fall in oxygen saturation levels had a 323% higher rate of death than patients who did not have a fall in oxygen saturation levels. We also checked the model without the splines, which gave us a nearly identical AIC value as our final model (258.33 versus 260.14). This indicates that our full model fits as well as the model with all 8 factors in the dataset. In addition, another predictor uip1 diagnosis (uip1) has a significant p-‐value (0.0489) in the non-‐spline model, which becomes non-‐significant (0.07) in our final model. The other six factors fail to show statistically significant effects on patient survival time. Table 4: Final model with linear spline for Resting Oxygen.
Parameter Parameter Estimate
Standard Error Chi-‐Square P-‐value Hazard Ratio
95% Hazard Ratio Confidence Limits
Age 0.03 0.02 1.20 0.27 1.03 0.98 1.07 Uip diagnosis 1.18 0.65 3.28 0.07 3.27 0.91 11.77 Vital capacity -‐0.16 0.12 1.83 0.18 0.85 0.68 1.07 Gender 0.33 0.41 0.64 0.42 1.39 0.62 3.14 Smoking status -‐0.09 0.43 0.05 0.83 0.91 0.40 2.10 Resting O2 saturation -‐0.02 0.13 0.02 0.89 0.98 0.77 1.26 Spline term -‐0.13 0.29 0.19 0.66 0.87 0.50 1.56 O2 desaturation 1.17 0.43 7.50 0.01 3.23 1.40 7.48
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
7
Figure 4a: Proportional hazards diagnostics using cumulative hazard plots (left) and their differences (right).
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
8
Proportional hazards diagnostics We first checked the proportional hazards assumptions graphically by plotting the log cumulative hazard ratio versus the time of study. We also plotted the difference in log cumulative ratio over time of study for each of the four dichotomous variables to check if the differences remained constant (see Figure 4).
From these plots, we found that the proportional hazards assumption holds for each of the four dichotomous variables. As we can see that the two curves are nearly parallel to each other in each of the four plots on the left. The four plots on the right show that the difference log cumulative ratio for each of the four variables over time is nearly constant (except sometimes at the endpoints). Furthermore, the plots on the left for smoking status and gender show two nearly identical curves, indicating the effect of smoking status or gender is the same at the same time of the study, which is consistent with our Kaplan-‐Meier curves. The plots for oxygen desaturation levels and uip1 diagnosis indicate the unique effect of each of these two factors is multiplicative with respect to their hazard rate.
Second, we check the proportional hazards assumption using Schoenfeld residuals. We fit the model by the new rest_Sp variable with a linear spline and then make the plot of each variable’s Schoenfeld residuals on functions of time (Figure 4b).
To make a double check, we compare the old model where rest_Sp act as a linear predictor with the new model where rest_Sp and its linear spline act as linear predictors. From Figure 4b, we can see that the new model fits the proportional hazard assumption much better than the old one. Figure 4b: Proportional hazards diagnostics using Schoenfeld residuals.
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
10
Conclusion Our results show that oxygen desaturation may be an important covariate in interstitial pneumonia disease progression. Larger studies with other covariates should be undertaken to confirm these findings.
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
11
Appendix
SAS Code
libname bios "M:\0FinalSurvival"; proc print data=bios.proj12; run; *other univariate stats; title 'univariate'; proc freq data = bios.proj12; tables desat1*alive / chisq; run; proc freq data = bios.proj12; tables desat1*SEX_M1 / chisq; run; proc freq data = bios.proj12; tables desat1*uip1 / chisq; run; proc freq data = bios.proj12; tables desat1*smoking / chisq; run; title 'kms'; *kaplan meiers; proc lifetest data=bios.proj12 plots=(s, lls); time time*alive(0); run; *kaplan meiers follow up; proc lifetest data=bios.proj12 plots=(s, lls); time time*alive(1); run; *univariate summary of covariate's association with death; *doing finding median of KM proc; title'univ surv time'; proc lifetest data=bios.proj12; time time*alive(0); strata uip1; run; proc lifetest data=bios.proj12; time time*alive(0); strata desat1; run; proc lifetest data=bios.proj12;
time time*alive(0); strata SEX_M1; run; proc lifetest data=bios.proj12; time time*alive(0); strata Smoking; run; title''; /*********FULL PHREG MODEL********/ title 'full model'; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 / risklimits; run; * the log says one obs was deleted; *now exploring interactions one by one; title 'interactions'; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 desat1*age / risklimits; run; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 desat1*uip1 / risklimits; run; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 desat1*fvcppd10 / risklimits; run; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
12
rest_Sp desat1 desat1*SEX_M1 / risklimits; run; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 desat1*Smoking / risklimits; run; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 desat1*rest_Sp / risklimits; run; *now exploring interactions one by one; ******but including new functional form of rest_Sp; * adding spline; title''; data bios.proj0; set bios.proj12; rest_Sp95 = rest_Sp-95; rest_ge_95 = (rest_Sp ge 95); run; title 'new'; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*age / risklimits; run; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*uip1 / risklimits; run; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*fvcppd10 / risklimits; run;
proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*SEX_M1 / risklimits; run; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*Smoking / risklimits; run; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*rest_Sp / risklimits; run; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 rest_Sp95*rest_ge_95 desat1*rest_Sp95*rest_ge_95 / risklimits; run; title ''; /* no interactions found significant, either with rest_Sp by itseld or with the new splines */ /*********FUNCTIONAL FORM CHECKS **********/ *look at residuals for functional form of age; *just to double check linearity; proc phreg data=bios.proj12; model time*alive(0) = / risklimits; output out=Outp xbeta=Xb resmart=Mart resdev=Dev dfbeta=delta_beta ressch=sch; run;
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
13
* these plots look good (linear at y=0); proc sgplot data=Outp; yaxis grid; refline 0 / axis=y; loess y=Mart x=age / smooth=0.6; run; proc sgplot data=Outp; refline 0 / axis=y; loess y=Dev x=age / smooth=0.6; run; *now, look at residuals for functional form of rest_Sp; *include the two predictors for which we know the functional form (accd to textbook); *these plots have some problems; proc phreg data=bios.proj12; model time*alive(0) = age fvcppd10 / risklimits; output out=Outp xbeta=Xb resmart=Mart resdev=Dev dfbeta=delta_beta ressch=sch; run; proc sgplot data=Outp; yaxis grid; refline 0 / axis=y; loess y=Mart x=rest_Sp / smooth=0.6; run; proc sgplot data=Outp; refline 0 / axis=y; loess y=Dev x=rest_Sp / smooth=0.6; run; * adding spline; data bios.proj0; set bios.proj12; rest_Sp95 = rest_Sp-95; rest_ge_95 = (rest_Sp ge 95); run; *now re-checking functional form of rest_Sp; * accd to ucla chapter 11; proc phreg data=bios.proj0; model time*alive(0) = age fvcppd10 rest_Sp rest_Sp95*rest_ge_95/ risklimits;
output out=Outp xbeta=Xb resmart=Mart resdev=Dev dfbeta=delta_beta ressch=sch; run; *here is the graphical check, and it is good; proc sgplot data=Outp; yaxis grid; refline 0 / axis=y; loess y=Mart x=rest_Sp / smooth=0.6; run; proc sgplot data=Outp; refline 0 / axis=y; loess y=Dev x=rest_Sp / smooth=0.6; run; *here is an alternative graphical check from ucla site; *also good fit; proc loess data=Outp; ods output OutputStatistics=figureHere; model Mart = rest_Sp / smooth=0.6 direct; run; quit; ************FULL MODEL again; title 'full model'; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1 / risklimits; run; title 'full model, with spline'; proc phreg data=bios.proj0; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp rest_Sp95*rest_ge_95 desat1 / risklimits; run; /******************* CHECK FOR OVERALL MODEL FIT ***************/
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
14
/******************* COX SNELL RESIDUALS******/ /**********FIRST MODEL*******/ *cox snell; *-logsurv is the cox-snell residual; title ''; proc phreg data=bios.proj12; model time*alive(0) = age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1; output out= coxfig logsurv = h /method=ch; run; data cox; set coxfig; h=-h; cons=1; run; proc phreg data=cox; model h*alive(0) = cons; output out = coxfig2 logsurv =ls /method=ch; run; data cox2; set coxfig2; haz = - ls; run; proc sort data = cox2; by h; run; title "Cox-Snell Residual Plot for Assessing Model Fit"; axis1 order = (0 to 2 by .2) minor = none; axis2 order = (0 to 2 by .2) minor = none label = ( a=90); symbol1 v=none i = stepjl c= blue; symbol2 v=none i = join c = red l = 3; proc gplot data = cox2; plot haz*h =1 h*h =2 /overlay haxis=axis1 vaxis= axis2; label haz = "Estimated Cumulative Hazard Rates"; label h = "Residual"; run; quit;
/**********SECOND MODEL - stratify by desat1*******/ *cox snell; *-logsurv is the cox-snell residual; proc sort data = cox; by desat1; run; proc phreg data=cox; model h*alive(0) = cons; output out = fill_2b logsurv =ls /method=ch; by desat1; run; data fill_2b1; set fill_2b; if desat1 = 0 then haz1 = -ls; if desat1 = 1 then haz2 = -ls; run; proc sort data = fill_2b1; by h; run; title "Cox–Snell residual plots for Desat1=1 and Desat1=0 separately."; *blue (the shorter step function) is desat1=0; symbol1 i = stepjl c= blue; symbol2 i = stepjl c = red l = 3; symbol3 i = join c = black; proc gplot data = fill_2b1; plot haz1*h = 1 haz2*h = 2 h*h=3 /overlay haxis=axis1 vaxis=axis2 ; label haz1 = "Log Cumulative Hazard Rate"; label h= "Residual"; run; quit; /******************* CHECK FOR PROPORTIONAL HAZARDS ***************/ *from
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
15
http://www.ats.ucla.edu/stat/sas/examples/sakm/chapter11.htm; *neams: age uip1 fvcppd10 SEX_M1 Smoking rest_Sp desat1; *checking age desat1 fvcppd10 rest_Sp; data project1; set bios.proj12; cons = 1; run; proc lifetest data=bios.proj12 plot=lls; time time*alive(0); strata desat1; run; title 'desat'; proc phreg data = bios.proj12 ; model time*alive(0) = desat1; * strata desat1; output out = figure11_9 logsurv = ls /method = ch; run; data figure11_9a; set figure11_9 ; logh = log (-ls); if desat1 = 0 then logh1 = logh; if desat1 = 1 then logh2 = logh; run; proc sort data = figure11_9a; by time; run; title "desat"; axis1 order = (0 to 2000 by 200) minor = none; axis2 order = (-4 to 1 by 1) minor = none label = ( a=90); symbol1 i = stepjl c= blue v=none; symbol2 i = stepjl c = red l = 3 v=none; proc gplot data = figure11_9a; plot logh1*time = 1 logh2*time = 2 /overlay haxis=axis1 vaxis=axis2 ; label logh1 = "Log Cumulative Hazard Rate";
label time= "Time on Study"; run; quit; *another plot type of differences; data figure11_9b; set figure11_9a; retain l1 l2 -5; if logh1 ~= . then l1 = logh1; if logh2 ~= . then l2 = logh2; diff = l2 - l1; run; axis1 order = (0 to 2000 by 200) minor = none; axis2 order = (0 to 1.6 by .4) minor = none label = ( a=90); symbol1 i = stepjl c= blue; title "Figure 11.10"; proc gplot data = figure11_9b; plot diff*time /haxis= axis1 vaxis=axis2; label diff = "Difference in Cumulative Hazard Rates"; label time = "Time on Study"; run; quit; /******** now checking uip ********/ title 'uip'; proc phreg data = project1 ; model time*alive(0) = uip1; * strata desat1; output out = figure11_9 logsurv = ls /method = ch; run; data figure11_9a; set figure11_9 ; logh = log (-ls); if uip1 = 0 then logh1 = logh; if uip1 = 1 then logh2 = logh; run; proc sort data = figure11_9a; by time; run; title "uip"; axis1 order = (0 to 2000 by 200) minor = none;
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
16
axis2 order = (-4 to 1 by 1) minor = none label = ( a=90); symbol1 i = stepjl c= blue; symbol2 i = stepjl c = red l = 3; proc gplot data = figure11_9a; plot logh1*time = 1 logh2*time = 2 /overlay haxis=axis1 vaxis=axis2 ; label logh1 = "Log Cumulative Hazard Rate"; label time= "Time on Study"; run; quit; *another plot type of differences; data figure11_9b; set figure11_9a; retain l1 l2 -5; if logh1 ~= . then l1 = logh1; if logh2 ~= . then l2 = logh2; diff = l2 - l1; run; axis1 order = (0 to 2000 by 200) minor = none; axis2 order = (0 to 1.6 by .4) minor = none label = ( a=90); symbol1 i = stepjl c= blue; title "Figure 11.10"; proc gplot data = figure11_9b; plot diff*time /haxis= axis1 vaxis=axis2; label diff = "Difference in Cumulative Hazard Rates"; label time = "Time on Study"; run; quit;
Oxygen desaturation in interstitial pneumonia prognosis. 18 December 2012
17
R Code library(psych) library(survival) ##* readin data origin <-‐ read.csv("D:/Statistics/Biostat675/final-‐project/proj12.csv") ##* check for missing values var(is.na(as.matrix(origin))) ##* delete missing values origin <-‐ origin[!is.na(origin$Smoking), ] ##* univariate analysis n <-‐ nrow(as.matrix(origin)) describe(origin) summary(origin) origing0 <-‐ origin[origin$desat1 == 0, ] describe(origing0) summary(origing0) origing1 <-‐ origin[origin$desat1 == 1, ] describe(origing1) summary(origing1) ##* K-‐M estimation survdata <-‐ survfit(Surv(time, alive) ~ desat1, type = "kaplan-‐meier", error = "greenwood", data = origin) summary(survdata) plot(survdata, conf.int = F, fun = "cumhaz", xlab = c("Time"), ylab = c("Survival Probability"), xlim = c(0,2200), lty = 1:2, las = 1) #* median follow-‐up time survdataall <-‐ survfit(Surv(time, alive) ~ 1, type = "kaplan-‐meier", error = "greenwood", data = origin) summary(survdataall) #* median = 1567 median(origin$time[origin$alive == 1]) ##* interactions cor(origin) lmlog <-‐ glm(desat1 ~ alive + uip1 + fvcppd10 + rest_Sp + SEX_M1 + Smoking + age, family = binomial, data = origin) step(lmlog) ##* proportional hazard test1 time.dep <-‐ coxph(Surv(time, alive) ~ desat1 + uip1 + fvcppd10 + rest_Sp + SEX_M1 + Smoking + age, method="breslow", na.action = na.exclude, data = origin)
time.dep.zph <-‐ cox.zph(time.dep, transform = "km", global = TRUE) print(time.dep.zph)##* check for significance summary(time.dep) ##* add spline irest_Sp95 <-‐ rep(0, n) for (i in 1 : n) { if (origin$rest_Sp[i] < 95) irest_Sp95[i] <-‐ 0 else irest_Sp95[i] <-‐ 1} rest_Sp95 <-‐ origin$rest_Sp -‐ 95 newrest <-‐ rest_Sp95 * irest_Sp95 time.dep <-‐ coxph(Surv(origin$time, origin$alive) ~ origin$desat1 + origin$uip1 + origin$fvcppd10 + origin$rest_Sp + newrest + origin$SEX_M1 + origin$Smoking + origin$age, method="breslow", na.action = na.exclude) time.dep.zph <-‐ cox.zph(time.dep, transform = "km", global = TRUE) print(time.dep.zph) plot(time.dep.zph[1], xlab = "Time", ylab = "Residuals for desat1") abline(h=0, lty=3) plot(time.dep.zph[2], xlab = "Time", ylab = "Residuals for uip1") abline(h=0, lty=3) plot(time.dep.zph[3], xlab = "Time", ylab = "Residuals for fvcppd10") abline(h=0, lty=3) plot(time.dep.zph[4], xlab = "Time", ylab = "Residuals for rest_Sp") abline(h=0, lty=3) plot(time.dep.zph[5], xlab = "Time", ylab = "Residuals for rest_Sp's soline") abline(h=0, lty=3) plot(time.dep.zph[6], xlab = "Time", ylab = "Residuals for SEX_M1") abline(h=0, lty=3) plot(time.dep.zph[7], xlab = "Time", ylab = "Residuals for Smoking") abline(h=0, lty=3) plot(time.dep.zph[8], xlab = "Time", ylab = "Residuals for age") abline(h=0, lty=3)