Survival analysis

SURVIVAL ANALYSISPRESENTED BY:

DR SANJAYA KUMAR SAHOO PGT,AIIH&PH,KOLKATA

SURVIVAL:

• It is the probability of remaining alive for a specific length of time.

• If our point of interest : prognosis of disease i.e 5 year survival

e.g. 5 year survival for AML is 0.19, indicate 19% of patients with AML will

survive for 5 years after diagnosis

e.g For 2 year survival: S= A-D/A= 6-1/6 =5/6 = .83=83%

CENSORING:

• Subjects are said to be censored• if they are lost to follow up • drop out of the study, • if the study ends before they die or have an outcome of interest.

• They are counted as alive or disease-free for the time they were enrolled in the study. • In simple words, some important information required to make a

calculation is not available to us. i.e. censored.

Types of censoring:

Three Types of Censoring

Right censoring Left censoring Interval censoring

Right Censoring:

• Right censoring is the most common of concern.

• It means that we are not certain what happened to people after some

point in time.

• This happens when some people cannot be followed the entire time

because they died or were lost to follow-up or withdrew from the

study.

• Left censoring is when we are not certain what

happened to people before some point in time.

• Commonest example is when people already have

the disease of interest when the study starts.

Left Censoring:

• Interval/random censoring is when we know that something

happened in an interval (i.e. not before starting time and not after

ending time of the study ), but do not know exactly when in the

interval it happened.

• For example, we know that the patient was well at time of start of

the study and was diagnosed with disease at time of end of the

study, so when did the disease actually begin?

• All we know is the interval.

Interval/Random Censoring

10

What is survival analysis?

• Statistical methods for analyzing longitudinal data on the occurrence

of events.

• Events may include death,onset of illness, recovery from illness

(binary variables) or failure etc.

• Accommodates data from randomized clinical trial or cohort study

design.

Need for survival analysis:

• Investigators frequently must analyze data before all patients have died;

otherwise, it may be many years before they know which treatment is better.

• Survival analysis gives patients credit for how long they have been in the study,

even if the outcome has not yet occurred.

• The Kaplan–Meier procedure is the most commonly used method to illustrate

survival curves.

12

Estimate time-to-event for a group of individuals:

-such as time until death for heart transplant patients(mortality studies)

-Time of remission for leukemic patients(in therapy trials)

To compare time-to-event between two or more groups:

-such as treated vs. placebo MI patients in a randomized controlled trial.

To assess the prognostic co-variables:(Survival models)

-such as: weight, insulin resistance, or cholesterol influence survival time

of MI patients?

Objectives of survival analysis:

14

Survival Analysis: Terms

• Time-to-event: The time from entry into a study until a

subject has a particular outcome.

• Censoring: Subjects are said to be censored if they are

lost to follow up or drop out of the study, or if the

study ends before they die or have an outcome of

interest. They are counted as alive or disease-free for

the time they were enrolled in the study.

Importance of censoring in survival analysis?• Example: we want to know the survival rates of a disease in two groups and our outcome interest is death due the disease? group-1 group-2Time in months

event

5 death

6 death

8 death

9 death

10 death

12 death

16 death

Time in months

event

9 death 8 death 12 death 20 death 6 death 7 death 4 death

This data can’t be analysed by survival analysis method.As

there is no censored data.In this case as all pts. died so we can take mean time of death and know which group has more

survival time

Also data shouldn’t have >50% censored data

SURVIVAL FUNCTION:

Let T= Time of death(disease)• Survival function S(t)=F(t) =prob.(alive at time t) =prob.(T>t)In simple terms it can be defined as

No. of pts. Surviving longer than ‘t’

S(t)= ----------------------------------------------

Total no. of pts.

18

Kaplan-Meier estimate of survival function:• Calculate the survival of study population.

• Easy to calculate.

• Non-parametric estimate of the survival function.

• Commonly used to compare two study populations.

• Applicable to small,moderate and large samples.

Kaplan-Meier Estimate:

• The survival probability can be calculated in the following way:

P1 =Probability of surviving for atleast 1 day after transplant

P2 =Probability of surviving the second day after having survived the

first day.

P3 = Probability of surviving the third day after having survived the

second day

• To calculate S(t) we need to estimate each of P1,P2,P3 ……. Pt

probability of survival at time ‘t’ calculated as: No. of pts. Followed for atleast (t-1)days and who also survived day tPt = --------------------------------------------------------------------------

No. of patients alive at the end of day (t-1)

S(t) = P1 x P2 x P3 …….x Pt

Example: 10 Tumor patients(remission time)

Event Time(T)

Number at Risk ni

Number of Events di

(ni – di)/ni SurvivalS(t)=

3 10 1 9/10 9/10

4+

5.7+

6.5 7 2 5/7 9/10*5/7

8.4+

10 4 1 3/4 9/10*5/7*3/4

10+

12 2 1 1/2 9/10*5/7*3/4*1/2

15 1 0 0 0

• One patient's disease progressed at 3 month and another at 6.5, 10, 12 & 15months, and they are listed under the column “Number of Events” (di) and ni denotes No. of patients at risk at that point of time.

• Then, each time an event or outcome occurs, probability of survival at that point of time and survival times(t) calculated.

• In this method first step is to list the times when a death or drop out occurs, as in the column “Event Time”.

Denotes censored

data

Beginning of study End of study Time in months

Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data (right-censored)

1. subject E dies at 4 months

X

0

100%

Time in months

Corresponding Kaplan-Meier Curve

Probability of surviving to 4 months is 100% = 5/5

Fraction surviving this death = 4/5

Subject E dies at 4 months

4


Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data

2. subject A drops out after 6 months


X

3. subject C dies at 7 monthsX

100%

Time in months


subject C dies at 7 months

Fraction surviving this death = 2/3

74


Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data

2. subject A drops out after 6 months

4. Subjects B and D survive for the whole year-long study period


X

3. subject C dies at 7 monthsX

12

100%

Time in months


Rule from probability theory:

P(A&B)=P(A)*P(B) if A and B independent

In kaplan meier : intervals are defined by failures(2 intervals leading to failures here).

P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving interval 2)

Product limit estimate of survival = P(surviving interval 1/at-risk up to failure 1) * P(surviving interval 2/at-risk up to failure 2) = 4/5 * 2/3= .5333

0

The probability of surviving in the entire year, taking into account censoring = (4/5) (2/3) = 53%

Properties of survival function: 1.Step function 2.Median survival time estimate(i.e 50% of pts. survival time)

Median survival? 12 &22Which has better survival? (2nd one)

What proportion survives 20days?(in 1st graph=around 35% and in 2nd onearound 62%)

Limitations of Kaplan-Meier:

1.Must have >50% uncensored observations.

2.Median survival time.

3. Doesn’t control for covariates.

4.Assumes that censoring occurs independent of survival times.

(what if the person who develops adverse effect due to some

treatment and forced to leave or died?)

t2 t1

Median survival time=(t1+ t2 )/2

Comparison between 2 survival curve

•Don’t make judgments simply on the basis of the amount of separation between two lines

Comparison between 2 survival curve:

• methods may be used to compare survival curves.

• Logrank statistic.

• Breslow Statistics

• Tarone-Ware Statistics

LOGRANK TEST:

• The log rank statistic is one of the most commonly used methods to learn if two curves are significantly different. • This method also known as Mantel-logrank statistics or Cox-Mantel-

logrank statistics.• The logrank statistic is distributed as χ2 with a H0 that survival

functions of the two groups are the same

LOG-RANK TEST

• Emphasizes failures in the tail of the survival curve,where The no. at risk decreases over time,yet equal weight is given to each failure time.

• USUALLY GIVE STATISTICALLY SIGNIFICANT RESULTS

BRESLOW STATISTICS

• Gives greater weight to early observations. It is less sensitive than the Log-Rank test to late events when few subjects remain in the study.

TARONE-WARE STATISTICS

• Provide a compromise between the Log-Rank test and Breslow Statistics with an intermediate weighting scheme.This test maintains power across a wider range of alternatives than do the other two tests.

• USUALLY APPLIED.

Hazard function:

• Opposite to survival function• Hazard function is the derivative of the survival function over time

h(t)=dS(t)/dt• instantaneous risk of event at time t (conditional failure rate)

• It is the probability that a person will die in the next interval of time, given that he survived until the beginning of the interval.

Hazard function

• Hazard function given by

h(t,x1,x2…x5)=ƛ0 (t)eb1x1+b2x2+….b5x5

• ƛ0 is the baseline hazard at time t i.e. ƛ0(t)• For any individual subject the hazard at time t is hi(t).

• hi(t) is linked to the baseline hazard h0(t) by loge {hi(t)} = loge{ƛ0(t)} + β1X1 + β2X2 +……..+ βpXp

• where X1, X2 and Xp are variables associated with the subject

38

)(...)(

...0

...0

,1111

11

11

)(

)(

)(

)( jkikji

jkkj

ikkixxxx

xx

xx

j

iji e

et

et

th

thHR

Cox-Proportional hazards:

Hazard functions should be strictly parallel!

Produces covariate-adjusted hazard ratios!

Hazard for person j (eg a non-smoker)

Hazard for person i (eg a smoker)

Hazard ratio

39

The model: binary predictor

smoking

smoking

agesmoking

agesmoking

eHR

eet

et

th

thHR

smokingcancerlung

j

ismokingcancerlung

/

)01(

)60()0(0

)60()1(0

/ )(

)(

)(

)(

This is the hazard ratio for smoking adjusted for age.

Importance

• Provides the only valid method of predicting a time dependent outcome , and many health related outcomes related to time.• Can be interpreted in relative risk or odds ratio• Gives survival curves with control of confounding variables.• Can be used with multiple events for a subject.

Take Home Message

• survival analysis Estimate time-to-event for a group of individuals and To

compare time-to-event between two or more groups.

• In survival data is transformed into censored and uncensored data

• all those who achieve the outcome of interest are uncensored” data

• those who do not achieve the outcome are “censored” data

Take Home Message

• The Kaplan-Meier method uses the next death, whenever it occurs, to

define the end of the last class interval and the start of the new class interval.• Log-Rank test used to compare 2 survival curves but does not control

for confounding.

• For control for confounding use another test called as ‘Cox Proportional Hazards Regression.’

Date post:	25-Jun-2015
Category:	Documents
Upload:	sanjaya-sahoo
View:	317 times
Download:	2 times

Survival analysis

Documents