+ All Categories
Home > Documents > Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it...

Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it...

Date post: 08-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
28
Single Institution Studies: What You can and can’t do David Schoenfeld Massachusetts General Hospital Biostatistics Center
Transcript
Page 1: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Single Institution Studies What You can and canrsquot do

David Schoenfeld

Massachusetts General Hospital Biostatistics Center

Outline

bull Anecdote sometimes big is small

bull Basic rule for small studies Purpose is to set the stage for next study

bull How to set the stage

ndash Things you can do to set the stage

ndash Things you canrsquot do in a small study

bull Control Groups

bull Sample Size

Anecdote Sometimes Big Studies Are Small

bull In 1984-Vincent Zurowski founder of Centicortried to get a grant to study 5000 Swedish Women to see if he could detect ovarian cancer using his CA-125

bull Test 5000 women follow for two years see who got Ovarian Cancer

bull Grant was rejected for lack of a sample size justification

Is this sample size adequate

bull Ovarian Cancer is rare

bull There will be relatively few positive tests

bull If you detect even one ovarian cancer you have shown sensitivity

bull If you detect one ovarian cancer the test works

bull With 5000 patients we had a 90 chance of detecting at least one cancer

Basic purpose of a small study

bull To write the Protocol so that it can be carried out remotely

bull To make sure that the protocol can be followed and measurement issues are resolved

bull To decide whether to go on to a larger studyndash The GO-NO-GO decision

bull A grant for a small study should contain a Clinical Development Plan

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 2: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Outline

bull Anecdote sometimes big is small

bull Basic rule for small studies Purpose is to set the stage for next study

bull How to set the stage

ndash Things you can do to set the stage

ndash Things you canrsquot do in a small study

bull Control Groups

bull Sample Size

Anecdote Sometimes Big Studies Are Small

bull In 1984-Vincent Zurowski founder of Centicortried to get a grant to study 5000 Swedish Women to see if he could detect ovarian cancer using his CA-125

bull Test 5000 women follow for two years see who got Ovarian Cancer

bull Grant was rejected for lack of a sample size justification

Is this sample size adequate

bull Ovarian Cancer is rare

bull There will be relatively few positive tests

bull If you detect even one ovarian cancer you have shown sensitivity

bull If you detect one ovarian cancer the test works

bull With 5000 patients we had a 90 chance of detecting at least one cancer

Basic purpose of a small study

bull To write the Protocol so that it can be carried out remotely

bull To make sure that the protocol can be followed and measurement issues are resolved

bull To decide whether to go on to a larger studyndash The GO-NO-GO decision

bull A grant for a small study should contain a Clinical Development Plan

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 3: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Anecdote Sometimes Big Studies Are Small

bull In 1984-Vincent Zurowski founder of Centicortried to get a grant to study 5000 Swedish Women to see if he could detect ovarian cancer using his CA-125

bull Test 5000 women follow for two years see who got Ovarian Cancer

bull Grant was rejected for lack of a sample size justification

Is this sample size adequate

bull Ovarian Cancer is rare

bull There will be relatively few positive tests

bull If you detect even one ovarian cancer you have shown sensitivity

bull If you detect one ovarian cancer the test works

bull With 5000 patients we had a 90 chance of detecting at least one cancer

Basic purpose of a small study

bull To write the Protocol so that it can be carried out remotely

bull To make sure that the protocol can be followed and measurement issues are resolved

bull To decide whether to go on to a larger studyndash The GO-NO-GO decision

bull A grant for a small study should contain a Clinical Development Plan

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 4: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Is this sample size adequate

bull Ovarian Cancer is rare

bull There will be relatively few positive tests

bull If you detect even one ovarian cancer you have shown sensitivity

bull If you detect one ovarian cancer the test works

bull With 5000 patients we had a 90 chance of detecting at least one cancer

Basic purpose of a small study

bull To write the Protocol so that it can be carried out remotely

bull To make sure that the protocol can be followed and measurement issues are resolved

bull To decide whether to go on to a larger studyndash The GO-NO-GO decision

bull A grant for a small study should contain a Clinical Development Plan

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 5: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Basic purpose of a small study

bull To write the Protocol so that it can be carried out remotely

bull To make sure that the protocol can be followed and measurement issues are resolved

bull To decide whether to go on to a larger studyndash The GO-NO-GO decision

bull A grant for a small study should contain a Clinical Development Plan

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 6: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Now some statistics

bull The most import formula for small studies

bull You want things that will happen to happen to you

bull Pr=1-(1-p)n

bull Pr=Probability it happening in your study

bull p=frequency of occurrence in big study

bull n=size of small study

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 7: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Concerns that follow this rule

bull Severe Adverse Events

bull Unavoidable Protocol Violations

bull Basically anything that needs to be anticipated

With a sample of 20 patients there will be an 87 chance of seeing at least one occurrence of any event that would occur with a frequency of 10 or more

bull n=log(1-Pr)log(1-p)

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 8: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Tolerance

bull How many patients have to be able to tolerate a new treatment To make it feasible

bull Considerations

bull In a big study what is lowest acceptable tolerance

bull This should lead to a Go No-Go (No-Fix Fix) rule If in n patients more than m tolerate the treatment then GO otherwise FIX

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 9: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Basic Idea

Expected Tolerance

Tolerance Unacceptable

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 10: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Some statistics

bull Uses binomial distribution

bull httpstattrekcomonline-calculatorbinomialaspx

bull prob1 your expected tolerance rate

bull prob2 the lowest expected tolerance rate

bull pbinom(mnprob1lowertail=F) gt 80-90

bull Pbinom(mnprob2 lowertail=F)lt=10-20

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 11: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Example

bull Tolerance (prob1) is expected to be 80

bull Tolerance (prob2) needs to be above 60

bull We will consider the treatment tolerable if more than 27 out of the 40 patients tolerate the treatment If the true tolerance rate is 60 or less we will have a 12 chance or less that this would happen if the true tolerance rate is 80 there is a 96 chance that this will happen

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 12: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

R code

bull pbinom(27406lowertail=FALSE)

bull [1] 01285097

bull gt pbinom(27408lowertail=FALSE)

bull [1] 09567584

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 13: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Dose (and other choices)

bull It takes far few patients to pick the winner than to prove you have the winner

bull Example two doses say 1 and 2 on 25 standard deviations apart

bull N=506 To achieve 80 power for a significant difference

bull N=46 To achieve an 80 power to pick the best of the doses

bull Use a sample size calculator with p=5 one sided

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 14: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Efficacy

bull Is this a pilot study or not

ndash Pilot studies need to have a go no go rule and are not powered to achieve statistical significance

ndash Other studies need to have reasonable power on their primary endpoint

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 15: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Efficacy for pilot studies

bull Similar considerations as tolerance

bull What treatment difference do you expect(Y1)

bull What treatment difference would be a unacceptable (Y2)

bull Choose the Go No-Go cut-off

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 16: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Example

bull Effect size is 5

bull With 50 patients we will have more than an 80 chance of achieving a one sided p-value of more than 020 if the true effect size is 5

bull This is symmetric the type one and two errors are each 20

bull Int J Radiat Oncol Biol Phys 1980 Mar6(3)371-4

bull Statistical considerations for pilot studies

bull Schoenfeld D

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 17: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Brief aside what is an effect size

bull D=(Difference in treatments)standard deviationbull Large effect D=1 small effect D=025bull The problem is that D only translates into sample

size when there are no covariates or baseline measurements

bull The standard deviation to measure effect is the population standard deviation

bull The standard deviation for calculating sample size is the standard deviation of the residuals based on the design

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 18: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Can a pilot study be used to estimate the effect size

bull The problem is that the estimated effect size has a lot of error so canrsquot be relied upon

bull The role and interpretation of pilot studies in clinical research Leon AC1 Davis LL Kraemer HC J Psychiatr Res 2011 May45(5)626-9

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 19: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Example

bull Effect size is 25

bull The larger study needs ~500 patients

bull We do a 40 patient pilot study

bull The chance of a negative effect size is 20

bull There is a 20 chance of getting a sample size of 160 or less and 30 chance of requiring a 2000 patients

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 20: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

What about the variance

bull Same example but variance is estimated in a 40 patient pilot study

bull There is a 20 chance the sample size will be 380 or below and a 20 chance it will be above 564

bull Overall the power of the combined procedure will be 077 very close to the nominal of 08

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 21: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Aside Adaptive Trials

bull Do a single trial with two phases

bull In phase I estimate the sample size for phase II

bull You need to correct the type I error which is relatively easy

bull You need to have a range for possible effect sizes

bull You stop in at the end of phase I for futility or efficacy

bull You need to show reasonable power for all effect sizes in the range

bull You flatten the power curve at the cost a possibly larger sample size

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 22: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Aside Why Phase III Trials Fail Testing Treatments That Were Effective in

Phase IIOnly a fraction of what we test are effective say R

For instance 10 of things we test really work R=01

True Positive P(true)=08 R=008

False Positive P(false)=005(1-R)=0045

Probability that a positive result is a true positive is

P(true)=P(true)(P(true)+P(false))=008(008+0046)=064

bull There is a 35 chance that a positive phase II study is a false positive

22

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 23: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

How I usually find sample sizes

bull What sample size is feasible

bull What is important is often not clear for most measures

bull What effect is reasonable

ndash What effect was found in other studies even if the treatment is quite different

ndash What effect differentiates healthy patients from sick patients

bull I try to find the residual standard deviation often back calculated from a reported p-value

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 24: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Do you really need a control group

bull Case 1 First in man study-Goal is safety

bull In ALS they often of 6 active treatments to 2 placeborsquos in escalating doses

bull The placebo patients give us no information whatsoever

bull This is a 1-(1-p)n situation With p=025

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 25: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Do you need a control group

bull Case 2 Activity in Cancer

bull Rule of 14 Treat 14 patients with a new agent if you see one response the drug is active (Edward Gehen)

bull Again 1-(1-20)14=095

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 26: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

When control groups are important

bull When there can be spontaneous improvement or a placebo response rate or regression to the mean

bull The rule of thumb is that a controlled study takes four times as many patients but it is somewhat of an illusion

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 27: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Change in ALS by timePowerTradeOff

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374

Page 28: Single Institution Studies: What You can do a and can’t · •Pr=1-(1-p)n •Pr=Probability it happening in your study •p=frequency of occurrence in big study •n=size of small

Some references

bull httphedwigmghharvardedusample_sizesizehtml

bull Binomial Calculator

bull httpstattrekcomonline-calculatorbinomialaspx

bull Schoenfeld D Statistical considerations for pilot studies International Journal of Radiation Oncology Biology and Physics (1980) 63 371-374


Recommended