+ All Categories
Home > Documents > Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Date post: 06-Jan-2016
Category:
Upload: lonato
View: 22 times
Download: 0 times
Share this document with a friend
Description:
Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME. A Little Study Design Terminology- Descriptive Studies. Case Study : Single patient is reviewed in detail. Case Series : Similar to above- just expand the number to a small handful. - PowerPoint PPT Presentation
58
Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME
Transcript
Page 1: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Katheryne Downes, MPH

Statistical Data Analyst/Research Specialist

Office of Clinical Research/GME

Page 2: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

A Little Study Design A Little Study Design Terminology- Descriptive Terminology- Descriptive StudiesStudies1.1. Case StudyCase Study: Single patient is reviewed in detail. : Single patient is reviewed in detail.

2.2. Case SeriesCase Series: Similar to above- just expand the : Similar to above- just expand the number to a small handful.number to a small handful.

3.3. **Ecological StudiesEcological Studies: Describes what’s going on : Describes what’s going on at the population (summary) level. All data are at the population (summary) level. All data are collected at the same time- no individual data are collected at the same time- no individual data are collected. collected.

4.4. **Cross-Sectional StudiesCross-Sectional Studies: Similar in many ways : Similar in many ways to ecological, but examines individual level data to ecological, but examines individual level data instead of population level. instead of population level.

** These studies have potential for some weak analytic statistics.

Page 3: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

A Little Study Design A Little Study Design Terminology- Analytic Terminology- Analytic StudiesStudies****Cohort StudiesCohort Studies: This study identifies people on their : This study identifies people on their

exposure status (yes/no) and follows them to exposure status (yes/no) and follows them to determine if they developed the outcome (yes/no). determine if they developed the outcome (yes/no). Great study for unusual or rare exposures. Great study for unusual or rare exposures.

a.a. Retrospective:Retrospective:

b. b. Prospective:Prospective:

** Cohort studies are sometimes used for purely descriptive purposes when we aren’t sure what phenomenon may occur.

Past…

E ------ O?

Present…

E ------ O?

Page 4: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

A Little Study Design A Little Study Design Terminology- Analytic Terminology- Analytic StudiesStudies2. 2. Case-Control StudiesCase-Control Studies: identify subjects by their : identify subjects by their

disease/outcome status and then look backward to disease/outcome status and then look backward to determine if they had the exposure of interest. determine if they had the exposure of interest.

3. 3. Randomized-Controlled TrialsRandomized-Controlled Trials: Ah, yes… The Golden : Ah, yes… The Golden Child of research.Child of research.

Expose------------------------Expose------------------------ Outcome? Outcome?

Randomize--Randomize--Don’t Expose ----------------Don’t Expose ---------------- Outcome? Outcome?

E?----O

Page 5: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Study Design QuizStudy Design Quiz

A young epidemiologist (& budding statistician!) was assigned to A young epidemiologist (& budding statistician!) was assigned to investigate an outbreak of an unusual fungus in the lungs of patients investigate an outbreak of an unusual fungus in the lungs of patients undergoing bronchoscopy. There’s about 15 patients and she’ll need undergoing bronchoscopy. There’s about 15 patients and she’ll need to do a thorough review of the patient’s records to gather to do a thorough review of the patient’s records to gather information to determine how these events may have taken place. information to determine how these events may have taken place. (She’ll eventually spend DAYS in the medical record department and (She’ll eventually spend DAYS in the medical record department and countless hours crawling through ventilation duct work and the countless hours crawling through ventilation duct work and the hospital roof…but that’s another story…)hospital roof…but that’s another story…)

Is it:Is it:A: A prospective cohort studyA: A prospective cohort studyB: A case-control StudyB: A case-control StudyC: A cross-sectional studyC: A cross-sectional studyD: A case-seriesD: A case-seriesE: A study on crazy epidemiologistsE: A study on crazy epidemiologists

Why did you select your answer?Why did you select your answer?

Page 6: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

RecapRecap: 15 patients underwent : 15 patients underwent bronchoscopy and ended up with bronchoscopy and ended up with really weird fungus growing in their really weird fungus growing in their lungs. In-depth review of charts.lungs. In-depth review of charts.

A: A: A prospective cohort study- NO. We need both exposed & unexposed groups for

B: A case-control Study NO. We’d need both disease AND no disease groups.

C: A cross-sectional study NO. We’d need everyone that underwent bronchoscopy

D: A case-series YES!!

E: A prospective study on crazy epidemiology interns NO. It would be a case study on crazy epidemiology interns.

Page 7: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Another Example…Another Example…

A group of researchers is interested in whether VAP is A group of researchers is interested in whether VAP is associated with the use of a particular tube type. They associated with the use of a particular tube type. They begin by identifying all patients diagnosed with VAP in begin by identifying all patients diagnosed with VAP in 2007 and also identify a similar group that did NOT 2007 and also identify a similar group that did NOT develop VAP. They then look at the frequency of tube develop VAP. They then look at the frequency of tube types among these two groups.types among these two groups.

Is it:Is it:A: A retrospective cohort StudyA: A retrospective cohort StudyB: A case-control studyB: A case-control studyC: A prospective cohort studyC: A prospective cohort study

Why did you select your answer?Why did you select your answer?

Page 8: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Recap:Recap: IS VAP associated with a certain IS VAP associated with a certain ET tube type? Start by looking at patients ET tube type? Start by looking at patients with and without VAP…then look at their with and without VAP…then look at their tube type.tube type.

A: A retrospective cohort NO. Pts need to be identified by exposure status in a cohort study.

B: A case-control study YES!!

D: A prospective cohort study NO. Again, pts would be identified by exposure status

Page 9: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Almost done! Almost done! (with this section, anyway)(with this section, anyway)

A research group is interested in the impact of A research group is interested in the impact of methadone use during pregnancy on baby outcomes. methadone use during pregnancy on baby outcomes. They have decided to follow a large group of pregnant They have decided to follow a large group of pregnant women classified as either methadone users or non-women classified as either methadone users or non-users and will later gather information on GA, users and will later gather information on GA, birthweight, Apgar Scores, etc.birthweight, Apgar Scores, etc.

Is it:Is it:A: A prospective cohort studyA: A prospective cohort studyB: A case seriesB: A case seriesC: An ecological studyC: An ecological studyD: A case-control studyD: A case-control study

Why did you choose your answer?Why did you choose your answer?

Page 10: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

RRecap:ecap: LOTS of pregnant women- LOTS of pregnant women- some are on drugs. What happens some are on drugs. What happens to all the babies?to all the babies?

A: A prospective cohort studyA: A prospective cohort study YES!!!YES!!!

B: A case series B: A case series NO.NO. This is for small groups, unusual This is for small groups, unusual phenomenon. Maybe a small group on extremely high phenomenon. Maybe a small group on extremely high doses?doses?

C: An ecological study C: An ecological study NO.NO. We have individual level data We have individual level data here and we have a timeline.here and we have a timeline.

D: A case-control study D: A case-control study NO.NO. That’s identification by That’s identification by outcomeoutcome status- We’re identifying on status- We’re identifying on exposureexposure status. status.

Page 11: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

We’ve now made it to…We’ve now made it to…The Stats Section!The Stats Section!

Questions So Far?Questions So Far?

Page 12: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Basic Stats: Data TypesBasic Stats: Data Types

*Data Types**Data Types*

CategoricalCategorical: the data have “categories” instead of : the data have “categories” instead of numeric values. (ex: male/female, disease/no disease, numeric values. (ex: male/female, disease/no disease, red/orange/yellow)red/orange/yellow)

DichotomousDichotomous: Categorical variable with only two : Categorical variable with only two possible categories.possible categories.

ContinuousContinuous: this means the variable can take on a : this means the variable can take on a range of possible values. (weight, bp, height, etc)range of possible values. (weight, bp, height, etc)

Page 13: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Categorical and Continuous Categorical and Continuous DataDataRemember…Remember…

Categorical dataCategorical data: yes/no, male/female, : yes/no, male/female, disease/no diseasedisease/no disease

Continuous dataContinuous data: weight, height, scores, : weight, height, scores, blood values, etc.blood values, etc.

Page 14: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill!Drill!

BMIBMI Disease (Yes/No)Disease (Yes/No) TemperatureTemperature Test Score (1-10)Test Score (1-10)

ContinuousContinuous

CategoricalCategorical

ContinuousContinuous

ContinuousContinuous

Page 15: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill!Drill! Test (positive/negative)Test (positive/negative) HeightHeight Survival (months)Survival (months) GenderGender

CategoricalCategoricalContinuousContinuousContinuousContinuousCategoricalCategorical

Page 16: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Descriptive Descriptive StatisticsStatistics

Page 17: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Describing the data…Describing the data…

Data:Data: 2, 7, 7, 8, 9, 11, 15 2, 7, 7, 8, 9, 11, 15

Mode:Mode: most frequently occurring number (7) most frequently occurring number (7)

Mean:Mean: average (9) average (9)

Median:Median: put numbers in order, middle number put numbers in order, middle number or average of two middle numbers (8) or average of two middle numbers (8) --AKA: 50AKA: 50thth Percentile. Percentile.

Page 18: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill!Drill!

Data: 1, 1, 3, 5, 6Data: 1, 1, 3, 5, 6

Mean, Median, Mode?Mean, Median, Mode?

3.23.2

33

11

Page 19: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Basic StatsBasic Stats: Descriptive Stats : Descriptive Stats for continuous datafor continuous data

N, or n:N, or n: We need to know how many people were in We need to know how many people were in the sample. Results drawn from a sample with n=5 the sample. Results drawn from a sample with n=5 aren’t very likely to be reliable. However, a sample of aren’t very likely to be reliable. However, a sample of n=100 will make you feel a little more comfortable. n=100 will make you feel a little more comfortable.

Central tendencyCentral tendency: Mean, median, mode: Mean, median, mode

Variation:Variation: Standard deviation, variance, standard Standard deviation, variance, standard errorerror

Page 20: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Descriptive stats: Descriptive stats: ContinuousContinuous

Normally distributed?Normally distributed?

NormalNormal: mean, SD: mean, SD

Not NormalNot Normal: median, range or 95% CI: median, range or 95% CI

* Special Case: Survival data are usually * Special Case: Survival data are usually described with described with medianmedian and and confidence confidence intervalinterval..

Page 21: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

The Empirical RuleThe Empirical Rule

How do we know if a distribution is “normal”??

-Visual Inspection (boxplots are very helpful)

-Kolmogorov-Smirnov (sorry, no vodka involved)

-Other tests

Page 22: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Basic Descriptive Stats for Basic Descriptive Stats for Categorical DataCategorical Data

Remember- you can’t take an average of yes/no Remember- you can’t take an average of yes/no (maybe?). (well, some people have (maybe?). (well, some people have triedtried to put that in to put that in papers…)papers…)

So, how do we describe categorical data?So, how do we describe categorical data?– N, or nN, or n– FrequenciesFrequencies– PercentagesPercentages

Page 23: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Question:Question:

Best approximation of the actual value for non-Best approximation of the actual value for non-normally distributed data?normally distributed data?

A: mean +/- standard error of the meanA: mean +/- standard error of the mean

B: median +/- standard deviationB: median +/- standard deviation

C: median +/- confidence intervalC: median +/- confidence interval

Page 24: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Before we move onto Before we move onto statistical tests- some basic statistical tests- some basic terminology…terminology…Independent VariableIndependent Variable: a predictor, a variable of interest: a predictor, a variable of interest

Dependent VariableDependent Variable: the thing you’re trying to predict or the : the thing you’re trying to predict or the outcome of interestoutcome of interest

Ex:Ex: I’m conducting a study to determine whether I’m conducting a study to determine whether administering antibiotic “x” approximately 12hrs before administering antibiotic “x” approximately 12hrs before surgery reduces post-operative infection rates.surgery reduces post-operative infection rates.

What’s the independent variable?What’s the independent variable?

What’s the dependent variable?What’s the dependent variable?

Page 25: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Another Example…Another Example…

Does serum albumin level pre-surgery affect Does serum albumin level pre-surgery affect the 90 day survival of patients receiving an the 90 day survival of patients receiving an LVAD?LVAD?

What’s the dependent variable?What’s the dependent variable?

What’s the independent variable?What’s the independent variable?

Page 26: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Statistical Tests: Statistical Tests: ContinuousContinuous

(Student’s) T-test:(Student’s) T-test: compares 2 groups on a compares 2 groups on a continuous variablecontinuous variable

Paired t-test:Paired t-test: compares 1 group, before and after compares 1 group, before and after on continuous variableon continuous variable

ANOVA:ANOVA: Compares 3+ groups on a continuous Compares 3+ groups on a continuous variablevariable

*Post-hoc tests *Post-hoc tests REQUIREDREQUIRED**

Page 27: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

How the Guinness Brewery How the Guinness Brewery Changed History…Changed History…

Page 28: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Statistical Tests: Statistical Tests: Non-Parametric (the Non-Parametric (the rebels!)rebels!)

Mann-Whitney U:Mann-Whitney U: compares 2 groups on a continuous compares 2 groups on a continuous variable (non-parametric version of t-test)variable (non-parametric version of t-test)

Wilcoxon Signed Ranks:Wilcoxon Signed Ranks: compares 1 group, before and compares 1 group, before and after on continuous variable (non-parametric version of after on continuous variable (non-parametric version of paired t-test)paired t-test)

Kruskal-Wallis:Kruskal-Wallis: Compares 3+ groups on a continuous Compares 3+ groups on a continuous variable (non-parametric version of ANOVA)variable (non-parametric version of ANOVA)

*Post-hoc tests *Post-hoc tests REQUIREDREQUIRED**

Page 29: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Statistical tests: CategoricalStatistical tests: Categorical

Chi-SquareChi-Square: used with categorical data with : used with categorical data with expected cell values 5+expected cell values 5+

McNemarMcNemar: paired proportions : paired proportions

Fisher ExactFisher Exact: categorical data with expected : categorical data with expected cell values <5.cell values <5.

Page 30: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Details, Details….Details, Details….

If all cell values are “5” or higher, you can use the If all cell values are “5” or higher, you can use the Chi-Square.Chi-Square.

If you have at least one cell with a value of “4” or If you have at least one cell with a value of “4” or lower, you should use the Fisher Exact test.lower, you should use the Fisher Exact test.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Q: Umm, what’s a “cell?” Phospholipid bi-layers!?!?Q: Umm, what’s a “cell?” Phospholipid bi-layers!?!?

A: In this case, a cell refers to this the A: In this case, a cell refers to this the

compartments of this 2x2 table--- >compartments of this 2x2 table--- >

55 44

1616 1010

Group 1 Group 2

M

ale

s Fe

male

s

Page 31: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Analytic Statistics for Analytic Statistics for Categorical variablesCategorical variables

Q: But what about normality & that crazy Kolmo-Q: But what about normality & that crazy Kolmo-whatchamacallit vodka test??!?!whatchamacallit vodka test??!?!

A: The tests for categorical variables don’t have any A: The tests for categorical variables don’t have any normality assumptions built in so your data can look normality assumptions built in so your data can look as crazy as can be and you will be fine!as crazy as can be and you will be fine!

Page 32: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

A study is being conducted to evaluate the effectiveness A study is being conducted to evaluate the effectiveness of a new diet pill. There are two groups- one is of a new diet pill. There are two groups- one is receiving a placebo, the other the experimental drug. receiving a placebo, the other the experimental drug. The outcome is BMI and is assumed to be normally The outcome is BMI and is assumed to be normally distributed.distributed.

What type of data? How would you summarize the data? What type of data? How would you summarize the data? What type of statistical test would you run?What type of statistical test would you run?

ContinuousContinuous

Mean +/- SDMean +/- SD

Two group t-testTwo group t-test

Page 33: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Other Stats: Other Stats: Relative Risk/Odds RatiosRelative Risk/Odds Ratios

Relative RiskRelative Risk: Used in cohort studies when you have the : Used in cohort studies when you have the incidence. (IR in exposed/IR in unexposed) (a/a+b) / incidence. (IR in exposed/IR in unexposed) (a/a+b) / (c/c+d)(c/c+d)

Odds RatioOdds Ratio: Used in case-control studies to approximate : Used in case-control studies to approximate relative risk. (ad/bc)relative risk. (ad/bc)

AA BB

CC DD

D ND

E

NE

Page 34: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

A group of patients are identified based off their A group of patients are identified based off their exposure status to the H1N1 vaccine. They are being exposure status to the H1N1 vaccine. They are being followed to determine whether they successfully followed to determine whether they successfully develop antibodies to the novel virus.develop antibodies to the novel virus.

Q: What type of study is this?Q: What type of study is this?

Q: What type of data does the outcome represent?Q: What type of data does the outcome represent?

Q: Name two statistical tests that could be used to Q: Name two statistical tests that could be used to evaluate this association.evaluate this association.

A: Prospective Cohort StudyA: Prospective Cohort Study

A: Categorical (yes/no for outcome)A: Categorical (yes/no for outcome)

A: Chi-square (or fisher exact) and Relative RiskA: Chi-square (or fisher exact) and Relative Risk

Page 35: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill continued…Drill continued…

So, here’s the data:So, here’s the data:

Calculate the RR !!!Calculate the RR !!!

90/10090/100

10/5010/50

RR = 4.5RR = 4.5

OO NONO

EE 9090 1010 100100

NENE 1010 4040 5050

100100 5050

Page 36: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

A group of patients are classified based on whether they A group of patients are classified based on whether they have stomach cancer or not. They are then asked have stomach cancer or not. They are then asked questions about their hot pepper consumption habits questions about their hot pepper consumption habits in the past 5 years (high consumption vs. low in the past 5 years (high consumption vs. low consumption). consumption).

Q: What type of study is this?Q: What type of study is this?

Q: What type of statistics could be used to evaluate the Q: What type of statistics could be used to evaluate the association?association?

A: Case-Control StudyA: Case-Control Study

A: Chi-Square/Fisher Exact or Odds RatioA: Chi-Square/Fisher Exact or Odds Ratio

Page 37: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill continued…Drill continued…

So, here’s the data:So, here’s the data:

Calculate the OR !!!Calculate the OR !!!

90*4090*40

10*1010*10

OR = 36OR = 36

OO NONO

EE 9090 1010 100100

NENE 1010 4040 5050

100100 5050

Page 38: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Regression AnalysisRegression Analysis

Regression Analysis?Regression Analysis?– Everything we’ve look at so far is termed “univariate Everything we’ve look at so far is termed “univariate

analysis” – meaning, we just look at the effect of analysis” – meaning, we just look at the effect of ONE variable at a time, but what if there’s a lot of ONE variable at a time, but what if there’s a lot of different risk factors? What if they interact with each different risk factors? What if they interact with each other?other?

– Regression analysis is used when we want to look at Regression analysis is used when we want to look at the complex interaction between different predictive the complex interaction between different predictive variables on the outcome of interest. This analysis variables on the outcome of interest. This analysis allows us to determine the effect of each variable on allows us to determine the effect of each variable on the outcome when ALL the others are controlled. the outcome when ALL the others are controlled.

Page 39: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Regression Analysis…Regression Analysis…

Regression Type is based on OUTCOME type Regression Type is based on OUTCOME type (not predictor variables) (not predictor variables)

– Two Basic TypesTwo Basic Types LOGISTIC RegressionLOGISTIC Regression: Outcome is : Outcome is

“dichotomous”“dichotomous” LINEAR RegressionLINEAR Regression: Outcome is “continuous”: Outcome is “continuous”

– In both types of regression, you can enter BOTH In both types of regression, you can enter BOTH continuous and categorical continuous and categorical predictorspredictors..

Page 40: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Hypothesis Testing…Hypothesis Testing…

Null hypothesisNull hypothesis: assumes that all the groups : assumes that all the groups will behave similarly- no meaningful will behave similarly- no meaningful differences.differences.

Alternate hypothesisAlternate hypothesis: There IS a difference: There IS a difference

– One-sided: Group A is better than BOne-sided: Group A is better than B– Two-sided: Group A is different than BTwo-sided: Group A is different than B

Note: This is the main type of hypothesis testing. There are some variations in which logic is flipped on it’s head: equivalence testing & non-inferiority testing are just two of them…

Page 41: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Hypothesis TestingHypothesis Testing

Reality ->Reality ->

Test Result Test Result ↓↓

No No DifferenceDifference

DifferenceDifference

Fail to Fail to Reject NullReject Null

CORRECTCORRECT Type II Type II ErrorError

(beta)(beta)

Reject NullReject Null Type I ErrorType I Error

(alpha)(alpha)CORRECTCORRECT

Page 42: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Hypothesis TestingHypothesis Testing

Type I ErrorType I Error: Incorrectly reject the null, alpha (0.05 or : Incorrectly reject the null, alpha (0.05 or 0.01)0.01)

Type II ErrorType II Error: Incorrectly fail to reject the null, beta (1-: Incorrectly fail to reject the null, beta (1-beta = power) (power = 80%)beta = power) (power = 80%)

– 1: Sample size too small !!!1: Sample size too small !!!– 2: Observed difference was smaller than specified 2: Observed difference was smaller than specified

differencedifference

P-valueP-value: probability of observing the event if it occurred : probability of observing the event if it occurred by chance. by chance.

Page 43: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

Large randomized multicenter trial where no difference is Large randomized multicenter trial where no difference is seen. Why?seen. Why?

A: Too strict inclusion criterionA: Too strict inclusion criterionB: Too different populations because of different centersB: Too different populations because of different centersC: The clinical difference is smaller than the expected C: The clinical difference is smaller than the expected

differencedifference

Page 44: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Hypothesis Testing: 95% CIHypothesis Testing: 95% CI

95% CI95% CI: provides an estimate of the true value. In hypothesis : provides an estimate of the true value. In hypothesis testing, we’re looking for a certain value in the interval testing, we’re looking for a certain value in the interval that corresponds to the null…that corresponds to the null…

Sooooo….in Relative Risk or Odds Ratios, we’re looking at the Sooooo….in Relative Risk or Odds Ratios, we’re looking at the ratio of risks for two groups.ratio of risks for two groups.

Q: If the risk is the same between the two groups, the ratio Q: If the risk is the same between the two groups, the ratio = ?= ?

Q: What value are we looking for in the associated 95% CI?Q: What value are we looking for in the associated 95% CI?

A: Yes, we’re looking for the value of “1” If that value is in the A: Yes, we’re looking for the value of “1” If that value is in the confidence interval, than “no difference” is in the range of confidence interval, than “no difference” is in the range of true values and the result wouldn’t be significant.true values and the result wouldn’t be significant.

Page 45: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Hypothesis Testing: 95% CIHypothesis Testing: 95% CI

What about a paired t-test?What about a paired t-test?

Q: What type of data is the test used for?Q: What type of data is the test used for?

Q: What’s the null value in this case?Q: What’s the null value in this case?

Q: So, what value are we looking for in the CI?Q: So, what value are we looking for in the CI?

A: Remember, this is generally used for before/after A: Remember, this is generally used for before/after tests. So, if before = after, then after - before = 0. tests. So, if before = after, then after - before = 0. Therefore, we’re looking for a value of “0” in the CI. If Therefore, we’re looking for a value of “0” in the CI. If we find it, the result is considered non-significant. we find it, the result is considered non-significant.

Page 46: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Trials and StudiesTrials and Studies

RCT:RCT: Reduces bias, evens distribution of confounding Reduces bias, evens distribution of confounding factors, but sometimes can’t be used.factors, but sometimes can’t be used.

Double BlindDouble Blind: doctor/patient don’t know what the : doctor/patient don’t know what the patient is getting. Reduces observational bias.patient is getting. Reduces observational bias.

Cohort StudyCohort Study: Patients identified by exposure status : Patients identified by exposure status and followed for outcomeand followed for outcome

Case-Control StudyCase-Control Study: Patients identified by outcome : Patients identified by outcome status (case or control) and look back for exposures.status (case or control) and look back for exposures.

Page 47: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

For a cohort study, what type of ratio can be calculated?For a cohort study, what type of ratio can be calculated?

A: Relative RiskA: Relative Risk

For a case-control study, what type of ratio can be For a case-control study, what type of ratio can be calculated?calculated?

A: Odds RatioA: Odds Ratio

Page 48: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

What’s the formula for Relative Risk?What’s the formula for Relative Risk?

A: IR in exposed/IR in unexposedA: IR in exposed/IR in unexposed

What’s the formula for Odds Ratio?What’s the formula for Odds Ratio?

A: ad/bcA: ad/bc

Page 49: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Misc…Misc…

Meta-Analysis:Meta-Analysis: combines the data from several combines the data from several different studies. Often used when individual sample different studies. Often used when individual sample sizes are too small and underpowered. Be careful sizes are too small and underpowered. Be careful when the studies are too different from each other.when the studies are too different from each other.

PrevalencePrevalence: # of current cases/total population: # of current cases/total population

Incidence:Incidence: # of new cases/total population at risk # of new cases/total population at risk

Page 50: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Test DiagnosticsTest Diagnostics

DD NDND

++ 9 9 (a)(a) 9 9 (b)(b) 1818

__ 1 1 (c)(c) 81 81 (d)(d)

8282

1010 9090 100100

Sensitivity: positive/ all diseased (a/a+c)= 90%

Specificity: negative/all not diseased (d/b+d) = 90%

PPV: diseased/all positive (a/a+b) = 50%

NPV: no disease/all negative (d/c+d) = 98.8%

Accuracy: correct results/all (a+d/ a+b+c+d)= 90%

Page 51: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Drill !Drill !

DD NDND

++ 10 10 (a)(a)

15 15 (b)(b)

2525

__ 10 10 (c)(c)

30 30 (d)(d)

4040

2020 4545 6565

Calculate Sensitivity, Specificity, PPV, NPV, Accuracy.

Sensitivity: (10/20) 50%

Specificity: (30/45) 66.7%

PPV: (10/25) 40%

NPV: (30/40) 75%

Accuracy: (10+30/ 65) 61.5%

Page 52: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

SampleSample QuestionQuestion

Prevalence of disease is 20%. Test is 80% sensitive and Prevalence of disease is 20%. Test is 80% sensitive and specific. What is the likelihood that a positive test is specific. What is the likelihood that a positive test is correct?correct?

First- What is this question asking for? First- What is this question asking for?

A: Positive Predictive Value (PPV) So, we’re going to be A: Positive Predictive Value (PPV) So, we’re going to be reading across that first row…reading across that first row…

Second- How do we set up this table?Second- How do we set up this table?

Page 53: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Steps to the answer…Steps to the answer…

Step 1: Draw the basic table with the Step 1: Draw the basic table with the correct correct orientation.orientation.

DD NDND

++

__

Page 54: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Steps to the answer…Steps to the answer…

Step 2: Begin with the prevalence they gave you…use Step 2: Begin with the prevalence they gave you…use EASY numbersEASY numbers

““Prevalence of disease is 20%”Prevalence of disease is 20%”

DD NNDD

++

__2020 8080 1010

00

Page 55: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Steps to the answer…Steps to the answer…

Step 3: Use the other information to fill in the table…Step 3: Use the other information to fill in the table…

““Test is 80% sensitive and specific.”Test is 80% sensitive and specific.”

DD NDND

++ 1616 1616 3232

__ 44 6464 6868

2020 8080 100100

Page 56: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Steps to answer the Steps to answer the question…question…Step 4: Answer the question! What are we looking for? Step 4: Answer the question! What are we looking for?

PPV!PPV!

DD NNDD

++ 1616 1616 3232

__ 44 6464 6868

2020 8080 101000

PPV= Disease/all positives (a/a+b)

16/32 = 50%

Page 57: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

QuestionsQuestions

Page 58: Katheryne Downes, MPH Statistical Data Analyst/Research Specialist Office of Clinical Research/GME

Thank you!


Recommended