Group Comparisons Part 3: Nonparametric Tests, Chi-squares and Fisher Exact Robert Boudreau, PhD...

Group Comparisons Part 3:Group Comparisons Part 3:

Nonparametric Tests, Nonparametric Tests, Chi-squares and Fisher ExactChi-squares and Fisher Exact

Robert Boudreau, PhDRobert Boudreau, PhDCo-Director of Methodology CoreCo-Director of Methodology Core

PITT-Multidisciplinary Clinical Research Center PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal Diseasesfor Rheumatic and Musculoskeletal Diseases

Core Director for BiostatisticsCore Director for Biostatistics Center for Aging and Population Health Center for Aging and Population Health

Dept. of Epidemiology, GSPH Dept. of Epidemiology, GSPH

Flow chart for group Flow chart for group comparisonscomparisons

Measurements to be compared

continuous

Distribution approx normal or N ≥ 20?

No Yes

Non-parametrics T-tests

discrete

( binary, nominal, ordinal with few values)

Chi-squareFisher’s Exact

A physiologic index of comorbidity – relationship to mortality and disability.

Anne B. Newman, MD, MPH, Robert M. Boudreau, PhD, Barbara L. Naydeck, MPH, Linda F. Fried, MD, MPH and Tamara B. Harris, MD, MS

J Gerontol Med Sci. 2008

5 Physiologic System 5 Physiologic System MeasuresMeasures

Cystatin CCystatin C Internal Carotid Artery Wall Thickness (ICA)Internal Carotid Artery Wall Thickness (ICA) Pulmonary: Forced Vital Capacity (FVC)Pulmonary: Forced Vital Capacity (FVC) Fasting GlucoseFasting Glucose White Matter GradeWhite Matter Grade

N=2928 elderly participants in longitudinal cohort studyN=2928 elderly participants in longitudinal cohort study

0-2 scale on each: 0=healthiest, 2=worst 0-2 scale on each: 0=healthiest, 2=worst tertiles or clinical cutpointstertiles or clinical cutpoints

(e.g. glucose <100, 100-126, 126+)(e.g. glucose <100, 100-126, 126+)

Physiologic Index= sum (range=0 to 10)Physiologic Index= sum (range=0 to 10)

* Mortality rates based on 9 yrs followup

√

Comparisons Using 2-SampleIndepende

nt T-tests ?

√

Comparisons Using 2-SampleIndepende

nt T-tests ?

√

√

√

√

Comparisons Using

Chi-Square ?

(categorical)

√

√√

√

Comparisons Using

Chi-Square ?

(categorical)

Pooled or Unequal Variance2-sampleT-test ?

Pooled or Unequal Variance2-sampleT-test ?

Pooleddf=(1237-1)+ (1691-1) = 2926

Unequal Vars(Satterthwaite)

Unequal Vars(Satterthwaite)

2-Sample T-test,Non-parametric: Wilcoxon Rank-Sum Test

Three-dimensional and thermal surface imaging produces reliable measures of joint shape and

temperature: a potential tool for quantifying arthritis

Steven J Spalding, C Kent Kwoh, Robert Boudreau, Joseph Enama, Julie Lunich, Daniel Huber, Louis Denes

and Raphael Hirsch

Arthritis Research & Therapy 2008

Will focus on HDI

Heat Distribution Index = SD of temps in standard reproducibly defined region

HDI of MCPs: RA vs HDI of MCPs: RA vs ControlsControls

MCP Region

HDIHDI (Heat Distribution Index) of (Heat Distribution Index) of MCPsMCPs 10 adults controls vs 9 adults with active RA10 adults controls vs 9 adults with active RA

…………...


T-test (2-sample independent) T-test (2-sample independent) vsvs Wilcoxon Rank-SumWilcoxon Rank-Sum (aka Mann- (aka Mann-

Whitney)Whitney)Control(n=10)

Arthritis(n=9)

1.2 1.4

1.1 2.4

1.0 2.3

1.2 2.1

0.6 3.0

0.5 1.1

1.0 1.4

1.0 1.3

1.3 1.1

1.2

Mean 1.01 1.79

SD 0.26 0.70

Median 1.05 1.40


T-test (2-sample independent) T-test (2-sample independent)

T-Tests

Variable Method Variances DF t Value Pr > |t|

HDI Pooled Equal 17 3.36 0.0037 HDI Satterthwaite Unequal 10.2 3.23 0.0089

Test for Equality of Variances

Variable Method Num DF Den DF F Value Pr > F

HDI Folded F 8 9 6.60 0.0105

“pooled” df = 10+9-2=17


T-test (2-sample independent) T-test (2-sample independent)

T-Tests

Variable Method Variances DF t Value Pr > |t|

HDI Pooled Equal 17 3.36 0.0037 HDI Satterthwaite Unequal 10.2 3.23 0.0089

Test for Equality of Variances

Variable Method Num DF Den DF F Value Pr > F

HDI Folded F 8 9 6.60 0.0105

Test of equality of variances is rejected

=> Use Unequal Variance t-test (Satterthwaite)


Wilcoxon Rank-SumWilcoxon Rank-Sum (aka Mann-Whitney) (aka Mann-Whitney)

The idea/motivation:The idea/motivation: Method should work for any distribution Method should work for any distribution

non-parametricnon-parametric Base statistical test on ranksBase statistical test on ranks

rank = order when all data is sorted from rank = order when all data is sorted from lowest to highest lowest to highest each group then gets a “rank sum”each group then gets a “rank sum”

Won’t be affected by outliersWon’t be affected by outliers Like all statistical tests, p-value is based on Like all statistical tests, p-value is based on

distribution (of difference in rank-sums here) distribution (of difference in rank-sums here) assuming there is no difference between the groupsassuming there is no difference between the groups



Base statistical test on ranksBase statistical test on ranks

each group gets a “rank sum”each group gets a “rank sum” p-value is based on distribution of difference in rank-sumsp-value is based on distribution of difference in rank-sums

assuming there is no difference between the groupsassuming there is no difference between the groups

just like shuffling cardsjust like shuffling cards

(with only two colors on cards; even if different n’s)(with only two colors on cards; even if different n’s)

the critical values are the “extreme” differences in the critical values are the “extreme” differences in

rank-sums between the two groups rank-sums between the two groups

((αα = 0.05 => = 0.05 => the most extreme 5% of differences ) the most extreme 5% of differences )

Sorted then assigned ranks

Obs group HDI HDI_rank

1 Control 0.5 1.0 2 Control 0.6 2.0 3 Control 1.0 4.0 4 Control 1.0 4.0 5 Control 1.0 4.0 6 Control 1.1 7.0 7 Arthritis 1.1 7.0 8 Arthritis 1.1 7.0 9 Control 1.2 10.0 10 Control 1.2 10.0 11 Control 1.2 10.0 12 Control 1.3 12.5 13 Arthritis 1.3 12.5 14 Arthritis 1.4 14.5 15 Arthritis 1.4 14.5 16 Arthritis 2.1 16.0 17 Arthritis 2.3 17.0 18 Arthritis 2.4 18.0 19 Arthritis 3.0 19.0

Average rank (= 12.5)



Wilcoxon Scores (Rank Sums) for Variable HDIClassified by Variable Group

Sum of Expected Std Dev Mean Group N Scores Under H0 Under H0 Score Control 10 64.50 100.0 12.172013 6.45000Arthritis 9 125.50 90.0 12.172013 13.94444

Average scores were used for ties.


Wilcoxon Rank-SumWilcoxon Rank-Sum (aka Mann-Whitney) (aka Mann-Whitney) Wilcoxon Two-Sample Test

Statistic (S) 125.5000

Normal Approximation Z 2.8754 One-Sided Pr > Z 0.0020 Two-Sided Pr > |Z| 0.0040

t Approximation One-Sided Pr > Z 0.0050 Two-Sided Pr > |Z| 0.0101

Exact Test One-Sided Pr >= S 0.0012 Two-Sided Pr >= |S - Mean| 0.0023

Z includes a continuity correction of 0.5.

Comparing Groups in the Comparing Groups in the Percentage Falling into Percentage Falling into

Categories Categories Example: Treatment for RAExample: Treatment for RA

Compare MTX vs MTX+ETNCompare MTX vs MTX+ETN

Outcomes (@ 3 months) Outcomes (@ 3 months) Dichotomous: e.g. % in remissionDichotomous: e.g. % in remission

% with DAS28 drop > 1.2 % with DAS28 drop > 1.2 ptspts

Multiple Categories: ACR 20/50/70Multiple Categories: ACR 20/50/70

% of pts reaching each level (sum to 100%)% of pts reaching each level (sum to 100%)

√√

√

Comparisons Using

Chi-Square ?

(categorical)

Comparing Groups on the Comparing Groups on the Percentage Falling into Percentage Falling into

Categories Categories Rule of thumb:Rule of thumb:

[1] All cell sizes [1] All cell sizes ≥ 5 => Use Chi-square≥ 5 => Use Chi-square

[2] Any cell size < [2] Any cell size < 5 => Use Fisher’s Exact5 => Use Fisher’s Exact

ReasonReason: Criterion [1] is a condition for the: Criterion [1] is a condition for theCentral Limit Theorem to hold with goodCentral Limit Theorem to hold with goodaccuracy (… so p-values are accurate) accuracy (… so p-values are accurate)


Categories Categories Sharma L, et.al. Quadriceps Strength and OA Progression in Malaligned and Lax

Knees, Ann Intern Med. 2003

Inclusions: KLgrade ≥ 2 At least a little difficulty (Likert category) on at least

two items in Western Ontario and McMaster University osteoarthritis index physical function scale

Exclusions: corticosteroid injection < 3 months, avascular necrosis, rheumatoid or other inflammatory arthritis,

periarticularfracture, Paget disease, villonodular synovitis, … (etc.) villonodular synovitis, … (etc.)


Categories Categories JSN Progression

No Yes # Knees

More neutral alignment (< 5 degrees)

Low quadraceps Strength 111 (88.8%) 14 (11.2%) 125

High quadraceps Strength 111 (88.8%) 14 (11.2%) 125

Malignment ( ≥ 5 degrees )

Low quadraceps Strength 28 (74.4%) 10 (26.3%) 38

High quadraceps Strength 20 (50.0%) 20 (50.0%) 40



No Yes # Knees


Low quadraceps Strength 28 (74.4%) 10 (26.3%) 38 (48.7%)

High quadraceps Strength 20 (50.0%) 20 (50.0%) 40 (51.3%)

Column totals 48 (61.5%) 30 (38.5) Total = 78


Categories Categories Chi-square Statistic df=(rows-1) x (cols-1)

Note: ni j = observed (actual) cell count eij = (row %) x (col %) x (total # knees) = (# knees in row) x (col %) = expected cell count as if groups are the “same” (eij effectively applies the “pooled” average JSN Progression rate to both groups)

Cells are: # observed Cells are: # observed

(# expected) (# expected)JSN Progression

No Yes Row %’s


Low quadraceps strength28

(23.4)10

(14.6)38 (48.7%)

High quadraceps strength20

(24.6)20

(15.4)40 (51.3%)

Column %’s 61.5% 38.5% Total = 78

High quadraceps strength: Expected # Yes = 0.513*0.385*78=0.1975*78 = 0.385 * 40 knees=15.4



No Yes # Knees


Low quadraceps Strength 28 (74.4%) 10 (26.3%) 38 (48.7%)

High quadraceps Strength 20 (50.0%) 20 (50.0%) 40 (51.3%)

Column totals 48 (61.5%) 30 (38.5) Total = 78

Chi-square = 4.6184, p=0.0316 df = (2-1) x (2-1) = 1Fisher’s Exact: p=0.0383

Cells are: Obs # (Alt #)Cells are: Obs # (Alt #)

Fisher’s Exact uses all (Alt #)’s that Fisher’s Exact uses all (Alt #)’s that retainretain

same row/col counts same row/col countsJSN Progression

No Yes # Knees


Low quadraceps Strength 28 (29) 10 (9) 38

High quadraceps Strength 20 (19) 20 (21) 40

Column totals 48 (61.5%) 30 Total = 78

Fisher’s Exact p-value is the hypergeometric proportion of tables that are at least as “extreme” as the observed table. (above table is more “extreme”)


Categories Categories Rule of thumb:Rule of thumb:

[1] All cell sizes [1] All cell sizes ≥ 5 => Use Chi-square≥ 5 => Use Chi-square

[2] Any cell size < [2] Any cell size < 5 => Use Fisher’s Exact5 => Use Fisher’s Exact

Date post:	02-Jan-2016
Category:	Documents
Upload:	gervase-pearson
View:	217 times
Download:	2 times

Group Comparisons Part 3: Nonparametric Tests, Chi-squares and Fisher Exact Robert Boudreau, PhD...

Documents