+ All Categories
Home > Documents > Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Date post: 20-Dec-2015
Category:
View: 224 times
Download: 0 times
Share this document with a friend
72
Homework • Chapter 11: 13 • Chapter 12: 1, 2, 14, 16
Transcript
Page 1: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Homework

• Chapter 11: 13• Chapter 12: 1, 2, 14, 16

Page 2: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions in Statistics

Page 3: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The wrong way to make a comparison of two

groups“Group 1 is significantly different from a constant, but Group 2 is not. Therefore Group 1 and Group 2 are different from each other.”

Page 4: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

A more extreme case...

Page 5: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Interpreting Confidence Intervals

Different Not different Unknown

Page 6: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions

• Random sample(s)• Populations are normally distributed

• Populations have equal variances– 2-sample t only

Page 7: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions

• Random sample(s)• Populations are normally distributed

• Populations have equal variances– 2-sample t only

What do we do when these are violated?

Page 8: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions in Statistics

• Are any of the assumptions violated?

• If so, what are we going to do about it?

Page 9: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting deviations from normality

•Histograms

•Quantile plots

•Shapiro-Wilk test

Page 10: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting deviations from normality: by

histogram

Biomass ratio

Frequency

Page 11: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting deviations from normality: by

quantile plot

Normal data

Normal Quantile

Page 12: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting deviations from normality: by

quantile plot

Biomass ratio

Normal Quantile

Page 13: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting deviations from normality: by

quantile plot

Biomass ratio

Normal QuantileNormal distribution = straight line

Non-normal = non-straight line

Page 14: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Detecting differences from normality:

Shapiro-Wilk test

Shapiro-Wilk Test is used to test statisticallywhether a set of data comes from a normal distribition

Ho: The data come from a normal distributionHa: The data come from some other distribution

Page 15: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

What to do when the distribution is not

normal• If the sample sizes are large, sometimes the standard tests work OK anyway

• Transformations• Non-parametric tests• Randomization and resampling

Page 16: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The normal approximation

• Means of large samples are normally distributed

• So, the parametric tests on large samples work relatively well, even for non-normal data.

• Rule of thumb- if n > ~50, the normal approximations may work

Page 17: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Data transformations

A data transformation changes each data point by some simple mathematical formula

Page 18: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Log-transformation

′ Y = ln Y[ ]

Y Y' = ln[Y]

ln = “natural log”, base elog = “log”, base 10EITHER WORK

Page 19: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

biomass ratio

ln[biomass ratio]

Biomassratio

ln[BiomassRatio]

1.34 0.301.96 0.672.49 0.911.27 0.241.19 0.181.15 0.141.29 0.26

Carry out the test on the transformed data!

Page 20: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The log transformation is often useful when:

• the variable is likely to be the result of multipli cation

of various components.

• the frequency distribution of the data is skewed to the

right

• the variance seems to increase as the mean gets larger

( in comparisons across groups).

Page 21: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Variance and mean increase together --> try the log-transform

Page 22: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Other transformations

Arc sine

′ p =arcsinp[ ]Sq uar e -r oo t

′ Y = Y+12Sq uar e

′ Y =Y2Re cipr o ca l

′ Y =1Y

An t il o g

′ Y =eY

Arcsine

Square-root

Square

Reciprocal

Antilog

Page 23: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example: Confidence interval with log-transformed data

Data: 5 12 1024 12398ln data: 1.61 2.48 6.93 9.43

Y '= 5.11 slog Y[ ] = 3.70

Y '±t0.05 2( ),3

slog Y[ ]

n= 5.11± 3.18

3.70

4= 5.11± 5.88

−0.773 < μ log Y[ ] <10.99

0.46 < μ < 59278

ln[Y]

ln[Y]

ln[Y]

Page 24: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Valid transformations...

• Require the same transformation be applied to each individual

• Must be backwards convertible to the original value, without ambiguity

• Have one-to-one correspondence to original values

X = ln[Y] Y = eX

Page 25: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Choosing transformations

• Must transform each individual in the same way

• You CAN try different transformations until you find one that makes the data fit the assumptions

• You CANNOT keep trying transformations until P <0.05!!!

Page 26: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions

• Random sample(s)• Populations are normally distributed

• Populations have equal variances– 2-sample t onlyDo the populations have equal variances?

If so, what shouldWe do about it?

Page 27: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Comparing the variance of two groups

H0 :σ 12 = σ 2

2

HA :σ 12 ≠ σ 2

2

One possible method: the F test

Page 28: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The test statistic F

F =s1

2

s22

Put the larger s2 on top in the numerator.

Page 29: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

F...

• F has two different degrees of freedom, one for the numerator and one for the denominator. (Both are df = ni -1.) The numerator df is listed first, then the denominator df.

• The F test is very sensitive to its assumption that both distributions are normal.

Page 30: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example: Variation in insect genitalia

Polygamousspecies

Monogamousspecies

Mean -19.3 10.25

Samplevariance

243.9 2.27

Sample size 7 9

Page 31: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example: Variation in insect genitalia

s12 = 243.9 s2

2 = 2.27

F =243.9

2.27=107.4

Page 32: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Degrees of freedom

df1 = 7 −1 = 6

df2 = 9 −1 = 8

F0.025,6,8 = 4.7

For a 2-tailed test, we compare to F/2,df1,df2 from Table A3.4

Page 33: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Why /2 for the critical value?

By putting the larger s2 in the numerator, we are forcing F to be greater than 1.

By the null hypothesis there is a 50:50 chance of either s2 being greater, so we want the higher tail to include just /2.

Page 34: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Critical value for F

Page 35: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Conclusion

The F= 107.4 from the data is greater than F(0.025), 6,8 =4.7, so we can reject the null hypothesis that the variances of the two groups are equal.

The variance in insect genitalia is much greater for polygamous species than monogamous species.

Page 36: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

SampleNull hypothesis

The two populations have the same variance

1 2

2

F-test

Test statistic Null distributionF with n1-1, n2-1 dfcompare

How unusual is this test statistic?

P < 0.05 P > 0.05

Reject Ho Fail to reject Ho

F =s1

2

s22

Page 37: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

What if we have unequal variances?

• Welch’s t-test would work

• If sample sizes are equal and large, then even a ten-fold difference in variance is approximately OK

Page 38: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Comparing means when variances are not

equal

Welch’s t test compared the means of two normally distributed populations that have unequal variances

Page 39: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Burrowing owls and dung traps

Page 40: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Dung beetles

Page 41: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Experimental design

• 20 randomly chosen burrowing owl nests

• Randomly divided into two groups of 10 nests

• One group was given extra dung; the other not

• Measured the number of dung beetles on the owls’ diets

Page 42: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Number of beetles caught

• Dung added:

• No dung added:

Y = 4.8

s = 3.26

Y = 0.51

s = 0.89

Page 43: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Hypotheses

H0: Owls catch the same number of dung beetles with or without extra dung (1 = 2)

HA: Owls do not catch the same number of dung beetles with or without extra dung (1 2)

Page 44: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Welch’s t

t =Y 1 − Y 2s1

2

n1

+s2

2

n2

df =

s12

n1

+s2

2

n2

⎝ ⎜

⎠ ⎟

2

s12 n1( )

2

n1 −1+

s22 n2( )

2

n2 −1

⎜ ⎜

⎟ ⎟

Round down df to nearest integer

Page 45: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Owls and dung beetles

t =Y 1 −Y 2s1

2

n1

+s2

2

n2

=4.8 − 0.51

3.262

10+

0.892

10

= 4.01

Page 46: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Degrees of freedom

df =

s12

n1

+s2

2

n2

⎝ ⎜

⎠ ⎟

2

s12 n1( )

2

n1 −1+

s22 n2( )

2

n2 −1

⎜ ⎜

⎟ ⎟

=

3.262

10+

0.892

10

⎝ ⎜

⎠ ⎟

2

3.262 10( )2

10 −1+

0.892 10( )2

10 −1

⎜ ⎜

⎟ ⎟

=10.33

Which we round down to df= 10

Page 47: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Reaching a conclusion

t0.05(2), 10= 2.23

t=4.01 > 2.23

So we can reject the null hypothesis with P<0.05.

Extra dung near burrowing owl nests increases the number of dung beetles eaten.

Page 48: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumptions

• Random sample(s)• Populations are normally distributed

• Populations have equal variances– 2-sample t only

What if you don’t want to make so many assumptions?

Page 49: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

SampleNull hypothesis

The two populations have the same mean

1

2

Welch’s t-test

Test statistic Null distributiont with df from formulacompare

How unusual is this test statistic?

P < 0.05 P > 0.05

Reject Ho Fail to reject Ho

t =Y 1 − Y 2s1

2

n1

+s2

2

n2

Page 50: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Non-parametric methods

• Assume less about the underlying distributions

• Also called "distribution-free"

• "Parametric" methods assume a distribution or a parameter

Page 51: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Most non-parametric methods use RANKS

• Rank each data point in all samples from lowest to highest

• Lowest data point gets rank 1, next lowest gets rank 2, ...

Page 52: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Sign test

• Non-parametric test• Compares data from one sample to a constant

• Simple: for each data point, record whether individual is above (+) or below (-) the hypothesized constant.

• Use a binomial test to compare result to 1/2.

Page 53: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example: Polygamy and the origin of species

• Is polygamy associated with higher or lower speciation rates?

Page 54: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Order Family Multiple matinggroup

Numberof

species

Singlemating group

Numberof

speciesBeetles Anobiidae Ernobius 53 Xestobium 10

Dermestidae Dermestes 73 Trogoderma 120Elateridae Agriotes 228 Selatosomus 74

Flies Muscidae Coenosia 353 Delia 289Cecidomyiidae Rhopalomyia 157 Mayetiola 30Chironomidae Chironomus 300 Pontomyia 4Chironomidae Stictochironomus 34 Clunio 18Drosophilidaeand Culicidae

Drosophilidae 3,400 Culicidae 3,500

Dryomyzidaeand

Calliphoridae

Dryomyzidae 20 Calliphoridae 1,000

Tephritidae Anastrepha 196 Bactrocera 486Sciaridae and

BibionidaeSciaridae 1,750 Bibionidae 660

Scatophagidae Scatophaga 55 Musca 63Mayflies Siphlonuridae Siphlonurus 37 Caenis 115

Homoptera Psyllidae Cacopsylla 100 Aonidiella 30Butterfliesand moths

Noctuidae andPsychidae

Noctuidae 21,000 Psychidae 600

Tortricidae Choristoneura 37 Epiphyas 40Nymphalidae Eueides

(aliphera clade)7 Eueides

(vibiliaclade)

5

Nymphalidae Heliconius(silvaniform

clade)

15 Heliconius(sarasapho

clade)

7

Nymphalidae Polygonia / 18 Nymphalis 6

Etc....

Page 55: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The differences are not normal

-5000 0 5000 10000 20000

43 -47 154 64 127 296 16-100 -980 -290 1090 -8 -78 70

20940 -3 2 8 12 227 161 1 79 78

Page 56: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Hypotheses

H0: The median difference in number of species between singly-mating and multiply-mating insect groups is 0.

HA: The median difference in number of species between these groups is not 0.

Page 57: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

7 out of 25 comparisons are

negative43 -47 154 64 127 296 16

-100 -980 -290 1090 -8 -78 7020940 -3 2 8 12 227 1

61 1 79 78

Pr X ≤ 7[ ] =25

i

⎝ ⎜

⎠ ⎟

i =1

7

∑ 0.5( )i

0.5( )25−i

= 0.02164

P = 2 (0.02164) = 0.043

Page 58: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

The sign test has very low power

So it is quite likely to not reject a false null

hypothesis.

Page 59: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Non-parametric test to compare 2 groups

The Mann-Whitney U test compares the central tendenciesof two groups using ranks

Page 60: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Performing a Mann-Whitney U test

• First, rank all individuals from both groups together in order (for example, smallest to largest)

• Sum the ranks for all individuals in each group --> R1 and R2

Page 61: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Calculating the test statistic, U

U1 = n1n2 +n1 n1 +1( )

2− R1

U2 = n1n2 − U1

Page 62: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example: Garter snake resistance to newt

toxin

Rough-skinned newt

Page 63: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Comparing snake resistance to TTX (tetrodotoxin)

Locality ResistanceBenton 0.29Benton 0.77Benton 0.96Benton 0.64Benton 0.70Benton 0.99Benton 0.34

Warrenton 0.17Warrenton 0.28Warrenton 0.20Warrenton 0.20Warrenton 0.37

This variable is known to be not normally distributed within populations.

Page 64: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Hypotheses

H0: The TTX resistance for snakes from Benton is the same as for snakes from Warrenton.

HA: The TTX resistance for snakes from Benton is different from snakes from Warrenton.

Page 65: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Calculating the ranksLocality Resistance RankBenton 0.29 5Benton 0.77 10Benton 0.96 11Benton 0.64 8Benton 0.70 9Benton 0.99 12Benton 0.34 6

Warrenton 0.17 1Warrenton 0.28 4Warrenton 0.20 2.5Warrenton 0.20 2.5Warrenton 0.37 7

Rank sum for Warrenton: R1=1+4+2.5+2.5+7=17

Page 66: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Calculating U1 and U2

U1 = n1n2 +n1 n1 +1( )

2− R1 = 5 7( ) +

5 6( )2

−17 = 33

U2 = n1n2 −U1 = 5 7( ) − 33 = 2

For a two-tailed test, we pick the larger of U1 or U2:

U=U1=33

Page 67: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Compare U to the U table

• Critical value for U for n1 =5 and n2=7 is 30

• 33 >30, so we can reject the null hypothesis

• Snakes from Benton have higher resistance to TTX.

Page 68: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

How to deal with ties

• Determine the ranks that the values would have got if they were slightly different.

• Average these ranks, and assign that average to each tied individual

• Count all those individuals when deciding the rank of the next largest individual

Page 69: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Ties

Group Y Rank2 12 12 14 21 17 31 19 4.52 19 4.51 24 62 27 71 28 8

Page 70: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Mann-Whitney: Large sample approximation

For n1 and n2 both greater than 10, use

Z =2U − n1n2

n1n2 n1 + n2 +1( ) / 3

Compare this Z to the standard normal distribution

Page 71: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Example:

U1=245 U2=80n1=13 n2=25

Z =2U − n1n2

n1n2 n1 + n2 +1( ) /3

=2 245( ) −13 25( )

13 25( ) 13+ 25 +1( ) /3

= 2.54

Z0.05(2)=1.96, Z>1.96, so we could reject the null hypothesis

Page 72: Homework Chapter 11: 13 Chapter 12: 1, 2, 14, 16.

Assumption of Mann-Whitney U test

Both samples are random samples.


Recommended