+ All Categories
Home > Documents > Z-test and t-test Xuhua Xia [email protected] .

Z-test and t-test Xuhua Xia [email protected] .

Date post: 14-Jan-2016
Category:
Upload: dontae-kirk
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
17
z-test and t-test Xuhua Xia [email protected] http://dambe.bio.uottawa.ca
Transcript
Page 1: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

z-test and t-test

Xuhua Xia

[email protected]

http://dambe.bio.uottawa.ca

Page 2: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

68.27% of the measurements lie within the range of ,95.44% lie within 2,99.73% lie within 3,

50% lie within 0.67,95% lie within 1.96,97.5% lie within 2.24,99% lie within 2.58,99.5% lie within 2.81,99.9% lie within 3.29.

Given = 70kg and = 10kg for a normal distribution (of body weight), what is the probability of a body weight of 40 kg belonging to the population?

The normal deviate:

Standard deviation and Standard Error of the mean:

The standard deviate pertaining to the normal distribution of means:

iXZ

X

iXZ

nnX

2

Properties of a Normal Distribution

Page 3: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

1.96i

X

XZ

The z-score

The government has certain regulations on commercial product. Suppose that packages of sugar labeled as 2 kg should have a mean weight of 2 kg and a standard deviation equal to 0.10. If a package of sugar labeled 2 kg that you bought from a store has a weight of 1.82 kg, what is the z score? Can you present the package as evidence that the manufacturer has violated the government regulation?

Page 4: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

050

100150200250300350

29

.91

36

.32

42

.74

49

.15

55

.57

61

.98

68

.40

74

.81

81

.23

87

.64

94

.06

10

0.4

7

10

6.8

9

Body Weight

Fre

qu

en

cy

Body Weight of 10,000 Adult MenMean = 70 kg, Std Dev = 10 kg

Normal Distribution

Page 5: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

050

100150200250300350

29

.91

36

.32

42

.74

49

.15

55

.57

61

.98

68

.40

74

.81

81

.23

87

.64

94

.06

10

0.4

7

10

6.8

9

Body Weight

Fre

qu

en

cy

n

ssx

Frequency Distribution of Means

Page 6: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

Is the mean difference significantly larger than 0?

96.1147.275.9

933.20

X

iXZ

75.915744.37

nX

Wrong method assuming normal distribution:

= 20.933; = 37.744; n = 15;

Therefore, the mean difference is significantly larger than zero, i.e., inbreeding does reduce seed production.

Darwin’s Breeding Experiment

Species Outbreed Inbreed Difference 1 100 51 49 2 222 289 -67 3 121 113 8 4 433 417 16 5 222 216 6 6 111 88 23 7 534 506 28 8 432 391 41 9 99 85 14 10 445 416 29 11 112 56 56 12 333 309 24 13 222 147 75 14 422 362 60 15 101 149 -48

Page 7: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

I may premise that if we took by chance a dozen or score of men belonging to two nations and measured them, it would I presume be very rash to form any judgment from such small numbers on their (the nation’s) average heights. But the case is somewhat different with my … plants, as they were exactly of the same age, were subjected from first to last to the same conditions, and were descended from the same parents. -- Darwin, quoted in Fisher’s The design of experiments.

Problem of Small Samples

Species Outbreed Intbreed Difference 1 100 51 49 2 222 289 -67 3 121 113 8 4 433 417 16 5 222 216 6 6 111 88 23 7 534 506 28 8 432 391 41 9 99 85 14 10 445 416 29 11 112 56 56 12 333 309 24 13 222 147 75 14 422 362 60 15 101 149 -48

Page 8: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

050

100150200250300350

29

.91

36

.32

42

.74

49

.15

55

.57

61

.98

68

.40

74

.81

81

.23

87

.64

94

.06

10

0.4

7

10

6.8

9

Body Weight

Fre

qu

en

cy

Normal distribution

t distribution

t distribution is wider and flatter than the normal distribution

William S. Gosset & t Distribution

Page 9: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

t distribution• The t distribution depends on the degree of freedom (DF). For

Darwin’s data with a sample size = 15, DF = 15 - 1 = 14.• With the t distribution with DF = 14, we expect 95% of the

observations should fall within the range of mean 2.145 STD.

• Remember that for a normal distribution, 95% of the observations are expected to fall within the range of 1.96 .

• For pair-sample t-test with the null hypothesis being Mean1 = Mean2 (or MeanD = 0):

0 20.9332.147 2.145

9.75X

Dt

s

Page 10: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

T-Test• T-Test can be used to test

– the difference in mean between two samples (paired or unpaired),– a sample mean against a mean of a known population (e.g., the

concentration of a medicine set as a standard by the government), – whether a single individual observation belong to a sample with

sample size larger than one.

• The normal distribution and the Student’s t distribution. Why should the statistic t take into consideration both the mean difference and the variance?

• How to apply the test using Excel or SAS.• The assumptions.• Alternative methods: Wilcoxon rank-sum test or Mann-

Whitney U test.

Page 11: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

1 2

Xpooled S

X Xt

Same variance,smaller mean difference

Same mean difference,larger variance

The Essence of the t Statistic

-6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

-18 -12 -6 0 6 12 18

Page 12: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

More on variance and SE

1 2 1

1 2 1 2

1 2 1 2

2 2 22

2 2 2

2 2 2 1 2 1 2

1 2 1 2

( )

( )

A better estimate:

( )

x x x x

x x x x

x x x x

s E s s

s E s s

SS SS SS SSs E s s E

DF DF DF DF

Two independent variables: x1, x2 sampled from two normal distributions

1 2

1 2

1 2

1 2

1 2

1 2

1 2

1 2 1 2

1 2

2 2

1 2

2 2

1 2

2 2

1 21 2

2 2

1 2

;

:

, but both large:

Estimate of assuming equal variance:

x xx x

x xx x

x xx x

x x

x x x xx x

s sS S

n n

s swith n n n S

n

s swith n n S

n n

S

s sS

n n

Page 13: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

Sample 1 Sample 2

Sample size n1 n2

Mean 1x 2x

Standard dev. s1 s2

Sample size 7 7

Mean 76.857 82.714

Standard dev. 2.545 3.147

828.3

7147.3545.2

)714.82857.76(22

t

Df = (7-1) + (7-1) = 12

Computation for unpaired t-test

1 2

1 2

x x

x xt

S

1 2

1 2

1 2

1 2

1 2

1 2 1 2

1 2

2 2

1 2

2 2

1 21 2

2 2

1 2

:

, but both large:

Estimate of assuming equal variance:

x xx x

x xx x

x x

x x x xx x

s swith n n n S

n

s swith n n S

n n

S

s sS

n n

Page 14: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

How should we allocate the two crop varieties to the plots? What comparison would be fair?

Block 1

Block 2

Block 3

Block 4

Using blocks to reduce confounding environmental factors (Everything else being equal except for the treatment effect) in evaluating the protein content of two wheat variaties.

Paired-sample t-test: 3

1 1 1 1

1 1 1 1

2 2 2 2

2 2 2 2

Block 1

Block 2

Block 3

Block 4

1 2 2 1

2 1 1 2

2 1 2 1

1 2 1 2

Page 15: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

The Wilcoxon-Mann-Whitney Test• Statistical significance tests can be grouped into

– Parametric tests, e.g., t-test, ANOVA– Non-parametric tests, e.g., Wilcoxon-Mann-Whitney test,

sign test, runs test.

Page 16: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

When to Use Non-parametric Tests• Parametric tests depends on the assumed probability

distributions, e.g., normal distribution, t distribution, etc, and would give misleading results when the assumptions are violated.

• Non-parametric tests are called distribution-free tests and can be used in cases where the parametric tests are inappropriate.

• Parametric tests are more powerful than their non-parametric counterparts when the underlying assumptions are met.

Page 17: Z-test and t-test Xuhua Xia xxia@uottawa.ca .

Xuhua Xia

Wilcoxon-Mann-Whitney Test• The Wilcoxon-Mann-Whitney test is the non-

parametric equivalent of the t-test.• The original data are rank-transformed before

applying the test• The test statistic is U


Recommended