+ All Categories
Home > Documents > STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng...

STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng...

Date post: 27-Mar-2015
Category:
Upload: ryan-quinn
View: 216 times
Download: 2 times
Share this document with a friend
Popular Tags:
34
STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University
Transcript
Page 1: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

STATISTICSHYPOTHESES TEST (III)

Nonparametric Goodness-of-fit (GOF) tests

Professor Ke-Sheng ChengDepartment of Bioenvironmental Systems Engineering

National Taiwan University

Page 2: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

Description of nonparametric Problems

• Until now, in the estimation and hypotheses testing problems, we have assumed that the available observations come from distributions for which the exact form is known, even though the values of some parameters are unknown. In other words, we have assumed that the observations come from a certain parametric family of distributions, and a statistical inference must be made about the values of the parameters defining that family.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University2

Page 3: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• In many situations, we do not assume that the available observations come from a particular family of distributions. Instead, we want to study inferences that can be made about the distribution from which the observations come, without making special assumptions about the form of that distribution.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University3

Page 4: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• For example, we might simply assume that observations form a random sample from a continuous distribution, without specifying the form of this distribution any further; and we then investigate the possibility that this distribution is a normal distribution.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University4

Page 5: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• Problems in which the possible distributions of the observations are not restricted to a specific parametric family are called nonparametric problems, and the statistical methods that are applicable in such problems are called nonparametric methods.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University5

Page 6: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

Goodness-of-fit test• A very common statistical problem in

hydrological frequency analysis or water resources planning is that whether the available observations (a random sample available to us) come from a particular type of distribution. For example, before we can estimate the magnitude of the 24-hour rainfall depth with 100-year return period, we must decide (identify) the type of probability distribution for the rainfall data (the annual maximum series) through statistical tests.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University6

Page 7: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• Let’s consider statistical problems based on data such that each observation can be classified as belonging to one of a finite number of possible categories. If a large population consists of data of k different categories, and let pi denote the probability that

an observation will belong to category i (i = 1, 2, …, k). Of course, for i = 1, 2, …, k and .

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University7

0ip

k

iip

1

1

Page 8: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University8

Page 9: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University9

Page 10: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University10

Page 11: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• Therefore, it seems reasonable to base a test on the values of the differences

for i = 1, 2, …, k and reject Ho when the

magnitudes of these differences are relatively large.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University11

ii en

Page 12: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

Chi-square GOF test

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University12

Page 13: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University13

Page 14: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University14

Page 15: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University15

Page 16: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University16

Page 17: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University17

Page 18: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University18

Sample size

Num

ber

of c

ateg

orie

s

Page 19: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University19

Page 20: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

Kolmogorov-Smirnov GOF test

• The chi-square test compares the empirical histogram against the theoretical histogram.

• In contrast, the K-S test compares the empirical cumulative distribution function (ECDF) against the theoretical CDF.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University20

Page 21: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University21

Page 22: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University22

Page 23: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University23

Page 24: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

• In order to measure the difference between Fn(X) and F(X), ECDF statistics based on the

vertical distances between Fn(X) and F(X)

have been proposed.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University24

Page 25: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University25

Page 26: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University26

Page 27: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University27

Page 28: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University28

Page 29: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University29

Page 30: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University30

Page 31: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University31

Page 32: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University32

Values of for the Kolmogorov-Smirnov test

,nD

Page 33: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

Goodness-of-fit tests using R• 2 test for GOF test

– chisq.test

– The above test doesn’t account for any parameters in determining the expected values.

– The degree of freedom of the test statistic is k-1.

• Kolmogorov-Smirnov GOF test

– ks.test (one-sample test)

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University33

Page 34: STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.

ks.test(x, y, parameters, alternative=”…”)where x is the data vector to be tested, y is a string vector specifying the hypothesized distribution, parameters are the values of distribution parameters corresponding to y, and alternative represents a string vector (“less”, “greater”, or “two.sided”) for one-tail or two-tail test.

• Examplesks.test(x, ”pnorm”, 30, 10, alternative=”two.sided”)ks.test(x, ”pexp”, 0.2, alternative=”greater”)

04/10/23Dept of Bioenvironmental Systems Engineering

National Taiwan University34


Recommended