+ All Categories
Home > Documents > Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things...

Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things...

Date post: 15-Jan-2016
Category:
Upload: teresa-craig
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
64
Statistics 1 1 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord with certain rules ... the scientific method. Question is a hypothesis Answer is obtained by testing the hypothesis Which gives the general model………
Transcript
Page 1: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 1

Hypothesis Testing

Discover the relationships that exist between events/things

Accomplished by:

Asking questions

Getting answers

In accord with certain rules ... the scientific method.

Question is a hypothesis

Answer is obtained by testing the hypothesis

Which gives the general model………

Page 2: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 2

.

Page 3: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 3

With some IMPORTANT restrictions about – How the hypothesis is formed.

How the hypothesis is tested……..

Forming hypotheses is an "everyday-everybody" activity

I will do better on examinations if relax the night before

Is a "hypothesis" ... a statement of a relationship

OK, BUT NOT a scientific hypothesis

Page 4: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 4

A scientific hypothesis must meet certain criterion

A scientific hypothesis must be:

Specific

Empirically testable

Strictly related to some experimental procedure

Page 5: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 5

Moreover, a scientific hypothesis actually consists of two separate mutually exclusive hypotheses

A null hypothesis

An alternative hypothesis

Page 6: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 6

Null Hypothesis

A statement reflecting the possibility that there are no differences between the objects and/or events that are being observed

In formal terms:

Ho: µ1 = µ2

Where: µ1 and µ2 are the mean or average of several observations

Page 7: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 7

Alternative Hypothesis

A statement reflecting the possibility that there are differences between the objects and/or events that are being observed

In formal terms:

H1: µ1 <> µ2 or H1: µ1 < µ2 or H1: µ1 > µ2

Page 8: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 8

Testing between the null and alternative hypothesis

Accomplished through collection of data

Data must be scientifically acceptable, i.e. – Observable – Public – Replicable

The test concentrates on the null hypothesis which you either – Reject – Fail to reject

Page 9: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 9

If there are no differences between your observations you – Fail to reject the null hypothesis and – Disregard the alternative hypothesis

If there are differences between your observations you – Reject the null hypothesis and Accept the alternative

hypothesis

Page 10: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 10

Some things to note about hypothesis testing

Failing to reject the null hypothesis – Does not mean that the null hypothesis is TRUE – The null hypothesis can never be proven – You can only fail to reject it

Rejecting the null hypothesis – Means you accept the alternative hypothesis – It does not establish the validity of a relationship

Validity is a function of experimental design

Page 11: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 11

When testing a hypothesis

Two possible outcomes re: null hypothesis

Two possible states of real world

Thus four possible decisions – Two are incorrect ... i.e. errors

Page 12: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 12

The Real World

Ho: true Ho: false

YourDecision

Reject HoType I error

alpha (p level)

CorrectPower

1 - beta

Do not reject HoCorrect

1 - alphaType II error

beta

Page 13: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 13

Some things to file for future reference

Type I error – You can directly "set" this – It is the chance (probability of the making the error)

you are willing to accept when you test your hypothesis.

Type II error – You cannot directly "set" this

You can attempt to control it through good experimental design.

Page 14: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 14

Alpha Level or the level of significance is a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true.

Critical region is composed of extreme sample values that are very unlikely to be obtained if the null hypothesis is true. The boundaries for the critical region are determined by the alpha level. If sample data fall in the critical region, the null hypothesis is rejected.

Page 15: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 15

Estimating Population Parameters from Samples

Sample mean

Unlikely to be exactly equal to population mean

BUT

Not more likely to be greater

Not more likely to be less

So sample mean is an unbiased estimate of population mean

Page 16: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 16

Sample standard deviation

Unlikely to be exactly equal to population standard deviation

BUT

More likely to be less

Is usually an under estimate of population parameter

So sample standard deviation is a biased estimate of the population standard deviation

Page 17: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 17

To understand why this is so you must understand the nature and concept of a sampling distribution

What it all means

When you take a sample

Page 18: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 18

Sample mean is unbiased estimate of population mean – But no reason to suspect it is higher or lower than

population mean

Sample standard deviation is a biased estimate of population standard deviation – But it is more likely to be smaller than population

variance and standard deviation

Page 19: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 19

And so must correct any estimate of the population variance increase it (i.e. use "n-1" when calculating the estimate)

Page 20: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 20

Page 21: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 21

Parametric Tests-

Tests that do make assumptions and test hypotheses about population parameters.

• z & t

• ANOVA

• F test

Involves an assessment of whether your observed data is related to your independent variable

Page 22: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 22

Or is simply what might be expected by chance random sampling – i.e. no relation between

• Independent variable

• Dependent measure

Page 23: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 23

Requires knowing or estimating population parameters

Mean: (μ)

Standard deviation: (σ)

Assumption of normality

For example: consider • Pat (individual score) = 64 • Population mean (μ) = 50 • Population standard deviation (σ) = 8

Page 24: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 24

And if population is normal, then you know

~68 (68.26) % data points between + 1

~95 (95.44) % data points between + 2

~99 (99.74) % data points between + 3

And remember: these are percentages, not absolute valuesAreas under the normal curve

Page 25: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 25

Remember the z-distribution ?

Provided areas (proportion of scores) under a normal distribution according to

And if convert Pat's raw score to a z-score

Page 26: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 26

And then look up Pat's z-score in the Z-table

Meaning that ~96% (1.00 - .0401 = .9599) of scores in distribution are below Pat. (page 699, G&W).

OR PUT OTHERWISE – If we were to randomly select a score from Pat's

distribution

The probability that the score would be greater than Pat's would only be 4 in 100

Page 27: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 27

And we have done what we set out to do

Accomplished a statistical test

… i.e. comparing ….– Observed data and What would be expected by

chance .

And thus,

Pat's score is significant at p < .05

Page 28: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 28

P-value

The P-value is the probability, when Ho is true, of a test statistic value at least as contradictory to Ho as the value actually observed. The smaller the P-value, the more strongly the data contradict Ho. The P-value is denoted by P.

The P-value summarizes the evidence in the data about the null hypothesis. A moderate to large P-value means that the data are consistent with Ho.

Page 29: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 29

Eg. P-value .26 or .83 indicates that the observed data would not be unusual if Ho were true. However, a P-value such as .001 means that such data would be very unlikely, if Ho were true.

The P-value is the primary reported result of a significance test.

If the P-value is sufficiently small, one rejects Ho and accepts H1.

Page 30: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 30

Standard Error of the Mean

When comparing an individual to a population needed to know two things about the population

Mean:

Standard deviation:

• Only slightly different when comparing a sample to a population

Page 31: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 31

Since you are not concerned with a single individual but with a sample of individuals

The "population" of interest is not – A population of individuals but rather – A population of samples, i.e. a SAMPLING

DISTRIBUTION

Page 32: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 32

And the measure of "variability" is not – The standard deviation of a population but rather – The standard deviation of a sampling distribution,

i.e. the STANDARD ERROR OF THE MEAN

Page 33: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 33

The calculation details

The standard error of the mean

σm = σ ∕ √n

where: n = sample size

The z comparison

Z = (M – μ) ∕ σm

Which is not really different than what we did when comparing an individual to a population

Page 34: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 34

Suppose: Herd of 10 cows (n=10)Mean milk production is 1.8 gal/cow

Question: How unique is this herd?

μ = 1.5, σ = .55, n=10, M=1.8

Page 35: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 35

Thus herd is pretty unique since the likelihood that any random sample of 10 cows from the population would produce more milk is less than 5 times in 100

Or in "statistics" – The probability of selecting a herd of greater milk

producers is p < 0.05

Page 36: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 36

Page 37: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 37

Problem: compare a sample to a population

Method:

1. Use population parameters to calculate the standard error of the mean of a sampling distribution.

σm = σ ∕ √n

Page 38: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 38

2. Use the standard error of the mean to compare sample mean with population mean by calculating a z-score

Z = (M – μ) ∕ σm

Page 39: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 39

3. Use z-table to determine the probability that a random sample would yield a mean greater than the mean of the sample

Page 40: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 40

A word on the logic and requirements of the statistic

• The "uniqueness" of your sample is the probability that another random sample of the same size would have the same mean as your sample.

• Or put otherwise, is your sample mean, is what would be expected by chance, a random selection?

• The more unique your sample, the more likely it reflects a relationship between:

Page 41: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 41

• Your independent variable

• Your dependent measure

Page 42: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 42

Two requirements

The population is normally distributed

You know the population

• Mean

• Standard deviation

Page 43: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 43

The t statistic -An alternative to z

MUST know the population mean

But can estimate population standard deviation from sample data

A sample standard deviation is given by (as you know)

Page 44: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 44

And so an estimated standard error of the mean is:

sm = s ∕ √n

And to use the estimated standard error of the mean to compare your sample to the population must make one adjustment– Adjustment is necessary to account for the fact that

you are estimating

Page 45: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 45

Comparisons When Estimating Population Parameters

• The adjustment part

• Estimating the standard deviation requires a different sampling distribution

• Sampling distribution is the t-distribution

Page 46: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 46

The t-distribution

• Normal distribution

• More platokurtic than z-distribution

• Tails more elevated

Page 47: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 47

And comparison becomes

t = (M – μ) ∕ sm

Thus:

Because use estimate of population standard deviation to estimate standard error of mean

Must use t-distribution to get probability of randomly selecting a sample with a mean similar to the mean of your sample

But not conceptually different -- just an adjustment

Page 48: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 48

For Example…

Suppose: Mean milk production of your herd of 10 cows is:1.8, 1.7, 2.4, 2.3, 1.1, 1.7, 1.5, 2.4,1.9, 1.2 gals(Mean = 1.8 gal/cow)

Question: How unique is your herd?

μ = 1.5 gal/day *givenσ = unknownσm =unknown because do not know population standard deviation

Page 49: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 49

QUESTION now is what does that "t-value" mean

• i.e. What is the probability of a random sample of 10 cows being like your cows

• To find out, consult a t-table

Page 50: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 50

Page 51: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 51

The t-Distribution

• Z-table gives exact probabilities

• t-table gives ranges of probabilities

Page 52: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 52

Enter table with degrees of freedom of your sample – Degrees of freedom

• Number of values in a calculation that are free to vary

– That is: – The degrees of freedom for a mean of 10 values is

9 ... because – If the mean of 10 numbers is, for example, 5 – Nine of the numbers "free" to be any value but when

these are established, the 10th number is determined if the mean is to be 5

Page 53: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 53

Determine the probability of your t-observed by the tabled t-values that it falls between,

For example:

• With 10 data points there are 9 degrees of freedom (df=9)

• If the t-observed statistic is 2.04

Page 54: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 54

The t-table gives a probability of that occurring by chance between 0.05 and 0.02 two-tailed (between 5 and 2 times in 100)

And for your cows

The observed t-value of 2.04, df=9 gives a tabled probability of p > 0.05

Page 55: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 55

Which is traditionally not sufficient to reject the Null hypothesis – Any event that has probability of occurring 5 times

or more in 100 is considered by most psychologists an indication of a chance event

And thus your cows are "just average old cows"

Well maybe NOT……….

Page 56: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 56

Directionality of Statistical Tests

Statistical tests have a property called "directionality"

• Nondirectional, called "two-tailed" tests

• Directional, called "one-tailed" tests

Page 57: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 57

Directionality is determined BEFORE you run your experiment

Based upon prior knowledge or data

You predict of the outcome of your observations, the affect of your independent variable – Your independent variable "will improve

performance" – Your independent variable "will interfere with

learning" – etc

Page 58: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 58

Your ability to predict an outcome means that you are better able to determine whether an event is a chance occurrence

More likely to reject Null hypothesis

In statistical terms the region of the sampling distribution indicating that an event is something different than what would be expected by chance is larger

Page 59: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 59

And cows again

If you had valid reasons to predict that your cows produced more milk

You would use a directional test, i.e. one-tailed test

And you would reject the Null hypothesis at p < .05 that your herd was not different than what you would expect from another random sample of cows

Page 60: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 60

Page 61: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 61

• Z-test a statistical test used to decide if a sample mean does or does not come from a specified population, when the standard deviation of the population is known.

• When the standard deviation of the population is unknown then a t-test is performed.

• Hypothesis testing, the goal is to decide whether to reject the null hypothesis.

Page 62: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 62

• Alpha level, traditionally set at .05 where, also, the acceptance and rejection regions are determined.

• Critical value, the absolute value of that defines the rejection region(s).

• Non-directional (two-tailed), where rejection of the sample mean is either above or below hypothesized population mean.

• Directional (one-tailed), where rejection of the sample mean is determined prior to experimentation.

Page 63: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 63

Compared a Sample to a population:

When population parameters are known…..

Sample to population: assume a normal population and known standard deviation

Z = (M – μ) ∕ σm

Page 64: Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.

Statistics 1 64

When population parameters are unknown….

Sample to population: assume a normal population and unknown standard deviation

t = (M – μ) ∕ sm


Recommended