+ All Categories
Home > Documents > Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Date post: 21-Dec-2015
Category:
View: 225 times
Download: 1 times
Share this document with a friend
Popular Tags:
25
Chapter 6-7-8 Chapter 6-7-8 Sampling Sampling Distributions and Distributions and Hypothesis Testing Hypothesis Testing
Transcript
Page 1: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Chapter 6-7-8 Sampling Chapter 6-7-8 Sampling Distributions and Distributions and

Hypothesis TestingHypothesis Testing

Page 2: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

When we have a frequency When we have a frequency distribution, or histogram, we can distribution, or histogram, we can determine probabilities. Look at the determine probabilities. Look at the M&M example.M&M example.

What is one of the most common What is one of the most common shapes of frequency distributions??shapes of frequency distributions??

Page 3: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

The normal distribution.The normal distribution. Again, all normal distributions are Again, all normal distributions are

characterized by the mean and the characterized by the mean and the standard deviation. There are an infinite standard deviation. There are an infinite number of normal distributions.number of normal distributions.

But some are very special to us, like the But some are very special to us, like the Standardized Normal Distribution.Standardized Normal Distribution.– ALL normal distributions can be standardized.ALL normal distributions can be standardized.– All scores are put in terms of Standard All scores are put in terms of Standard

Deviation units from the mean.Deviation units from the mean.– SO, we know proportions, and hence, SO, we know proportions, and hence,

probabilities associated with scores that fall in probabilities associated with scores that fall in a normal distribution. We just did that in a normal distribution. We just did that in Chapter 5.Chapter 5.

Page 4: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

100% of our observations appear in 100% of our observations appear in the normal distribution.the normal distribution.

Proportions and probabilities are the Proportions and probabilities are the same. same.

What proportion of scores fall above a z-What proportion of scores fall above a z-score of 1? score of 1?

What is the probability that a randomly What is the probability that a randomly chosen z-score will be 1 or higher? chosen z-score will be 1 or higher?

What is the probability that a randomly What is the probability that a randomly chosen z-score will fall between 0 and .5? chosen z-score will fall between 0 and .5?

There is a .05 probability (or a 5% chance) There is a .05 probability (or a 5% chance) of a z-score being this high or higher? of a z-score being this high or higher?

Page 5: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

MoreMore

We can also look at specific scores (X), We can also look at specific scores (X), convert them into z-score, and find the convert them into z-score, and find the probability of getting a score that high or probability of getting a score that high or higher, lower than that score, and so on.higher, lower than that score, and so on.– Given sigma = 100 and the mean = 500, what Given sigma = 100 and the mean = 500, what

is the probability of getting a 600 or higher?is the probability of getting a 600 or higher?– 1) Convert to z; (600-500)/100 = 1.1) Convert to z; (600-500)/100 = 1.– 2) What proportion of the distribution falls at or 2) What proportion of the distribution falls at or

above a z-score of 1?above a z-score of 1?

Page 6: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

The pastThe past What we have been doing is descriptive What we have been doing is descriptive

statistics. statistics. We have come up with distributions, We have come up with distributions,

measures of central tendency and measures of central tendency and measures of variability, all of which measures of variability, all of which describe a population or a sample. describe a population or a sample.

We can use these, as we have found out, We can use these, as we have found out, to find the probability of to find the probability of a scorea score, or range , or range of scores, etc. of scores, etc.

But statistics, z-scores, probabilities, etc., But statistics, z-scores, probabilities, etc., can be used for more interesting purposes.can be used for more interesting purposes.

Page 7: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

The futureThe future

Inferential statisticsInferential statistics – Estimate population – Estimate population parameters from a sample, or determine if parameters from a sample, or determine if two samples are differenttwo samples are different– Hypothesis testing – Is the population Hypothesis testing – Is the population

parameter equal to some specific value?parameter equal to some specific value?– Ex. This class (random sample) takes a study Ex. This class (random sample) takes a study

skills course: Seating, classroom tips, study skills course: Seating, classroom tips, study habits habits

– G. P. A. – Is the G.P.A. of this class now G. P. A. – Is the G.P.A. of this class now different than MSU students generally different than MSU students generally (population)?(population)?

Page 8: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Well, let’s think about this.Well, let’s think about this. Of course, if we were to randomly sample 50 MSU Of course, if we were to randomly sample 50 MSU

students and get their mean GPA, it would be a students and get their mean GPA, it would be a little different than the actual population mean little different than the actual population mean GPA.GPA.

There will always be a little error, the sample There will always be a little error, the sample mean will probably not equal the population mean will probably not equal the population mean until all of the members in the population mean until all of the members in the population are in our sample.are in our sample.

The quantification of this discrepancy is called The quantification of this discrepancy is called Sampling Error – Sampling Error –

The discrepancy, or amount or error, between a The discrepancy, or amount or error, between a sample statistic and its corresponding parameter.sample statistic and its corresponding parameter.

Page 9: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Well, let’s think about this.Well, let’s think about this. Also, we can take numerous samples. For Also, we can take numerous samples. For

example, the next day I can get the GPAs of 40 example, the next day I can get the GPAs of 40 different students. The mean GPA for this sample different students. The mean GPA for this sample will also be a little different than the true will also be a little different than the true population mean. ALSO, this second sample will population mean. ALSO, this second sample will have a mean that is slightly different from our have a mean that is slightly different from our first sample mean.first sample mean.– In fact, we could take a huge number of samples, and In fact, we could take a huge number of samples, and

get a huge number of sample means.get a huge number of sample means. So, how do we use a given sample to estimate So, how do we use a given sample to estimate

the population if every sample will be a little the population if every sample will be a little different?different?

Page 10: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Sampling DistributionSampling Distribution To answer this we have to create a sampling To answer this we have to create a sampling

Distribution of a statistic (mean, median)Distribution of a statistic (mean, median) In particular, we will use a In particular, we will use a Sampling Sampling

Distribution of Sample Means =Distribution of Sample Means =– This is the collection of sample means for all the possible This is the collection of sample means for all the possible

random samples of a particular size (n) that could be random samples of a particular size (n) that could be obtained from a population.obtained from a population.

OROR– The distribution of a statistic (the mean) over repeated The distribution of a statistic (the mean) over repeated

sampling from a specified population.sampling from a specified population. Sampling distribution of sample means : (Most common), Sampling distribution of sample means : (Most common),

G.P.A.: Say MSU population mean is 2.74, G.P.A.: Say MSU population mean is 2.74, distribution of means of an infinity of random distribution of means of an infinity of random

samples.samples.

Page 11: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

We have been looking at distributions of We have been looking at distributions of SCORES, now we are doing to look at SCORES, now we are doing to look at distributions of all possible SAMPLE distributions of all possible SAMPLE MEANS.MEANS.

We are dealing with particular type of We are dealing with particular type of sampling distributionsampling distribution = a distribution of = a distribution of statistics (e.g., mean) obtained by statistics (e.g., mean) obtained by selecting all the possible samples of a selecting all the possible samples of a specific size from a populationspecific size from a population

Page 12: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

DRAW SAMPLING DISTRIBUTION DRAW SAMPLING DISTRIBUTION OF MEANS: N = 50 OF MEANS: N = 50

Distribution of means if we sample Distribution of means if we sample 50 students and assume the 50 students and assume the population mean is 2.74:population mean is 2.74:

Sample 1: 2.77Sample 1: 2.77 Sample 2: 2.91Sample 2: 2.91 Sample 3: 2.55Sample 3: 2.55 Sample 4: 3.77Sample 4: 3.77

Page 13: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

NOTE: This is similar to what we were NOTE: This is similar to what we were doing with z scores. We were looking at doing with z scores. We were looking at where a z score falls in a distribution of where a z score falls in a distribution of scores. Now we are looking at where a scores. Now we are looking at where a sample statistic (in this case the mean) sample statistic (in this case the mean) falls among a distribution of samples. falls among a distribution of samples.

If close to the middle of the distribution we If close to the middle of the distribution we retain null hypothesis (no difference)retain null hypothesis (no difference)

If far from the middle – sample unlikely, If far from the middle – sample unlikely, reject hypothesis.reject hypothesis.

Page 14: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Sampling Error:Sampling Error: Variability of a Variability of a statistic from sample to sample. Due statistic from sample to sample. Due to chance.to chance.

Standard Error:Standard Error: The standard The standard deviation of a sampling distribution deviation of a sampling distribution from the population. (sigma/ sqrt n)from the population. (sigma/ sqrt n)

Page 15: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

As usual, n = sample size, which should be As usual, n = sample size, which should be taken into account when calculating taken into account when calculating standard deviations.standard deviations.

Obviously, the larger the sample, the Obviously, the larger the sample, the closer the sample means will be to the closer the sample means will be to the population mean (i.e., less error). So, we population mean (i.e., less error). So, we have to take sample size into account.have to take sample size into account.

Law of large numbers = the larger the Law of large numbers = the larger the sample size, the more probable it is that sample size, the more probable it is that the sample mean will be close to the the sample mean will be close to the population mean.population mean.

Page 16: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

When n = 1, se = sdWhen n = 1, se = sd

As n increases, the standard error should As n increases, the standard error should decrease. The equation takes this into decrease. The equation takes this into account.account.

There is this great mathematical Theorem There is this great mathematical Theorem that allows us to know the general that allows us to know the general properties of our sampling distribution as properties of our sampling distribution as our samples (and population) get larger our samples (and population) get larger and larger.and larger.

Page 17: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Central Limit Theorem:Central Limit Theorem: Central Limit Theorem:Central Limit Theorem: From the book: For any population with a mean From the book: For any population with a mean

(mu) and a standard deviation (sigma), the (mu) and a standard deviation (sigma), the distribution of sample means for sample size n distribution of sample means for sample size n will have a mean or mu and a standard deviation will have a mean or mu and a standard deviation of sigma/sqrt n and will approach a normal of sigma/sqrt n and will approach a normal distribution as n approaches infinity.distribution as n approaches infinity.– So what is this saying?So what is this saying?

As N increases, sample means and standard As N increases, sample means and standard deviations approach those of the population.deviations approach those of the population.– With a sample size of 30+, the distribution of sample With a sample size of 30+, the distribution of sample

means is practically normal.means is practically normal.– So, we have a clue about the mean of the sampling So, we have a clue about the mean of the sampling

distribution, the standard deviation, and its shape distribution, the standard deviation, and its shape (normal). What can we do with this information???(normal). What can we do with this information???

Page 18: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

So what is this saying?So what is this saying? As N increases, sample means and standard As N increases, sample means and standard

deviations approach those of the population.deviations approach those of the population. With a sample size of 30+, the distribution of With a sample size of 30+, the distribution of

sample means is practically normal.sample means is practically normal. So, we have a clue about the mean of the So, we have a clue about the mean of the

sampling distribution, the standard deviation, and sampling distribution, the standard deviation, and its shape (normal). What can we do with this its shape (normal). What can we do with this information???information???

This allows us to know the distribution of sample This allows us to know the distribution of sample means for any population, regardless of the mean means for any population, regardless of the mean and SD, and even if the population distribution is and SD, and even if the population distribution is not normal.not normal.

Page 19: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Back to our example:Back to our example:

MSU Mean: 2.53MSU Mean: 2.53 Class Mean: 3.02Class Mean: 3.02 There may be no relationship There may be no relationship

between this class (the intervention) between this class (the intervention) and G.P.A.and G.P.A.

Page 20: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

Goal:Goal: Determine whether this difference is due Determine whether this difference is due

to chance (sampling error)to chance (sampling error) Can determine with probabilities how Can determine with probabilities how

likely/unlikely it is that this difference is likely/unlikely it is that this difference is due to chance. due to chance.

If this class is different, then we can If this class is different, then we can classify it as a different population with classify it as a different population with different population parameters (higher different population parameters (higher mean)mean)

A statistical test will answer this question A statistical test will answer this question for us:for us:

Page 21: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

HYPOTHESIS TESTING!HYPOTHESIS TESTING!

A hypothesis test = a statistical procedure A hypothesis test = a statistical procedure that uses sample data to evaluate that uses sample data to evaluate hypotheses about a population parameter.hypotheses about a population parameter.

General steps.General steps.– 1) generate a hypothesis about the population 1) generate a hypothesis about the population

mean.mean.– 2) So, we hypothesize that our sample mean 2) So, we hypothesize that our sample mean

will be close to this guess regarding the will be close to this guess regarding the population mean.population mean.

– 3) Obtain a sample and sample mean3) Obtain a sample and sample mean– 4) Compare the sample and population means.4) Compare the sample and population means.

Page 22: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

1) Set up Null Hypothesis:1) Set up Null Hypothesis: The null hypothesis always says the opposite of The null hypothesis always says the opposite of

that in which we are interested:that in which we are interested:– We can never prove something is true; We can only We can never prove something is true; We can only

prove that it is falseprove that it is false In other words: In other words:

– There is no difference between our groups or:There is no difference between our groups or:– If we are only interested in whether our group is If we are only interested in whether our group is

better:better: Null Hypothesis would say our group is equal to Null Hypothesis would say our group is equal to

or worse than other.or worse than other.– We are usually working to reject the null hypothesisWe are usually working to reject the null hypothesis– Note:Note: Assuming the null is true, we create our Assuming the null is true, we create our

sampling distribution. In this case the sampling sampling distribution. In this case the sampling distribution of means.distribution of means.

– M class = 2.53M class = 2.53

Page 23: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

2. Set up the “Alternative 2. Set up the “Alternative hypothesis” (What we want to hypothesis” (What we want to

find)find) M class ne 2.53M class ne 2.53 Doing this before we collect our Doing this before we collect our

data. Mean could be higher or data. Mean could be higher or lower. Maybe our class hurts lower. Maybe our class hurts people G.P.A.people G.P.A.

Page 24: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

3. Set a criterion level for our 3. Set a criterion level for our Decision:Decision:

How far away does the mean have to How far away does the mean have to be for us to reasonably doubt that be for us to reasonably doubt that this sample came from the same this sample came from the same population?population?

When are we going to say this When are we going to say this sample is the same as the population sample is the same as the population (just sampling error) or when we are (just sampling error) or when we are going to say this sample is different going to say this sample is different from the population.from the population.

Page 25: Chapter 6-7-8 Sampling Distributions and Hypothesis Testing.

3. Set a criterion level for our 3. Set a criterion level for our Decision:Decision:

When are we going to say this sample is the When are we going to say this sample is the same as the population (just sampling error) or same as the population (just sampling error) or when we are going to say this sample is different when we are going to say this sample is different from the population.from the population.

Significance levelSignificance level – Predetermined probability – Predetermined probability that represents a sample result that is so rare or that represents a sample result that is so rare or unusual that is cast doubt on the accuracy of Ho: unusual that is cast doubt on the accuracy of Ho: alphaalpha– The probability with which we are willing to reject Ho The probability with which we are willing to reject Ho

when it is correct.when it is correct.– Rejection regionRejection region: the set of outcomes from an : the set of outcomes from an

experiment that will lead to a rejection of Ho.experiment that will lead to a rejection of Ho. Typically:Typically:

– Choose : alpha = 5%Choose : alpha = 5%


Recommended