+ All Categories
Home > Documents > Introduction to Inference Sampling Distributions.

Introduction to Inference Sampling Distributions.

Date post: 17-Jan-2018
Category:
Upload: clyde-mark-wilcox
View: 235 times
Download: 0 times
Share this document with a friend
Description:
Inference with Sample Mean Sample mean is our estimate of population mean How much would the sample mean change if we took a different sample? Key to this question: Sampling Distribution of x Population Sample Parameter:  Statistic: x Sampling Inference Estimation ?
25
Introduction to Inference Sampling Distributions
Transcript
Page 1: Introduction to Inference Sampling Distributions.

Introduction to Inference

Sampling Distributions

Page 2: Introduction to Inference Sampling Distributions.

Inference with a Single Observation

• Each observation Xi in a random sample is a representative of unobserved variables in population

• How different would this observation be if we took a different random sample?

Population

Observation Xi

Parameter:

Sampling Inference

?

Page 3: Introduction to Inference Sampling Distributions.

Inference with Sample Mean

• Sample mean is our estimate of population mean• How much would the sample mean change if we took

a different sample?• Key to this question: Sampling Distribution of x

Population

Sample

Parameter:

Statistic: x

Sampling Inference

Estimation

?

Page 4: Introduction to Inference Sampling Distributions.

Sampling Distribution of a Sample Statistic• Sampling Distribution of a Sample Statistic: The

distribution of values for a sample statistic obtained from repeated samples, all of the same size and all drawn from the same population

1) Make a list of all samples of size 2 that can be drawn from this set (Sample with replacement)

2) Construct the sampling distribution for the sample mean for samples of size 23) Construct the sampling distribution for the minimum for samples of size 2

Example: Consider the set {1, 2, 3, 4}:

Page 5: Introduction to Inference Sampling Distributions.

{1, 1} 1.0 1 1/16{1, 2} 1.5 1 1/16{1, 3} 2.0 1 1/16{1, 4} 2.5 1 1/16{2, 1} 1.5 1 1/16{2, 2} 2.0 2 1/16{2, 3} 2.5 2 1/16{2, 4} 3.0 2 1/16{3, 1} 2.0 1 1/16{3, 2} 2.5 2 1/16{3, 3} 3.0 3 1/16{3, 4} 3.5 3 1/16{4, 1} 2.5 1 1/16{4, 2} 3.0 2 1/16{4, 3} 3.5 3 1/16{4, 4} 4.0 4 1/16

Sample x Minimum Probability

This table lists all possible samples of size 2, the mean for each sample, and the probability of each sample occurring (all equally likely)

# of possible samples (with placement) = Nn

Table of All Possible Samples

Page 6: Introduction to Inference Sampling Distributions.

1.0 1/161.5 2/162.0 3/162.5 4/163.0 3/163.5 2/164.0 1/16

Sampling Distributionof the Sample Mean

x P x( )

1.0 1.5 2.0 2.5 3.0 3.5 4.00.00

0.05

0.10

0.15

0.20

0.25

x

P x( )

Histogram: Sampling Distributionof the Sample Mean

Sampling Distribution• Summarize the information in the previous table to obtain the sampling distribution of the sample mean and the sample minimum:

Page 7: Introduction to Inference Sampling Distributions.

Sampling Distribution of Sample Mean• Distribution of values taken by statistic in all possible

samples of size n from the same population• Model assumption: our observations xi are sampled

from a population with mean and variance 2

PopulationUnknown

Parameter:

Sample 1 of size n xSample 2 of size n xSample 3 of size n xSample 4 of size n xSample 5 of size n xSample 6 of size n xSample 7 of size n xSample 8 of size n x .

. .

Distributionof thesevalues?

Page 8: Introduction to Inference Sampling Distributions.

Mean of Sample Mean• First, we examine the center of the sampling

distribution of the sample mean.

• Center of the sampling distribution of the sample mean is the unknown population mean:

mean( X ) = μ• Over repeated samples, the sample mean will, on

average, be equal to the population mean – no guarantees for any one sample!

Page 9: Introduction to Inference Sampling Distributions.

Variance of Sample Mean• Next, we examine the spread of the sampling

distribution of the sample mean

• The variance of the sampling distribution of the sample mean is

variance( X ) = 2/n

• As sample size increases, variance of the sample mean decreases! • Averaging over many observations is more accurate than

just looking at one or two observations

Page 10: Introduction to Inference Sampling Distributions.

• Comparing the sampling distribution of the sample mean when n = 1 (parent population) vs. n = 10

Page 11: Introduction to Inference Sampling Distributions.

Law of Large Numbers

• Remember the Law of Large Numbers:• If one draws independent samples from a

population with mean μ, then as the number of observations increases, the sample mean x gets closer and closer to the population mean μ

• This is easier to see now since we know that

mean(x) = μ

variance(x) = 2/n 0 as n gets large

Page 12: Introduction to Inference Sampling Distributions.

Example• Population: seasonal home-run totals for

7032 baseball players from 1901 to 1996• Take different samples from this population and

compare the sample mean we get each time• In real life, we can’t do this because we don’t

usually have the entire population!

Sample Size Mean Variance100 samples of size n = 1 3.69 46.8

100 samples of size n = 10 4.43 4.43

100 samples of size n = 100 4.42 0.43

100 samples of size n = 1000 4.42 0.06

Population Parameter = 4.42

Page 13: Introduction to Inference Sampling Distributions.

Distribution of Sample Mean

• We now know the center and spread of the sampling distribution for the sample mean.

• What about the shape of the distribution?

• If our data x1,x2,…, xn follow a Normal distribution, then the sample mean x will also follow a Normal distribution!

Page 14: Introduction to Inference Sampling Distributions.

Example

• Mortality in US cities (deaths/100,000 people)

• This variable seems to approximately follow a Normal distribution, so the sample mean will also approximately follow a Normal distribution irrespective of the sample size drawn.

Page 15: Introduction to Inference Sampling Distributions.

Central Limit Theorem

• What if the original data doesn’t follow a Normal distribution?

• HR/Season for sample of baseball players

• If the sample is large enough, it doesn’t matter!

Page 16: Introduction to Inference Sampling Distributions.

Central Limit Theorem• If the sample size is large enough (n≥ 30),

then the sample mean x has an approximately Normal distribution

• This is true no matter what the shape of the distribution of the original data!

Page 17: Introduction to Inference Sampling Distributions.

Example: Home Runs per Season• Take many different samples from the seasonal HR

totals for a population of 7032 players• Calculate sample mean for each sample

n = 1

n = 10

n = 100

Page 18: Introduction to Inference Sampling Distributions.

Important Definition & Theorem

Central Limit TheoremThe sampling distribution of sample means will become normal as the sample size increases.

Sampling Distribution of Sample MeansIf all possible random samples, each of size n, are taken from any population with a mean and a standard deviation , the sampling distribution of sample means will:

1. have a mean equal to

2. have a standard deviation equal to

Further, if the sampled population has a normal distribution, then the sampling distribution of will also be normal for samples of all sizes

n

x

x

x

Page 19: Introduction to Inference Sampling Distributions.

Summary

• The standard deviation of the sampling distribution of (also called the standard error of the mean) is equal to the standard deviation of the original population divided by the square root of the sample size:Notes: – The distribution of becomes more compact as n increases. (Why?)– The variance of :

x

x n

xx x n2 2

• The distribution of is (exactly) normal when the original population is normal

x

• The CLT says: the distribution of is approximately normal regardless of the shape of the original distribution, when the sample size is large enough!

x

• The mean of the sampling distribution of is equal to the mean of the original population:

x x

Page 20: Introduction to Inference Sampling Distributions.

Standard Error of the Mean

Notes:• The n in the formula for the standard error of the mean is

the size of the sample

• The proof of the Central Limit Theorem is beyond the scope of this course

• The following example illustrates the results of the Central Limit Theorem

Standard Error of the Mean: The standard deviation of the sampling distribution of sample means: x n

Page 21: Introduction to Inference Sampling Distributions.

Graphical Illustration of the Central Limit TheoremOriginal Population

x10 3020

10 x

Distribution of x: n = 10

x

Distribution of x:n = 30

10 20

x

Distribution of x: n = 2

10 3020

Page 22: Introduction to Inference Sampling Distributions.

7.3 ~ Applications of the Central Limit Theorem

• When the sampling distribution of the sample mean is (exactly) normally distributed, or approximately normally distributed (by the CLT), we can answer probability questions using the standard normal distribution, using the z standard score for dealing with the normal distribution,

Page 23: Introduction to Inference Sampling Distributions.

Example 2 Example: Consider a normal population with = 50

and = 15. Suppose a sample of size 9 is selected at random. Find:

P x( )45 60

P x( . )47 5

1)2)

Solutions: Since the original population is normal, the distribution of the sample mean is also (exactly) normal

1) x 50

x n 15 9 15 3 52)

Page 24: Introduction to Inference Sampling Distributions.

5045 60 x0 1.00 2.00 z

0 3413. 0 4772.

Example 2

P x P

P z

( )

(. . .

45 60 45 505

60 505

1.00 2.00)0 3413 0 4772 08185

zz = ;x - n

Page 25: Introduction to Inference Sampling Distributions.

5047.5 x0-0.50 z

01915.0 3085.

Example 2

P x P x

P z

( . ) .

( . ). . .

47 5 505

47 5 505

505000 01915 0 3085

z = ;x - n


Recommended