Vegas Baby A trip to Vegas is just a sample of a random variable (i.e. 100 card games, 100 slot...

Post on 12-Jan-2016

215 views 0 download

transcript

Vegas BabyA trip to Vegas is just a sample of a random variable

(i.e. 100 card games, 100 slot plays or 100 video poker games)

Which is more likely?

Win on a short trip

Win on a long trip

Vegas Baby

House likely to win in the long run, but loses in the short run all the time.

Player likely to lose in the long run, but wins in the short run all the time

Example: Blackjack

What if you went to Vegas 1000 times

Each weekend played exactly 8 hands?

How often would you win all 8 hands?

How often would you win 6 out of 8 hands?

How often would you lose all 8 hands?

Example: Blackjack

The outcomes of each trip would be on a distribution.

This distribution would be affected by:

The distribution of each blackjack event

Sampling (i.e. luck)

Distribution – 8 Hands

______ Win.49

Lose.51

Winning or losing in the short run is greatly affected by LUCK

Distribution – 256 Hands

Flipping outVery important concept.

The mean of a larger sample is more likely to be close to expected value.

Flip 1,000,000 coins then proportion of heads is very close to .5

(e.g. .4993).

Flip 100 coins then number of head is not so close to .5

(e.g. .7).

Almost impossible to get .7 heads with a 1,000,000 flips

In class demo: flip 10 coins

• Go to online text sampling demo

Sampling Mean Distribution

1) MEAN OF A DISTRIBUTION OF MEANS =

MEAN OF POPULATION OF INDIVIDUAL CASES

CENTRAL LIMIT THEOREM

XX 2) THE SPREAD OF A DISTRIBUTION OF MEANS IS LESS THAN THE SPREAD OF THE POPULATION OF INDIVIDUAL CASES

22 XX n

XX

n

X = STANDARD ERROR

3) THE SHAPE OF THE DISTRIBUTION OF MEANS TENDS TO BE NORMAL (REGARDLESS OF SHAPE OF DISTRIBUTION OF X, IF n > 30)

CENTRAL LIMIT THEOREM

Sampling with and without Replacement

Let's look at how sampling with and without replacement works. In this example, we have a population of 1,3,5 and

7. The sample size is 2.

Sampling with and without Replacement

With four scores in the population, there are 4 times 4 equals 16 samples with two scores in the sample.

Sampling with and without Replacement

These 12 combinations of scores involve a sample where the first score and second score are not the same. These

are all the combinations when sampling without replacement is used.

Sampling with and without Replacement

When sampling with replacement is used, then there are some additional samples that are possible. These four combinations of scores involve a sample where the same score was selected twice.

Therefore, sampling with replacement always involves a larger number of possible samples than sampling without replacement.

With Replacement

•Central Limit Theorem Holds

Without Replacement

•Central Limit Theorem DOES NOT Hold

But with large samples, this is not a big problem

Sampling Mean Distribution

22 XX n

22 XX n

Good samples and Bad samples

When making an estimate of a population, you want your sample to be a random subset of the population.

This is often not the case in some samples that are commonly reported.

Example: Political opinion polls that sample by telephone – why?

Consider a population distribution of means that come from a distribution of the weights of men. The mean of the score distribution (μ)=150 and the standard deviation = 10 and the means are from sample size = 25. The maximum score value is 160.

What is the proportion of means above a mean of 152? 

Proportion Problem for Means

Is a Mean Distribution Normal?

Consider a population distribution of means that come from a distribution of the weights of men and women. The mean of the score distribution (μ)=150 and the standard deviation = 10 and the means are from sample size = 36.What is the proportion of means above a mean of 152?

Proportion Problem for Means

t distributionsArea problems so far use standard error form population scores

• But we usually don’t have population information but we do have sample information

• What if we estimate standard error with..

t distributions

• This is the ‘estimated standard error’ which we can use to calculate how far a mean is away from the mean of mean distribution

• Example:– Sample mean =102 – Population mean = 100– Estimated standard error = 2– This sample mean is (102-100)/2=1

estimated standard errors away from the population mean

t distributions

• The term for this distance of a mean away from mean of means distribution is called the t statistics

• Similar to a z-score for means, but used with estimated standard errors instead of actual population standard errors.

Can calculate areas under t distribution with t statistics similar to area under

normal curve for z

t distribution• PROBLEM: Because we estimated the

standard error, this t statistic has error in it, and the error depends upon the sample size

• EFFECT: t distribution that defines areas for the t statistic is affected by sample size.

t distribution

• BIG PROBLEM: If we want to calculate areas for t distributions, then we need 1 complete table of values (like z area table) for each sample size.

• SOLUTION: Only provide info for certain important areas that are useful for hypothesis testing and confidence intervals

T critical values

.05(2) = 5% in two tails

Range is specified by:•Lower value < μ < Higher value•Examples•100 < μ < 110•0 < μ < 120

Use of Sample Means:Confidence Intervals

What if we have a score distribution with a standard deviation

If we collect a sample mean X, then how accurate is that mean?

In other words, can we specify a range of population means that are likely to be true?

Use of Sample Means:Confidence Intervals

_

Confidence intervals can specify different degrees of confidence, for example:

•95% (want to be 95% sure μ is in range) •99% (want to be 99% sure μ is in range) •99.9% (want to be 99.9% sure μ is in range)

Use of Sample Means:Confidence Intervals

σX known. Z distribution.

• (Not typical -- easier)

σX unknown. t distributions.

• (Typical - harder)

Solving Confidence IntervalsTwo types

Use normal distribution

Start with score distribution

Mean distribution is times skinnier

Formula

Solving Confidence Intervals

When σx is known

If we have a sample mean of 85 and sample size of 16 and a σX=10, then what range of population values can we be 95% sure contain the true population mean?

Use of Sample Means:Confidence Intervals With Z

distribution

If we have a sample mean of 85 and sample size of 16 and a σX=10, then what range of population values can we be 99% sure contain the true population mean?

Use of Sample Means:Confidence Intervals With Z

distribution

If we have a sample mean of 85 and sample size of 16 and a SX=10, then what range of population values can we be 95% sure contain the true population mean?

Use of Sample Means:Confidence Intervals With t Distribution

If we have a sample mean of 85 and sample size of 16 and a SX=10, then what range of population

values can we be 95% sure contain the true population mean?

Solution: