Chapter 15 Inference in Practice PSLS/2eChapter 151.

Chapter 15

Inference in Practice

PSLS/2e Chapter 15 1

Effective use of inferential methods requires more than knowing the facts. It requires understanding the reasoning behind the

process.

z Procedures• If we know standard deviation before data collected, the

confidence interval for is:

• To test H0: = 0, we use this statistic:

• These are called z procedures because they rely on critical values from the Z~N(0,1) density function


Conditions for Z Procedures1. Data must resemble an SRSSRS from the population

Ask: “where did the data come from?”– Bad samples Bad samples (see next slide) invalidate methods

2. Population must be NormalNormal …BUT…a fact known as the Central Limit Central Limit TheoremTheorem tells us the sampling distribution of x-bar will be Normal even if the population is not Normal ifif the sample is “large enough”

– In practice, z procedures are robust in large samples3. Population standard deviation must be knownmust be known

before data are collected …Chapter 17 will introduce procedures that can be used when is not known


Examples of BadBad SamplesSamples• Convenience samples - selecting members of the population

that are easiest to reach– Example: sample of mall shoppers teenagers and retired people

will be over-represented• Voluntary response samples - people who choose themselves

by responding to a broad appeal– Example: online polls are useless scientifically

(people who take the trouble to respond are not representative of the larger population)

• Under-coverage - some groups in the population are left out or underrepresented

– Example: using telephone listing to select subjects (not everyone has a listed phone number

• If the data do not come from an SRS or a randomized experiment conclusions are open to challenge.

• Always ask where the data came from.Always ask where the data came from.


Inference about µ 604/20/23 Inference about µ 6

Normality Assumption and the Central Limit Theorem

Normality can be assumed Normality can be assumed when when n n is large because of is large because of the the Central Limit TheoremCentral Limit Theorem

• Sample size less than 15: “Normality” can be assumed if data are symmetric, have a single peak and no outliers. If data are highly skewed, avoid z [and t] procedures.

• Sample size at least 15: Normality can be assumed unless data are strongly skewed or have outliers.

• Large samples n > 30 - 60: Normality can be assumed even for skewed distributions when the sample is large (n ≥ ~40)


Can Normality be assumed?

Moderately sized dataset (n = 20) w/strong skew. Normality cannot be assumed

Do NOT use z [or t] procedures


Can Normality be assumed?Extremely large data set (n ≈ 1000)

The data has a strong positive skew

But since sample is large, central limit theorem is strong and we can assume Normality.

Do use z [or t] procedures.


Can Normality be assumed?

The distribution has no clear departures from Normality. Therefore, we can trust z [and t] procedures.

n is moderate

Additional Caution: GIGO


• Garbage In, Garbage Out • A study is only as good as the quality of the data• CIs and P-values are valueless when the

INFORMATION is of POOR QUALITY• Example: Self-reported data can be inaccurate and

biased

Additional Caution: P-values• P-values (significance tests) are often misunderstood• Even large differences can fail to be significant if the

sample is small • Statistical significance does NOT tell us whether a finding is

important statistical significance is NOT the same as practical significance

• P values are NOT the probability that H0 is true; it is the probability the data came from a distribution in which H0 is correct

• Failure to reject H0 is NOT the same as accepting H0• Although = 0.05 is a common cut-off, there is NO

set border between “significant” and “insignificant” results, surely God loves P = .06 nearly as much as P = .05.


Margin of Error (m)• When estimating µ with C confidence, the margin of error:

• The margin of error = half the CI length indicates the precision of the estimate

• z* and σ are immutable at a given level of confidence • To increase precision, increase the sample size:

↑ n → ↓ m → ↑ precision


m zn

Choosing a Sample Size


To determine the sample size requirement to achieve margin of error m when estimating µ use:

2

m

σzn

Example: National Assessment of Educational Progress (NAEP) Math Scores


NEAP math scores predict success following High School

Suppose that we want to estimate a population mean NAEP scores with 90% confidence and want the margin of error to be no more than ±5 points

We know the NEAP math scores have = 60

What sample size will be required to enable us to create such an interval?

Example


NAEP Quantitative Scores

If you round down your margin of error will be bigger If you round up your margin of error will be smaller (a good

thing).Always round UP to next integer. Study 400 individuals so m no greater than 5.

z σn

m

2

5

2(1.645)(60)

= 399.67

Example: Decrease margin of error m


Now suppose we want to estimate the population mean NAEP scores with 90% confidence and want the margin of error not to

exceed 3 points (recall that = 60).

What sample size will be required to enable us to create such an interval?

Case Study


NAEP Quantitative Scores

Therefore resolve to study 1083 (so that the margin of error does not exceed 3 points.

Note that lowering the margin of error to 3 points, required a much larger sample size!

Date post:	02-Jan-2016
Category:	Documents
Upload:	charleen-dulcie-bishop
View:	225 times
Download:	7 times

Chapter 15 Inference in Practice PSLS/2eChapter 151.

Documents