+ All Categories
Home > Documents > Chapter 5 Introduction to Inferential Statistics.

Chapter 5 Introduction to Inferential Statistics.

Date post: 19-Dec-2015
Category:
View: 230 times
Download: 4 times
Share this document with a friend
Popular Tags:
35
Chapter 5 Introduction to Inferential Statistics
Transcript
Page 1: Chapter 5 Introduction to Inferential Statistics.

Chapter 5

Introduction to Inferential Statistics

Page 2: Chapter 5 Introduction to Inferential Statistics.

Definition

infer - vt., arrive at a decision by or opinion by reasoning from known facts or evidence.

Page 3: Chapter 5 Introduction to Inferential Statistics.

Sample

A sample comprises a part of the population selected for a study.

Page 4: Chapter 5 Introduction to Inferential Statistics.

Random Samples

If every score in the population has an equal chance of being selected each time you chosea score, then it is called a random sample.

Random samples, and only random samples, are representative of the population from which they are drawn.

Page 5: Chapter 5 Introduction to Inferential Statistics.

Q: ON WHAT MEASURES IS A RANDOM SAMPLE REPRESENTATIVE OF THEPOPULATION?

A: ON EVERYEVERY MEASURE.

Page 6: Chapter 5 Introduction to Inferential Statistics.

REPRESENTATIVE ON EVERY MEASUREThe mean of the random sample’s

height will be similar to the mean of the population.

The same holds for weight, IQ, ability to remember faces or numbers, the size of their livers, self-confidence, how many children their aunts had, etc., etc., etc. ON EVERY MEASURE THAT EVER WAS OR CAN BE.

Page 7: Chapter 5 Introduction to Inferential Statistics.

All sample statistics are representative of their population parameters

The sample mean is a least squares, unbiased consistent est

Page 8: Chapter 5 Introduction to Inferential Statistics.

REPRESENTATIVE ON measures of central tendency (the mean), on measures of variability (e.g., sigma2), and on all derivative measuresFor example, the way scores fall around

the mean of a random sample (as indexed by MSW) will be similar to the way scores fall around the mean of the population (as indexed by sigma2).

Page 9: Chapter 5 Introduction to Inferential Statistics.

THERE ARE OCCASIONAL RANDOM SAMPLES THAT ARE POOR REPRESENTATIVES OF THEIR POPULATION

But 1.) we will take that into accountAnd 2.) most are fairly to very good

representatives of their populations

Page 10: Chapter 5 Introduction to Inferential Statistics.

Population Parameters and Sample Statistics: Nomenclature

The characteristics of a population are calledpopulation parameters. They are usually represented by Greek letters (, ).

The characteristics of a sample are calledSAMPLE STATISTICS. They are usually represented by the English alphabet (X, s).

Page 11: Chapter 5 Introduction to Inferential Statistics.

Three things we can do with random samples

Estimate population parameters. This is called estimation research.

Estimate the relationship between variables in the population from their relationship in a random sample. This is called correlational research.

Compare the responses of random samples to different conditions. This is called experimental research.

Page 12: Chapter 5 Introduction to Inferential Statistics.

Estimating population parameters

Sample statistics are least squares, unbiased, consistent estimates of their population parameters.

We’ll get to this in a minute, in detail.

Page 13: Chapter 5 Introduction to Inferential Statistics.

Correlational Research

We observe the relationship among variables in a random sample. We are unlikely to find strong relationships purely by chance. When you study a sample and the relationship between two variables is strong enough, you can infer that a similar relationship between the variables will be found in the population as a whole.

This is called correlational research.For example, height and weight are co-related.

Page 14: Chapter 5 Introduction to Inferential Statistics.

What is needed for correlational research In Chapter 6, you will learn to turn scores on

different measures from a sample into scores that can be directly compared to each other.

In Chapter 7, you will learn to compute a single number that describes the direction and consistency of the relationship between two variables. That number is called the correlation coefficient.

In Chapter 8, you will learn to predict scores on one variable scores on another variable when you know (or can estimate) the correlation coefficient.

In Chapter 8, you will also learn when not to do that and to go back to predicting that everyone will score at the mean of their distribution.

Page 15: Chapter 5 Introduction to Inferential Statistics.

Experimental Research

In Chapters 9 – 11 you will learn about experiments.In an experiment, we start with samples that can be assumed to be similar and then treat them differently.Then we measure response differences among the samples and make inferences about whether or not similar differences would occur in response to similar treatment in the whole population.For example, we might expose randomly selected groups of depressed patients to different doses of a new drug to see which dose produces the best result.If we got clear differences, we might suggest that all patients be treated with that dose.

Page 16: Chapter 5 Introduction to Inferential Statistics.

Experimental Research

We apply different treatments to samples and then measure the response differences and if, andonly if, the differences among samples are largeenough, we can infer that the same differences would occur in the population.This is called experimental research.

For example, studying the effect of Vitamin C on the likelihood of obtaining a cold.

Page 17: Chapter 5 Introduction to Inferential Statistics.

In this chapter, we will focus on estimating population parameters from sample statistics.

Page 18: Chapter 5 Introduction to Inferential Statistics.

Estimation research

We measure the characteristics of a random sample and then we infer that they are similar to the characteristics of the population.

Characteristics are things like the mean andstandard deviation.

Estimation underlies both correlational and Experimental research.

Page 19: Chapter 5 Introduction to Inferential Statistics.

Definition

A least square estimate is a number that is the minimum average squared distance from the number it estimates. We will study sample statistics that are least squares estimates of their population parameters.

Page 20: Chapter 5 Introduction to Inferential Statistics.

Definition

An unbiased estimate is one around whichdeviations sum to zero.We will study sample statistics that are unbiased estimates of their population parameters.

Page 21: Chapter 5 Introduction to Inferential Statistics.

Definition

A consistent estimator is one where the largerthe number of randomly selected scores underlyingthe sample statistic, the closer the statistic will tendto come to the population parameter.We will study sample statistics that are consistentestimates of their population parameters.

Page 22: Chapter 5 Introduction to Inferential Statistics.

The sample mean

The sample mean is called X-bar and is represented by X.

X is the best estimate of , because it is a leastsquares, unbiased, consistent estimate.

X = X / n

Page 23: Chapter 5 Introduction to Inferential Statistics.

Estimated variance

The estimate of 2 is called the mean squarederror and is represented by MSW.

It is also a least squares, unbiased, consistentestimate.

SSW = (X - X)2

MSW = (X - X)2 / (n-k)

Page 24: Chapter 5 Introduction to Inferential Statistics.

Estimated standard deviation

The estimate of is called s.

s = MSW

Page 25: Chapter 5 Introduction to Inferential Statistics.

In EnglishWe estimate the population mean by

finding the mean of the sample.We estimate the population variance

(sigma2) with MSW by first finding the sum of the squared differences between our best estimate of mu (the sample mean) and each score. Then, we divide the sum of squares by n-k where n is the number of scores and k is the number of groups in our sample.

We estimate sigma by taking a square root of MSW, our best estimate of sigma2.

Page 26: Chapter 5 Introduction to Inferential Statistics.

Estimating mu and sigma – single sample

S#ABC

X684

MSW = SSW/(n-k) = 8.00/2 = 4.00

s = MSW = 2.00

(X - X)2

0.00 4.004.00

(X - X) 0.00 2.00-2.00

X6.006.006.00

X=18 N= 3

X=6.00

(X-X)=0.00 (X-X)2=8.00 = SSW

Page 27: Chapter 5 Introduction to Inferential Statistics.

Group11.11.21.31.4

X50776988

MSW = SSW/(n-k) =

s = MSW =

(X - X)2

441.0036.00

4.00289.00

(X - X) -21.00

+6.00-2.00

+17.00

(X-X1)=0.00 (X-X1)2= 770.00Group2

2.12.22.32.4

78578263

(X-X2)2= 426.00(X-X2)=0.00

64.00169.00144.0049.00

8.00-13.0012.00-7.00

Group33.13.23.33.4

74706381

X71.0071.0071.0071.00

X1 = 71.00

70.0070.0070.0070.00

X2 = 70.00

(X-X3)2= 170.00(X-X3)=0.00

4.004.00

81.0081.00

2.00-2.00-9.009.00

72.0072.0072.0072.00

X3 = 72.00

1366.00/9 = 151.78

151.78 = 12.32

Page 28: Chapter 5 Introduction to Inferential Statistics.

Why n-k?

This has to do with “degrees of freedom.”

As you saw last chapter, each time you add a score to a sample, you pull the sample statistic toward the population parameter.

Page 29: Chapter 5 Introduction to Inferential Statistics.

Any score that isn’t free to vary does not tend to pull the sample statistic toward the population parameter.

One deviation in each group is constrained by the rule that deviations around the mean must sum to zero. So one deviation in each group is not free to vary.

Deviation scores underlie our computation of SSW, which in turn underlies our computation of MSW.

Page 30: Chapter 5 Introduction to Inferential Statistics.

n-k is the number of degrees of freedom for MSW

You use the deviation scores as the basis of estimating sigma2 with MSW.

Scores that are free to vary are called degrees of freedom.

Since one deviation score in each group is not free to vary, you lose one degree of freedom for each group - with k groups you lose k*1=k degrees of freedom.

There are n deviation scores in total. k are not free to vary. That leaves n-k that are free to vary, n-k degrees of freedom MSW, for your estimate of sigma2.

The precision or “goodness” of an estimate is based on degrees of freedom. The more df, the closer the estimate tends to get to its population parameter.

Page 31: Chapter 5 Introduction to Inferential Statistics.

Group11.11.21.31.4

X50776988

MSW = SSW/(n-k) =

s = MSW =

(X - X)2

441.0036.00

4.00289.00

(X - X) -21.00

+6.00-2.00

+17.00

(X-X1)=0.00 (X-X1)2= 770.00Group2

2.12.22.32.4

78578263

(X-X2)2= 426.00(X-X2)=0.00

64.00169.00144.0049.00

8.00-13.0012.00-7.00

Group33.13.23.33.4

74706381

X71.0071.0071.0071.00

X1 = 71.00

70.0070.0070.0070.00

X2 = 70.00

(X-X3)2= 170.00(X-X3)=0.00

4.004.00

81.0081.00

2.00-2.00-9.009.00

72.0072.0072.0072.00

X3 = 72.00

1366.00/9 = 151.78

151.78 = 12.32

Page 32: Chapter 5 Introduction to Inferential Statistics.

More scores that are free to vary = better estimates: the mean as an example.

Each time you add a randomly selected score to your sample, it is most likely to pull the sample mean closer to mu, the population mean.

Any particular score may pull it further from mu.But, on the average, as you add more and more scores, the oddsare that you will be getting closer to mu..

Page 33: Chapter 5 Introduction to Inferential Statistics.

Book example

Population is 1320 students taking a test.

is 72.00, = 12

Unlike estimating the variance (where df=n-k) when estimating the mean, all the scores are free to vary. So each score in the sample will tend to make the sample mean a better estimate of mu. Let’s randomly sample one student at a time and see what happens.

Page 34: Chapter 5 Introduction to Inferential Statistics.

Test Scores

Frequency

score

36 48 60 96 10872 84

Sample scores:

3 2 1 0 1 2 3Standard

deviations

Scores

Mean

87Means: 80 79

102 72 66 76 66 78 69 63

76.4 76.7 75.6 74.0

Page 35: Chapter 5 Introduction to Inferential Statistics.

Consistent estimators

This tendency to pull the sample mean back to the populationmean is called “regression to the mean”.

We call estimates that improve when you add scoresto the sample consistent estimators.

Recall that the statistics that we will learn are:consistent,least squares, andunbiased.


Recommended