New Topic 4: Inferential statistics · 2019. 9. 4. · QM STEM Ed 2018 4 Inferential statistics...

QM STEM Ed 20181

Quantitative Methods in STEM

Education Research

Topic 4: Inferential statistics

Judy Sheard

Faculty of Information Technology

Monash University, Australia

[email protected]

QM STEM Ed 20182

Overview of topic 4

Hypothesis testing.

Central Limit Theorem

Level of significance

Z-scores

Confidence intervals

Categories of statistical tests

QM STEM Ed 20183

Descriptive vs. inferential statistics

Descriptive statistics — used to describe sets

of quantitative data. This involves

descriptions of distributions of data and

relationships between variables.

Inferential statistics — used to make

inferences about populations from analysis of

subsets (samples) of the population.

QM STEM Ed 20184

Inferential statistics

“In inferential statistics, statistics are measures of the

sample and parameters are measures of the

population. Inferences are made about the

parameters from the statistics”. (Wiersma, 1995,

p.363)

Inferences are made about a population based on a

subset or random sample of that population.

Note that in educational research it is often not possible to

have a random sample – instead we attempt to show that

the sample is typical of the population by comparing

demographics, e.g. gender, age, educational background.

QM STEM Ed 20185

Hypothesis testing

In inferential statistics, a hypothesis is used to determine whether an observation has an underlying cause or whether it was due to some random fluctuation or error in a sample.

The researcher will test to see if the hypothesis is consistent with the sample data – if not the hypothesis is rejected.

Two different ways of stating a hypothesis:

Looking for a difference between groups;

Looking for relationships between groups.

QM STEM Ed 20186

Hypothesis testing

On what basis do we accept or reject a hypothesis? Consider this example:

A set of exercises was designed to encourage reflection on program design. It was hypothesized that these exercises improved students’ skills in program design. This method was used on a class of 30 students. In a test on program design, the class scored a mean of 60% with a standard deviation of 10. The same test on another class that had not used these exercises, resulted in a mean score of 55% with a standard deviation of 12.

Does the hypothesis seem reasonable? What if the class mean was 70%? What about 57%?

QM STEM Ed 20187

Null hypothesis

In inferential statistics we test the opposite of a research hypothesis using the null hypothesis. For example:

Research hypothesis: Skills in program design will be improved with the use of exercises to encourage reflection on program design.

Null hypothesis: There will be no difference in skill levels in program design between students who have completed exercises to encourage reflection on program design and those who have not.

Research hypothesis: The performance of introductory programming students is related to prior programming experience.

Null hypothesis: There is no relationship between programming performance and prior programming experience.

If your study finds there is a difference or some relationship then you can reject the null hypothesis (H0) and you can state that there is support for your research hypothesis (H1).

QM STEM Ed 20188

Sampling distribution

We need more than intuition here.

We will connect probability with a statistic — using

the concept of a sampling distribution of the

statistic.

A sampling distribution consists of the values of a

statistic computed from all possible samples of a

given size. (Wiersma, 2005, p.375).

Note that the sampling distribution is not the

sample distribution.

What does this mean?

We have a population.

We can take a sample of size n from the

populations and compute a statistic of this

sample, e.g. the mean.

We take all possible samples of size n and

compute the statistic of these samples.

We now have a distribution of the statistic.

QM STEM Ed 20189

QM STEM Ed 201810

Central limit theorem

The shape, location (central tendency) and

variability (dispersion) of the sampling

distribution is described by the central limit

theorem.

The central limit theorem (CLT) states:

Given any population, the distribution of the

sample mean is approximately a normal

distribution, provided the sample size is large.

This is the key theorem in statistics!

QM STEM Ed 201811

Central limit theorem

The central limit theorem specifies that the

sampling distribution of the mean has a mean

equal to the population mean (μ), a standard

deviation equal to σ/√n, and is normally

distributed.

(σ is the standard deviation of the population)

Some simulations to illustrate this:http://www.stat.sc.edu/~west/javahtml/CLT.html

http://www.rand.org/statistics/applets/clt.html

http://en.wikipedia.org/wiki/Concrete_illustration_of_the_central_limit_theorem

QM STEM Ed 201812

Level of significance

The level of significance is a probability used in testing hypotheses. It is a criterion used in making a decision about the hypothesis.

The common level used in educational research is 0.05. Occasionally other levels are used: 0.01, 0.001 and 0.1.

A level of 0.05 means that when the probability is lower than 0.05, the null hypothesis is rejected.

It then follows that if the null hypothesis is true it will only be rejected 5% of the time.

We now connect the sampling distribution with the level of significance.

QM STEM Ed 201813

The “68.3 - 95.5 - 99.7” rule

http://en.wikipedia.org/wiki/Image:Standard_deviation_diagram.svg


QM STEM Ed 201814

Z-score

The z-score (also called standard score) indicates how far, and in what direction, that score deviates from its distribution's mean, expressed in units of the distribution's standard deviation.

The formula for creating z-scores is:

Where:

x is a raw score to be standardizedμ is the mean of the population

σ is the standard deviation of the population

QM STEM Ed 201815

Standard z-score

The z-score indicates if a score was above or below the distribution mean.

A z-score of +1 indicates one standard deviation above the population mean.

A z-score of -1 indicates one standard deviation below the population mean.

For example, a mark of 53 on a test where the mean of all marks was 67 and the standard deviation of marks was 7 would give a standard score of -2.0.

QM STEM Ed 201816

Properties of standard scores

A z-score makes it possible to compare scores from different distributions.

z-scores have the following properties:

The mean of any set of z-scores is zero.

The standard deviation of any set of z-scores is always equal to 1.

The distribution of z-scores has the same shape as the distribution of raw score from which they were derived.

QM STEM Ed 201817


A confidence interval specifies a range within which we can have some degree of confidence of finding of finding another value – usually the population mean.

To construct a confidence interval based on the normal distribution we need:

a random sample of size n

the sample mean

the standard deviation of the population

a level of confidence

QM STEM Ed 201818

Defining confidence intervals

To find the lower (L) and upper (U) limits for a

confidence interval we use to following

L

U

x zn

x zn

The sample mean The sample

size

The std deviation

A z-score indicating the confidence level

QM STEM Ed 201819


Increasing the confidence level widens the

confidence interval.

Increasing the sample size narrows the

confidence interval.

Increasing the standard deviation makes the

interval wider.

Common confidence levels are 90%, 95%,

99% - but we can specify any level below

100%.

QM STEM Ed 201820

Choosing the z-score

1.96 1.96

For 95% confidence we

choose a central area of

0.95 on the standard normal

curve.

For 90% confidence we

choose a central area of

0.90 on the standard normal

curve.

0.95

1.645 1.645

0.90

QM STEM Ed 201821

The “68.3-95.5-99.7” rule



QM STEM Ed 201822

Example

The numbers below were randomly drawn from

a normal population with σ = 10.

56.87, 73.96, 59.77, 75.89, 71.60, 81.94, 69.11,

80.07, 74.70, 63.32

The sample mean = 70.72 and we want a 95%

confidence interval. So,10

L 70.72 1.96 64.5210

10U 70.72 1.96 76.92

10

QM STEM Ed 201823

Example cont..

So we are 95% confident that the population mean is between 64.52 and 76.92.

What does this really mean?

Would you get the same result from another random sample of size 10?

What if you took another 100 samples and constructed 100 confidence intervals? They would all be different and about 5% of them would

not even contain the population mean

QM STEM Ed 201824

The standard error

The standard error of the sample mean is:

You can see that the standard error gets smaller

as the sample size increases.

The standard error also shows up in the confidence

interval formula:

xn

x zn

This is why the

interval get smaller as n increases

QM STEM Ed 201825

Null hypothesis

The null hypothesis H0 is

the hypothesis of no

difference or no

relationship.

But there is a possibility

of a wrong decision.

Reducing the risk of one

error increases the risk

of another error.

Correct Error

(Type II

error

p = β)

Error

(Type I

error

p = α)

Correct

p = 1- β

(power)

H0 True H0 False

Accept H0

Reject H0

Researcher’s

decision

“State of the world” –

actual situation

QM STEM Ed 201826

Type I and Type II errors

Type I error occurs when the decision is to

reject the null hypothesis when it is actually

true. This probability equals the significance

level. Symbolized by α

Type II error occurs when the decision is not

to reject the null hypothesis when it is actually

false. Symbolized by β

QM STEM Ed 201827

Power of a statistical test

The power of a statistical test is the probability that the

test will lead to a decision to reject the null

hypothesis when the null hypothesis is indeed false.

How to increase the power of a test:

Increase the significance level

Increase the sample size

Reduce variability – e.g. use homogenous groups,

exclude outliers

http://www.cas.buffalo.edu/classes/psy/segal/2072001/Hyptest/Hyptsting.htm

QM STEM Ed 201828

Power

Predicted

distribution

Actual

distribution

QM STEM Ed 201829

Review of inferential statistics

reasoning

We have a population that we wish to make

measures of – parameters.

We select a random sample and compute

measures of the sample – statistics.

The statistics reflect the corresponding

parameters and sampling distribution.

We observe the statistics, and infer back to

the parameters in the light of the sampling

distribution and probability.

QM STEM Ed 201830

Analysis using inferential statistics

Data may be analysed using inferential statistics.

A common process is through hypothesis testing.

The role of hypothesis testing is to determine whether

the result obtained from analysis occurred by

chance.

The null hypothesis is tested through these statistical

tests.

There are numerous tests.

There are many different sampling distributions.

QM STEM Ed 201831

Tests using inferential statistics

Broadly two categories:

Parametric analyses – interval scale

measurement and assumptions about the

population.

Nonparametric analyses – typically nominal and

ordinal scale measurement and generally no

assumptions about the population.

QM STEM Ed 201832

Deciding upon which test to use

Factors to consider:

Number of independent and dependent variables.

Measurement levels of independent and dependent variables.

Related vs. non-related variables (only relevant when comparing groups).

Number of categories for the independent variables.

QM STEM Ed 201833

Why is measurement important?

Compute the average mark of these results.

Compute the average hair colour.

83 47 34 23 85 33 84 83 72 94 30

black brown red blonde

blonde blonde black red

QM STEM Ed 201834

Statistical vs. practical significance

We may be able to demonstrate statistical

significance but the effects may be very

small.

Hence, the effects may be statistically

significant but not practically significant.

Perhaps the cost of the new teaching method

or technology may outweigh the benefits

gained.

QM STEM Ed 201835

Meta analysis

Quantitative methods are typically applied to individual

studies.

Quantitative methods may also be used to review

results across studies – this is called meta analysis.

Vote counting – this technique has poor power

p-values – uses the size of the p-value

lost studies – estimates of numbers of studies not

published due to non-significant results.

Effect sizes – estimates of the effect size of studies are

compared

QM STEM Ed 201836

Categories of statistical tests

Differences between groups

Degree of relationship between variables

Clustering of variables or individuals

Analyses across time

In the remaining lectures we will review some

of the most commonly used in educational

research.

QM STEM Ed 201837

Date post:	10-Oct-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

New Topic 4: Inferential statistics · 2019. 9. 4. · QM STEM Ed 2018 4 Inferential statistics...

Documents