Data Analysis Using SPSS EDU5950 SEM1 2014-15 Assoc. Prof. Dr. Rohani Ahmad Tarmizi Institute for...

Data Analysis Using SPSS

EDU5950SEM1 2014-15

•Assoc. Prof. Dr. Rohani Ahmad Tarmizi•Institute for Mathematical Research/

•Faculty of Educational Studies•UPM

LEARNING OUTCOMESFirst - students will be able to conceptualize importance of choosing appropriate statistical analyses

Second – students will be able to conduct DATA ENTRY procedures

Third - students will be able to conduct descriptive statistical analysis and interpret the findings

Fourth – students will be able to conduct test of hypotheses of differences and interpret the findings

Fifth – students will be able to conduct test of correlation or relationship and interpret the findings

Statistics ANALYSES Some background

► As we all know, human beings are complex entities complete with knowledge , beliefs, feelings, opinions, attitudes, etc.

► Studying human subjects by examining a single independent variable (IV) and a single dependent variable (DV) is truly impractical since these variables do not co-exist in isolation as part of the human mind or set of behaviors.

► These two variables (an IV and the examined DV) may effect or be affected by several other variables.

► In order, to be able to draw conclusions offer accurate explanations of the phenomenon of interest, the researcher should be willing to examine many variables simultaneously.

Variables

EXTERNAL REWARDS INTRINSIC MOTIVATION

Independent Variables

Dependent Variable

Variables

EXTERNAL REWARDS

TASK INTEREST

TASK STRUCTURE

INTRINSIC MOTIVATION


Dependent Variable

Variables

Spiritual well-beingExperienceTraining

Demography:► Gender► Educational levels

Counseling competency► skills► knowledge► awareness

Level of integration of religious

perspectives


Dependent Variable

VariablesCharacteristic studied that assume different values for different elements

Demography:► Gender► Job tenure► Occupational status

Job characteristic:► Work condition► Job demand

► Job control

Perceived quality of ICT facilities

Career commitment

Quality of work life


Intervening Variable

Dependent Variable

BASIC CONCEPTSTATISTICAL ANALYSIS

MAJOR GROUPS OF HYPOTHESIS TESTINGS

• GROUP DIFFERENCES

• RELATIONSHIP BETWEEN VARIABLES

• PREDICTION OF GROUP MEMBERSHIP

• STRUCTURAL ANALYSES

Group Differences1. t Test (independent t-test)

Compare differences in mean of interval/ratio DV among groups of a qualitative IV. It analyzes differences between means of two group.

There is significant difference in mean literacy performance between male and female preschoolers.

2. t Test (dependent t-test)Compare differences in mean of interval/ratio DV based on paired or matched scores. It analyzes differences between means that are paired/matched from the group.

There is significant difference in mean literacy performance from pre to post remedial program among preschoolers who undergo the remedial program.

Group Differences

3. One-Way Analysis of Variance (ANOVA ) and t TestCompare differences in mean of interval/ratio DV among groups of a qualitative IV. It analyzes variation between and within each group. Since ANOVA determines the group differences and does not identify which groups are significantly different, post hoc tests are usually conducted.

There are significant differences in mean literacy

performance between preschoolers from the low, middle and high SES group.

Group Differences

4. One-Way Analysis of Covariance (ANCOVA)Assess group differences on a single metric DV after the effect of one or more covariates are statistically removed. Covariates are chosen because of their known relationship with the DV.

Do preschoolers of low, middle and high SES have

different literacy test scores after adjusting for family type?There are significant differences in mean literacy performance between preschoolers from the low, middle and high SES group after adjusting for family type.

Group Differences

5. Factorial Analysis of Variance (factorial ANOVA) Comparing differences of one metric DV among groups of several nonmetric IVs and interactions among the Ivs

Does ethnicity and learning preference (IVs)

significantly affect reading achievement, (DVs) among primary school students?

Relationship and Prediction between Variables

6. Bivariate Correlation and RegressionBivariate Correlation assess the degree of relationship between two metric variables.

What is the relationship between motivation

achievement and CGPA of UPM freshman students?

7. In contrast, Bivariate Regression utilizes the relationship between the IV and DV to predict the score of DV from the IV.

To what extend do motivation achievement scores

(IV) predict CGPA of UPM freshman students?

8. Multiple correlation- degree of relationship between one metric DV and a set of metric IVs.

What is the relationship between motivation achievement, learning preference, locus of control (IVs) wtih CGPA of UPM freshman students?

9. Multiple Regression-Objective: to predict changes in the DV in response

to changes to in several IVs-One metric DV-One or more metric IVs

To what extend do motivation achievement scores, learning preference, locus of control (IVs) predict CGPA of UPM freshman students?

Relationship and Prediction between Variables

No. & Type of DV

No. & Type of IV

Test Purpose of Analysis

1 DV 1 IV (2 categories)

t-test Determine significance of mean group differences

1 DV 1 IV (>2 categories)

One-way ANOVA

Determine significance of mean group differences

1 DV ≥ 2 IVs Factorial ANOVA

Determine significance of mean group differences

Decision-making Tree – Test of Group Differences

Null Hypothesis Significance Testing

• This address: • How likely it is to obtain an observed

(i.e sample) result given a specific assumption about the population.

• The assumption about the population is called the null hypothesis (e.g, there is no difference, there is no relationship, there is no predictive model, etc) and the observed result is what the sample produces (e.g.,there is differences, there is relationship, there is predictive model)

Null Hypothesis Significance Testing

• Statistical tests such as z, t, F (ANOVA), etc., determine how likely the sample result or any result more distant from the null hypothesis would be if the null hypothesis were true.

• This probability is then compared to a set criterion which is the set alpha value or the term Type I error or alpha error rate.

• POWER ANALYSIS focuses on situations for which the expectation is that the null hypothesis is false.

Levels of Measurement• Which statistics you can use to analyze your data are determined by the level of measurement of each variable

• Four levels of measurement:

• Nominal - you group a variable into classes with no particular order (race, favorite color, etc)

• Ordinal - categories that represent somewhat ranks but you don’t know how much higher or lower, (weight categories (underweight, normal, overweight, obese)

• Interval - Data that have an inherent order and thus resulted in scores hence the data represent a true magnitude

• Ratio - Data that have an inherent order which resulted in scores and has a true 0 point.

• For purposes of choosing statistical analyses, the distinction between interval and ratio is unimportant!!

Levels of Measurement - Quiz

1. IQ scores

2. Gender3. Income (as a dollar amount)

4. Income (in 6 categories)

5. Agreement scores (1=strongly disagree, 2=slightly disagree, 3=neutral, 4=slightly agree, 5=strongly disagree)

6. Cancer status (has cancer, does not have cancer)

7. Practice location (rural, urban)

8. Cigarette smoking (no. of cig/day)

9. Cigarette smoking (none, up to ½ ppd, ½ ppd-<1 ppd, 1 ppd+)

Statistical Tools For Descriptive Analyses

• Frequency/percentage table, • Pie or bar Charts, • Histogram • Frequency Polygon, • Cross-tabulation• Scatter diagram• Mean, Median, Mode, Maximum,

Minimum• Range, Variance, Standard

Deviation, Coefficient of variation, Standard Scores

Statistical Tools For Inferential Statistics

• PARAMETRIC TESTS: – Test of hypothesis of differences

between means - Z-test, t-test, F-test, MANOVA

– Test of hypothesis of relationship – Pearson r, Point-biserial, Regression

• NON-PARAMETRIC TESTS: – Mann-Whitney, – Kruskal Wallis, – Spearman rho, – Chi-Square, Cramer’s V, Lambda, dll.

In most research projects, it is likely that you will use

quite a variety of different types of statistics, depending on the question you are addressing and the nature (level of measurement) of the data that you have.

It is therefore important that you have a basic understanding of

Different statistical tools, Type of objectivesResearch questionsHypotheses to address and the underlying

assumptions and requirements.

• TO DESCRIBE MEASURED VARIABLES

• TO COMPARE MEANS or MEDIANS or FREQUENCIES – test of differences

• TO CORRELATE OR DETERMINE RELATIONSHIP OR ASSOCIATION – test of association or relationship

THREE MAJOR STATISTICAL TECHNIQUES

ACTIVITY 1- DATA ENTRY

INITIAL DATA FILE – VARIABLE VIEW

INITIAL DATA FILE – DATA VIEW

Go to Variable view

To define the IVs & DVs .

Use separate line for each & give sensible names.

Decide specification/format of data: NAME, TYPE, WIDTH, DECIMALS, LABEL, VALUES, MISSING, COLUMN, ALIGN, MEASURE. For example, String = text, numeric = numbers or others but numeric is generally the best format.

Variable view – use to define or give specifications for the IVs and DVs

Go to Data view

To insert data – the measured and collected responses for variables.

Data is input in columns under appropriate variable names.

Each row designate the respondent of the study.

DATA VIEW – use to input data (respondents by rows and variables by columns)

EXAMPLE OF DATA SET IN SPSS DATA EDITOR – Variable view

EXAMPLE OF DATA SET IN SPSS DATA EDITOR – Data View

DATA TRANSFORMATION

• Used when variables need to be transformed as intended by the researcher or as stated in the objectives.

• TRANSFORM- COMPUTE To compute or sum the scores

• TRANSFORM – RECODERecoding negatively worded scale

itemsCollapsing continuous variablesReplacing missing values

TO RECODE:•CLICK TRANSFORM => RECODEYOU WILL GET RECODE DIALOG BOX•CLICK VARIABLE TO THE EMPTY RIGHT-HAND BOX•NAME THE NEW VARIABLE AND LABEL•CLICK CHANGE•CLICK OLD AND NEW VALUES BUTTON

To COMPUTE a score (TEACHER_EFFICACY)

•Click Transform => Compute•You will get a Compute Variable dialog box•Name your Target Variable•Type in the required Numeric Expression•Click OK

DATA SET WITH NEW VARIABLES - TEACHER_ FACTOR

You should be able to calculate descriptive statistics such as frequencies, descriptives, and crosstabs, bar charts, scattergram, box plot, histogram, etc.

Remember: output appears in a separate window.

ACTIVITY 2- DESCRIPTIVE ANALYSIS

Use the following Menu:

– DESCRIPTIVES STATISTICS FREQUENCIES

– DESCRIPTIVE STATISTICS DESCRIPTIVES – CUSTOM TABLES

– DISPLAY DATA HISTOGRAM, BOXPLOT, STEM AND LEAF

TO DESCRIBE MEASURED VARIABLES

Gender

Frequency Percent Valid Percent

Cumulative

Percent

Valid lelaki 22 34.4 34.4 34.4

perempuan 42 65.6 65.6 100.0

Total 64 100.0 100.0

Race

Frequency Percent Valid Percent

Cumulative

Percent

Valid MELAYU 15 23.4 23.4 23.4

CINA 41 64.1 64.1 87.5

INDIA 8 12.5 12.5 100.0

Total 64 100.0 100.0

TO OBTAIN FREQUENCY DISTRIBUTION

Religion

Frequenc

y Percent

Valid

Percent

Cumulative

Percent

Valid ISLAM 25 39.1 39.7 39.7

BUDDHA 24 37.5 38.1 77.8

HINDU 1 1.6 1.6 79.4

KRISTIA

N

13 20.3 20.6 100.0

Total 63 98.4 100.0

Missing System 1 1.6

Total 64 100.0

TO OBTAIN DESCRIPTIVE STATISTICS OF DATA

Descriptive Statistics

N Minimum Maximum Mean Std. DeviationMy teacher wants us to enjoy learning maths 60 1 6 3.75 1.580

My teacher understand our problems in learning maths

36 1 6 3.89 1.833

My teacher try to make mathematics lessons interesting

64 1 6 4.00 1.533

My teacher appreciates it when we try hard, even when our results are not so good

64 1 6 4.16 1.514

My teacher show us step by step and how to solve maths problems

63 2 6 4.25 1.534

My teacher listen carefully to what we say 64 1 6 4.16 1.185

My teacher is friendly to us 64 1 6 3.52 1.491My teacher gives us time to explore new maths problems

63 1 6 3.81 1.216

Valid N (listwise) 34

Report

Gender

My teacher

wants us to

enjoy learning

maths

My teacher

understand our

problems in

learning maths

My teacher try

to make

mathematics

lessons

interesting

lelaki Mean 3.74 3.75 3.91

N 19 12 22

Std. Deviation 1.821 1.913 1.716

perempuan Mean 3.76 3.96 4.05

N 41 24 42

Std. Deviation 1.480 1.829 1.447

Total Mean 3.75 3.89 4.00

N 60 36 64

Std. Deviation 1.580 1.833 1.533

TO OBTAIN DESCRIPTIVE -COMPARE MEANS OF DIFFERENT GROUPS

Plot graphs – you should be able to plot bar charts for sets of scores & plot scattergrams of relationships between the two sets of scores.

Remember: Select Graphs then explore the alternatives.

TO OBTAIN DESCRIPTIVE -COMPARE MEANS OF DIFFERENT GROUPS

Report

Gender

My instructor

wants us to enjoy

learning maths

My instructor

understand our

problems in learning maths

My instructor try

to make mathematics

lessons interesting

lelaki Mean 3.71 3.86 3.91

N 21 22 22

Std. Deviation

1.736 1.521 1.716

perempuan Mean 3.74 4.07 4.05

N 42 42 42

Std. Deviation

1.466 1.504 1.447

Total Mean 3.73 4.00 4.00

N 63 64 64

Std. Deviation

1.547 1.501 1.533

Summary of Statistical Tools For Descriptive Analyses

• Frequency/percentage table, • Pie or bar Charts, • Histogram • Frequency Polygon, • Cross-tabulation• Scatter diagram• Mean, Median, Mode, Maximum,

Minimum• Range, Variance, Standard

Deviation, Coefficient of variation, Standard Scores

ACTIVITY 3- COMPARISON OF MEANS OF TWO GROUPS

EXPLORING DIFFERENCES BETWEEN TWO GROUPS

1.t-test t-tests are used when you have two groups (e.g. males and females)

or two sets of data (before and after), and you wish to compare the

mean score on some continuous variable.

There are two main types of t-tests.

Paired sample t-tests (also called repeated measures) are used when

you are interested in changes in scores for subject tested at Time 1,

and then at Time 2 (often after some intervention or event). The

samples are ‘related’ because they are the same people tested each

time.

Independent sample t-tests are used when you have two different

(independent) groups of people (males and females), and you are

interested in comparing their scores. In this case, you collect

information on only one occasion, but from two different sets of

people.

• TO MAKE COMPARISONS BETWEEN GROUPS ON ANY MEASURED VARIABLES AT INTERVAL AND RATIO LEVEL

• CLICK ANALYZE =>COMPARE MEANS• You will get the following Sub-

menus

– MEANS– ONE-SAMPLE T-TEST– INDEPENDENT SAMPLES T-TEST– PAIRED SAMPLES T-TEST– ONE-WAY ANOVA

PURPOSE EXAMPLE OF RESEARCH QUESTION

PARAMETRIC STATISTIC

INDEPENDENT VARIABLE

DEPENDENT VARIABLE

Comparing means of two groups

Is there a difference in instructors’ efficacy in teaching and learning mathematics as perceived by students of different gender?

Independent t-test

One categorical independent variable gender of two levels-males and females

One continuous dependent variablestudents’ perception on instructors’ efficacy in teaching and learning

To Compare Means of Two Groups•Click: Analyze>Compare means>Independent T-test•You will get a Independent T-test dialog box•Select your variables – Test variables & Group variables•Click OK

Independent Samples Test

Levene's Test for Equality of

Variances t-test for Equality of Means

F Sig. t df

Sig. (2-

tailed)

Mean

Difference

Std. Error

Difference

95% Confidence

Interval of the

Difference

Lower Upper

INSTRUCTORS’ EFFICACY

Equal variances assumed

.883 .351 -.094 60 .926 -.02315 .24740 -.51803 .47173

Equal variances not assumed

-.095 42.237 .925 -.02315 .24347 -.51440 .46811

Group Statistics

GenderN Mean Std. Deviation Std. Error Mean

INSTRUCTORS’

EFFICACY

lelaki 21 3.9490 .89190 .19463

perempuan 41 3.9721 .93662 .14628

HYPOTHESIS ALPHA VALUE

SIGNIFICANT VALUE

(FROM THE SPSS OUTPUT)

EVALUATING DECISION

There is no significant difference in variance of students’ perception on instructors’ efficacy in T&Lof by different gender

0.05 .351 SIG.V > α Fail to reject null hypothesis,

Accept null hypothesis

Therefore , we Choose t from the equal variances assumed row

There is a significant difference in variance of students’ perception on instructors’ efficacy in T&L by different gender

DECISION MATRIX

HYPOTHESIS ALPHA

VALUE

SIGNIFICANT

VALUE (FROM

THE SPSS

OUTPUT)

EVALUATING DECISION CONCLUSION

There is no significant difference in mean students’ perception on instructors’ efficacy in T&L by different gender

0.05

.926 Sig. value lebih besar daripada α

Bermakna kebenaran hipotesis nol adalah besar.

Fail to reject null hypothesis,


There is no significant difference in students’ mean perception on instructors’ efficacy in T&L by gender, t (60) = -.094, p> .05. ( or p=.926)

There is a significant difference in mean students’ perception on instructors’ efficacy in T&L by different gender

PURPOSE EXAMPLE OF RESEARCH QUESTION

PARAMETRIC STATISTIC

INDEPENDENT

VARIABLE

DEPENDENT VARIABLE

Comparing means of two groups

Is there a difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting

Dependent t-test

- Two continuous dependent variable:students’ perception of mathematics inastructors’ role in making the students enjoy learning maths with making maths’ lessons interesting

Item 1 vs Item 3

To Compare Means of Two Dependent Groups

•Click: Analyze ->Compare means ->Paired Sample T-test•You will get a Paired Sample T-test dialog box•Select your variables – Paired variables •Click OK

Paired Samples Correlations

N Correlation Sig.Pair 1 My instructor wants us to

enjoy learning maths with My teacher try to make mathematics lessons interesting

63 .708 .000

Paired Samples TestPaired Differences

t dfSig. (2-tailed)Mean

Std. Deviation

Std. Error Mean

95% Confidence Interval of the DifferenceLower Upper

Pair 1 My instructors wants us to enjoy learning maths with My teacher try to make mathematics lessons interesting

-.238 1.174 .148 -.534 .058 -1.610 62 .112


SIGNIFICANT VALUE (FROM THE

SPSS OUTPUT)

EVALUATING

DECISION CONCLUSION

There is no significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting

0.05 .112 Sig. value lebih besar daripada α

Bermakna kebenaran hipotesis nol adalah besar.

Fail to reject null hypothesis,


There is no significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting, t (62) = -1.160, p> .05. (or p=.112)

There is a significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting

DECISION MATRIX

EXPLORING DIFFERENCES BETWEEN GROUPS

One-way analysis variance One-way analysis variance is similar to a t-test, but is used when you have two or

more groups and you wish to compare their mean scores on a continuous variable.

It is called one-way because you are looking at the impact of only one independent

variable on your dependent variable.

A one-way analysis of variance (ANOVA) will let you know whether your groups

differ, but it won’t tell you where the significant difference is (gp1/gp2, gp3/gp4 etc).

You can conduct post-hoc comparisons to find out which groups are significantly

different from one another.

You could also choose to test differences between specific groups, rather than

comparing all the groups by using planned comparisons. Similar to t-tests, there are

two types of one-way ANOVAs: repeated measures ANOVA (same people on more

than two occasions), and between-groups (or independent samples) ANOVA, where

you are comparing the mean scores of two or more different groups of people.

PURPOSE

EXAMPLE OF

RESEARCH QUESTION

PARAMETRIC

STATISTIC

INDEPENDENT

VARIABLE

DEPENDENT

VARIABLE

Comparing means of three groups

Is there a difference in students’ perception of instructors’ efficacy in T&L mathematics byrace?

One-way between groups ANOVA

One categorical independent variable (three levels of race)

One continuous dependent variable students’ perception of instructors’ efficacy in T&L mathematics

To Compare Means of Three or More Groups•Click: Analyze->Compare means->One-Way ANOVA•You will get a One-Way ANOVA dialog box•Select your variables –> Dependent variables-> Factor or Group variables•Click: Options•Click OK

Descriptives

INSTRUCTORS’_EFFICACY

N Mean

Std.

Deviation

Std.

Error

95% Confidence Interval

for Mean

Minimum MaximumLower Bound Upper Bound

MELAYU 14 4.2704 .73282 .19586 3.8473 4.6935 3.07 5.36

CINA 40 3.7339 .96118 .15198 3.4265 4.0413 2.21 5.71

INDIA 8 4.5804 .46673 .16501 4.1902 4.9706 3.86 5.07

Total 62 3.9643 .91443 .11613 3.7321 4.1965 2.21 5.71

ANOVAINSTRUCTORS’ EFFICACY

Sum of Squares df Mean Square F Sig.Between Groups 6.471 2 3.235 4.286 .018Within Groups 44.537 59 .755

Total 51.008 61

TEST OF DIFFERENCES BETWEEN GROUPS – BY RACE


SIGNIFICANT VALUE

(FROM THE SPSS

OUTPUT)

EVALUATING

DECISION CONCLUSION

There is no significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race?

0.05 .018 Sig. value lebih kecil daripada α

Bermakna kebenaran hipotesis nol adalah kecil.

Reject null hypothesis,

Accept alternative hypothesis

There is significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race, F(2,59) = 4.29, p<.05.

There is a significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race?

DECISION MATRIX

TEST OF DIFFERENCES BETWEEN GROUPS – BY RELIGION


SIGNIFICANT VALUE

(FROM THE SPSS

OUTPUT)

EVALUATING

DECISION CONCLUSION

There is no significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion?

0.05 .018 Sig. value lebih kecil daripada α

Bermakna kebenaran hipotesis nol adalah kecil.

Reject null hypothesis,

Accept alternative hypothesis

There is significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion, F(2,58) = 11.98, p<.05.

There is a significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion?

DECISION MATRIX

Pearson Product-Moment Correlation

• A measure of the linear relationship between two or more variables.

• Correlation analysis produces Pearson Correlation Coefficient ( r ).

• It indicates the strength of the relation and the direction (+ve / -ve) of the relationship between the variables.

Significant of Relationship

• The significance of the relationship is expressed in probability levels p (e.g., significant at p =.05)

• The smaller the p-level, the more significant the relationship.

• The larger the correlation (r value), the stronger the relationship.

Example 1 Correlations

Correlations

Total life satisfaction

Total Self esteem

Total life satisfaction

Pearson Correlation 1 .488**

Sig. (2-tailed).000

N 436 434

Total Self esteem

Pearson Correlation .488** 1

Sig. (2-tailed).000

N 434 436**. Correlation is significant at the 0.01 level (2-tailed).

Example 2 Correlations

Intimate Relationship

Friends Common Sense

Academic Intelligence

General

Intimate Relationship Pearson Correlation Sig. (2-tailed) N

1

80

.552** .000 80

.351** .001 80

.218 .052 80

.393** .000 80

Friends Pearson Correlation Sig. (2-tailed) N

.552** .000 80

1

80

.462** .000 80

.244* .029 80

.546** .000 80

Common Sense Pearson Correlation Sig. (2-tailed) N

.351** .001 80

.462** .000 80

1

80

.400** .000 80

.525** .000 80

Academic Intelligence Pearson Correlation Sig. (2-tailed) N

.218 .052 80

.244* .029 80

.400** .000 80

1

80

.261* .019 80

General Pearson Correlation Sig. (2-tailed) N

.393** . 000 80

.546** .000 80

.525** .000 80

.261* .019 80

1

80

**Correlation is significant at the level 0.01 level (2-tailed)*Correlation is significant at the level 0.005 level (1-tailed)

Report the Output of a Pearson Product-Moment Correlation

• Report the value of the correlation coefficient, r, as well as the degrees of freedom (df)

• The degrees of freedom (df) is the number of data points minus 2 (N - 2).

Coefficient of Determination, r2

• How much of the variation in the DV - Y is due to change in the IV - X

• It is sometimes expressed as a percentage when the proportion of variance explained by the correlation.

• Example: r² = 0.36Hence, 36% of the variation in Y is associated with the change in X. 64% of variation is Y is due to other factors.

Regression Analysis

• Regression analysis procedures have as their primary purpose the development of an equation that can be used for predicting values on some DV for all members of a population.

• A secondary purpose is to use regression analysis as a means of explaining causal relationships among variables.

Regression Analysis• The most basic application of regression

analysis is the bivariate situation, to which is referred as simple linear regression, or just simple regression.

• Simple regression involves a single IV and a single DV.

• Goal: to obtain a linear equation so that we can predict the value of the DV if we have the value of the IV.

• Simple regression capitalizes on the correlation between the DV and IV in order to make specific predictions about the DV.

• The correlation tells us how much information about the DV is contained in the IV.

• If the correlation is perfect (i.e r = ±1.00), the IV contains everything we need to know about the DV, and we will be able to perfectly predict one from the other.

• Regression analysis is the means by which we determine the best-fitting line, called the regression line.

• Regression line is the straight line that lies closest to all points in a given scatterplot

• This line sometimes pass through the centroid of the scatterplot.

• 3 important facts about the regression line must be known:– The extent to which points are scattered

around the line– The slope of the regression line– The point at which the line crosses the Y-axis

• The extent to which the points are scattered around the line is typically indicated by the degree of relationship between the IV (X) and DV (Y).

• This relationship is measured by a correlation coefficient – the stronger the relationship, the higher the degree of predictability between X and Y.

• The degree of slope is determined by the amount of change in Y that accompanies a unit change in X.

• It is the slope that largely determines the predicted values of Y from known values for X.

• It is important to determine exactly where the regression line crosses the Y-axis (this value is known as the Y-intercept).


Mean Std. Deviation N

Grade - PMR MATH 2.53 1.468 62

TEACHER_FACTOR 3.9643 .91443 62

Correlations

Grade - PMR

MATH

TEACHER_F

ACTOR

Pearson

Correlation

Grade - PMR

MATH

1.000 .571

TEACHER_EFF .571 1.000

Sig. (1-tailed) Grade - PMR

MATH

. .000

TEACHER_EFF .000 .

N Grade - PMR

MATH

62 62

TEACHER_EFF 62 62

Model Summaryb

Model

RR

SquareAdjusted R

Square

Std. Error of the

Estimated

i

m

e

n

s

i

o

n

0

1 .571a

.326 .315 1.215

a. Predictors: (Constant), TEACHER_FACTORb. Dependent Variable: Grade - PMR MATH

ANOVAb

Model Sum of

Squares df Mean Square F Sig.

1 Regression 42.848 1 42.848 29.021 .000a

Residual 88.588 60 1.476

Total 131.435 61

a. Predictors: (Constant), TEACHER_FACTOR

b. Dependent Variable: Grade - PMR MATH

Coefficientsa

Model Unstandardized Coefficients

Standardized Coefficients

t Sig.B Std. Error Beta1 (Constant) -1.101 .692 -1.591 .117

TEACHER_FACTOR .917 .170 .571 5.387 .000a. Dependent Variable: Grade - PMR MATH


Mean Std. Deviation NGrade - PMR MATH 2.53 1.468 62

TEACHER_EFF 3.9643 .91443 62

Race 1.90 .593 62

Correlations

Grade - PMR

MATH

TEACHER_F

ACTOR Race

Pearson

Correlation

Grade - PMR MATH 1.000 .571 -.015

TEACHER_EFF .571 1.000 .019

Race -.015 .019 1.000

Sig. (1-

tailed)

Grade - PMR MATH . .000 .453

TEACHER_EFF .000 . .440

Race .453 .440 .

N Grade - PMR MATH 62 62 62

TEACHER_EFF 62 62 62

Race 62 62 62

Model Summaryb

Model

R

R

Square

Adjusted R

Square

Std. Error of

the

Estimate

d

i

m

e

n

s

i

o

n

0

1 .572a .327 .304 1.225

a. Predictors: (Constant), Race,

TEACHER_FACTOR

b. Dependent Variable: Grade - PMR MATH

ANOVAb

Model Sum of Squares df

Mean Square F Sig.

1 Regression 42.939 2 21.469 14.313 .000a

Residual 88.497 59 1.500

Total 131.435 61

a. Predictors: (Constant), Race, TEACHER_FACTORb. Dependent Variable: Grade - PMR MATH

Coefficientsa

ModelUnstandardized Coefficients

Standardized Coefficients

t Sig.B Std. Error Beta1 (Constant) -.980 .853 -1.150 .255

TEACHER_FACTOR .917 .172 .571 5.349 .000Race -.065 .265 -.026 -.246 .806

a. Dependent Variable: Grade - PMR MATH

Performing the paired t-test

Opens up dialogue box

Use:

Analyze

Compare Means

Paired Samples T-Test

The paired samples t- test dialogue box

Transfer two levels of IV to ‘paired variables box

Both need to be highlighted

Variables shown in box as paired

Click OK

Output (1)

Mean for each condition

Number of paired scores

SD for each condition

Means suggest difference, but need to look at output of t-test to see if significant

Output (2)

t-value

p valuedfMean difference score

Reporting

There was a significant effect of statistics lecture on depression, t (18) = 5.86, p<.05). Findings indicated that depression scores recorded after the lecture were lower (mean = 13.0, SD= 2.33) than those recorded before the lecture (mean = 13.95, SD = 2.48).

Independent samples t-test

Used when different participants take part in each experimental condition.

Hypothesis: males can eat more chillies than females.

Eight males & eight females were tested on their chilli tolerance in a chilli eating competition.

Use arrow key to put IV here

Use arrow Key to put DV here.Define levelsof DV.

Examine descriptive statistics first.

Group Statistics

8 5.6250 1.4079 .4978

8 4.1250 1.1260 .3981

GENDERmale

female

CHILLIESN Mean Std. Deviation

Std. ErrorMean

GENDER

femalemale

Me

an

CH

ILL

IES

6.0

5.5

5.0

4.5

4.0

3.5

Results suggest that males could eat more chillies than females. But need to conduct t-test to determine if this difference is significant.


.443 .517 2.353 14 .034 1.5000 .6374 .1330 2.8670

2.353 13.355 .035 1.5000 .6374 .1267 2.8733

Equal variancesassumed

Equal variancesnot assumed

CHILLIESF Sig.

Levene's Test forEquality of Variances

t df Sig. (2-tailed)Mean

DifferenceStd. ErrorDifference Lower Upper

95% ConfidenceInterval of the

Difference

t-test for Equality of Means

Levene’s test - scores must have equal variance to use standard t-test techniques. Variances equal if p > 0.05

t-value, df & p shown here. Difference is significant if p < 0.05.

Results section

We examined chilli tolerance in males and females. Eight males and eight females were tested on their ability to consume chillies. Males with mean of 5.63 (s= 1.41) and females with mean of 4.13 (s= 1.13). Findings also showed that males ate significantly more chillies than females, t(14) = 2.35, p < 0.05.

The results suggest that males have greater chilli tolerance than females (or that males are foolish enough to try to win chilli eating contests).

Paired samples t-test

Used when same or matched pairs of participants take part in experimental conditions.

Hypothesis: chilli tolerance is more on cold days than on warm days.

Ten participants ate chillies on a warm day then cold day.

Use arrow key to select variables that are to be compared.

Paired Samples Test

-2.3000 2.9078 .9195 -4.3801 -.2199 -2.501 9 .034WARM - COLDPair 1Mean Std. Deviation

Std. ErrorMean Lower Upper


Difference

Paired Differences

t df Sig. (2-tailed)

Mean difference between pairs of scores shown here.

T-value, df & p shown here. Difference is significant if p < 0.05.

Results section

We examined chilli tolerance in warm and cold days. Ten participants were tested on their ability to consume chillies. The mean difference is 2.30 in which more chillies were consume in cold days compared to warm days. Findings also showed that chilli tolerance is more on cold days significantly than warm days, t(9) = -2.501, p < 0.05.

The results suggest that individuals can consume more chillies on cold days than on warm days.

Paired Samples Statistics

,4714 21 ,24276 ,05297

,5019 21 ,25522 ,05569

Doppler

Cath

Pair1

Mean N Std. DeviationStd. Error

Mean

Paired Samples Correlations

21 ,888 ,000Doppler & CathPair 1N Correlation Sig.

Paired Samples Test

-,03048 ,11864 ,02589 -,08448 ,02353 -1,177 20 ,253Doppler - CathPair 1Mean Std. Deviation

Std. ErrorMean Lower Upper


Difference

Paired Differences

t df Sig. (2-tailed)

Paired Sample T-testPaired Sample T-test

Results section

We examined chilli tolerance based on two type of chillies. 21 participants were tested on their ability to consume both type of chillies. The mean difference is 0.348. Findings also showed that there is no significant difference in chilli tolerance between the two types of chillies, t(20) = -1.77, p > 0.05.

The results suggest that there is no difference in chilli tolerance between the two types of chillies.

Group Statistics

12 25.5673 5.04689 1.45691

12 31.1920 7.79554 2.25038

group1.00

2.00

DVN Mean Std. Deviation

Std. ErrorMean


7.236 .013 -2.098 22 .048 -5.62476 2.68082 -11.18443 -.06508

-2.098 18.843 .050 -5.62476 2.68082 -11.23894 -.01057

Equal variancesassumed

Equal variancesnot assumed

DVF Sig.

Levene's Test forEquality of Variances

t df Sig. (2-tailed)Mean

DifferenceStd. ErrorDifference Lower Upper


Difference

t-test for Equality of Means

variances are 25.4 and 60.7

Results section

We examined chilli tolerance between two groups of participants. Twelve participants per group were tested on their ability to consume chillies. Group 1 scored mean of 25.07 (s= 5.05) and group 2 scored mean of 31.19 (s= 7.80). Findings also showed that the two groups differ significantly in their chilli consumption, t(22) = 2.10, p < 0.05.

The results suggest that group 2 have greater chilli tolerance than group 1.

Statistical Tools For Inferential Statistics

• PARAMETRIC TESTS: – Test of hypothesis of differences

between means - Z-test, t-test, F-test, MANOVA

– Test of hypothesis of relationship – Pearson r, Point-biserial, Regression

• NON-PARAMETRIC TESTS: – Mann-Whitney, – Kruskal Wallis, – Spearman rho, – Chi-Square, Cramer’s V, Lambda,

dll.

STATISTICAL DECISION

Decision (fail to reject Ho)

1 – α

Decision (fail to reject Ho)

β errorType II error

Decision (reject Ho)

α errorType I error

Decision (reject Ho)

1 – βPower

Reality

H0 : No difference HA : Difference

H0 : No difference

HA : Difference

Date post:	05-Jan-2016
Category:	Documents
Upload:	earl-carpenter
View:	215 times
Download:	0 times

Data Analysis Using SPSS EDU5950 SEM1 2014-15 Assoc. Prof. Dr. Rohani Ahmad Tarmizi Institute for...

Documents