FACTOR ANALYSIS LECTURE 11 EPSY 625. PURPOSES SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO...

FACTOR ANALYSIS

LECTURE 11

EPSY 625

PURPOSES

SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO UNDERLYING TRAITS (FACTORS)

• EFA- EXPLORE/UNDERSTAND UNDERLYING FACTORS FOR A TEST

• CFA- CONFIRM THEORETICAL STRUCTURE IN A TEST

HISTORICAL DEVELOPMENT

PEARSON (1901)- eigenvalue/eigenvector problem (dimensional reduction) “method of principal axes)

SPEARMAN (1904) “General Intelligence, Objectively Measured and Determined”

Others: Burt, Thompson, Garnett, Holzinger, Harmon, Thurstone

FACTOR MODELS

FIXED SAMPLE

Fixed

Principal

components, common factors

Image

SUBJECTS

VARIABLES

ALPHA Factor Analysis

Canonical Factor Analysis

Sampl

e

EXPLORATORY FACTOR ANALYSIS

USE PRINCIPAL AXIS METHOD:• ASSUMES THERE ARE 3 VARIANCE

COMPONENTS IN EACH ITEM:• COMMONALITY (h2)

• UNIQUENESS:

• SPECIFICITY (s2)

• ERROR (e2)

SINGLE FACTOR

REQUIRES AT LEAST 3 ITEMS OR MEASUREMENTS TO UNIQUELY DETERMINE

FACTOR

ITEM1

SPECIFICITY

e

ITEM2

ITEM3

e

e

.7

.8

.6

CALLED FACTOR LOADING

CORRELATION BETWEEN ITEM AND FACTOR

ASSUMED=0 FOR PARALLEL ITEMS

.6

.8

.714

FACTOR

ITEM1

SPECIFICITY

e

ITEM2

ITEM3

e

e

.7

.8

.6

ALPHA= SPEARMAN-BROWN STEPPED UP AVERAGE INTER ITEM CORRELATION:

(.56 +.42+.48)/3=.49

ALPHA= 3(.49)/[1+2(.49)]

= .74

ASSUMED=0 FOR PARALLEL ITEMS

.6

.8

.714

=1-.72

TWO FACTORS

NEED AT LEAST 2 ITEMS OR MEASUREMENTS PER FACTOR, ASSUMING FACTORS ARE CORRELATED

FACTOR 1

ITEM1e

ITEM2eITEM 3

ITEM 4

FACTOR 2

e

e.5CORRELATION

BETWEEN FACTORS

.7

.8

.6

.7

FACTOR 1

ITEM1e

ITEM2eITEM 3

ITEM 4

FACTOR 2

e

e.5CORRELATION

BETWEEN FACTORS

.7

.8

.6

.7

CORRELATION BETWEEN ANY TWO ITEMS = PRODUCT OF ALL PATHS BETWEEN THEM;

EX. R(ITEM1, ITEM4) =

.7 x .5 x .7 = .245

SIMPLE STRUCTURE TRY TO CREATE SCALE IN WHICH

EACH ITEM CORRELATES WITH ONLY ONE FACTOR:

ITEM FACTOR

1 2 3

ITEM 1 1 0 0

ITEM 2 1 0 0

ITEM 3 0 1 0

ETC

CRITERIA FOR SIMPLE STRUCTURE

Structural equation modeling provides chi square test of fit

Compares observed covariance (correlation) matrix with predicted/fitted matrix

Alternatively, look at RMSEA (Root mean square error of approximation) of deviations from fitted matrix

MATHEMATICAL MODEL

Z = persons by variables matrix of p x k standardized variables (mean=0, SD=1)

Z’Z = NR (covariance matrix) k x k Zi = aiFi + ei

MATHEMATICAL MODEL

Z = AF = C + U

ZZ’/N = R = AFF’A’ + U2

S = ZF’/N (structure matrix: correlations between Z and F)

= AFF’/N = FF’/N (correlations among factors) A = Pattern matrix

MATHEMATICAL MODEL

S = A A = S -1 (If factors uncorrelated, A=S)

Pattern matrix = Structure matrix

R = ZZ’/N = CC’/N + U2

MATHEMATICAL MODEL

If we take the covariance matrix of F to be diagonal, and the metric of variances of Fi to be 1.0,

R = AA’/N = SA’ = AS’

MATHEMATICAL MODEL

Now let Zi = aiFi + si + ei

Let Ŕ = R - D2, where D2 is a diagonal matrix of specificities and error: si + e2

i

Then Ŕ = AFF’A/N = A A’ = SA’ = AS’ = I Ŕ = AA’

MATHEMATICAL MODEL

How do we estimate s2i ?

Instead, estimate [R2- U2]ii= [I- s2i - e2

i]ii

Consider for each zi that it is predictable from the rest:

zi = b1z1 + b2z2 + …bi-1zi-1 + ...

Then R2i = variance common to all other

variables (squared multiple correlation or SMC) h2

i = communality for item i Due to Dwyer (1939)

MATHEMATICAL MODEL

SMC is estimable from the observed data, so that Ŕ = R - [1-SMCi]

where [SMCi] = diagonal matrix with SMCs for each variable on the diagonals and zeros off-diagonal

Theorem states “SMCs guarantee that the number of factors # eigenvalues>1.0

MATHEMATICAL MODEL

Ŕ =

R21.234.. 0 0 0 0 …

0 R22.134.. 0 0 0 …

0 0 R23.124.. 0 0 …

0 0 0 R24.123.. 0 …

MATHEMATICAL MODEL

SOLUTIONS: PRINCIPAL COMPONENTS (R =

Ŕ )Rq = q,

RQ = Q, = diagonal [i]

Q-1RQ = QQ’ = I = Q-1 = Q’

Q’RQ = (Spectral Theorem)

MATHEMATICAL MODEL

SOLUTIONS: PRINCIPAL AXIS ( Ŕ- I)q = 0

That is, solve for first eigenvalue | Ŕ- I | = 0, solved by Rmq = mq

begin with m=2: R2q = 2q , then put solution in R(Rq1) = 2q1, iterate for m=4

MATHEMATICAL MODEL

Now compute residual correlation matrix:R2

1 = R2 - Ŕ , iterate

EIGENVALUES

i = variance of ith factor i / i = proportion of total variance

accounted for by the ith factor i < 1 chance factor Scree plot (value x factor eigenvalue

ordered from greatest to lowest)

K

1.0

0

1 2 3 4 5 6 7 . . . . K

SCREE PLOT

ROTATION

MEANING CRITERION: SIMPLE STRUCTURE POSITIVE MANIFOLD

B=AT A=INITIAL FACTOR MATRIX

T=TRIANGULAR MATRIX

B=FINAL FACTOR MATRIX

TT’=

VARIMAX ROTATION (uncorrelated Factors)

ORTHOGONAL (RIGID) ROTATION Maximize V=n (bjp/hj)4 - (b2

jp/h2j)2

Geometric problem: (X,Y) = (x,y) cos - sin

sin - cos

VARIMAX ROTATION (X,Y) = (x,y) cos - sin sin - cos

uj = x2j - y2

j

vj = 2xjyj

A= uj

B= vj

C= (uj - vj)2

D=2 ujvj

solve tan4 = [D-2AB/h]/[C-(A2-B2)/h]

-45o 45o

Unrotated Factor 1 loading values

Unrotated Factor 2 loading values

Orthogonal (perpendicular) Rotation of Axes

OBLIQUE SOLUTION (correlated Factors)

MINIMIZE S (OBLIMIN) S = [n(v2

jp/h2j) (v2

jg/h2j)

- ((v2jp/h2

j)((v2jg/h2

j)]PROMAX:

• Start with VARIMAX, B=AT, transform with vjp = (bjp

4)/bjp

FACTOR CORRELATION

= TT’

Tij = cos(ij) -sin(ij) sin(ij) cos(ij)

rij = [cos(ij)(-sin(ij)] + [sin(ij)cos(ij)] = T11T12 + T21T22

FACTOR CORRELATION

S = P (Structure matrix= Pattern matrix x factor correlation matrix)

P = A(T’)-1

A = PT’

ij

Oblique Rotation of Axes

ALPHA FACTOR ANALYSIS

Estimates population h2i for each variable Little different from common factors

Canonical Factor Analysis

Uses canonical analysis to maximize R between factors and variables, iterative Maximum Likelihood analysis

Image Analysis

h2i = R2i.1,2,…K

pj = wjkzk (standard regression) ej = zj - pj called anti-image Var(ej)> Var(j) where Var(j) = anti-

image for the regression of zj on the factors F1,F2, …FK

FACTOR CONGRUENCE

Alternative to Confirmatory Analysis for two groups who it is hypothesized have the same factor structure:

Spq = ajpbjq / [a2jp b2jq ] This is basically the correlation between

factor loadings on the comparable factors for two groups

Example of 2 factor structure

Achievement (reading, math) and IQ (verbal, nonverbal)

quasi-multitrait multimethod analysis:• reading is verbal

• math is “nonverbal”

.9

Ach Apt

.7

Reading

.9

Arithmetic

.8

Verbal Nonverbal

.8

e

.3.6

e

.43

e

.6

e

Factor Structure

F1 F2R .9 .63

A .8 .56

V .63 .9

NV .56 .8

Reduced Correlation Matrix

R A V NV R .81 .72 .57 .50

A .64 .51 .45

V .81 .72

NV .64

.9

Ach Apt

.7

Reading

.9

Arithmetic

.8

Verbal Nonverbal

.8

e

.43.6

e

.43

e

.6

e

Factor Structure

F1 F2R .9 .63

A .8 .56

V .63 .9

NV .56 .8

Reduced Correlation Matrix

R A V NV R .91 .72 .63 .51

A .92 .65 .61

V .99 .72

NV .80

.32

.40

Revised Model with additional specificities

.37

CONFIRMATORY FACTOR ANALYSIS

BASIC PRINCIPLES

x x´)

2x1

xx = x1x2 2x2 x1x3 x2x3 2x3

BASIC PRINCIPLES

2x1 =2111 + 21

2xk =2k11 + 2k

xixk =x111 xk

x1

1

xkk

1

IDENTIFICATION RULES t-rule : tq(q+1), q=#manifest variables

• necessary but not sufficient

3-indicator rule: 1 factor3 indicators• sufficient but not necessary

2-indicator rule: 2+ factors2 indicators @ local vs. global identification:

• local: sample estimates of parameters independent- necessary but not sufficient

• global: population parameters independent- necessary and sufficient

ESTIMATION

MODEL EVALUATION• FIT: FML used to evaluate , S

• Residuals: E= S -

• RMR = SD(sij - ij )

• RMSEA = √[(2/df - 1) /(N - 1)]

• note: factor analyze E , should be 0

ˆ

ˆ

ˆ

ˆ

ˆ

Hancock’s Formula- reliability for a given factor

Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) }

Ex. l1 = .7, l2= .8, l3 = .6

H = 1 / [ 1 +1/( .49/.51 + .64/.36 + .36/.64 )]

= 1 / [ 1 + 1/ ( .98 +1.67 + .56 ) ]

= 1/ (1 + 1/3.21)

= .76

Hancock’s Formula Explained

Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) }

now assume strict parallelism: then l2ij= 2xt

thus Hj = 1/ [ 1 + {1 / (Σ[2xt /(1- 2

xt)] ) }

= k 2xt / [1 + (k-1) 2

xt ]

= Spearman-Brown formula

TEST

(n-1)FML ~ t

used for nested model: model with one or more restrictions from original

restriction = known parameter, equality of two or more parameters

Proof: Bollen shows (N-1)[-2Log(L0/L1)=(N-1)FML

where L0 is unrestricted, L1 restricted models

INCREMENTAL FIT

Bentler and Bonnet: 1 = Fb - Fm Fb

= b -

m

b

can be used to compare improvements over original model or against a standard or baseline

Bentler & Bonnet Baseline conventions b= S Alternatives:

b= [.5] or b = from a previous study example: Willson & Rupley (1997) was

used by Nichols (1997) dissertation

Bollen’s fit index2 = Fb - Fm

Fb

= b -

m

b - df

Logic: the difference in the numerator has expected value equal to the denominator

AMOS SEM PROGRAM Uses SPSS to input data- select SPSS file Draw factor model

• Circles for factors, boxes for observed variables

• Arrows from circles to boxes to indicate loadings

• Errors for each box (special drawing character)

• Label all circles and boxes with names- SPSS variable names for boxes, your own name for factors and circles

• Correlate factors with curved arrows as needed

AMOS Drawing

ANX

e1e2

e3

DEP SE

F1

Date post:	22-Dec-2015
Category:	Documents
View:	216 times
Download:	0 times

FACTOR ANALYSIS LECTURE 11 EPSY 625. PURPOSES SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO...

Documents