Happiness comes not from material wealth but less desire.
1
2
Applied Statistics Using SAS
Topic: Factor Analysis
By Prof Kelly Fan, Cal State Univ, East Bay
3
Outline
Introduction
Principal component analysis
Rotations
4
Introduction
Reduce data
Summarize many ordinal categorical factors by a few combinations of them (new factors)
5
Example. 6 Questions
Goal: a measure of depression and a measure of paranoia (how pleasant)
6 questions with response using number 1 to 7. The smaller the number is, the stronger the subject agrees. 4: no opinion
6
Example. 6 Questions
1. I usually feel blue.2. People often stare at me.3. I think that people are following me.4. I am usually happy.5. Someone is trying to hurt me.6. I enjoy going to parties.Q. Which questions will a depressed person
likely agree with? A happy person?
7
Data Set:
Subj 1 2 3 4 5 6 7 8 9
Question
1 7 6 3 2 3 6 1 3 2
2 2 3 6 2 4 3 2 3 1
3 3 2 7 2 2 4 3 2 1
4 4 1 3 5 4 2 7 3 6
5 5 3 6 3 2 3 2 4 2
6 6 2 3 4 3 2 2 3 5
8
Data Set:
Subj 10 11 12 13 14 15
Question
1 6 3 6 5 2 1
2 2 5 7 1 1 2
3 3 4 6 1 1 1
4 2 2 2 2 6 7
5 2 3 6 6 1 1
6 2 3 2 2 5 7
9
Principal Component Analysis
The bigger the eigenvalue is, the more information this factor (component) carries.
Eigenvalues of the Correlation Matrix: Total = 6 Average = 1
Eigenvalue Difference Proportion Cumulative1 3.66827888 2.42830320 0.6114 0.6114
2 1.23997569 0.70866159 0.2067 0.8180
3 0.53131410 0.18729333 0.0886 0.9066
4 0.34402077 0.18927986 0.0573 0.9639
5 0.15474091 0.09307124 0.0258 0.9897
6 0.06166967 0.0103 1.0000
10
A Visual Tool: Scree Plot
Scree Plot of Eigenvalues ‚ ‚ ‚ ‚ 4.0 ˆ ‚ ‚ ‚ 1 ‚ 3.5 ˆ ‚ ‚ ‚ ‚ 3.0 ˆ ‚ ‚ ‚ E ‚ i 2.5 ˆ g ‚ e ‚ n ‚ v ‚ a 2.0 ˆ l ‚ u ‚ e ‚ s ‚ 1.5 ˆ ‚ ‚ ‚ 2 ‚ 1.0 ˆ ‚ ‚ ‚ ‚ 0.5 ˆ 3 ‚ ‚ 4 ‚ 5 ‚ 6 0.0 ˆ ‚ ‚ ‚ Šƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒˆƒƒƒƒ 0 1 2 3 4 5 6 Number
Variance Explained by Each Factor
Factor1 Factor23.6682789 1.2399757
11
Two Summary Factors
Factor Pattern
Factor1 Factor2QUES1 Feel Blue 0.76843 -0.54767
QUES2 People Stare at Me 0.72985 0.59840
QUES3 People Follow Me 0.77904 0.50692
QUES4 Basically Happy -0.87354 0.36879
QUES5 People Want to Hurt Me 0.72583 0.26237
QUES6 Enjoy Going to Parties -0.80519 0.34660
12
Communalities
Communalities represent how much variance in the original variables is explained by all of the factors kept in the analysis (here the two factors)
Final Communality Estimates: Total = 4.908255
QUES1 QUES2 QUES3 QUES4 QUES5 QUES60.89042545 0.89075763 0.86386768 0.89907825 0.59566231 0.76846325
13
Discussion
Q4 & Q6 should be at the same direction of factor 1 & 2 (component 1 & 2)
The other questions should be at the same direction of factor 1 & 2 (component 1 & 2)
Need a rotation!!
14
Rotation: Varimax Rotation
Orthogonal Transformation Matrix
1 21 -0.73625 0.67671
2 0.67671 0.73625
Rotated Factor Pattern
Factor1 Factor2QUES1 Feel Blue -0.93637 0.11677
QUES2 People Stare at Me -0.13241 0.93446
QUES3 People Follow Me -0.23053 0.90040
QUES4 Basically Happy 0.89271 -0.31960
QUES5 People Want to Hurt Me -0.35684 0.68434
QUES6 Enjoy Going to Parties 0.82737 -0.28969
15
Component Plot after Rotation
Plot of Factor Pattern for Factor1 and Factor2 Factor1 1 D .9 F .8 .7 .6 .5 .4 .3 .2 .1 -1 -.9-.8-.7-.6-.5-.4-.3-.2-.1 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0
-.1 B
-.2 C
-.3 E
-.4 -.5 -.6 -.7 -.8 -.9 A -1 QUES1=A QUES2=B QUES3=C QUES4=D
QUES5=E QUES6=F
Variance Explained by Each Factor
Factor1 Factor22.5562772 2.3519773
16
Using Communalities Other Than One
When the original factors are not equally important
Different methods of “extraction”
SAS Code
17
PROC FACTOR DATA=FACTOR PREPLOT PLOT ROTATE=VARIMAX
NFACTORS=2 OUT=FACT SCREE;
TITLE "Example of Factor Analysis"; VAR QUES1-QUES6;RUN;