Canonical Correlation Analysis - Home Page | Jonathan ... · PDF fileLecture #11 - 8/4/2011...

Lecture #11 - 8/4/2011 Slide 1 of 39

Canonical Correlation Analysis

Lecture 11August 4, 2011

Advanced Multivariate Statistical MethodsICPSR Summer Session #2

Overview

● Today’s Lecture

Canonical Correlations

Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 2 of 39

Today’s Lecture

■ Canonical Correlation Analysis

◆ What it is

◆ How it works

◆ How to do such an analysis

■ Examples of uses of canonical correlations

Overview


● Purpose

● Concept

● Bivariate Correlation

● Multiple Correlation

● Canonical Correlation

Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 3 of 39

Purpose

■ In general, when we have univariate data there are timeswhen we would like to measure the linear relationshipbetween things

◆ The simplest case is when we have 2 variables and all weare interested in is measuring their linear relationship.Here we would just use bivariate correlation

◆ Another case is in multiple regression when we haveseveral independent variables and one dependentvariable. In this case we would use the multiple correlationcoefficient (R2)

■ So, it would be nice if we could expand the idea used inthese to a situation where we have several y variables andseveral x variables

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 4 of 39

Concept

■ From Webster’s Dictionary: canonical: reduced to thesimplest or clearest schema possible.

■ What do we mean by basic ideas?

■ In describing canonical correlation, we will start with thebasic cases where we only have two variables and build on ituntil we get to canonical correlations

1. First we will look at the bivariate correlation

2. Then we will see what was done to generalize bivariatecorrelation to the multiple correlation coefficient

3. Finally, these discussions will lead us right to whathappens in canonical correlation analysis

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 5 of 39

Bivariate Correlation

■ Begin by thinking of just two variables y and x

■ In this case the correlation describes the extent that onevariable relates (can predict) the other

■ That is...the stronger the correlation the more we will knowabout y by just knowing x

No relationship Strong positive relationship

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 6 of 39

Multiple Correlation

■ On the other hand, if we have one y and multiple x variableswe can no longer look at a simple relationship between thetwo variables

■ But, we can look at how well the set of x variables canpredict the y by just computing the regression line

■ Using the regression line we can compute our predicted y

and we can compare it to the y variable.

◆ Specifically, we now have only two variables y andy = x′b = so we can compute a simple correlation

■ Note: we started with something that was more complicated(many x variables) and changed it in to something that wecould compute a simple correlation (between y and y)

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 7 of 39

Multiple Correlation Example

From Weisberg (1985, p. 240).

“Property taxes on a house are supposedly dependent on thecurrent market value of the house. Since houses actually sellonly rarely, the sale price of each house must be estimatedevery year when property taxes are set. Regression methodsare sometimes used to make up a prediction function.”

We have data for 27 houses sold in the mid 1970’s in Erie,Pennsylvania:

■ x1: Current taxes (local, school, and county) ÷ 100 (dollars)■ x2: Number of bathrooms■ x3: Living space ÷ 1000 (square feet)■ x4: Age of house (years)■ y: Actual sale price ÷ 1000 (dollars)

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 8 of 39


To compute the multiple correlation of x1, x2, x3, and x4 with y,first compute the multiple regression for all x variables and y:

proc reg data=house;model y=x1-x4;output out=newdata p=yhat;run;

Then, take the predicted values given by the model, y andcorrelate them with y:

proc corr data=newdata;var yhat y;run;

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 9 of 39


Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 10 of 39


Above is the multiple correlation between x1, x2, x3, x4 and y

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 11 of 39

Canonical Correlation

■ Canonical correlation seeks to find the correlation betweenmultiple x variables and multiple y variables

■ Now we have several y variables and several x variables soneither of our previous two examples can directly apply, BUTwe can take the points from the previous cases and usethem for this new case

■ So we could look at how well the set of x variables canpredict the set of y variables, but in doing this we still will notbe able to compute a simple correlation

■ On the other hand, in the multiple regression we found alinear combination of the variables b′x to get a singlevariable

◆ In our case we have two sets of variables so it makessense that we can define two linear combinations...one forthe x variables (b1) and one for the y variables (a1)

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 12 of 39


■ In the simple case where we only have a single linearcombination for each set of variables we can compute thesimple correlation between these two linear combinations

■ The first canonical correlation describes the correlationbetween these two new variables (b′

1x and a′

1y)

■ So how do we pick the linear transformations?

◆ These linear transformations (b1 and a1) are picked suchthat the correlation between these two new variables ismaximized

◆ Notice that this idea is really no different from what we didin multiple regression

◆ This also sounds similar to something we have done inPCA

Overview


● Purpose

● Concept




Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 13 of 39


■ ONE LAST THING

■ Think back to PCA when we said that a single linearcombination did not account for all of the information presentin a data set...

◆ Then we could determine how many linear combinationswere needed to capture more information (where thelinear combinations were all uncorrelated)

■ We can do the same thing here...

◆ We can define more sets of linear combinations (bi andai, i = 1, . . . , s where s = min (p, q), p is the number ofvariables in the group of x and q is the number ofvariables in y)

◆ Each linear combinations maximizes the correlationbetween the new variables under the constraint that theyare uncorrelated with all other previous linearcombinations

Overview


Computation

● Computation

● Example #2

● Standardized Weights

● Canonical Corr. Properties

● Hypothesis Test for Corr.

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 14 of 39

Computation

■ To show how to compute canonical correlations, firstconsider our original covariance matrix from our example:

x1 x2 x3 x4 y

x1 8.3100 1.0700 1.3400 −15.0300 37.7400

x2 1.0700 0.1800 0.2100 −1.2500 5.6000

x3 1.3400 0.2100 0.3100 −1.4000 7.4200

x4 −15.0300 −1.2500 −1.4000 197.4900 −62.3900

y 37.7400 5.6000 7.4200 −62.3900 204.7000

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 15 of 39

Computation

■ From this matrix, we will define four new sub-matrices, fromwhich we will calculate our correlations:

x1 x2 x3 x4 y

x1

x2 Sxx Sxy

x3

x4

y S′

xy Syy

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 16 of 39

Computation

■ So how do we compute the canonical correlations?

■ To begin, note that we could define the Squared MultipleCorrelation R2

M as

R2M =

|Sxy′S−1

xx Sxy|

|Syy|

which can be rewritten as:

R2M = |S−1

yy SyxS−1

xx Syx|

■ For canonical correlations, however, we will focus on thematrix formed by the part of the equation within the | · | (notethis was just a scalar when y only has one variable)

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 17 of 39

Computation

■ We first compute the square root of the eigenvalues(r1, r2, . . . , rs) and the eigenvectors (a1, a2, . . . , as) of:

S−1

yy SyxS−1

xx Sxy

■ Then we compute the square root of the eigenvalues(r1, r2, . . . , rs) and the eigenvectors (b1, b2, . . . , bs) of:

S−1

xx SxyS−1

yy Syx

■ Conveniently, the eigenvalues for both equations are equal(and are between zero and one)!

◆ The square root of the eigenvalues represents eachsuccessive canonical correlation between the successivepairs of linear combinations

■ From the eigenvectors we have determined the lineartransformations for the new linear combinations

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 18 of 39

Example #2

■ To illustrate canonical correlations, consider the followinganalysis:

Three physiological and three exercise variables aremeasured on 27 middle-aged men in a fitness club

■ The variables collected are:

◆ Weight (in pounds - x1)

◆ Waist size (in inches - x2)

◆ Pulse rate (in beats-per-minute - x3)

◆ Number of chin-ups performed (y1)

◆ Number of sit-ups performed (y2)

◆ Number of jumping-jacks performed (y3)

■ The goal of the analysis is to determine the relationshipbetween the physiological measurements and the exercises

Lecture #11 - 8/4/2011 Slide 19 of 39

Example #2

■ To run a canonical correlation analysis, use the following code:

proc cancorr data=Fit allvprefix=Physiological vname=’Physiological Measurements’wprefix=Exercises wname=’Exercises’;

var Weight Waist Pulse;with Chins Situps Jumps;run;

Lecture #11 - 8/4/2011 Slide 20 of 39

Example #2

Lecture #11 - 8/4/2011 Slide 21 of 39

Example #2

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 22 of 39

Standardized Weights

■ Just like in PCA and Factor Analysis, we are interested ininterpreting the weights of the linear combination

■ However, if our variables are in different scales they aredifficult to interpret

■ So, we can standardize them, which is the same ascomputing the canonical correlations and linear combinationof the correlation matrix instead of using the thevariance/covariance matrix

■ We can also compute the standardize coefficients (c and d)directly:

c = diag(Syy)1

2 a

and

d = diag(Sxx)1

2 b

Lecture #11 - 8/4/2011 Slide 23 of 39

Example #2

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 24 of 39

Canonical Corr. Properties

1. Canonical correlations are invariant.

■ This means that, like any correlation, scale changes (suchas standardizing) will not change the correlation.

■ However, it will change the eigenvectors...

2. The first canonical correlation is the best we can do withassociations.

■ Which means it is better than any of the simplecorrelations or any multiple correlation with the variablesunder study

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 25 of 39

Hypothesis Test for Corr.

■ We begin by testing that at least the first (the largest)correlation is significantly different from zero

■ If we cannot get a significant relationship out of the optimallinear combination of variables this is the same as testingH0 : Σxy = 0 or B1 = 0

◆ This is tested using Wilk’s Lambda:

Λ1 =|S|

|Syy||Sxx|

■ Or, equivalently (where r2i is the eigenvalue from the matrix

term produced from the submatrices of the covariancematrix):

Λ1 =s∏

i=1

(1 − r2i )

Overview


Computation

● Computation

● Example #2




Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 26 of 39

The Rest

■ In this case Λ1 as

Λ1 =s∏

i=1

(1 − r2i )

which can be compared to Λα,p,q,n−1−q (or to a Λα,q,p,n−1−p)

■ In general we can compute

Λj =s∏

i=k

(1 − r2i )

which can be compared to Λα,p−k+1,q−k+1,n−k−q (or to aΛα,q,p,n−1−p)

Lecture #11 - 8/4/2011 Slide 27 of 39

Example #2

Overview


Computation

Interpretation

● Interpretation

● Standardized● Correlation of Linear

Combination with Variables

● Rotation

● Redundancy

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 28 of 39

Interpretation

■ Because in many ways a canonical correlation analysis issimilar to what we discussed in PCA, the interpretationmethods are also similar

■ Specifically, we will discuss four methods that are used tointerpret the results:

1. Standardized Coefficients

2. Correlation between Canonical Variates (the linearcombination) and each variable

3. Rotation

4. Redundancy Analysis

Overview


Computation

Interpretation

● Interpretation



● Rotation

● Redundancy

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 29 of 39

Standardized

■ Because the standardized variables are on the same scalethey can be directly compared

■ Those variables that are most important to the associationare the ones with the largest absolute values (i.e., determineimportance)

■ To interpret what the linear combination is capturing we willalso consider the sign of each weight

Overview


Computation

Interpretation

● Interpretation



● Rotation

● Redundancy

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 30 of 39

Correlation of Linear Combination with Variab

■ This was mentioned in PCA and EFA...

■ That is, we compute our linear combinations and thencompute the correlation between the linear combination(canonical variates) with each of the actual variables

◆ The correlations are typically called the loadings orstructure coefficients

■ As was the case in PCA this ignores the overallmultidimensional structure and so it is not a recommendanalysis to make interpretations from

Overview


Computation

Interpretation

● Interpretation



● Rotation

● Redundancy

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 31 of 39

Rotation

■ We could try rotating the weights of the analysis to providean interpretable result...

■ For this we begin to rely on the spacial representation ofwhat is going on with the data

■ Every linear combination is projecting our observations on toa different dimension

◆ Sometimes these dimensions are difficult to interpret (i.e.,based on the sign and magnitude

■ Sometimes we can rotate these dimensions so that theweights are easier to interpret

◆ Some are large and some are small

■ Rotations in CCA are not recommended, because we losethe optimal interpretation of the analysis

Lecture #11 - 8/4/2011 Slide 32 of 39

Redundancy

■ Another method for interpretation is a redundancy analysis (this, again, isoften not liked by statisticians because it only summarizes univariaterelationships)

Lecture #11 - 8/4/2011 Slide 33 of 39

Redundancy

Lecture #11 - 8/4/2011 Slide 34 of 39

Redundancy

Overview


Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 35 of 39

Another Example

■ In a study of social support and mental health, measures ofthe following seven variables were taken on 405 subjects:

◆ Total Social Support

◆ Family Social Support

◆ Friend Social Support

◆ Significant Other Social Support

◆ Depression

◆ Loneliness

◆ Stress

■ The researchers were interested in determining therelationship between social support and mental health...howabout using a canonical correlation analysis?

*SAS Example #3;data depress (type=corr);_type_=’corr’; input _name_ $ v1-v7;label v1=’total social support’

v2=’family social support’v3=’friend social support’v4=’significant other social support’v5=’depression’v6=’loneliness’v7=’stress’;

datalines;v1 1.00 . . . . . .v2 0.8280 1.0000 . . . . .v3 0.8136 0.5192 1.0000 . . . .v4 0.8569 0.5972 0.6109 1.0000 . . .v5 -0.3691 -0.3218 -0.3150 -0.3044 1.0000 . .v6 -0.6282 -0.4945 -0.5774 -0.5266 0.5368 1.0000 .v7 -0.1849 -0.2049 -0.1132 -0.1291 0.4872 0.2846 1.000;

proc cancorr data=depress all corr edf=404vprefix=Mental_Health vname=’Mental Health’wprefix=Social_Support wname=’Social Support’;

var v1-v4;with v5-v7;

run;

35-1

Overview


Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 36 of 39

Overview


Computation

Interpretation

Another Example

Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 37 of 39

Overview


Computation

Interpretation

Another Example

Other Analyses

● Other Analyses

Wrapping Up

Lecture #11 - 8/4/2011 Slide 38 of 39

Other Analyses

■ In general, the results from a canonical correlations routineare related to:

1. Regression

2. Discriminant Analysis (we will learn this next week)

3. MANOVA

■ However, the goals of canonical correlation overlap with theinformation provided by a confirmatory factor analysis orstructural equation model...

Lecture #11 - 8/4/2011 Slide 39 of 39

Final Thought

■ The midterm was accomplished using MANOVA and MANCOVA.

■ Canonical correlation analysis is a complicated analysis that provides manyresults of interest to researchers.

■ Perhaps because of it’s complicated nature, canonical correlation analysis isnot often used.

■ Last week: Nebraska...This week: Texas...After that: The world.■ Tomorrow: Lab Day! Meet in Helen Newberry’s Michigan Lab

Date post:	06-Mar-2018
Category:	Documents
Upload:	hathu
View:	218 times
Download:	0 times

Canonical Correlation Analysis - Home Page | Jonathan ... · PDF fileLecture #11 - 8/4/2011...

Documents