Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | brent-hall |
View: | 221 times |
Download: | 2 times |
Outer Product Analysis (OPA)
studying the relations among sets of variables measured on the same individuals
Douglas N. Rutledge
Some publications on Outer Product Analysis
Infrared spectroscopy and outer product analysis for quantification of fat, nitrogen, and moisture of cocoa powder
A. Vesela, A. S. Barros, A. Synytsya, I. Delgadillo, J. Copıkova, M. A. Coimbra
Analytica Chimica Acta 601 (2007) 77–86
Multi-way analysis of outer product arrays using PARAFACD. N. Rutledge, D. Jouan-Rimbaud BouveresseChemometrics and Intelligent Laboratory Systems 85 (2007) 170–178
Image processing of outer-product matrices – a new way to classify samples. Examples using visible/NIR/MIR spectral data
B. Jaillais, V. Morrin, G. DowneyChemometrics and Intelligent Laboratory Systems xx (2006) xxx–xxx
Some publications on Outer Product Analysis
Variability of cork from Portugese Quercus suber studied by solid state 13C-NMR and FTIR spectroscopies
M.H. Lopes, A.S. Barros, C. Pascoal Neto, D. Rutledge, I. Delgadillo, A. M. GilBiopolymers (Biospectroscopy) 62 (5) (2001) 268–277
Outer Product Analysis of electronic nose and visible spectra:
application to the measurement of peach fruit characteristicsC. di Natale, M. Zude-Sasse, A. Macagnano, R. Paolesse, B. Herold, A.
D'AmicoAnalytica Chimica Acta 459 (2002) 107–117
Determination of the degree of methylesterification of pectic polysaccharides by FT-IR using an outer product PLS1 regression
A.S. Barros, I. Mafra, D. Ferreira, S. Cardoso, A. Reis, J.A. Lopes de Silva, I. Delgadillo, D.N. Rutledge, M.A. Coimbra
Carbohydrate Polymers 50 (2002) 85–94
Some publications on Outer Product Analysis
Enhanced multivariate analysis by correlation scaling and
fusion of LC/MS and 1H NMR dataJ. Forshed, R. Stolt, H. Idborg, S. P. JacobssonChemometrics and Intelligent Laboratory Systems 85 (2007) 179–185
Outer-product analysis (OPA) using PCA to study the influence of temperature on NIR spectra of water
B. Jaillais, R. Pinto, A.S. Barros, D.N. RutledgeVibrational Spectroscopy 39 (2005) 50–58
Outer-product analysis (OPA) using PLS regression to study the retrogradation of starch
B. Jaillais, M.A. Ottenhof, I.A. Farhat, D.N. RutledgeVibrational Spectroscopy 40 (2006) 10–19
Principal Components Analysis (PCA)
Calculate the covariance matrix, C, of the original data matrix, X
Covariance Matrix : C
p
p
Cij = cov(i,j)
1 2
1 1 2 21, 2 12
1 2 ,( , )1
n
i ii i
x x
x x x x
Cov x x sn
Calculate the matrix of individual covariances between variables of one data set, X
xiT
xi
Ci = xiT . xi
1 p
1
p
p
p
Mutual weighting of each signal by the other:• if intensities simultaneously high in the two domains, the product is higher;• if intensities simultaneously low in the two domains, the product is lower;• if one intensity high and the other low, the product tends to an intermediate value
For n samples, one gets n Outer Product matrices
Group them together one under the other in the form of a cube of individual matrices of covariances among variables
1
.
.
n
1
p1 p
1
.
n
1
p
1 p
Calculate all the individual covariance matricesof a single matrix, X
A cube of symmetrical matrices
Calculate the mean of the individual covariance matrices to have a :matrix of mean covariances
1
.
.
n
1
p1 p
1
.
n
1
p
1 p
1
p 1 p
Decomposition of the column-mean matrix by SVD Principal Components Analysis
Decomposition of the « Mean » OP matrix by SVD≡ Principal Components Analysis
SVD applied to the initial data matrix, X
S : diagonal matrix of singular values V : loadings matrix U*S : scores matrix
20 40 60 80 100 120 140 160
5
10
15
20
25
30
X(n,p) = U(n,r) S(r,r) VT(r,p)
SVD applied the covariance matrix, XTXor column-means of the Outer Product cube
S2 : diagonal matrix of eigenvaluesV : loadings matrix X*V : scores matrix
20 40 60 80 100 120 140 160
20
40
60
80
100
120
140
160
XTX(p,p) = V(p,r) S2(r,r) VT
(r,p)
Lignin-starch mixtures by TD-NMR
Application of « Mean » Outer Product Analysis to real data (1)
50 100 150 200 2500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
D.N. Rutledge, Food Control, (2001) 12(7), 437-445
50
100
150
200
250
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
50 100 150 200 250
50
100
150
200
250
50 100 150 200 2500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Column-means of Outer Products
0 10 20 30 40 50 60-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
50 100 150 200 250
-0.2
-0.1
0
0.1
0.2
0.3
Decomposition of the matrix by SVD(Principal Components Analysis)
X*VScores on PC2, PC3 & PC4
VLoadings on PC2, PC3 & PC4
Retrogradation of starch by X-ray diffraction
Application of « Mean » Outer Product Analysis to real data (2)
Diffraction Rayons X
0 50 100 150 200 250 300 350-100
0
100
200
300
400
500
B. Jaillais, M.A. Ottenhof, I.A. Farhat, D.N. Rutledge, Vib. Spec. (2006), 40, 10–19.
Column-means of Outer Products
50 100 150 200 250 300
50
100
150
200
250
300
50 100 150 200 250 300
0
50
100
150
200
250
300
350
400
450
50
100
150
200
250
300
0 50
100
150
200
250
300
350
400
450
50 100 150 200 250 300
-0.1
-0.05
0
0.05
0.1
0.15
1 2 3 4 5 6 7 8 9-5
-4
-3
-2
-1
0
1
2
3
4
5
Decomposition of the matrix by SVDPrincipal Components Analysis
X*VScores on PC2
VLoadings on PC2
Unfold the cube to form a matrix
1
.
.
n
1
p 1 p
1
.
n
1
p
1 p
« Unfold » Outer Product Analysis
Analyse the unfolded individual covariance matrices
n-PLS, n-PCA (ANOVA) …
p x p
n
1
Different data unfolding schemes
X
3 x PCA
n
qp
X1
np x
q
X3
p
n x
q
X2
p
n x
p
1 2 3 4 5 6
x 104
5
10
15
20
25
30
35
40
45
50
55
Lignin-starch mixtures by TD-NMRunfolded OP matrix (X1)
Application of unfolded OP to real data (1)
n
p x p
50 100 150 200 250
50
100
150
200
250
0 10 20 30 40 50 60-10
-8
-6
-4
-2
0
2
4
6
8
Decompose the unfolded OP matrix (X1) by SVD(Unfold-PCA)
U*SScores of X1 on PC2, PC3 & PC4
VRefolded Loadings of X1
on PC2, PC3 & PC4
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
p
p
50 100 150 200 250
-4
-3
-2
-1
0
1
2
3
Decompose the unfolded OP matrices (X1 & X2) by SVD(Unfold-PCA)
U*SScores of X1 on PC2, PC3 & PC4
U*SScores of X2 on PC2, PC3 & PC4
0 10 20 30 40 50 60-10
-8
-6
-4
-2
0
2
4
6
8
0 10 20 30 40 50 60-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
50 100 150 200 250
-0.2
-0.1
0
0.1
0.2
0.3
Decomposition of the matrix by SVD(Column-mean PCA)
X*VScores on PC2, PC3 & PC4
VLoadings on PC2, PC3 & PC4
50 100 150 200 250
-1
-0.5
0
0.5
1
1.5
2
2.5
3
5 10 15 20 25 30 35 40 45 50 55
-2
-1
0
1
2
3
4
Decompose the unfolded OP matrices (X1 & X2) by SVDUnfold-PCA
U*SScores of X1 on PC4
U*SScores of X2 on PC4
50 100 150 200 250
5
10
15
20
25
30
35
40
45
50
55
50 100 150 200 250
-1
-0.5
0
0.5
1
1.5
2
2.5
3
Decompose the unfolded OP matrix (X2) by SVDUnfold-PCA
U*SScores of X2 (X3) on PC4
VRefolded Loadings of X2 (X3) on PC4
p
n
1 2 3 4 5 6 7 8 9 10 11
x 104
1
2
3
4
5
6
7
8
9
Starch retrogradation by XRDunfolded OP matrix (X1)
Application of unfolded OP to real data (2)
n
p x p
50 100 150 200 250 300
50
100
150
200
250
300
1 2 3 4 5 6 7 8 9-150
-100
-50
0
50
100
150
Decompose the unfolded OP matrix (X1) by SVDUnfold-PCA
U*SScores of X1 on PC2
VRefolded Loadings of X1 on PC2
p
p
50 100 150 200 250 300-20
-15
-10
-5
0
5
10
15
20
1 2 3 4 5 6 7 8 9-150
-100
-50
0
50
100
150
Decompose the unfolded OP matrices (X1 & X2) by SVDUnfold-PCA
U*SScores of X1 on PC2
U*SScores of X2 on PC2
50 100 150 200 250 300
1
2
3
4
5
6
7
8
9
50 100 150 200 250 300-20
-15
-10
-5
0
5
10
15
20
Decompose the unfolded OP matrix (X2) by SVDUnfold-PCA
U*SScores of X2 on PC2
VRefolded Loadings of X2 on PC2
p
n
Group them together, one under the other, in the form of a cube of individual matrices of covariances among variables
1
.
.
n
1
p 1 p
1
.
n
1
p
1 p
« Multi-way » Outer Product Analysis
Decomposition of the cube PARAFAC
PARAFAC – Parallel Factor Analysis
= + +…
F is the number of Factors used in the PARAFAC model.
This model minimises the sum of squared residuals.
xijk =
F
f=1
aifbjfckf + eijk
3-way data X (n,q,p) :
nq
k
n
q
p
n
q
p
1
1 1 F1
R. Bro, Chemometrics and Intelligent Laboratory Systems, (1997), 38, 149-171
Cube of individual covariances matrices
Loadings on the 1° mode (samples) Time
1 2 3 4 5 6 7 8 9-200
-100
0
100
200
300
400
500
Sample
1, 2
1 2 3 4 5 6 7 8 90
5
10
15
20
25
PARAFAC applied to OP cube
Starch retrogradation by XRD
Loadings on the 2° mode (XRD)
Loadings on the 3° mode (XRD)
50 100 150 200 250 300-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
Variable
1, 2
50 100 150 200 250 300-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
Variable
1, 2
Starch Data : PARAFAC Model
Comparaison PARAFAC / SVD
OP-PARAFAC SVD on XTX= PCA
50 100 150 200 250 300-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
Variable
1, 2
0 50 100 150 200 250 300 350-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
1 2 3 4 5 6 7 8 9-500
0
500
1000
1500
2000
2500
3000
1 2 3 4 5 6 7 8 9-200
-100
0
100
200
300
400
500
Sample
1, 2
Calculate the matrix of individual covariancesbetween variables of 2 different matrices X & Y
xi
yi
Ci = xiT . yi
1 q
1
p
1,1 1, q
p, q
=
p
n
qSignal 1
Signal 2
n
n
q
p
Visualisation of the Outer Product cube
For n samples, one gets n Outer Product matrices
Group them together in the form of a “cube”
Calculate the column-mean of the individual covariance matrices to give the matrix of covariances between the 2 groups of variables
Apply SVD
1
.
.
n
1
p1 q
1
.
n
1
p
1 q
Calculate the matrix of covariances of 2 matrices X & Y
1
p1 q
n OP (p, q) matrices 1 “cube” (n, p, q) 1 mean matrix (p, q)
Analyse the links between 2 tables of data, X & Y
Singular Value Decomposition of the matrix of covariances between the 2 groups of variables (1/n)XTY
That decomposition of the matrix (1/n)XTY corresponds to looking for successive pairs of variables (th = Xah , uh = Ybh ) where :
- covariance between th et uh maximal,- axes ah orthogonal- axes bh orthogonal
Decompose the « Mean » OP matrix by SVD≡ Tucker Analysis
L. Tucker, Psychometrika, (1958), 23, 111-136
20 40 60 80 100 120 140 160
20
40
60
80
100
120
140
160
SVD applied to the covariance matrix, XTYor column-means of the Outer Product cube
XTY = VX S VYT
S : diagonal matrix of singular valuesVX et VY : X & Y loadings matricesX*VX : scores of XY*VY
T : scores of Y
Application of « Mean » Outer Product Analysis to real data (3)
Complexation between TPP & Cu
1000 2000 3000 4000 5000 60000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Spectres RMN
550 600 650 700
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2Spectres UVTD-NMR Vis
D.N. Rutledge, A.S. Barros, F. Gaudard, Mag. Res. in Chemistry, 35 (1997), 13–21
Column-means of Outer Products
20
40
60
80
100
120
140
160
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9 1
20 40 60 80 100 120 140 160
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
20 40 60 80 100 120 140 160
20
40
60
80
100
120
140
160
[uRMN, sRMN_Vis, vVis] = svd (meanRMN_Vis,'econ');
50 100 150
-0.2
-0.15
-0.1
-0.05
50 100 150
-0.2
-0.1
0
50 100 150
-0.2
-0.1
0
0.1
50 100 150
-0.2
-0.1
0
0.1
50 100 150
-0.12
-0.1
-0.08
-0.06
-0.04
-0.02
50 100 150-0.1
-0.05
0
0.05
0.1
0.15
50 100 150
-0.1
-0.05
0
0.05
0.1
0.15
50 100 150-0.1
-0.05
0
0.05
0.1
uRMN vVis
SVD on matrix of column-means of Outer Product (Tucker Analysis)
0 10 20 30 40-0.5
-0.4
-0.3
-0.2
-0.1
0
0 10 20 30 40-10
-5
0
5
0 10 20 30 40-20
-10
0
10
20
30
0 10 20 30 40-200
-100
0
100
200
0 10 20 30 40-6
-5
-4
-3
-2
0 10 20 30 40-1.5
-1
-0.5
0
0.5
1
0 10 20 30 40-0.6
-0.4
-0.2
0
0.2
0 10 20 30 40-0.2
-0.15
-0.1
-0.05
0
0.05
SVD on matrix of column-means of Outer Product (Tucker Analysis)
sRMN = RMN x uRMN / (uRMN' x uRMN); sVis = Vis x vVis / (vVis' x vVis);
sRMN sVis
Application of unfolded OP to real data (3)
0.5 1 1.5 2 2.5
x 104
5
10
15
20
25
30
Complexation between TPP & Cuunfolded OP matrix (X1)
p x q
n
20 40 60 80 100 120 140 160
20
40
60
80
100
120
140
1600 5 10 15 20 25 30 35
-100
-80
-60
-40
-20
0
20
40
Decompose the unfolded OP matrix (X1) by SVDUnfold-PCA
U*SScores of X1 on PC1 & PC2
VLoadings of X1 on PC1 & PC2
20 40 60 80 100 120 140 160
20
40
60
80
100
120
140
160
q
p
0 20 40 60 80 100 120 140 160-80
-70
-60
-50
-40
-30
-20
-10
0
10
0 5 10 15 20 25 30 35-100
-80
-60
-40
-20
0
20
40
Decompose the unfolded OP matrices (X1 & X2) by SVDUnfold-PCA
U*SScores of X1 on PC1 & PC2
U*SScores of X2 on PC1 & PC2
20 40 60 80 100 120 140 160
5
10
15
20
25
30
0 20 40 60 80 100 120 140 160-80
-70
-60
-50
-40
-30
-20
-10
0
10
Decompose the unfolded OP matrix (X2) by SVDUnfold-PCA
U*SScores of X2 on PC1 & PC2
VLoadings of X2 on PC1 & PC2
20 40 60 80 100 120 140 160
5
10
15
20
25
30
p
n
20 40 60 80 100 120 140 160-40
-35
-30
-25
-20
-15
-10
-5
0
5
10
0 5 10 15 20 25 30 35-100
-80
-60
-40
-20
0
20
40
Decompose the unfolded OP matrices (X1 & X3) by SVDUnfold-PCA
U*SScores of X1 on PC1 & PC2
U*SScores of X3 on PC1 & PC2
20 40 60 80 100 120 140 160
5
10
15
20
25
30
20 40 60 80 100 120 140 160-40
-35
-30
-25
-20
-15
-10
-5
0
5
10
Decompose the unfolded OP matrix (X3) by SVDUnfold-PCA
U*SScores of X3 on PC1 & PC2
VLoadings of X3 on PC1 & PC2
q
n20 40 60 80 100 120 140 160
5
10
15
20
25
30
For n samples, one gets n Outer Product matrices
Group them together, one under the other, in the form of a cube of individual matrices of covariances among variables
1
.
.
n
1
p 1 q
1
.
n
1
p
1 q
Decomposition of the cube PARAFAC
« Multi-way » Outer Product Analysis
Cube of individual covariances matrices
0 5 10 15 20 25 30 35-20
0
20
40
60
80
100
120
N° éch.
Load
ings
sur
le m
ode
1
facteur 1
facteur 2
0 5 10 15 20 25 30 350
50
100
150
200
250
300
350
« Loadings » on the 1° mode (samples) Concentrations
Cu
TPPCu + TPP
Complexation between TPP & Cu
PARAFAC applied to OP cube of real data (2)
Complexation between TPP & Cu
PARAFAC applied to OP cube of real data (2)
20 40 60 80 100 120 140 160
0.05
0.1
0.15
0.2
0.25
20 40 60 80 100 120 140 160
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Loadings on the 2° mode (TD-NMR) Loadings on the 3° mode (Vis)
Fructose solutions
1200 1400 1600 1800 2000 2200 2400
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5Spectres IR
1000 2000 3000 4000 50000
0.5
1
1.5
2
2.5
3
3.5
4
Spectres RMN
PARAFAC applied to OP cube of real data (3)
D.N. Rutledge, A.S. Barros, R. Giangiacomo, Magnetic Resonance in Food Science—A View to the Future, RSC, 2001, pp. 179–192
Cube of individual covariances matrices
PARAFAC model
Loadings on the 1° mode
0 5 10 15 20 25-5
0
5
10
15
20
25
30
facteur 1
facteur 2
0 5 10 15 20 250
10
20
30
40
50
60
N° de l'échantillon
Con
c. e
n fr
ucto
se
Concentrations
PARAFAC model
Loadings on the 2° mode (TD-NMR)
0 1000 2000 3000 4000 5000 60000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Longueurs d'onde
facteur 1
facteur 2
1200 1400 1600 1800 2000 2200 24000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Longueur d'onde
facteur 1
facteur 2
Loadings on the 3° mode (NIR)
0 5 10 15 20 25 30 35 400
0.5
1
1.5
2
2.5
0 20 40 60 80 100 120 140 1600.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
0 50 100 150 200 250 300 350-100
0
100
200
300
400
500
Log(TD-NMR) MIR XRD
PARAFAC on 4-D OP hypercube MIR NMR XRD (1)
(9 x 157 x 40 x 341)
Starch retrogradation
Loadings on the 1° mode (Samples)
1 2 3 4 5 6 7 8 9-100
-50
0
50
100
150
200
250
300
Sample number
Load
ings
on
the
first
mod
e
factor 1
factor 2
0 20 40 60 80 100 120 140 160-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
data1
data2
Loadings on the 2° mode (MIR)
0 5 10 15 20 25 30 35 400
0.05
0.1
0.15
0.2
0.25
0.3
0.35
data1
data2
Loadings on the 3° mode (NMR)
0 5 10 15 20 25 30 35 40-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
Load
ings
on
the
four
th m
ode
factor 1
factor 2
Loadings on the 4° mode (XRD)
Comparison of 2D-Correlation Spectroscopyand unfold OP PCA
38 NIR spectra of water; acquired in the region 1300-1600 nm; from 6 to 80 ºC
-0.60
-0.30
0.00
0.30
0.60
0.90
1.20
1300 1350 1400 1450 1500 1550 1600
Wavelength (nm)
Abso
rban
ce
80ºC 6ºC
V.H. Segtnan et al., Anal. Chem. (2001), 73, 31-53B. Jaillais et al., Vib. Spec., (2005), 39, 1, 50-58
2D-COS Sync vs PC1 Loadings
1493
1412-2.0E-04
-1.0E-04
0.0E+00
1.0E-04
2.0E-04
3.0E-04
1300 1350 1400 1450 1500 1550 1600
Wavelength (nm)
Sync
hron
ous
Corr
elat
ion
1412
1491
2D-COS Async vs PC2 Loadings
1446
1428
1404
-6.0E-06
-3.0E-06
0.0E+00
3.0E-06
6.0E-06
1300 1350 1400 1450 1500 1550 1600
Wavelength (nm)
Asyn
chro
nous
Cor
rela
tion
ss 2D-COS vs. Loadings of PCAon transposed row-normalised, column-centred spectra
-2.5E-04
-1.5E-04
-5.0E-05
5.0E-05
1.5E-04
2.5E-04
0 10 20 30 40 50 60 70 80
Sample temperature (ºC)
Sync
hron
ous
corr
elat
ion
6ºC 80ºC
PC1Sync
ss 2D-COS vs Loadings of PCAon transposed row-normalised, column-centred spectra
-4.0E-06
-2.0E-06
0.0E+00
2.0E-06
4.0E-06
0 10 20 30 40 50 60 70 80
Sample temperature (ºC)
Asy
nchr
onou
s co
rrel
atio
n
6ºC80ºC
(6,38) (54,80)(80,6)
PC2ASync
Unfold PCT-OP-PCA (or PLS) algorithmfor huge X & Y
Step Computation Comments
1 X,Y input of X and Y matrices
2 [TX, PX] PCA(X) full rank PCA on X
3 [TY, PY] PCA(Y) full rank PCA on Y
4 K = OP(TX, TY) unfolded outer product
between TX and TY
5 [T, PPCT] = PCA(K) PCA (or PLS etc.) on K
T = scores in the X-space & PC-space
PPCT = loadings in PC-space
6 for a=1:n
refold PPCT(a)
P(a) = PY PPCT(a) PTX
end for rebuild PC loadings in X-space
A.S. Barros & D.N. Rutledge, Chemom. Intell. Lab. Syst., (2004) 73 245– 255A.S. Barros & D.N. Rutledge, Chemom. Intell. Lab. Syst., (2005), 78, 125–137
X
PCA
X1
T1
P1
X2
T2
P2
Xq
Tq
Pq
T1 T2 ... T1
PCA/PLS
TPCT
PTPCT
PX1 PX2 PXq...
Segmented PCT-PCA (or PLS)
TPCT TX=
OPA vs. 2D-COS, PCA and Tucker Analysis
• Why use the mean ! ?(PCA & TA are in a sense compromises)
• Not limited to two data sets
• Cube analysable by unfolding or by multi-way methods
• Multi-way methods extract Factors
• Unfold-OPA can reveal relations between variablesnot limited to two matricesno need to sort samples (unlike 2D-COS)
• No memory problem with (segmented) PCT-OPA