Date post: | 14-Jan-2016 |
Category: |
Documents |
Upload: | wesley-norris |
View: | 215 times |
Download: | 0 times |
In the name of
GOD
Zeinab Mokhtari
1-Mar-2010
In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement for understanding.
scatter plots
Plotting = visualization = graphing
plotting some measured result against some parameter in a Cartesian co-ordinate system
the entries from two vectors of the same size are plotted pairwise in the Cartesian co-ordinate system.
Geographical mapssatellite images
Cartesian plottingContour plotsScatter plots
Line plotsImages
Bar plotsLoading plotsScore plots
BiplotsJoint plots
Latent variable methods
unit-free plotsThe items plotted against each other (scores, loadings) are based on the same measured data but projected differently.
principal component analysisorthogonal scores
orthonormal loadings
the co-ordinate systems used for the score plots
PCA
PLS regression
factor analysis
PARAFAC…
scores and loadings
PLOTTING IN COMPONENT MODELS
Sign inversion
PCA scores and loadings are mirrored together.
Figure 1. The mean-centered data
Figure 2. The score plot after PCA
Figure 3.The normalized scoreplot
0 0.5 10
0.5
1
1.5D
-0.4 -0.2 0-1
-0.5
0
0.5u
-2 -1 0-0.2
0
0.2u*s
-0.8 -0.7 -0.6-1
0
1v
-2.5 -2 -1.5-0.2
0
0.2v*s
-1 0 1-1
-0.5
0
0.5D-mc1
-1 0 1-1
-0.5
0
0.5u
-1 0 1-0.05
0
0.05u*s
-0.8 -0.7 -0.6-1
0
1v
-1 -0.8 -0.6-0.1
0
0.1v*s
-0.1 -0.05 00
0.05
0.1D-mc2
-0.4 -0.2 0-0.5
0
0.5u
-0.2 -0.1 0-1
0
1x 10
-16u*s
-1 0 10.7071
0.7071
0.7071
0.7071v
-0.5 0 0.51.3171
1.3171
1.3171
1.3171x 10
-16v*s
-0.8 -0.6 -0.4-0.7
-0.65
-0.6D-mc
-0.3162 -0.3162 -0.3162-0.5
0
0.5
1u
-0.8979 -0.8979 -0.8979-0.1
0
0.1u*s
0.7071 0.7071 0.7071-1
0
1v
2.0078 2.0078 2.0078-0.1
0
0.1v*s
PLOTTING IN PARTIAL LEAST SQUARES REGRESSION
a linear relationship with slope one
Outliersnon-linearityGrouping of data…
The singular value is distributed equally among the u and v parts (scores and loadings) for the purpose of forming new variables h and g to be plotted.
The special cases of c=0 and 1
c=1row metric-preserving version
Euclidean distances between objects and Mahalanobis distances between variables
c=0column metric-preserving version
Euclidean distances between variables and Mahalanobis distances between objects
For almost equal number of objects and variables
Biplots
A compensation for number of objects (I) and variables (K) is made by introducing a fudge or zoom factor z:
Biplots can be expanded to the use of three-way loadings, especially for Tucker3 models. Then they get the name joint plots.
suitable for the elucidation of the similarities and dissimilarities among the columns and rows of two-dimensional data matrices
cannot be employed for the evaluation of arrays of higher dimensions
Principal component analysis (PCA), a versatile and easy-to-use multivariate
mathematical–statistical method
Three-way analysis by PARAFAC
not orthogonal loadings
N-WAY TOOLBOX
a a
bbc c
Tucker3 model or three-way PCA
analysis of the similarities and dissimilarities among N-dimensional data arrays
The Tucker3 model computes three orthogonal matrices with lower dimensions than the original data arrays such a manner that the variance explained by the reduced matrices being as high as possible.
N-WAY TOOLBOX
AB
C
GX
Cluster analysis
The reduction of the dimensionality of multidimensional arrays
Projection of the points scattered in the multidimensional space on a plane, such a manner that the distances among the points in the multidimensional space on the plane are as similar as possible
The objectives of the study were the measurement of the microbiological effect of benzimidazolium salts containing various anions, the application of the combination of Tucker3 model and cluster analysis for the evaluation of the dependence of the microbiological effect on the type of test organism, chemical structure of the free benzimidazolium base and the type of cation.
The free base and the salts formed with Cl−, SO42−, PO4
3− and NO3−
Species tested for the microbiological activity (altogether 15 species)
The Tucker3 model has been employed for the three dimensionaldata matrix consisting of the inhibitory activity of seven benzimidazole derivatives (factor I), the presence and type of anion (factor II) and the
15 test organisms (factor III) (3-way array with dimensions 7, 5, 15).
Arrays of the largest possible dimensions (6, 4, 14)
The arrays explaining more than 0.28% of the total variance (in this case 3, 2, 3)
the total variance explained : 99.75%
the total variance explained : 98.05%
Fig. 1. Plot of the first two elements of component matrix I
Fig. 2. Cluster dendogram of component matrix I
The distribution of benzimidazolederivatives is highly similar on both figures.
Similarity and dissimilarity of microbiological activity
Fig. 3. Plot of component matrix II
The presence of sulfate anion may have a considerable impact on the biological efficacy of benzimidazole derivatives.
Fig. 4. Plot of the first two elements of component matrix III
Fig. 5. Cluster dendogram of component matrix III
It can be concluded from the results that a Tucker3 model combined with cluster analysis can be successfully used for the study of the microbiological activity of benzimidazolium salts and separates the effect of the type of benzimidazole derivatives and saltforming anions.
Five different breads were baked in replicates giving a total of ten samples. Eight different judges assessed the breads with respect to eleven different attributes. The data can be regarded as a three-way array (10 × 11 × 8) or alternatively as an ordinary two-way matrix (10 × 88).
Always enjoy life, no matter how hard it seems!
When life gives you a thousand reasons to cry, show the world that you have million reasons to
SMILE!