+ All Categories
Home > Documents > In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where...

In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where...

Date post: 14-Jan-2016
Category:
Upload: wesley-norris
View: 215 times
Download: 0 times
Share this document with a friend
30
In the name of GOD
Transcript
Page 1: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

In the name of

GOD

Page 2: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Zeinab Mokhtari

1-Mar-2010

Page 3: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement for understanding.

scatter plots

Plotting = visualization = graphing

plotting some measured result against some parameter in a Cartesian co-ordinate system

the entries from two vectors of the same size are plotted pairwise in the Cartesian co-ordinate system.

Geographical mapssatellite images

Cartesian plottingContour plotsScatter plots

Line plotsImages

Bar plotsLoading plotsScore plots

BiplotsJoint plots

Page 4: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Page 5: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Latent variable methods

unit-free plotsThe items plotted against each other (scores, loadings) are based on the same measured data but projected differently.

principal component analysisorthogonal scores

orthonormal loadings

the co-ordinate systems used for the score plots

PCA

PLS regression

factor analysis

PARAFAC…

scores and loadings

Page 6: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Page 7: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

PLOTTING IN COMPONENT MODELS

Sign inversion

PCA scores and loadings are mirrored together.

Page 8: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Figure 1. The mean-centered data

Figure 2. The score plot after PCA

Figure 3.The normalized scoreplot

Page 9: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

0 0.5 10

0.5

1

1.5D

-0.4 -0.2 0-1

-0.5

0

0.5u

-2 -1 0-0.2

0

0.2u*s

-0.8 -0.7 -0.6-1

0

1v

-2.5 -2 -1.5-0.2

0

0.2v*s

-1 0 1-1

-0.5

0

0.5D-mc1

-1 0 1-1

-0.5

0

0.5u

-1 0 1-0.05

0

0.05u*s

-0.8 -0.7 -0.6-1

0

1v

-1 -0.8 -0.6-0.1

0

0.1v*s

-0.1 -0.05 00

0.05

0.1D-mc2

-0.4 -0.2 0-0.5

0

0.5u

-0.2 -0.1 0-1

0

1x 10

-16u*s

-1 0 10.7071

0.7071

0.7071

0.7071v

-0.5 0 0.51.3171

1.3171

1.3171

1.3171x 10

-16v*s

-0.8 -0.6 -0.4-0.7

-0.65

-0.6D-mc

-0.3162 -0.3162 -0.3162-0.5

0

0.5

1u

-0.8979 -0.8979 -0.8979-0.1

0

0.1u*s

0.7071 0.7071 0.7071-1

0

1v

2.0078 2.0078 2.0078-0.1

0

0.1v*s

Page 10: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

PLOTTING IN PARTIAL LEAST SQUARES REGRESSION

a linear relationship with slope one

Outliersnon-linearityGrouping of data…

Page 11: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

The singular value is distributed equally among the u and v parts (scores and loadings) for the purpose of forming new variables h and g to be plotted.

The special cases of c=0 and 1

c=1row metric-preserving version

Euclidean distances between objects and Mahalanobis distances between variables

c=0column metric-preserving version

Euclidean distances between variables and Mahalanobis distances between objects

For almost equal number of objects and variables

Biplots

Page 12: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

A compensation for number of objects (I) and variables (K) is made by introducing a fudge or zoom factor z:

Biplots can be expanded to the use of three-way loadings, especially for Tucker3 models. Then they get the name joint plots.

Page 13: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

suitable for the elucidation of the similarities and dissimilarities among the columns and rows of two-dimensional data matrices

cannot be employed for the evaluation of arrays of higher dimensions

Principal component analysis (PCA), a versatile and easy-to-use multivariate

mathematical–statistical method

Page 14: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Page 15: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Three-way analysis by PARAFAC

not orthogonal loadings

N-WAY TOOLBOX

a a

bbc c

Page 16: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Tucker3 model or three-way PCA

analysis of the similarities and dissimilarities among N-dimensional data arrays

The Tucker3 model computes three orthogonal matrices with lower dimensions than the original data arrays such a manner that the variance explained by the reduced matrices being as high as possible.

N-WAY TOOLBOX

AB

C

GX

Page 17: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Cluster analysis

The reduction of the dimensionality of multidimensional arrays

Projection of the points scattered in the multidimensional space on a plane, such a manner that the distances among the points in the multidimensional space on the plane are as similar as possible

The objectives of the study were the measurement of the microbiological effect of benzimidazolium salts containing various anions, the application of the combination of Tucker3 model and cluster analysis for the evaluation of the dependence of the microbiological effect on the type of test organism, chemical structure of the free benzimidazolium base and the type of cation.

Page 18: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

The free base and the salts formed with Cl−, SO42−, PO4

3− and NO3−

Page 19: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Species tested for the microbiological activity (altogether 15 species)

Page 20: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

The Tucker3 model has been employed for the three dimensionaldata matrix consisting of the inhibitory activity of seven benzimidazole derivatives (factor I), the presence and type of anion (factor II) and the

15 test organisms (factor III) (3-way array with dimensions 7, 5, 15).

Arrays of the largest possible dimensions (6, 4, 14)

The arrays explaining more than 0.28% of the total variance (in this case 3, 2, 3)

Page 21: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Page 22: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

the total variance explained : 99.75%

Page 23: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

the total variance explained : 98.05%

Page 24: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Fig. 1. Plot of the first two elements of component matrix I

Fig. 2. Cluster dendogram of component matrix I

The distribution of benzimidazolederivatives is highly similar on both figures.

Similarity and dissimilarity of microbiological activity

Page 25: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Fig. 3. Plot of component matrix II

The presence of sulfate anion may have a considerable impact on the biological efficacy of benzimidazole derivatives.

Page 26: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Fig. 4. Plot of the first two elements of component matrix III

Fig. 5. Cluster dendogram of component matrix III

Page 27: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

It can be concluded from the results that a Tucker3 model combined with cluster analysis can be successfully used for the study of the microbiological activity of benzimidazolium salts and separates the effect of the type of benzimidazole derivatives and saltforming anions.

Page 28: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Five different breads were baked in replicates giving a total of ten samples. Eight different judges assessed the breads with respect to eleven different attributes. The data can be regarded as a three-way array (10 × 11 × 8) or alternatively as an ordinary two-way matrix (10 × 88).

Page 29: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Page 30: In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.

Always enjoy life, no matter how hard it seems!

When life gives you a thousand reasons to cry, show the world that you have million reasons to

SMILE!


Recommended