Post on 10-Jan-2022
transcript
Optimization of Chromatogram Alignment Using A ClassSeparability Criterion
Gopal Yalla
Department of Mathematics and Computer ScienceDepartment of Chemistry
College of the Holy Cross
April 28, 2015
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 1 / 38
Outline
1 Introduction to Chromatography2 Theory and Techniques3 Experimental Data4 Data Preprocessing5 Results6 Extended Results7 Acknowledgements
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 2 / 38
Gas Chromatography
The gas chromatograph (GC)) is the main instrument used for separatingthe components of a mixture.
Two Phases: Mobile Phase and Stationary phase
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 3 / 38
Mass SpectrometryThe mass spectrometer (MS) identifies the amount and type of chemicalspresent in a sample.
Components are ionized and separated according mass.
The mass spectrum is a definite pattern of the number of ions presentat each mass level
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 4 / 38
Chromatograms
GC + MS produces chromatograms.
x-axis displays retention time in the GC column
y-azis displays molecular abundance in sample
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38
Chromatograms
GC + MS produces chromatograms.
x-axis displays retention time in the GC column
y-azis displays molecular abundance in sample
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38
Chromatographic Data Analysis
Peak Area Extractionæ Judgement of number and type of chemical components must be made
by the user.
æ Straightforward, but time consuming.
æ Sacrifice interesting trends.
æ Di�cult with complex data...
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38
Chromatographic Data Analysis
Peak Area Extractionæ Judgement of number and type of chemical components must be made
by the user.
æ Straightforward, but time consuming.
æ Sacrifice interesting trends.
æ Di�cult with complex data...
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38
Chromatographic Data Analysis
Peak Area Extractionæ Judgement of number and type of chemical components must be made
by the user.
æ Straightforward, but time consuming.
æ Sacrifice interesting trends.
æ Di�cult with complex data...
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38
Chromatographic Data Analysis
Peak Area Extractionæ Judgement of number and type of chemical components must be made
by the user.
æ Straightforward, but time consuming.
æ Sacrifice interesting trends.
æ Di�cult with complex data...
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38
Peak Area Extraction (Con’t)
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 7 / 38
Alignment Issue
When dealing with multiple samples, fluctuations in peak height andpeak location occur.
Without peak location alignment, trends determined by chemometricmethods will be skewed or meaningless.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38
Alignment Issue
When dealing with multiple samples, fluctuations in peak height andpeak location occur.
Without peak location alignment, trends determined by chemometricmethods will be skewed or meaningless.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38
Alignment Techniques
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38
Alignment TechniquesCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .
Dynamic Programming : Solves combinatorial optimization problems.
COW uses two matrices, F and U of size (S + 1) ◊ (L + 1).
.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38
COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .
⌥⌃ ⌅⇧What is the optimal choice of COW parameters?
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38
COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .
Choice of target chromatogram is based on similarity index,
SIj =NŸ
n=1|r(xj , xn)| .
Where r(·, ·) represents Pearson’s correlation coe�cient.
⌥⌃ ⌅⇧What is the optimal choice of COW parameters?
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38
COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .
⌥⌃ ⌅⇧What is the optimal choice of COW parameters?
.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38
Nomenclature and Terminology
a = scalars
a = column vector
A = data matrices
Row index n corresponds to sample chromatogram
Column index m corresponds to retention time
M total retention times
N total chromatogram
Nk total chromatograms in the kth class
K total classes
x(Q)kn is the nth chromatogram in the kth class processed with
correction method Q.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 11 / 38
Alignment Metrics: Warping E�ect
Warping E�ect = Simplicity + Peak Factor
Simplicity ([0, 1]): How close is data to rank 1 matrix
simplicity =Rÿ
r=1
Q
aSVD
Q
aX/
ııÙKÿ
k=1
Nkÿ
n=1
Mÿ
m=1x2
knm
R
b
R
b4
Peak Factor ([0, 1]): How much the shape and peak area ofchromatograms have been changed by warping
peak factor =1N
Kÿ
k=1
Nkÿ
n=1(1 ≠ min(ckn, 1)2)
where ckn =
-----Î x(COW)
kn Î ≠ Î xkn ÎÎ xkn Î
----- represents a relative error between
aligned and unaligned chromatogram.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38
Alignment Metrics: Warping E�ect
Warping E�ect = Simplicity + Peak Factor
Simplicity ([0, 1]): How close is data to rank 1 matrix
simplicity =Rÿ
r=1
Q
aSVD
Q
aX/
ııÙKÿ
k=1
Nkÿ
n=1
Mÿ
m=1x2
knm
R
b
R
b4
Peak Factor ([0, 1]): How much the shape and peak area ofchromatograms have been changed by warping
peak factor =1N
Kÿ
k=1
Nkÿ
n=1(1 ≠ min(ckn, 1)2)
where ckn =
-----Î x(COW)
kn Î ≠ Î xkn ÎÎ xkn Î
----- represents a relative error between
aligned and unaligned chromatogram.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38
Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion
HTC Incorporates both within class and between class variation inthe data set.
¿ HTC
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38
Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion
HTC Incorporates both within class and between class variation inthe data set.
ø HTC
¿ HTC
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38
Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion
HTC Incorporates both within class and between class variation inthe data set.
¿ HTC
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38
Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:
xk =1
Nk
Nkÿ
n=1xkn , Sk =
1Nk ≠ 1
Nkÿ
n=1(xkn ≠ xk)(xkn ≠ xk)
t .
• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:
¯x =Kÿ
k=1Pk xk .
• The within-class scatter matrix and between-class scatter matrix is defined as:
Swc =Kÿ
k=1PkSk , Sbc =
Kÿ
k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38
Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:
xk =1
Nk
Nkÿ
n=1xkn , Sk =
1Nk ≠ 1
Nkÿ
n=1(xkn ≠ xk)(xkn ≠ xk)
t .
• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:
¯x =Kÿ
k=1Pk xk .
• The within-class scatter matrix and between-class scatter matrix is defined as:
Swc =Kÿ
k=1PkSk , Sbc =
Kÿ
k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38
Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:
xk =1
Nk
Nkÿ
n=1xkn , Sk =
1Nk ≠ 1
Nkÿ
n=1(xkn ≠ xk)(xkn ≠ xk)
t .
• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:
¯x =Kÿ
k=1Pk xk .
• The within-class scatter matrix and between-class scatter matrix is defined as:
Swc =Kÿ
k=1PkSk , Sbc =
Kÿ
k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38
Hotelling Trace Criterion (Con’t)
The HTC is defined as: ⌥⌃
⌅⇧J = tr
!S≠1
wc Sbc"
When K = 2, HTC reduces to the Mahalanobis distance
J = (x1 ≠ x2)tS≠1(x1 ≠ x2)
When K = 2 and M = 1, HTC reduces to the square of a t-statistic
J = t21
2N
2
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38
Hotelling Trace Criterion (Con’t)
The HTC is defined as: ⌥⌃
⌅⇧J = tr
!S≠1
wc Sbc"
When K = 2, HTC reduces to the Mahalanobis distance
J = (x1 ≠ x2)tS≠1(x1 ≠ x2)
When K = 2 and M = 1, HTC reduces to the square of a t-statistic
J = t21
2N
2
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38
Hotelling Trace Criterion (Con’t)
The HTC is defined as: ⌥⌃
⌅⇧J = tr
!S≠1
wc Sbc"
When K = 2, HTC reduces to the Mahalanobis distance
J = (x1 ≠ x2)tS≠1(x1 ≠ x2)
When K = 2 and M = 1, HTC reduces to the square of a t-statistic
J = t21
2N
2
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38
Experimental Data5 Classes of Biodiesel:
Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)
} Each sample tested3 di�erent runs
45 Total Chromatograms
Sample Chromatogram:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38
Experimental Data5 Classes of Biodiesel:
Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)
} Each sample tested3 di�erent runs
45 Total Chromatograms
Chemical Structure:
FAMEs (Fatty acid methyl ester)
Variable length of carbon chain and number of double bonds.
Sample Chromatogram:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38
Experimental Data5 Classes of Biodiesel:
Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)
} Each sample tested3 di�erent runs
45 Total Chromatograms
Chemical Structure:
FAMEs (Fatty acid methyl ester)
Variable length of carbon chain and number of double bonds.
Sample Chromatogram:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38
Experimental Data5 Classes of Biodiesel:
Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)
} Each sample tested3 di�erent runs
45 Total Chromatograms
Reaction Process:
Sample Chromatogram:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38
Experimental Data5 Classes of Biodiesel:
Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)
} Each sample tested3 di�erent runs
45 Total Chromatograms
Sample Chromatogram:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38
Data Preprocessing: Timeline
1 Baseline Correction2 COW Alignment3 Normalization & Mean Centering4 Principal Component Transformation5 Computed Metrics
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 17 / 38
Baseline Problem
Need to correct for non-linear increase in baseline caused from:Gradual increase in oven temperature
Column Bleeding
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 18 / 38
Baseline Correction
Use asymmetric least squares smoothing to determine baseline vector bÕ
that minimizes
f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2
w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38
Baseline Correction
Use asymmetric least squares smoothing to determine baseline vector bÕ
that minimizes
f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2
w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38
Baseline Correction
Use asymmetric least squares smoothing to determine baseline vector bÕ
that minimizes
f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2
w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38
Baseline Correction
Use asymmetric least squares smoothing to determine baseline vector bÕ
that minimizes
f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2
w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38
Baseline Correction: Finding Peaks• Let
xkn = s + b + ‘
where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.
• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .
wi =
Y_]
_[
0 if |xkni | > mi ± 2‡‘
1 if |xkni | Æ mi ± 2‡‘
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38
Baseline Correction: Finding Peaks• Let
xkn = s + b + ‘
where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.
• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .
æ m ¥ b
æ xkn ¥ b + ‘
æ ‡‘ ¥ 1.4826 ◊ median (|xkn ≠ m|).
wi =
Y_]
_[
0 if |xkni | > mi ± 2‡‘
1 if |xkni | Æ mi ± 2‡‘
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38
Baseline Correction: Finding Peaks• Let
xkn = s + b + ‘
where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.
• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .
wi =
Y_]
_[
0 if |xkni | > mi ± 2‡‘
1 if |xkni | Æ mi ± 2‡‘
.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38
Baseline Correction: ResultsUsing bÕ to estimate b gives,
x(BC)kn = xkn ≠ bÕ ¥ s + ‘
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38
Baseline Correction: ResultsUsing bÕ to estimate b gives,
x(BC)kn = xkn ≠ bÕ ¥ s + ‘
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38
Normalization and Mean Centering
• Each chromatogram x(BC,COW)kn should be normalized to account for
variations in injection volume.
x(BC,COW,NORM)kn =
AAkn
· x(BC,COW)kn
where Akn represents total area of each chromatogram, and A is averagetotal area of all chromatograms.
• Each chromatogram should be mean centered to the origin.
x(BC,COW,NORM,MC)kn = x(BC,COW,NORM)
kn ≠ x(BC,COW,NORM)
where x(BC,COW,NORM,MC)kn is the sample mean chromatogram.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38
Normalization and Mean Centering
• Each chromatogram x(BC,COW)kn should be normalized to account for
variations in injection volume.
x(BC,COW,NORM)kn =
AAkn
· x(BC,COW)kn
where Akn represents total area of each chromatogram, and A is averagetotal area of all chromatograms.
• Each chromatogram should be mean centered to the origin.
x(BC,COW,NORM,MC)kn = x(BC,COW,NORM)
kn ≠ x(BC,COW,NORM)
where x(BC,COW,NORM,MC)kn is the sample mean chromatogram.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38
Principal Component AnalysisHTC was evaluated on the principal component transformed data.
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38
Principal Component AnalysisHTC was evaluated on the principal component transformed data.
Let S represent the the sample covariance matrix of the entire set ofpreprocessed data, with eigenvalue decomposition:
S = U⇤Ut
Then ykn, the vector of PC’s, is computed via the transformation
ykn = Utx(BC,COW,NORM,MC)kn
Eigenvalues correspond to how much variation is explained in each PC.
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38
Principal Component AnalysisHTC was evaluated on the principal component transformed data.
.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38
HTC Evaluated on PCs• Let zkn = (ykn1, ykn2, · · · , yknL)t denote the L ◊ 1 vector corresponding to thefirst L PCs of ykn. The sample mean vector and sample covariance matrix for thekth class are given respectively by
zk =1
Nk
Nkÿ
n=1zkn , Sk =
1Nk ≠ 1
Nkÿ
n=1(zkn ≠ zk)(zkn ≠ zk)
t .
• The grand mean vector is given by
¯z =Kÿ
k=1Pk zk .
• The within-class scatter matrix and between-class scatter matrix is defined as:
Swc =Kÿ
k=1PkSk , Sbc =
Kÿ
k=1Pk(zk ≠ ¯z)(zk ≠ ¯z)t .
• HTC is given by, ⌥⌃ ⌅⇧J = tr (S≠1wc Sbc)
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 24 / 38
Computed Metrics
Density Plots for Warp E�ect & HTC:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38
Computed Metrics
Density Plots for Warp E�ect & HTC:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38
Computed Metrics
Density Plots for Warp E�ect & HTC:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38
Computed Metrics
Density Plots for Warp E�ect & HTC:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38
Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38
Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38
Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38
Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38
Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38
Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38
Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38
Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38
Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)
soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38
Summary of Results
Based on our data, HTC leads to better alignment than warping e�ect
æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)
Class Soy Canola Tallow Waste Grease
Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -
Waste Grease 1.22 1.16 1.12 0
æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)
Class 1st Major Axis 2nd Major Axis
Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30
Waste Grease 0.68 0.68
Clear parametric distinction.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38
Summary of Results
Based on our data, HTC leads to better alignment than warping e�ect
æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)
Class Soy Canola Tallow Waste Grease
Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -
Waste Grease 1.22 1.16 1.12 0
æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)
Class 1st Major Axis 2nd Major Axis
Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30
Waste Grease 0.68 0.68
Clear parametric distinction.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38
Summary of Results
Based on our data, HTC leads to better alignment than warping e�ect
æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)
Class Soy Canola Tallow Waste Grease
Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -
Waste Grease 1.22 1.16 1.12 0
æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)
Class 1st Major Axis 2nd Major Axis
Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30
Waste Grease 0.68 0.68
Clear parametric distinction.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38
Project Milestone!
1 Published Work in Journal of ChemometricsSoares Edward J., Yalla Gopal R., O’Connor John B., Walsh Kevin A.,and Hupp Amber M. (2015), Hotelling trace criterion as a figure ofmerit for the optimization of chromatogram alignment, J. Chemometrics,29, pages 200-212.
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 30 / 38
More Complex Data: Biodiesel-Diesel Blends210 chromatograms with three di�erent attributes
æ Feedstock: Pure Diesel, Soy, Canola, IRE Tallow, Texas Tallow,Waste Grease
æ Diesel Type: Flynn, Hess, Shell, Sunoco
æ Blend Ratio: B2, B5, B10, B20
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 31 / 38
Diesel ResultsBefore Alignment:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38
Diesel ResultsAfter Alignment and Optimization:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38
Diesel ResultsAfter Alignment and Optimization:
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38
ClassificationB10 Biodiesel Samples
Shell Sunoco
Texas Tallow 12 5 (¶) 5 (*)
IRE Tallow 12 5 (¶) 5 (*)
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38
ClassificationB10 Biodiesel Samples
Shell Sunoco
Texas Tallow 12 5 (¶) 5 (*)
IRE Tallow 12 5 (¶) 5 (*)
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38
Broader Impact
1 Determine chemical components that contribute the most to theenergy content of fuel
æ Create synthetic biomaterial with energy content?2 Forensic / Environment Concerns
æ Determine origins and consequence of oil spill
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 34 / 38
Future Work
1 Algorithmic Development
æ COW has very long computation time.
æ No parametric pattern
2 Larger Sample Size for HTC Results
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38
Future Work
1 Algorithmic Developmentæ COW has very long computation time.
æ No parametric pattern
2 Larger Sample Size for HTC Results
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38
Future Work
1 Algorithmic Developmentæ COW has very long computation time.
æ No parametric pattern
2 Larger Sample Size for HTC Results
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38
Future Work
1 Algorithmic Developmentæ COW has very long computation time.
æ No parametric pattern
2 Larger Sample Size for HTC Results
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38
Acknowledgements
Thank you for listening!
Professor Amber Hupp
Professor Kevin Walsh
Colette Houssan
Mike Comiskey
Department of Mathematics & Computer Science
Department of Chemistry
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38
Acknowledgements
Thank you for listening!
Professor Amber Hupp
Professor Kevin Walsh
Colette Houssan
Mike Comiskey
Department of Mathematics & Computer Science
Department of Chemistry
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38
Acknowledgements
Thank you for listening!
Professor Amber Hupp
Professor Kevin Walsh
Colette Houssan
Mike Comiskey
Department of Mathematics & Computer Science
Department of Chemistry
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38
Acknowledgements
Thank you for listening!
Professor Amber Hupp
Professor Kevin Walsh
Colette Houssan
Mike Comiskey
Department of Mathematics & Computer Science
Department of Chemistry
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38
AcknowledgementsJournal of Chemometrics
University Syringe Program Grant from Hamilton Company (AMH).
Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).
College of the Holy Cross.
National Institute of Standards and Technology (NIST, Gaithersburg,MD)
Western Dubuque Biodiesel
ADM Company,
Keystone Biofuels,
TMT Biofuels,
Texas Green Manufacturing
Iowa Renewable Energy
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38
AcknowledgementsJournal of Chemometrics
University Syringe Program Grant from Hamilton Company (AMH).
Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).
College of the Holy Cross.
National Institute of Standards and Technology (NIST, Gaithersburg,MD)
Western Dubuque Biodiesel
ADM Company,
Keystone Biofuels,
TMT Biofuels,
Texas Green Manufacturing
Iowa Renewable Energy
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38
AcknowledgementsJournal of Chemometrics
University Syringe Program Grant from Hamilton Company (AMH).
Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).
College of the Holy Cross.
National Institute of Standards and Technology (NIST, Gaithersburg,MD)
Western Dubuque Biodiesel
ADM Company,
Keystone Biofuels,
TMT Biofuels,
Texas Green Manufacturing
Iowa Renewable EnergyGopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38
Thank you Professor Soares!
Couldn’t have done it without you Sauce!
Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 38 / 38