optimization of chromatogram alignment using a class

Post on 10-Jan-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Optimization of Chromatogram Alignment Using A ClassSeparability Criterion

Gopal Yalla

Department of Mathematics and Computer ScienceDepartment of Chemistry

College of the Holy Cross

April 28, 2015

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 1 / 38

Outline

1 Introduction to Chromatography2 Theory and Techniques3 Experimental Data4 Data Preprocessing5 Results6 Extended Results7 Acknowledgements

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 2 / 38

Gas Chromatography

The gas chromatograph (GC)) is the main instrument used for separatingthe components of a mixture.

Two Phases: Mobile Phase and Stationary phase

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 3 / 38

Mass SpectrometryThe mass spectrometer (MS) identifies the amount and type of chemicalspresent in a sample.

Components are ionized and separated according mass.

The mass spectrum is a definite pattern of the number of ions presentat each mass level

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 4 / 38

Chromatograms

GC + MS produces chromatograms.

x-axis displays retention time in the GC column

y-azis displays molecular abundance in sample

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38

Chromatograms

GC + MS produces chromatograms.

x-axis displays retention time in the GC column

y-azis displays molecular abundance in sample

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 5 / 38

Chromatographic Data Analysis

Peak Area Extractionæ Judgement of number and type of chemical components must be made

by the user.

æ Straightforward, but time consuming.

æ Sacrifice interesting trends.

æ Di�cult with complex data...

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis

Peak Area Extractionæ Judgement of number and type of chemical components must be made

by the user.

æ Straightforward, but time consuming.

æ Sacrifice interesting trends.

æ Di�cult with complex data...

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis

Peak Area Extractionæ Judgement of number and type of chemical components must be made

by the user.

æ Straightforward, but time consuming.

æ Sacrifice interesting trends.

æ Di�cult with complex data...

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Chromatographic Data Analysis

Peak Area Extractionæ Judgement of number and type of chemical components must be made

by the user.

æ Straightforward, but time consuming.

æ Sacrifice interesting trends.

æ Di�cult with complex data...

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 6 / 38

Peak Area Extraction (Con’t)

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 7 / 38

Alignment Issue

When dealing with multiple samples, fluctuations in peak height andpeak location occur.

Without peak location alignment, trends determined by chemometricmethods will be skewed or meaningless.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38

Alignment Issue

When dealing with multiple samples, fluctuations in peak height andpeak location occur.

Without peak location alignment, trends determined by chemometricmethods will be skewed or meaningless.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 8 / 38

Alignment Techniques

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38

Alignment TechniquesCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .

Dynamic Programming : Solves combinatorial optimization problems.

COW uses two matrices, F and U of size (S + 1) ◊ (L + 1).

.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 9 / 38

COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .

⌥⌃ ⌅⇧What is the optimal choice of COW parameters?

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .

Choice of target chromatogram is based on similarity index,

SIj =NŸ

n=1|r(xj , xn)| .

Where r(·, ·) represents Pearson’s correlation coe�cient.

⌥⌃ ⌅⇧What is the optimal choice of COW parameters?

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

COW AlgorithmCorrelation Optimized Warping (COW): Given two parameters segmentsize (m) and max warp (t), a chromatogram P is aligned to a targetchromatogram T .

⌥⌃ ⌅⇧What is the optimal choice of COW parameters?

.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 10 / 38

Nomenclature and Terminology

a = scalars

a = column vector

A = data matrices

Row index n corresponds to sample chromatogram

Column index m corresponds to retention time

M total retention times

N total chromatogram

Nk total chromatograms in the kth class

K total classes

x(Q)kn is the nth chromatogram in the kth class processed with

correction method Q.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 11 / 38

Alignment Metrics: Warping E�ect

Warping E�ect = Simplicity + Peak Factor

Simplicity ([0, 1]): How close is data to rank 1 matrix

simplicity =Rÿ

r=1

Q

aSVD

Q

aX/

ııÙKÿ

k=1

Nkÿ

n=1

Mÿ

m=1x2

knm

R

b

R

b4

Peak Factor ([0, 1]): How much the shape and peak area ofchromatograms have been changed by warping

peak factor =1N

Kÿ

k=1

Nkÿ

n=1(1 ≠ min(ckn, 1)2)

where ckn =

-----Î x(COW)

kn Î ≠ Î xkn ÎÎ xkn Î

----- represents a relative error between

aligned and unaligned chromatogram.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38

Alignment Metrics: Warping E�ect

Warping E�ect = Simplicity + Peak Factor

Simplicity ([0, 1]): How close is data to rank 1 matrix

simplicity =Rÿ

r=1

Q

aSVD

Q

aX/

ııÙKÿ

k=1

Nkÿ

n=1

Mÿ

m=1x2

knm

R

b

R

b4

Peak Factor ([0, 1]): How much the shape and peak area ofchromatograms have been changed by warping

peak factor =1N

Kÿ

k=1

Nkÿ

n=1(1 ≠ min(ckn, 1)2)

where ckn =

-----Î x(COW)

kn Î ≠ Î xkn ÎÎ xkn Î

----- represents a relative error between

aligned and unaligned chromatogram.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 12 / 38

Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion

HTC Incorporates both within class and between class variation inthe data set.

¿ HTC

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion

HTC Incorporates both within class and between class variation inthe data set.

ø HTC

¿ HTC

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Alignment Metric: Hotelling Trace CriterionHotelling Trace Criterion

HTC Incorporates both within class and between class variation inthe data set.

¿ HTC

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 13 / 38

Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:

xk =1

Nk

Nkÿ

n=1xkn , Sk =

1Nk ≠ 1

Nkÿ

n=1(xkn ≠ xk)(xkn ≠ xk)

t .

• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:

¯x =Kÿ

k=1Pk xk .

• The within-class scatter matrix and between-class scatter matrix is defined as:

Swc =Kÿ

k=1PkSk , Sbc =

Kÿ

k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:

xk =1

Nk

Nkÿ

n=1xkn , Sk =

1Nk ≠ 1

Nkÿ

n=1(xkn ≠ xk)(xkn ≠ xk)

t .

• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:

¯x =Kÿ

k=1Pk xk .

• The within-class scatter matrix and between-class scatter matrix is defined as:

Swc =Kÿ

k=1PkSk , Sbc =

Kÿ

k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion• Define the sample mean vector and sample covariance matrix for the kth class as:

xk =1

Nk

Nkÿ

n=1xkn , Sk =

1Nk ≠ 1

Nkÿ

n=1(xkn ≠ xk)(xkn ≠ xk)

t .

• Let Pk = Nk/N be the probability of occurrence of class k. The grand meanvector is given by:

¯x =Kÿ

k=1Pk xk .

• The within-class scatter matrix and between-class scatter matrix is defined as:

Swc =Kÿ

k=1PkSk , Sbc =

Kÿ

k=1Pk(xk ≠ ¯x)(xk ≠ ¯x)t .

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 14 / 38

Hotelling Trace Criterion (Con’t)

The HTC is defined as: ⌥⌃

⌅⇧J = tr

!S≠1

wc Sbc"

When K = 2, HTC reduces to the Mahalanobis distance

J = (x1 ≠ x2)tS≠1(x1 ≠ x2)

When K = 2 and M = 1, HTC reduces to the square of a t-statistic

J = t21

2N

2

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Hotelling Trace Criterion (Con’t)

The HTC is defined as: ⌥⌃

⌅⇧J = tr

!S≠1

wc Sbc"

When K = 2, HTC reduces to the Mahalanobis distance

J = (x1 ≠ x2)tS≠1(x1 ≠ x2)

When K = 2 and M = 1, HTC reduces to the square of a t-statistic

J = t21

2N

2

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Hotelling Trace Criterion (Con’t)

The HTC is defined as: ⌥⌃

⌅⇧J = tr

!S≠1

wc Sbc"

When K = 2, HTC reduces to the Mahalanobis distance

J = (x1 ≠ x2)tS≠1(x1 ≠ x2)

When K = 2 and M = 1, HTC reduces to the square of a t-statistic

J = t21

2N

2

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 15 / 38

Experimental Data5 Classes of Biodiesel:

Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)

} Each sample tested3 di�erent runs

45 Total Chromatograms

Sample Chromatogram:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data5 Classes of Biodiesel:

Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)

} Each sample tested3 di�erent runs

45 Total Chromatograms

Chemical Structure:

FAMEs (Fatty acid methyl ester)

Variable length of carbon chain and number of double bonds.

Sample Chromatogram:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data5 Classes of Biodiesel:

Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)

} Each sample tested3 di�erent runs

45 Total Chromatograms

Chemical Structure:

FAMEs (Fatty acid methyl ester)

Variable length of carbon chain and number of double bonds.

Sample Chromatogram:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data5 Classes of Biodiesel:

Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)

} Each sample tested3 di�erent runs

45 Total Chromatograms

Reaction Process:

Sample Chromatogram:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Experimental Data5 Classes of Biodiesel:

Soy (6 di�erent samples)Canola (3 di�erent samples)Tallow (3 di�erent samples)Waste Grease (2 di�erent samples)Hybrid (1 sample)

} Each sample tested3 di�erent runs

45 Total Chromatograms

Sample Chromatogram:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 16 / 38

Data Preprocessing: Timeline

1 Baseline Correction2 COW Alignment3 Normalization & Mean Centering4 Principal Component Transformation5 Computed Metrics

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 17 / 38

Baseline Problem

Need to correct for non-linear increase in baseline caused from:Gradual increase in oven temperature

Column Bleeding

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 18 / 38

Baseline Correction

Use asymmetric least squares smoothing to determine baseline vector bÕ

that minimizes

f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2

w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction

Use asymmetric least squares smoothing to determine baseline vector bÕ

that minimizes

f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2

w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction

Use asymmetric least squares smoothing to determine baseline vector bÕ

that minimizes

f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2

w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction

Use asymmetric least squares smoothing to determine baseline vector bÕ

that minimizes

f (bÕ) = Îwt(bÕ ≠ xkn)Î2 + ⁄ÎDbÕÎ2

w is a vector of weights⁄ is a relaxation parameterD is a second di�erence matrixηΠis the Euclidean norm

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 19 / 38

Baseline Correction: Finding Peaks• Let

xkn = s + b + ‘

where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.

• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .

wi =

Y_]

_[

0 if |xkni | > mi ± 2‡‘

1 if |xkni | Æ mi ± 2‡‘

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: Finding Peaks• Let

xkn = s + b + ‘

where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.

• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .

æ m ¥ b

æ xkn ¥ b + ‘

æ ‡‘ ¥ 1.4826 ◊ median (|xkn ≠ m|).

wi =

Y_]

_[

0 if |xkni | > mi ± 2‡‘

1 if |xkni | Æ mi ± 2‡‘

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: Finding Peaks• Let

xkn = s + b + ‘

where s is true peak height, b is true smooth basline, and ‘ is normalrandom error with small deviation ‡‘.

• Let mi be median vector of points in xkn over an appropriate windowcentered at time index i .

wi =

Y_]

_[

0 if |xkni | > mi ± 2‡‘

1 if |xkni | Æ mi ± 2‡‘

.Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 20 / 38

Baseline Correction: ResultsUsing bÕ to estimate b gives,

x(BC)kn = xkn ≠ bÕ ¥ s + ‘

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38

Baseline Correction: ResultsUsing bÕ to estimate b gives,

x(BC)kn = xkn ≠ bÕ ¥ s + ‘

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 21 / 38

Normalization and Mean Centering

• Each chromatogram x(BC,COW)kn should be normalized to account for

variations in injection volume.

x(BC,COW,NORM)kn =

AAkn

· x(BC,COW)kn

where Akn represents total area of each chromatogram, and A is averagetotal area of all chromatograms.

• Each chromatogram should be mean centered to the origin.

x(BC,COW,NORM,MC)kn = x(BC,COW,NORM)

kn ≠ x(BC,COW,NORM)

where x(BC,COW,NORM,MC)kn is the sample mean chromatogram.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38

Normalization and Mean Centering

• Each chromatogram x(BC,COW)kn should be normalized to account for

variations in injection volume.

x(BC,COW,NORM)kn =

AAkn

· x(BC,COW)kn

where Akn represents total area of each chromatogram, and A is averagetotal area of all chromatograms.

• Each chromatogram should be mean centered to the origin.

x(BC,COW,NORM,MC)kn = x(BC,COW,NORM)

kn ≠ x(BC,COW,NORM)

where x(BC,COW,NORM,MC)kn is the sample mean chromatogram.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 22 / 38

Principal Component AnalysisHTC was evaluated on the principal component transformed data.

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

Principal Component AnalysisHTC was evaluated on the principal component transformed data.

Let S represent the the sample covariance matrix of the entire set ofpreprocessed data, with eigenvalue decomposition:

S = U⇤Ut

Then ykn, the vector of PC’s, is computed via the transformation

ykn = Utx(BC,COW,NORM,MC)kn

Eigenvalues correspond to how much variation is explained in each PC.

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

Principal Component AnalysisHTC was evaluated on the principal component transformed data.

.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 23 / 38

HTC Evaluated on PCs• Let zkn = (ykn1, ykn2, · · · , yknL)t denote the L ◊ 1 vector corresponding to thefirst L PCs of ykn. The sample mean vector and sample covariance matrix for thekth class are given respectively by

zk =1

Nk

Nkÿ

n=1zkn , Sk =

1Nk ≠ 1

Nkÿ

n=1(zkn ≠ zk)(zkn ≠ zk)

t .

• The grand mean vector is given by

¯z =Kÿ

k=1Pk zk .

• The within-class scatter matrix and between-class scatter matrix is defined as:

Swc =Kÿ

k=1PkSk , Sbc =

Kÿ

k=1Pk(zk ≠ ¯z)(zk ≠ ¯z)t .

• HTC is given by, ⌥⌃ ⌅⇧J = tr (S≠1wc Sbc)

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 24 / 38

Computed Metrics

Density Plots for Warp E�ect & HTC:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics

Density Plots for Warp E�ect & HTC:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics

Density Plots for Warp E�ect & HTC:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Computed Metrics

Density Plots for Warp E�ect & HTC:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 25 / 38

Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC2Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 26 / 38

Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC1 vs. PC3Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 27 / 38

Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (1 PC): (64,3)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (2 PC): (55,8)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Results: PC2 vs. PC3Max Warp E�ect: (26,15) Max HTC (3 PC): (70,6)

soy (¶), canola (ù), tallow (⇤), waste grease (ú), hybrid (+).

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 28 / 38

Summary of Results

Based on our data, HTC leads to better alignment than warping e�ect

æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)

Class Soy Canola Tallow Waste Grease

Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -

Waste Grease 1.22 1.16 1.12 0

æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)

Class 1st Major Axis 2nd Major Axis

Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30

Waste Grease 0.68 0.68

Clear parametric distinction.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Summary of Results

Based on our data, HTC leads to better alignment than warping e�ect

æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)

Class Soy Canola Tallow Waste Grease

Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -

Waste Grease 1.22 1.16 1.12 0

æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)

Class 1st Major Axis 2nd Major Axis

Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30

Waste Grease 0.68 0.68

Clear parametric distinction.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Summary of Results

Based on our data, HTC leads to better alignment than warping e�ect

æ Greater Euclidean Distance between class meansRatios for Segment Length/Max Warp (55,8) to (26,15)

Class Soy Canola Tallow Waste Grease

Soy 0 - - -Canola 1.18 0 - -Tallow 1.13 1.09 0 -

Waste Grease 1.22 1.16 1.12 0

æ Smaller within-class variation.Ratios for Segment Length/Max Warp (55,8) to (26,15)

Class 1st Major Axis 2nd Major Axis

Soy 0.94 0.92Canola 1.06 0.80Tallow 0.86 1.30

Waste Grease 0.68 0.68

Clear parametric distinction.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 29 / 38

Project Milestone!

1 Published Work in Journal of ChemometricsSoares Edward J., Yalla Gopal R., O’Connor John B., Walsh Kevin A.,and Hupp Amber M. (2015), Hotelling trace criterion as a figure ofmerit for the optimization of chromatogram alignment, J. Chemometrics,29, pages 200-212.

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 30 / 38

More Complex Data: Biodiesel-Diesel Blends210 chromatograms with three di�erent attributes

æ Feedstock: Pure Diesel, Soy, Canola, IRE Tallow, Texas Tallow,Waste Grease

æ Diesel Type: Flynn, Hess, Shell, Sunoco

æ Blend Ratio: B2, B5, B10, B20

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 31 / 38

Diesel ResultsBefore Alignment:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

Diesel ResultsAfter Alignment and Optimization:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

Diesel ResultsAfter Alignment and Optimization:

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 32 / 38

ClassificationB10 Biodiesel Samples

Shell Sunoco

Texas Tallow 12 5 (¶) 5 (*)

IRE Tallow 12 5 (¶) 5 (*)

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38

ClassificationB10 Biodiesel Samples

Shell Sunoco

Texas Tallow 12 5 (¶) 5 (*)

IRE Tallow 12 5 (¶) 5 (*)

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 33 / 38

Broader Impact

1 Determine chemical components that contribute the most to theenergy content of fuel

æ Create synthetic biomaterial with energy content?2 Forensic / Environment Concerns

æ Determine origins and consequence of oil spill

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 34 / 38

Future Work

1 Algorithmic Development

æ COW has very long computation time.

æ No parametric pattern

2 Larger Sample Size for HTC Results

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work

1 Algorithmic Developmentæ COW has very long computation time.

æ No parametric pattern

2 Larger Sample Size for HTC Results

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work

1 Algorithmic Developmentæ COW has very long computation time.

æ No parametric pattern

2 Larger Sample Size for HTC Results

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Future Work

1 Algorithmic Developmentæ COW has very long computation time.

æ No parametric pattern

2 Larger Sample Size for HTC Results

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 35 / 38

Acknowledgements

Thank you for listening!

Professor Amber Hupp

Professor Kevin Walsh

Colette Houssan

Mike Comiskey

Department of Mathematics & Computer Science

Department of Chemistry

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements

Thank you for listening!

Professor Amber Hupp

Professor Kevin Walsh

Colette Houssan

Mike Comiskey

Department of Mathematics & Computer Science

Department of Chemistry

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements

Thank you for listening!

Professor Amber Hupp

Professor Kevin Walsh

Colette Houssan

Mike Comiskey

Department of Mathematics & Computer Science

Department of Chemistry

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

Acknowledgements

Thank you for listening!

Professor Amber Hupp

Professor Kevin Walsh

Colette Houssan

Mike Comiskey

Department of Mathematics & Computer Science

Department of Chemistry

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 36 / 38

AcknowledgementsJournal of Chemometrics

University Syringe Program Grant from Hamilton Company (AMH).

Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).

College of the Holy Cross.

National Institute of Standards and Technology (NIST, Gaithersburg,MD)

Western Dubuque Biodiesel

ADM Company,

Keystone Biofuels,

TMT Biofuels,

Texas Green Manufacturing

Iowa Renewable Energy

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38

AcknowledgementsJournal of Chemometrics

University Syringe Program Grant from Hamilton Company (AMH).

Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).

College of the Holy Cross.

National Institute of Standards and Technology (NIST, Gaithersburg,MD)

Western Dubuque Biodiesel

ADM Company,

Keystone Biofuels,

TMT Biofuels,

Texas Green Manufacturing

Iowa Renewable Energy

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38

AcknowledgementsJournal of Chemometrics

University Syringe Program Grant from Hamilton Company (AMH).

Robert L. Ardizzone Fund for Junior Faculty Excellence (AMH).

College of the Holy Cross.

National Institute of Standards and Technology (NIST, Gaithersburg,MD)

Western Dubuque Biodiesel

ADM Company,

Keystone Biofuels,

TMT Biofuels,

Texas Green Manufacturing

Iowa Renewable EnergyGopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 37 / 38

Thank you Professor Soares!

Couldn’t have done it without you Sauce!

Gopal Yalla (Holy Cross) Analysis of Biofuels April 28, 2015 38 / 38

top related