+ All Categories
Home > Documents > Joint Selection of Wavenumber Regions for MidIR and Raman … · 2015-05-29 · Explanatory...

Joint Selection of Wavenumber Regions for MidIR and Raman … · 2015-05-29 · Explanatory...

Date post: 13-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
1
Lidwine Grosmaire, Pedro Maldonado-Alvarado, Christelle Reynès, Robert Sabatier, Dominique Dufour, Thierry Tran and Jean-Louis Delarbre Joint Selection of Wavenumber Regions for MidIR and Raman Spectra and Variables in PLS Regression using Genetic Algorithms The Context Empirical and small-scale processing Good breadmaking ability Interest Increased production and consumption of cassava From the crop to the starch 2012 EFFoST Annual Meeting 20-23 November 2012 • Montpellier, France Varietal and process impacts on breadmaking ability Standardize and scale-up the process Physicochemical parameters Improve product quality Industrial development of new gluten-free bread products Aim The Aim The Data Physicochemical parameters Spectral data PLS1 Y Ra 3 X 1 52 1 52 1 4562 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 500 1 000 1 500 2 000 2 500 3 000 3 500 Shift Raman (cm-1) RVA selGAmPLS 3351 1 c 1 X IR 2 X 13 Breadmaking ability Amylose content RVA parameters (12 variables) Mid-infrared spectra (3351 variables) Raman spectra (4562 variables) How to explain the breadmaking ability from our data using a statistical regression method while selecting variables of different types: individual and intervals? Explanatory variables are organized in a multitable in which intervals and individual variables are selected in order to predict one variable of interest: the breadmaking capacity. A Genetic Algorithm (GA) was developed in a context of discrimination, jointly with the PLS1 method : this method is called selGAmPLS. The Results Fig 2: Final GA populations characteristics: selected variables are indicated by black points Table 1: Comparison with other methods results (number of selected variables, number of retained PLS components, R² and cross-validation R²). The 10 final populations are quite close indicating a global convergence of the GA Individual populations seem to have converged Method # var # comp CV PLS 7926 7 0.7836 0.6605 PLS + VIP 4 3 0.7210 0.6650 selAGmPLS 311 12 0.9936 0.8273 Physicochemical parameters 4 RVA parameters Peak Viscosity Holding Strenght Breakdown Relative Breakdown Spectroscopic variables 4 spectral regions Crystalline order of starch Selected variables Conclusion Genetic Algorithms provide a very adaptable and efficient solution when dealing with both several kinds of variables selections (individual vs intervals) and multiway tables. The results obtained are very interesting for a predictive use. In terms of interpretation the method allowed to highlight the importance of some physico-chemical variables and to select a small number of short intervals in spectroscopic data. The data selected are related to the water absorptivity and the crystalline state of starch and play a key role in breadmaking ability 7926 variables The Problematic Fig 1: Selection of relevant intervals and variables in multitables with a GA for modelling Y thanks to PLS
Transcript
Page 1: Joint Selection of Wavenumber Regions for MidIR and Raman … · 2015-05-29 · Explanatory variables are organized in a multitable in which intervals and individual variables are

Lidwine Grosmaire, Pedro Maldonado-Alvarado, Christelle Reynès, Robert Sabatier, Dominique Dufour, Thierry Tran and Jean-Louis Delarbre

Joint Selection of Wavenumber Regions for MidIR and Raman Spectra and Variables in PLS Regression using

Genetic Algorithms

The Context

Empirical and small-scale processing Good breadmaking ability

Interest

Increased production and consumption of cassava

From the crop to the starch

20

12

EF

Fo

ST

An

nu

al

Me

eti

ng

20

-23

No

ve

mb

er

20

12

• M

on

tpe

llie

r, F

ra

nc

e

Varietal and process impacts on breadmaking ability

Standardize and scale-up the process

Physicochemical parameters

Improve product quality

Industrial development of new gluten-free bread products

Aim

The Aim

The Data

Physicochemical parameters Spectral data

Predicting the expansion ability

PLS1

Y

IR2X

Ra3X1

52

1

52

1 4562

-0.45

-0.40

-0.35

-0.30

-0.25

-0.20

-0.15

-0.10

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

Inte

nsité (

co

ups)

500 1 000 1 500 2 000 2 500 3 000 3 500

Shift Raman (cm-1)

RVA

Relevant parameters

selGAmPLS

33511

c1X

IR2X

13

Breadmaking abilityAmylose contentRVA parameters (12 variables)

Mid-infrared spectra (3351 variables)

Raman spectra (4562 variables)

How to explain the breadmaking ability from our data using a statisticalregression method while selecting variables of different types:individual and intervals?

Explanatory variables are organized in a multitable in which intervals andindividual variables are selected in order to predict one variable ofinterest: the breadmaking capacity.

A Genetic Algorithm (GA) was developed in a context of discrimination,jointly with the PLS1 method : this method is called selGAmPLS.

The ResultsFig 2: Final GA populations

characteristics: selected variables are indicated by black points

Table 1: Comparison with other methods results (number of selected variables, number of retained

PLS components, R² and cross-validation R²).

The 10 final populations are quite close indicating a global convergence of the GA

Individual populations seem to have converged

Method # var # comp R² R²CV

PLS 7926 7 0.7836 0.6605

PLS + VIP 4 3 0.7210 0.6650

selAGmPLS 311 12 0.9936 0.8273

Physicochemical parameters

4 RVA parameters

Peak ViscosityHolding Strenght

BreakdownRelative Breakdown

Spectroscopic variables

4 spectral regions

Crystalline order of starch

Selected variables

Conclusion

Genetic Algorithms provide a very adaptable and efficient solution when dealing with both several kinds of variables selections (individual vsintervals) and multiway tables.

The results obtained are very interesting for a predictive use. In terms of interpretation the method allowed to highlight the importance ofsome physico-chemical variables and to select a small number of short intervals in spectroscopic data.

The data selected are related to the water absorptivity and the crystalline state of starch and play a key role in breadmaking ability

7926 variables

The Problematic

Fig 1: Selection of relevant intervals and variables in multitables with a GA for modelling Y thanks to PLS

Recommended