+ All Categories
Home > Documents > Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2...

Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2...

Date post: 22-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
Model-based Classification in Food Authenticity Studies D. Toher 1,2 , G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research Centre, Teagasc, (formerly The National Food Centre), Dublin 15
Transcript
Page 1: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Model-based Classification in Food Authenticity Studies

D. Toher1,2, G. Downey1 and T.B. Murphy2

Presented by: Deirdre Toher

1 Ashtown Food Research Centre, Teagasc,

(formerly The National Food Centre), Dublin 152 Dept of Statistics, School of Computer Science and Statistics, Trinity College Dublin, Dublin 2

Page 2: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Outline

• Food authenticity

• Spectroscopic data

• Current mathematical methods

• Proposed alternative – Dimension reduction– Model-based clustering– Updating

• Example near-infrared data with results

Page 3: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Food Authenticity – what and why?

• Detecting when foods are not what they are claimed to be

• Tampering/adulteration, mislabelling

• Economic fraud worth millions of US dollars globally

• Promote quality products

• Build consumer trust

Page 4: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Food Authenticity – how?

• Near infrared spectroscopy– Non-invasive– Relatively inexpensive

• Multivariate Mathematics– Partial Least Squares Regression– Factorial Discriminant Analysis– Model-based Clustering

• Other methods available (sp..)

Page 5: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Spectroscopic Data• Near infrared transflectance spectroscopy

– High dimensional data– Range 1100-2498 nm, reading every 2 nm– 700 values for each sample

Page 6: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Current Mathematical Methods

• Discriminant Partial Least Squares Regression

• Factorial Discriminant Analysis

Problem?– Limited to “two-group” classification problems– No quantification of certainty

Page 7: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Proposed Alternative

Model-based clustering

– Expansion of discriminant analysis– Allows clusters to vary in shape and size– Gives probability of a sample being in each

cluster/group– Can classify situations with more than two

groupings

Page 8: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Possible Cluster Shapes

Page 9: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

The Dimensionality Problem• Model-based clustering requires dimension

reduction – for efficient computation– to prevent singular covariance matrices

• Use wavelet analysis with thresholding

Page 10: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

EM Algorithm & Updating

• EM algorithm– expected value of the likelihood function– maximises the expected value– commonly used in statistics for estimating

missing values

• Updating– uses previous estimates of labels as a starting

point for iteration

Page 11: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Example: Honey Adulteration

• Irish honey extended with – fructose:glucose mixtures – fully inverted beet syrup – high fructose corn syrup

• Total of 478 spectra:– 157 pure and 321 adulterated

• 225 with fructose:glucose mixtures• 56 with fully inverted beet syrup• 40 with high fructose corn syrup

Page 12: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Classification AchievedClassification rates on test set data achieved

with correct proportions of each type of adulterant in the training set for “pure or adulterated” question.

Training / Test EM EM & Updating

50% / 50% 94.72% (1.12) 94.43% (1.10)

25% / 75% 93.22% (1.08) 93.05% (1.03)

10% / 90% 90.82% (1.76) 92.22% (1.11)

Page 13: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Classification AchievedClassification rates on test set data achieved

with correct proportions of pure / adulterated in the training set for “pure or adulterated” question.

Training / Test EM EM & Updating

50% / 50% 94.38% (1.16) 94.11% (0.89)

25% / 75% 93.50% (1.08) 93.03% (1.02)

10% / 90% 90.54% (1.80) 92.05% (1.09)

Page 14: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Classification AchievedClassification rates on test set data achieved

using 50% training, 50% test data

with correct proportion of pure / adulterated in the training data set for “type of adulteration” question.

Question EM EM & Updating

Pure or adulterated?

91.09% (1.40) 90.64% (1.36)

Type of adulteration

86.23% (1.20) 84.12% (1.67)

Page 15: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Classification AchievedClassification rates on test set data achieved

using 50% training, 50% test data

with correct proportions of each type of adulterant in the training set for “type of adulteration” question.

Question EM EM & Updating

Pure or adulterated?

89.41% (1.76) 88.61% (1.82)

Type of adulteration

85.70% (1.96) 83.57% (2.23)

Page 16: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Probability v Accurate Classification

Probability of group membership - by colour (black being pure, red being adulterated)

Page 17: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Conclusions

• EM algorithm gives a method of predicting group membership

• Updating procedures effective with small training sets

• Quantifying certainty

• Allows cost of misclassification to be easily incorporated into modelling

Page 18: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Questions?

Funded by:Teagasc under the Walsh Fellowship Scheme

Irish Department of Agriculture & Food

(FIRM programme)

Science Foundation of Ireland

Basic Research Grant scheme (Grant 04/BR/M0057)


Recommended