+ All Categories
Home > Documents > A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop...

A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop...

Date post: 16-Dec-2015
Category:
Upload: maximus-pearcey
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
15
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop Mayampurath, Chuan-Yih Yu Info-690 (Glycoinformatics) Final Project Presentation
Transcript

A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data

Anoop Mayampurath, Chuan-Yih YuInfo-690 (Glycoinformatics) Final Project Presentation

Background

[1] Kyselova et al. “Alterations in the Serum Glycome Due to Metastatic Prostate Cancer “ Journal of Proteome Research, 2007, 6:1822-1832

[2] Tang et. al “Identification of N-Glycan Serum Markers Associated with Hepatocellular Carcinoma from Mass Spectrometry Data” Journal of Proteome Research, 2009, Article ASAP[3] Ressom et. al “Analysis of MALDI-TOF Mass Spectrometry Data for Discovery of Peptide and Glycan Biomarkers of Heptacelluar Carcinoma, Journal of Proteome Research, 2008, 7:603

Objective• Given a set of N mass spectra(disease and healthy),

develop an algorithm that identifies “significant” spectra and glycan peaks▫ From the significant glycan peaks

Nature of regulation between disease and healthy Study of effects such as fucosylation and linkage

▫ From the significant spectra A smaller set of spectra m << N that help in analysis Glycan annotation Check for overlapping glycans

• What is meant by “significant”?▫ Elements that exhibit coherent patterns and large variation

between disease and healthy• Datasets

▫ 151 MALDI TOF mass spectra : 73 cancer, 78 normal

Data Processing - MultiNGlycan

•Details▫Background subtraction▫Peak Picking▫Identification of common glycans across all

151 spectra▫Filtering using Fit Coefficient cutoff > 0.5

30% of spectra has glycan fit coefficient greater that 0.5, then retain

•A Nxp matrix X is obtained (N : number of glycans, p: number of spectra)

Multi-PCA algorithm•Perform PCA

•Perform inner-product

•Sort glycans by inner product (which measure correlation)

•Shave off 10% of glycans with the lowest inner product score

•Repeat

1 2( , ..... )NX

[4] Hastie et. al ‘‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns’, Genome Biology 2000, 1(2):1-21

1,X

Multi-PCA AlgorithmX

1

Sort by inner product, shave of 10% of glycans

' s.t. | ' |X X X

1

-The algorithm was iterated until 10 glycan values were acquired. The glycans are supposed to be coherent in intensity changes while having high variance between cancer and no cancer- We also switched dimensions to shave off spectra. The algorithm was iterated until we got 6 spectra

Results

Mass value

Total Intensity

Tota

l In

ten

sity

Filtered out

Not present in original composition file

Mass value

Tota

l In

ten

sity

Significant Spectra•No overlapping glycans were found

Future Directions

•Fragmentation of glycans to study effect of linkage among glycans

•Glycan microarray•More detail on overlapping glycans

(substitute single score by combined score)

•Orthogonalize the data to see other patterns.

Acknowledgements

•Prof. Haixu Tang, School of Informatics & Computing

•Prof. Yehia Mechref, Dept of Chemistry


Recommended