Post on 17-Jan-2016
transcript
Nuria Lopez-Bigas
Methods and tools in functional genomics
(microarrays)
BCO17
What are microarrays?
What are microarrays?
Microarray data analysis is the step that will allow us to extract biological meaning to high-throughput data generated with the experiment.
Microarray data analysis
Microarray data analysis
Microarray DATANormalized data Data preprocession and normalization
Normalization and Noise:
Normalization
• Some kind of normalization is usually required when comparing more
than one microarray experiment.
• Adjust to account for differences in overall brightness of slides
• Normalize relative to housekeeping genes
Noise
• Refers to variability and reproducibility of microarray experiments
• Intra and inter-microarray variations can significantly skew
interpretation of data
• Sample collection is very important. If comparing two conditions you
must control for all variables other than the one you are trying to measure
• Technical noise can result from imperfections in the chip.
• Both biological and technical replicates are required to measure and
control these sources of noise
Microarray data analysis
Microarray data analysis
Differential expression
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
Microarray data analysis
Differential expression
GO,KEGG…analysis
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
http://www.geneontology.org
The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism.
The Ontologies •Cellular component•Biological process•Molecular function
BROWSER::AMIGO
TOOLS
Gene Ontology
Gene Ontology
Gene Ontology
Gene Ontology::Tools
http://www.geneontology.org/GO.tools.shtml
http://www.fatigo.org/
http://www.barleybase.org/funcexpression.php
http://discover.nci.nih.gov/gominer/htgm.jsp
FUNC-EXPRESSION
KEGG http://www.genome.jp/kegg/
Microarray data analysis
Differential expression
GO,KEGG…analysis
Classification
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
Classification
Support vectors machines
Desition trees
Microarray data analysis
Differential expression
GO,KEGG…analysis
Classification
Clustering
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
Supervised versus Unsupervised:
Supervised
• Analysis to determine genes that fit a predetermined pattern
• Usually used to find genes with expression levels that are significantly different between
groups of samples or finding genes that accurately predict a characteristic of the sample
• Two popular supervised techniques would be nearest-neighbour analysis and support
vector machines.
Unsupervised
• Analysis to characterize the components of a data set without a priori input or
knowledge of a training signal
• Try to find internal structure or relationships in data without trying to predict some
‘correct answer’.
• Three classes:
1. Feature determination: Look for genes with interesting patterns
Eg. Principal-components analysis
2. Cluster determination: Determine groups of genes with similar expression patterns
eg. Nearest-neighbour clustering, self-organizing maps, k-means clustering, 2d
hierarchical clustering
3. Network determination: Determine graphs representing gene-gene or gene-phenotype
interactions.
Eg. Boolean networks, Bayesian networks, relevance networks
Clustering & Classification
Clustering & Classification
Cooper Breast Cancer Res 2001 3:158
Microarray data analysis
Differential expression
GO,KEGG…analysis
Clustering
Classification
Promoter analysis
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
Promoter analysis::TFBS
TRANSFAC
Promoter analysis::Tools
http://www.cisreg.ca/
Microarray data analysis
Differential expression
GO,KEGG…analysis
Clustering
Classification
Promoter analysis
Reverse engineering
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis
Reverse engineering
Microarray data analysis
Differential expression
GO,KEGG…analysis
Clustering
Classification
Promoter analysis
Reverse engineering
Microarray DATANormalized data Data preprocession and normalization
Data
analy
sis