+ All Categories
Home > Documents > Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.

Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.

Date post: 18-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007
Transcript

Gene function analysis

Stem Cell Network

Microarray Course, Unit 5

May 2007

Sections

• Introduction to Gene Ontology

• GOstat

• Example

Gene OntologyMichael Ashburner

Annotate genes or proteins

http://www.geneontology.org

Started for Drosophila melanogaster (fly). Now expanded for all taxa

Gene OntologyBiological process A phenomenon marked by

changes that lead to a particular result, mediated by one or more gene products.

Molecular function Elemental activities, such as

catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.

Cellular component The part of a cell of which a

gene product is a component; for purpose of GO includes the extracellular environment of cells; a gene product may be a component of one or more parts of a cell; this term includes gene products that are parts of macromolecular complexes, by the definition that all members of a complex normally copurify under all except extreme conditions.

Gene Ontologyhttp://www.geneontology.org/GO_nature_genetics_2000.pdf

Biological process

Gene Ontologyhttp://www.geneontology.org/GO_nature_genetics_2000.pdf

Gene Ontologyhttp://www.geneontology.org/GO_nature_genetics_2000.pdf

Gene OntologyEvidence codes http://www.geneontology.org/GO.evidence.shtml

IC: Inferred by CuratorIDA: Inferred from Direct AssayIEA: Inferred from Electronic AnnotationIEP: Inferred from Expression Pattern (2006)IGC: Inferred from Genomic Context (2007)IGI: Inferred from Genetic InteractionIMP: Inferred from Mutant PhenotypeIPI: Inferred from Physical InteractionISS: Inferred from Sequence or Structural SimilarityNAS: Non-traceable Author Statement (2006)ND: No biological Data availableRCA: inferred from Reviewed Computational AnalysisTAS: Traceable Author StatementNR: Not Recorded (2006)

Gene Ontology

Stats. May 29th 2007.

biological_process: 13,553 terms (10,894 in 2006; 9,277 in 2005)

cellular_component: 1,966 terms (1,815; 1,512)

molecular_function: 7,609 terms (7,927; 6,957), Total: 23,128 terms (20,636; 17,746)

Gene OntologyStats. May 29th 2007.Mouse Genome Informatics (The Jackson Laboratory http://www.informatics.jax.org/)

• biological_process: 14,200 genes, 42,675 annotations (3.0 kw/gene) [13,329 genes, 33,783 annotations (2.5 kw/gene) in 2006]

• cellular_component: 14,713 genes, 31,330 annotations (2.1 kw/gene) [13,547 genes, 26,515 annotations (2.0 kw/gene)]

• molecular_function: 15,553 genes, 50,343 annotations (3.2 kw/gene) [14,056 genes, 40,806 annotations (2.9 kw/gene)]

8.3 terms per gene [7.5 in 2006]

Databases using Gene Ontology

NetAffx (Affymetrix probe annotations)Flybase (sequences) was the firstSGD (yeast)MGI (mouse)InterPro (Protein sequences)ProDom (Protein domains)Entrez Gene (gene information)

GOstatFind statistically overrepresented properties within a group of genes as selected by...

...typically, analysis of a DNA microarray experiment

http://gostat.wehi.edu.au/

Beissbarth & Speed (2004) Bioinformatics, 20: 1464-1465.

gene Agene Bgene Cgene Dgene E

XX

Total set of genes2,000 of 5,000 are XNot significantY

YTotal set of genes4 of 5000 are YVery significant

•Do it for all Gene Ontology terms•Take into account the structure of the ontology•Sort by p-values

GOstat

Contigency Table

genes with GO in group

total genes in group

selected genes (e.g. differentially

expressed)

reference group (e.g. all genes on array)

51

176

467

9180

p-value8e-52

Chi-square Test (Fisher's Exact Test for small values)

Probability of obtaining those values from a random distribution.

Web tool

Web tool

Output

ExampleWe will study the function of a set of genes selected via StemBase http://www.stembase.ca/

(see corresponding Unit for more info on using StemBase)

http://gostat.wehi.edu.au/

1. Select a set of genes

Objective:Genes correlated to Lgals3bp (lectin, galactoside-binding, soluble, 3 binding protein)

A galectin, a beta-galactoside-binding protein implicated in modulating cell-cell and cell-matrix interactions

1. Select a set of genes

1. Select a set of genes

1. Select a set of genes

1. Select a set of genes

2. Run in GOstat

2. Run in GOstatCalcium ion binding

mannosyl-oligosaccharide mannosidase activity

2. Run in GOstat

http://www.geneontology.org/amigo

2. Run in GOstatCalcium ion binding

mannosyl-oligosaccharide mannosidase activity

2. Run in GOstat

3. Examine expression

MAN2A11448647_atMAN1A 1417111_atLgals3bp 1448380_at

3. Examine expression

3. Examine expression

To know more

• Gene Ontology. http://www.geneontology.org/GO.doc.shtm

• GOstat

http://gostat.wehi.edu.au Beissbarth & Speed (2004) Bioinformatics, 20: 1464-1465.

• StemBase. http://www.stembase.ca

See corresponding Unit in this course.


Recommended