CACAO - Remote training Gene Function and Gene Ontology Fall 2011 CACAO.

Post on 19-Dec-2015

214 views 0 download

Tags:

transcript

CACAO - Remote training

Gene Function and Gene OntologyFall 2011

http://gowiki.tamu.edu/wiki/index.php/Category:CACAO

“Scientists find gene that ...”

An avalanche of genes

• High throughput sequencing is finding genes faster than we can understand them

• Goals for annotation:– Where the genes are in

the genome

– What their functions are

Function annotation

• Allows us to– Infer the functions of genes

• Related by common descent

• Related by similar expression patterns

• Related by phylogenetic profiles

• ...

Function annotation

• Allows us to – Understand the capabilities of

organisms genomes

– Understand patterns of gene expression• In different environments

• In different tissues

• In disease states

– ...

Classic MODel

Literature

Datasets

Curators(rate limiting)

Database

Requirements

• Accurate functional annotation for as many genes as possible

• A system of assigning function that allows both humans and computers to compare, contrast, analyze, and predict gene function

• Curators to make and/or check these assignments– For CACAO, we will teach you what

biocurators do.

CACAO

• Community

• Assessment– How well can

• Community – you (with our coaching)

• Annotation with– assign gene functions

• Ontologies– using GO?

CACAO is competitive

• Teams get points for complete annotations– GO term (right level of specificity)– reference– evidence code– identify where in the paper the evidence comes

from

• Teams can take away points from competitors by challenging annotations– finding a problem– suggesting a better alternative

What’s in it for you (besides credit)?

– We hope you will • learn how we think

about gene function

• gain skills that will help your future career

• enjoy contributing to a resource used by people all over the world

• have fun!

The gist of CACAO…

Finding evidence(in papers)

Making annotations

Using GO terms

GO = Gene Ontology

• Controlled vocabulary– Everyone uses the same terms

– Terms have IDs that computers can understand

• Relationships between functions

Gene OntologyA common system for describing gene function

GO

• 3 aspects (ontologies) for gene products

1. Biological Process

2. Molecular Function

3. Cellular Component

• Used to make annotations– aka Gene associations– Term + qualifiers + evidence code + reference etc.

Molecular Function

• activities or “jobs” of a gene product

glucose-6-phosphate isomerase activity

from GOCfigure from GO consortium presentations

Biological Processa commonly recognized series of events

cell division

Figure from Nature Reviews Microbiology 6, 28-40 (January 2008)

Cellular Component

• where a gene product acts

Key elements of a GO annotation

Submitted to GO consortium

Viewable on GONUTS

**Don’t worry - I will cover this again (several times)!

GO Annotation

• To make an annotation, you need to– Assign GO terms to genes (gene

products)• At appropriate level of specificity

• Sometimes with Qualifiers – NOT

– Contributes_to

– Colocalizes_with

– Record the evidence

Record the evidence

• Where it came from: – Reference (database accession)

• PMID:6987663

• Kind of evidence: – Evidence codes

• IMP: Inferred from Mutant Phenotype

• IDA: Inferred from Direct Assay

• …

CACAO - the “Community Annotation” part

What I am going to tell you about next is:1. How to choose proteins to annotate2. Finding GO terms & navigating a GO term page

3. Finding UniProt accessions4. Making gene pages on GONUTS & the anatomy of a gene page

5. How and where to add an annotation6. Where to look for your annotations & other teams’ annotations … (& the challenges!)

http://gowiki.tamu.edu/wiki/index.php/

Deciding what to annotate1. randomly

2. topics of interest (ie efflux pump proteins, biofilms)

3. papers you have come across while doing other stuff

4. methods you know or want to learn

5. phenotypes and mutants you are interested in

6. by author

7. by pathway or regulon

8. suggested by another (ie high IEA:manual annotation ratio)

9. current paper mentions another gene product

10. review papers (ie Annual Reviews are excellent sources)

EXAMPLE #1: let’s say you have a great paper (PMID:1111) that characterizes the tyrosine kinase activity of your

favorite protein (human p53)…

Part I: Where do you search for GO terms? GONUTS

http://gowiki.tamu.edu

• CHICK - AgBase (Gallus gallus)• dictyBase - dictyBase (Dictyostelium discoideum - slime mold)• FB - FlyBase (Drosophila melanogaster)• HUMAN - Reactome, BHF-UCL• MGI - Mouse genome informatics (Mus musculus - house mouse)• SGD - Saccharomyces genome database (Saccharomyces cerevisiase - yeast)• TAIR - The Arabidopsis Informatics Resource (Arabidopsis thaliana)• WB - WormBase (Caenorhabditis elegans)• ZFIN - Zebrafish model organism database (Danio rerio)

What do you actually need once you have found the correct term?

GO:0004713

Part II: You now have a paper, a protein & you found a suitable GO

term… what next?

• UniProt accession - http://www.uniprot.org

- Search (“Query”) & find the correct UniProt accession for your protein

- Look something like: P012A9

Part III: Where are you going to add your annotations? GONUTS

http://gowiki.tamu.edu

How do you make a new gene page in GONUTS?

• Use the UniProt accession to make a page that you will be able to add your own annotation to.

• GoPageMaker will:1. Check if the page exists in GONUTS & take you there if it does.2. Make a page & pull all of the annotations from UniProt into a

table that you can edit.

Where do you add an annotation? Add a row in the table.

What you must fill in (for every annotation)

GO:0004713

PMID:1111

IDA: Inferred from direct assay

Figure 2a

What you might also have to fill in

Not sure? Check the competition guidelines. Ask a coach (Jim, Debby, Adrienne or usually me)!

Where will your annotation now show up?

1. In the “Annotation” table on the gene page you just edited

2. In the table on your user pagehttp://gowiki.tamu.edu/wiki/index.php/User:Oherrera

3. In the table on your team pagehttp://gowiki.tamu.edu/wiki/index.php/Category:Team_That_Will_Beat_You!!!

4. As points on the scoreboardhttp://gowiki.tamu.edu/wiki/index.php/Category:CACAO_UW_Parkside

Questions?

At this point, you should be able to:1. Find GO terms on GONUTS2. Find UniProt accessions on UniProt3. Make a gene page on GONUTS4. Add an annotation

CACAO - the “Community Assessment” part

http://gowiki.tamu.edu/wiki/index.php/Category:CACAO_UW_Parkside

Example starting from a topic– Shiga toxin

PMID:2677606

Make page on GONUTS for Q7BQ98. Has

PMID:2677606 been annotated?

If it has, search PubMed for a different article.

http://www.ncbi.nlm.nih.gov/pubmed?term=2677606

GO:0009405 ?

What GO term? (hint: search GONUTS)