TAIR: Bringing together data for the global plant biology community kate dreher curator TAIR/PMN.

Post on 27-Mar-2015

215 views 0 download

Tags:

transcript

TAIR: Bringing together data for the global plant biology community

kate dreher

curator

TAIR/PMN

Acquiring new gene/protein function data in TAIR TAIR curators Research community

Using gene function data Searching by function Working with large datasets Connecting to other species

Gene/protein function data at TAIR

Functional curation pipeline ~200 papers about Arabidopsis show up in Pubmed each month

TAIR curators link papers to appropriate loci Please help us and all researchers who read your paper . . . Report the AGI locus code for every gene in the paper

ASA1 is ambiguous ANTHRANILATE SYNTHASE ALPHA SUBUNIT 1? ATTENUATED SHADE AVOIDANCE 1? ATP SULFURYLASE ARABIDOPSIS 1?

AT3G02260 is unique

Papers are prioritized for in-depth literature curation First priority – Papers with data about unannotated / novel genes

TAIR curators read primary literature to extract gene functional data

Capturing gene/protein function data at TAIR

TAIR curators captures gene/protein function data using: Free text

Gene descriptions

Gene names / symbols Please help us by writing the full name of symbols, especially the first time

that they are published GGT2 = Glutamate:Glyoxylate aminotransferase 2

Mutant phenotype descriptions We love to see ABRC stock numbers, SALK/SAIL/GABI-Kat IDs, etc. Please check to see what allele numbers have already been used

Textpresso is a big help!

Capturing gene/protein function data at TAIR phyA-201

Free text functional data on TAIR locus pages

Gene description

Gene names / symbols

Free text functional data on TAIR locus pages

Mutant phenotypes

TAIR curators captures gene/protein function data using:

Controlled vocabularies Gene Ontology terms

Molecular function (e.g. transcription factor activity) Biological process (e.g. phosphate transport) Cellular component (e.g. chloroplast)

Plant Ontology terms Plant structure (e.g. endosperm) Plant growth and development stages (e.g. root primordium formation)

Capturing gene/protein function data at TAIR

GO and PO functional data on TAIR locus pages

GO terms

PO terms

Getting detailed functional data

It’s 2010 . . . Do we know what every Arabidopsis protein does?

9024 genes (~30%) are linked to experimental functional data (3/2010)

Is there more information out there? ~32% of the PubMed Arabidopsis papers from 2009 were curated

~68% were not curated Additional articles appear in plant journals not indexed by Pubmed

How can we get closer to our 2010 goal? On-going TAIR curation Increased community annotation

Journal/author collaborations NEW on-line gene functional data submission tool

Capturing gene/protein function data at TAIR

Journal collaborations First started in March 2008 with Plant Physiology

Current collaborators and methods Submit at ASPB website:

Plant Physiology

Fill out a spreadsheet: The Plant Journal

Use the NEW TAIR gene functional data submission tool Journal of Integrative Plant Biology Plant, Cell and Environment Journal of Experimental Botany Plant Science  Environmental Botany Plant Physiology and Biochemistry

But you can use the tool TODAY! 

Contributing new functional data – as you publish

Contributing new functional data – anytime!

Given by publisheror found online

We welcome data fromALL your publications . . .but please add them one at a time

AT2G01830 WOL Wooden Leg

16753566

Adding gene function annotations

kinase

But I actually know that it is a histidine kinase . . .

Try entering a different search term

Adding gene function annotations

histidine

Is there an even more specific / appropriate term?

Check TAIR Keywords

Choosing the best term

Choosing the best term

Providing an experimental method

in vitro

But what if my term or method do not appear?

Entering new terms and methods

kinase sextuple mutant

Adding additional information

Do I have more molecular function data about THIS gene in THIS paper?

kinase

Yes!Nope!

Adding additional information

Adding additional information

in vitro assay

Adding additional information

Adding additional information

Adding additional information

What “Other” information can I add?

• Mutant phenotype information

• Identity of other loci in double/

triple / quadruple mutants, etc.

• Description of gene

• Any other free text information

The wol-8 EMS mutant (CS07856) has a point mutation in the first exon that introduces a premature stop codon. The roots of mutant plants fail to respond to the exogenous application of cytokinin.

Covering all the data

Do I have any more information to add about OTHER genes in THIS paper?

Yes!

Entering the data into the database

Nope, no information about OTHER genes in THIS paper?

Please e-mail us with any questions or problems during or after submitting your data:

curator@arabidopsis.org

Something is better than nothing . . . If you don’t have time to hunt around for the perfect term or method, please just

give us what you can

But, if possible, please try to be . . . As complete as possible

e.g. If it’s a kinase, also add that it’s involved in biological process of phosphorylation

As specific as possible e.g. use potassium transporter instead of transporter.

Benefits of good annotation Better understanding of individual gene functions

Tips for gene function data submission

Something is better than nothing . . . If you don’t have time to hunt around for the perfect term or method, please just give us

what you can

But, if possible, please try to be . . . As complete as possible

e.g. If it’s a kinase, also add that it’s involved in biological process of phosphorylation

As specific as possible e.g. use potassium transporter instead of transporter.

Benefits of good annotation Better understanding of individual gene functions Better categorization / analysis of large-scale data sets Better functional predictions for newly sequenced genomes

Tips for gene function data submission

Vandepoele et al, 2009

BAR

TAIR

Contributing new functional data – anytime!

Many other data

types still

welcome!

How can all thisgene/protein function information

be put to good use?

Use Gene Search to find genes . . .

involved in a specific biological process with a particular molecular function found in a specific compartment expressed in a particular place and/or during a specific developmental phase

Enter keywords for GO or PO terms Can limit by evidence codes

Finding the gene(s) you want . . .

Use Gene Search to find genes . . .

involved in the same biological process with the same molecular function found in the same compartment expressed in the same place and/or at the same time

Enter keywords for GO or PO terms Can limit by evidence codes

Enter search terms / keywords for gene descriptions

Enter search terms / keywords for mutant phenotype

Finding the gene(s) you want . . .

Use Gene Search to find genes . . .

involved in the same biological process with the same molecular function found in the same compartment expressed in the same place and/or at the same time

Enter keywords for GO or PO terms Can limit by evidence codes

Enter search terms / keywords for gene descriptions

Enter search terms / keywords for mutant phenotype

Finding the gene(s) you want . . .

Putting gene/protein functional data to use

Adding value to community-generated gene families Over 150 gene families have been submitted by researchers

Working with large data sets

Adding value to community-generated gene familes Over 150 gene families have been submitted by researchers

Attach data to your favorite protein family:

Adding information to data sets

Generate a .txt file of AGI locus codes

Adding information to data sets

Adding information to data sets

Finding “related” genes in other species

Connecting to other species

Connecting to other species

Connecting to other species

Connecting to other species

Connecting to other species

save as text file

Connecting to other species

Connecting to other species

We are here to help: www.arabidopsis.org Please use the data we provide

Please use the tools we provide

Please use TAIR to help improve your research!

Please contact us if we can be of any help! Make an appointment to meet with us during the conference Please come visit our exhibitor booth – 219 – Plant Genome Resources! Please stop by poster 14022 tomorrow night

curator@arabidopsis.org

www.arabidopsis.org

www.twitter.com/tair_news

http://www.facebook.com/tairnews

(432 fans so far . . . )

AcknowledgementsTAIR

Current Curators:

- Tanya Berardini (lead curator – functional annotation)

- David Swarbreck (lead curator – structural annotation)

- Peifen Zhang (Director and lead curator- metabolism)

- Philippe Lamesch (curator)

- Donghui Li (curator)

- Debbie Alexander (curator)

- A. S. Karthikeyan (curator)

- Marga Garcia (curator)

- Leonore Reiser (curator)

Eva Huala (Director) Sue Rhee (Co-PI)

Tech Team Members:- Bob Muller (Manager)- Larry Ploetz (Sys. Administrator)- Raymond Chetty- Cynthia Lee- Shanker Singh- Chris Wilks

Recent Past Contributors:

- Anjo Chi (tech team)

- Vanessa Kirkup (tech team)

-Tom Meyer (tech team)

- Rajkumar Sasidharan (curator)

Department of Plant Biology

We are here to help: www.arabidopsis.org Please use the data we provide

Please use the tools we provide

Please use TAIR to help improve your research!

Please contact us if we can be of any help! Make an appointment to meet with us during the conference Please come visit our exhibitor booth – 219 – Plant Genome Resources! Please stop by poster 14022 tomorrow night

curator@arabidopsis.org

www.arabidopsis.org

www.twitter.com/tair_news

http://www.facebook.com/tairnews