Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | milo-alexander |
View: | 215 times |
Download: | 0 times |
Mary Ann Tuli Advisory Board Meeting, CSHL 2005
WormBase and the CGCWormBase and the CGC
Mary Ann Tuli
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Growth of genetic dataGrowth of genetic data
0
2000
4000
6000
8000
10000
12000
14000
CGC GeneName
Gene Class Allele Strain Multipointdata
WS120 Mar 2004
WS130 Aug 2004
WS140 Mar 2005
WS150 Oct 2005
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
WBGeneWBGene
≈ ?Gene model introduced in April 2004 (WS124)
≈ Name server – streamlining gene tracking
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Gene ClassesGene Classes
≈ 200 classes with no members
≈ 1692 CGC names not connected to sequences
≈ let and seven TM receptors are largest gene classes
1250
1300
1350
1400
1450
1500
1550
1600
1650
GeneClasses
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
CGC Gene NamesCGC Gene Names
0
1000
2000
3000
4000
5000
6000
7000
CGC Gene Name
WS100 May 2003
WS110 Oct 2003
WS120 May 2004
WS130 Aug 2004
WS140 Mar 2005
WS150 Oct 2005
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Gene Naming PipelineGene Naming Pipeline
Web Form
Submitter
Geneace Curator
CGC
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Developments – Tracking Developments – Tracking gene namesgene names
Before:Gene_name: abu-1 Gene_class: abu-1Other_name: pqn-1 Remark“pqn-1 is Other_name of
abu-1 and has been merged into it”
After:Gene_name: abu-1 Gene_class: pqnOther_name: pqn-1 Old_member: pqn-1
Gene_name: pqn-1Former_member_of: pqn
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Developments – Tracking Developments – Tracking gene namesgene names
≈ Former_member_of and Old_member introduced in WS144
≈ WS150 = 663 CGC Other_names in 291 gene classes
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Developments - StatusDevelopments - Status
≈ Before:≈ Live tag only in ?Gene model≈ Absence implied object was Dead≈ Difficult to differentiate between different
statuses
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Developments - StatusDevelopments - Status
≈ After:≈ Status tag introduced in Gene and
Variation model (WS144)≈ Live, Dead or Suppressed
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
The Variation ClassThe Variation Class
Variation ClassWS140
Locus Class Allele Class
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
The Variation ClassThe Variation Class
≈ Type of Variation≈ Deletion≈ Insertion_and_deletion≈ Insertion≈ Substitution≈ Mos_insertion≈ Transposon_insertion≈ SNPs
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Growth in Allele DataGrowth in Allele Data
0
2000
4000
6000
8000
10000
12000
14000
WS120WS130
WS140WS150
Total
Knockouts
≈ Nearly 10,000 manually curated alleles
≈ Most have at least a gene connection
≈ Many have details of the strain carrying the mutation
≈ 1500 have rich annotation≈ Description of lesion≈ Connection to sequence
≈ Submission of Plasterk high throughput chemical mutagenesis/sequencing will result in many new alleles
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Allele Submission PipelineAllele Submission Pipeline
Web Form
SubmitterNBPGeneace Curator
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Knockout AllelesKnockout Alleles
0
200
400
600
800
1000
1200
1400
WS120WS130
WS140WS150
≈ Mark Edgley≈ Jeff Holmes
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Knockout AllelesKnockout Alleles
0200400600800
10001200140016001800 ≈ Shohei Mitani
NBP
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Knockout Alleles - plansKnockout Alleles - plans
≈ Possible Web form for collaborators to upload data
≈ Advantages≈ onus on user to provide accurate data≈ More efficient way for us to convey
changes in database conventions
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Strain DataStrain Data
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
≈ Sent periodically to WormBase from Theresa Stiernagle
≈ Leads to merges of Gene names and sequences
≈ Leads to updates of tag- genes
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Strain Data – tag gene classStrain Data – tag gene class
≈ All genes with KO alleles should have name which follows recommendations e.g. unc-12 not R09B3.4
≈ tag- genes assigned…but the list kept growing
≈ No longer assign new tag- genes
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Laboratory DataLaboratory Data
460
470
480
490
500
510
520
530
540
550
WS120WS130
WS140WS150
Labs
≈ Laboratory data sent from the CGC and Caltech
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Multipoint DataMultipoint Data
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
WS12
0
WS13
0
WS14
0
WS15
0
Total
Inferred
≈ Process of adding inferred multi_pt_data continues
≈ Script in Jan 2004 to add inferred data.
≈ 1996 ~1,300 genetic marker loci
≈ Mar 2004 – 2,500 markers≈ Oct 2005 – 4,000 markers
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
The Genetic MapThe Genetic Map
≈ Recent transfer of knowledge from Jonathan Hodgkin and Richard Durbin is enabling WormBase to update the genetic map when new information becomes available.
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
The end of the CGC contractThe end of the CGC contract
≈ Subcontract between CGC and Oxford (Jonathan Hodgkin) runs until May 2007.
≈ WormBase needs to prepare for this.
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
Future PlansFuture Plans
≈ Continue to ensure timely incorporation of all data..including alleles!
≈ Streamline submission processing≈ Update Web forms≈ Improve scripts
≈ Improve models
Mary Ann TuliAdvisory Board Meeting, CSHL 2005
CollaboratorsCollaborators
≈ The CGC≈ Jonathan Hodgkin≈ Bob Herman & Theresa Stiernagle
≈ The Knockout Consortium≈ Mark Edgley≈ Jeff Holmes
≈ National BioResource Centre, Japan≈ Shohei Mitani