+ All Categories
Home > Documents > The i5k [email protected]/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for...

The i5k [email protected]/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for...

Date post: 04-Jun-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
45
The i5k Workspace@NAL: a pan-Arthropoda Genome Database Chris Childers and Monica Poelchau USDA-ARS, National Agricultural Library
Transcript
Page 1: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Thei5kWorkspace@NAL:apan-ArthropodaGenomeDatabase

ChrisChildersandMonicaPoelchauUSDA-ARS,NationalAgriculturalLibrary

Page 2: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Outline

• Backgroundandoverview• Whyjointhei5kWorkspace?• Whatdoweneedforaproject?• Whatwedowithyourdata?• Whatdon’twedowithyourdata?• Ournewsystemforsubmittingprojectsanddata

Page 3: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Background

• Thei5kinitiativetaskeditselfwithcoordinatingthesequencingandassemblyof5000insectorrelatedarthropodgenomes

• Internationalefforttoprioritize insectgenomesforsequencing;provideguidelinesforgenomesequencingandcuration;andseekfunding.

• Thei5kWorkspace@NAL isavailabletohelpanyi5k(arthropod)projectwithgenomehostingneeds

• Researchplan• Generatematerialfor

sequencing• Genomesequencing• Genomeassembly• Automated

annotationofgenomeassembly

• Biologicalinsights/Publication

GenomeProjectTrajectory

• ManualCuration• Officialgeneset

(OGS)generation• Genomeproject

maintenance

Page 4: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

WorkspaceProjectBasics

• Thei5kWorkspacecentersaroundprojects.• Aprojectisacollectionofdatabasedonthegenomeassemblyofanarthropod

• Alldataisusedinthecontextofthegenomeassembly

• Eachprojecthasaprojectcoordinator.• Servesasthepointofcontactforquestionsabouttheproject

• Mainresponsibility:approveorrejectnewApollousers

• All ofourdataisuser-submitted

Page 5: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Whyjointhei5kWorkspace?

• Gainaccesstoalargediversecommunity• Adiversityoforganisms

• 58speciesandcounting• 20%ofthearthropodswithgenomeassembliesatNCBI

• Largeusercommunitywithmanydifferentinterests• Peopleversedinthebiologyofspecificsystems• Expertsinaspeciesorgroupofspecies

• Acommoninterfaceforaccessingdata,toolsandsearch• Detailedpoliciesondataandprojectmanagement

• Helpfulifyouhavedatamanagementrequirements• Datamanagement

• https://i5k.nal.usda.gov/data-management-policy• Long-termprojectmanagement

• https://i5k.nal.usda.gov/long-term-i5k-workspace-project-management

Page 6: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Whatdoweneedforaproject?

• Yourprojectmetadata• Informationaboutyourorganism• Metadataforsubmitteddatafiles(themorethebetter)

• Whattoolsormethodswereused• Softwareversionsandoptionsset• Whenandwherethedataweregenerated• Otherinformation(locationcollected,life-stage,etc.)

• Yourdatafiles• GenomeassemblyneedstobeinGenBank/ENA/DDBJ• Datashouldbeopenaccess(noprivaterepositories)• Additionaldatasetsneedtobemappedtothesameassembly

Page 7: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Whatdowedowithyourdata?

• Createresources• Organismandgenepages• Datadownloads

• Integrateyourdatawithourtools• Genomebrowser• BLAST,Clustal,HMMer• Apolloforgenecuration

• Offerpostcurationservices• AnnotationQCandOfficialGeneSet(OGS)Creation• Updategenepages,Apollo,BLASTwithOGS

Page 8: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Submission

‘Frozen’genomeassembly

Automatedannotations

Ancillarydatafiles (e.g.RNA-Seq alignments)

ToolsOrganismInformation

Page

Bulkdatadownloads

Tutorials

CustomBLASTinterface

Apollomanualcurationtool

JBrowse genomebrowser

Services

Manualannotationqualitycontrol

Officialgenesetgeneration

https://i5k.nal.usda.gov/Workspace@NAL

HMMer Clustal

Resources

Challenges

Non-standarddataformatting

Failuretosubmitallmetadata(ex:sampleorigin;

analysismethods)

Page 9: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Whatdon’twedowithyourdata?

• Computationallyintenseanalysessuchas• Geneprediction• RawRNAseqmapping

• Wearenotalong-termarchiveorrepository• NCBI• AgDataCommons• DryadDigitalRepository• CyVerse Datacommons• Manyotheroptionsavailable

Page 10: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Criteriaforstartingaproject

• Youneedtohaveanarthropod genomeassembly,accessionedbyNCBI(oranotherINSDCmember)

• UsingGenBank's accessionnumbersavoidsconfusionaboutassemblyversion

• TheGenBank contaminationscreenimprovestheassemblyquality

• Usingastableassemblyisbeneficialforthelabor-intensivecommunityannotationprocess

Page 11: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Otherthingstoconsiderbeforesubmitting• Alldatasubmittedtothei5kWorkspaceispublic.

• However,wedostatewhetherFt.Lauderdale/Torontoagreementsofdatasharingshouldapply

• Isyourgenomean‘orphan’,oristhereanothersuitabledatabase?

• Wecanhostgenomesthatarealreadyhostedelsewhere,andactivelycommunicatewithotherdatabaseproviders

• Allmanualannotationeffortsneedtobeatonedatabase

Page 12: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Gettinganaccount

• Applyforadatasetsubmissionaccount:https://i5k.nal.usda.gov/register/project-dataset/account

• Onceyouraccountisapproved,youcansubmitprojects,assembliesorotherdatasets

Page 13: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Startani5kWorkspaceProject

• Login• https://i5k.nal.usda.gov/user

• Frommenu,select’Data->Submitdata->Requestanewi5kWorkspaceProject’

• https://i5k.nal.usda.gov/datasets/request-project

• We’llreviewyoursubmissionandwillgetintouchwithyou

Page 14: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Submityourgenomeassembly

• Allinformationsubmittedthroughthisformwillbere-formattedfordisplayatthei5kWorkspace(exceptforemailaddressandfilechecksum)

• Frommenu,select‘Data->Submitdata->Submitagenomeassembly’

• https://i5k.nal.usda.gov/datasets/assembly-data

Page 15: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Submitgenepredictions

• Allinformationsubmittedthroughthisformwillbere-formattedfordisplayatthei5kWorkspace(exceptforemailaddressandfilechecksum)

• Undermenubar,select‘Data->Submitdata->SubmitGenePredictions’

• https://i5k.nal.usda.gov/datasets/gene-prediction

Page 16: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Submitmappeddatasets

• Allinformationsubmittedthroughthisformwillbere-formattedfordisplayatthei5kWorkspace(exceptforemailaddressandfilechecksum)

• Undermenubar,select‘Data->Submitdata->SubmitaMappedDataset’

• https://i5k.nal.usda.gov/datasets/mapped

Page 17: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Sendusyourfiles

• Therearecurrentlyfive waystosharefileswithus:1. Useourdatasubmissionforms2. Transmitthefileviaftp (onlyforfiles<2Gb)3. Emailittous(forfiles<25Mbonly)4. ProvideuswithaURL,ifavailable5. UploadthefiletoCyVerse andsharewithour

organization,“NALBioinformatics”• Wepreferthatyoushareyourfileswithusviaourdatasubmissionforms.

• Formoreinformation,seehttps://i5k.nal.usda.gov/content/sharing-files-us

Page 18: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

OtherresourcesattheNAL:theAgDataCommons

• HostsanydatasetfundedbytheUSDA

• Landingpage• CitableDOI• https://data.nal.usda.gov/• 9i5kdatasetsalreadyavailable

Page 19: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Needmoreinformation?i5kWorkspace@NAL:• https://i5k.nal.usda.gov/• https://github.com/NAL-i5K/

Thei5kinitiative:• Newwebsite:http://i5k.github.io/

Page 20: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

OfficialGeneSetcreationatthei5kWorkspace

Page 21: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

OfficialGeneSetcreationatthei5kWorkspace• OfficialGeneSetdefinition• OurOGSgenerationprocess

• Manualandcommunityannotation• Qualitycontrol• Merge• Release

• ExamplesandfuturedirectionsoftheOGSgenerationprocess

Page 22: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

TheOfficialGeneSet– whatisit?

• Loosedefinition:Thebestknownrepresentationofgenemodelsforagenomeassembly

• Whenthei5kWorkspacegeneratesanOGS,thisisamergebetweenonegeneset(usuallycomputationallypredicted),andasetofmanuallyvalidatedannotations(usuallyfromtheApollosoftware)

Page 23: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

WhygenerateanOfficialGeneSet?• Thisdependsonyourgenomecommunity’sneeds.• Ifseveralgroupswanttoperformdownstreamanalyses,ithelpstohaveanauthoritative‘referencegeneset’foryourcommunity,ratherthanmultiplecompetinggenesets

Page 24: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

OurOGSgenerationprocess

• Newpublicversionofprogramisavailable:https://github.com/NAL-i5K/GFF3toolkit (Mei-JuChen,Li-MeiChiang)

• Thefullprocessistime-consuming,butwearegenerallyavailabletoperformOGSgenerationfori5kWorkspaceprojects

1. Manual annotation (via Apollo)

2. Error checking Curator fixes

3. Merge with one

designated gene set

4. Release Official

Gene Set

Manual annotation

freeze

Page 25: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

1.ManualandcommunityannotationWhatismanualannotation?• Manualreviewandimprovementofanexistinggeneprediction

• Often,butnotalways:drawingonexternalevidence(e.g.RNA-Seq,cDNA,genesfromotherspecies)toimproveacomputationallypredictedgenemodel

Structuralannotation– e.g.modifyexons

Functionalannotation– e.g.addname

Page 26: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

1.ManualandcommunityannotationWhymanuallyannotate?• “Incorrectannotationspoisoneveryexperimentthatmakesuseofthem…Worsestill,thepoisonspreadsbecauseincorrectannotationsfromoneorganismareoftenunknowinglyusedbyotherprojectstohelpannotatetheirowngenomes.”

• Yandell andEnce 2012,doi:10.1038/nrg3174• Linkgenemodelstoexistingliteratureandontologies,providingricherdata

• Onecurrent‘model’ofthegenomepaperoftendrawsheavilyfrominsightsconfirmedbymanualannotation

Page 27: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

1.Manualandcommunityannotation• Whatiscommunityannotation?

• Scientistscollectivelyexamineandimprovegenemodels(usuallycomputationallypredicted)

• Communityannotationatthei5kWorkspace:• Accesstoalargecommunityofcurators• Tutorials,guidelines,webinars• Registrationmechanismfornewannotators• One-on-onesupport• Over400registeredannotatorshavecuratedover10,000genemodelsusingtheApollosoftware

Page 28: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

1.Manualandcommunityannotation– i5kpilotexampleNumberofcuratorsperorganism.Communitysizevariesamongorganisms.

Numberoforganismspercurator.35%ofcuratorsworkedonmorethanoneorganism

Page 29: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

1.Manualandcommunityannotation– i5kpilotexample

• Threeorganismsthatcompletedthemanualannotationprocesshadtoperformsimilaramountsofstructuralannotationstocomputationallypredictedgeneannotations

• Computationallypredictedgenesoftenhaveinaccurategenestructures

• Communityannotationcaneffectivelyimprovegenesets

organismTotalnumberofmanually

annotatedmodels

Proportionofmanuallyannotatedmodels with

structuralchanges

Anoplophora glabripennis6 1144 0.75

Cimex lectularius7 1354 0.76

Oncopeltus fasciatus 1518 0.76

Page 30: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

2.OGSgeneration– QualityControl• Manualcurationcanintroducemanyerrors,evenusingstandardsoftwarepackages(e.g.Apollo)

• QCprogramidentifiescommonformatting errorsfromthemanualcurationprocess

• Github repo:https://github.com/NAL-i5K/GFF3toolkit

• Identifiesover50errortypes• Anotherin-housepipelinecorrectsmanyoftheseerrors

Page 31: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

2.OGSgeneration– QualityControl• Requiressomemanualreview– can’tbecompletelyautomated

• e.g.didyounameyourgenemodel‘test’or‘Contig277’?

• Notethati5kWorkspacestaffaren’t‘curators’inthetraditionalsense– wedonotreviewthebiologicalvalidityofanyofthecommunity-annotatedmodels.

• ThedegreeofmanualreviewofcommunityannotationsishigherifOfficialGeneSetsaretobesubmittedtoNCBI

Page 32: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

2.OGSgeneration– QualityControl• Diaphorina citri example(Database,doi:10.1093/database/bax032)

• Firstroundofcorrectionsforcommunitycuration:• 513errorsin587manuallyannotatedmodels• 397oftheseerrorsneededcuratorfeedback

• Secondroundofcorrections:• 15errorsneededannotatorfeedback

Error checking Curator fixes

Page 33: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

3.OGSgeneration– Merge

TheGFF3toolkitMergeprogramcanidentifywhichgenemodelsinthe‘reference’genesetshouldbereplacedbygenemodelsinasecondgeneset(i.e.themanuallyannotatedmodels)via‘auto-assignment’)

Referencegene

Manuallyannotatedgene

Page 34: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

3.OGSgeneration– Merge

• Auto-assignmentusesbothsequencesimilarityandcoordinateoverlap

• ExtractCDSandpre-mRNAsequencesfrommRNAfeaturesfrombothgenesets.

• Useblastn todeterminewhichsequencesfromthemodifiedandreferencegenesetaligntoeachotherintheircodingsequence.

• Theseparametersareused:-evalue 1e-10-penalty-15-ungapped• Iftwomodelspassthealignmentstep,checkthatmatchedmodelsalsohavecoordinateoverlap

• Adda’ReplaceTag'withtheIDofeachoverlappingmodeltothemodifiedgeneset.

• Ifnoreferencemodeloverlapswithanewmodel,thentheprogramwilladd'replace=NA'.

Page 35: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

3.OGSgeneration– Merge

• TheprogramdeterminesmergeactionsbasedontheReplaceTags:1. deletion2. simplereplacement3. newaddition4. splitreplacement5. mergereplacement

• Modelsfrommodifiedmanualannotationsreplacemodelsfromreferenceannotationsbasedonmergeactionsinstep2.

Referencegene

UpdatedgeneMergereplacement

Page 36: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

3.OGSgeneration– Merge

• Diaphorina citri example(Database,doi:10.1093/database/bax032)1. #genesdeleted:12. #geneswithsimplereplacement:4373. #genes added:724. #genes split:385. #genes merged:316. TotalnumberofgenesinOGS:20,217

Page 37: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

3.OGSgeneration– Merge

• Othersoftwaretoolscanbeusedtomergegenesets

• Combinertoolsthatuse‘weights’fordifferentinputannotations,e.g.

• EVidenceModeler (EVM,https://evidencemodeler.github.io/)• Glean(https://sourceforge.net/projects/glean-gene/)

• Otheroverlap-basedreplacementtools,e.g Bedtoolsintersect(http://bedtools.readthedocs.io/en/latest/)

Page 38: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

4.OGSgeneration– ReleaseOGS

• GeneratenewormaintainoldgenemodelIDs• Establishreleasedatewithgenomecoordinator• Generatefasta files• Addtoi5kWorkspace@NAL database• *SubmittoNCBIifrequestedbygenomecoordinator*

Page 39: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

CompletedOGSprojectsusingi5kWorkspace’spipeline• Diaphorina citri OGSv1.0• Frankliniella occidentalis OGSv1.0• Hyalella azteca OGSv1.0• Oncopeltus fasciatus OGSv1.2• Athalia rosae OGSv1.0• Orussus abietinus OGSv1.0• Leptinotarsa decemlineata OGSv1.0

Page 40: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Futureupdates

• Currentimprovments:• GFF3toolkitsupportforQCandmergeofnon-codingtranscripts(Li-MeiChiang)

• Futurework:• Improvemethodsformergingmulti-isoformmodels• ImproveQCprocess– howtoimprovecommunicationsabouterrorswithannotators

Page 41: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Questions?

i5kWorkspace@NAL:• https://i5k.nal.usda.gov/• https://github.com/NAL-i5K/• GFF3toolkitissuetracker:https://github.com/NAL-i5K/GFF3toolkit/issues

• Email:[email protected]

Page 42: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Thankyou!TheNALTeam

• Yu-yu Lin

• ChaitanyaGutta

• Li-MeiChiang

• YiHsiao

• GaryMoore

• SusanMcCarthy

I5kWorkspacealumni

• Chien-Yueh Lee

• HanLin

• Jun-WeiLin

• Vijaya Tsavatapalli

• Mei-Ju Chen

• Chao-ITuan

i5kWorkspace@NAL advisorycommittee

• i5kCoordinatingCommittee• i5kPilotProject• Apollo&JBrowse DevelopmentTeams• GMOD/Tripalcommunity

• Allofourusersandcontributors!

Page 43: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

OGSgeneration– theGFF3toolkit

Page 44: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

TheReplacedModelsfield

• Weusetheinformationinthisfieldtogenerateamerged,non-redundantgenesetfromthemanuallycuratedmodelsandtheofficialorprimarygeneset

• Yourofficialorprimarygenesetislistedinthecategoryfieldofthetrackselector

• Ifyoudon’tknowwhatyourproject’sgenesetis,contactus!

https://i5k.nal.usda.gov/apollo-replaced-models-field-explanations-and-examples

ReplacedModelsfield

Page 45: The i5k Workspace@NALi5k.github.io/webinar_slides/i5k_webinar-i5k_workspace_Oct04-2017.… · for genome sequencing and curation; and seek funding. •The i5k Workspace@NALis available

Communityannotationlifecycle(endgoal:OGS)Genome

sequencing,assemblyandannotation

Communitybuilding:

Conferencecallsandtraining

Manualannotationvia

Apollo

Manualannotation‘freeze’

GeneralQC(NAL)

OfficialGeneSetgeneration(Merge

ofmanualannotationsand

referencegeneset)


Recommended