EMBL-ABR HEADS OF NODES MEETING 2-2016 WEDNESDAY 16 NOVEMBER 2016 AT 1:00PM – 2:00PM AEDT VIA TELECONFERENCE MEETING AGENDA
1
EMBL-ABRHeadofNodesGroup(HoN)Meeting2-2016viateleconferenceDate: Wednesday 16 November 2016 Time: 1:00pm – 2:00pm (Melbourne, Australian Eastern Daylight Time, AEDT)
AGENDA
DIAL IN DETAILS
Zoom Conference Meeting Topic: EMBL-ABR Heads of Nodes Meeting Number 2-2016 Time: Nov 16, 2016 1:00 PM (GMT+11:00) Australia/Melbourne
Join from PC, Mac, iOS or Android: https://unimelb.zoom.us/j/337935164
Or join by phone: Dial: +61 2 8015 2088 Meeting ID: 337 935 164 International numbers available: https://unimelb.zoom.us/zoomconference?m=wm4mIZPNMt31oTi4wMyvtCnjlk_z3ovm
Or join from a H.323/SIP room system:
Dial: SIP:[email protected] or H323:[email protected] (From Cisco) or H323:182.255.112.21##337935164 (From LifeSize or Polycom) or 162.255.36.11 or 162.255.37.11 (U.S.) Meeting ID: 337935164
Location LocalTime- Victoria AEDT/1:00PM - NSW AEDT/1:00PM - ACT AEDT/1:00PM - Tasmania AEDT/1:00PM - SouthAustralia ACDT/12:30PM - Queensland AEST/12:00PM- WesternAustralia AWST/10:00AM- NorthernTerritory ACST/11:30AM
EMBL-ABR HEADS OF NODES MEETING 2-2016 WEDNESDAY 16 NOVEMBER 2016 AT 1:00PM – 2:00PM AEDT VIA TELECONFERENCE MEETING AGENDA
2
DISCUSSION ITEMS
Agenda Item No.
Subject Time
Item 1 Register of presences and welcome (Chair: Vicky Schneider)
EMBL-ABR all Hands meeting agenda (https://www.embl-abr.org.au/all-hands-mtg-2016/)
Confirm attendance of EMBL-ABR by all Head of Nodes and also gauge who is bringing ppl relevant to specific areas and satellite workshops Registries in Bioinformatics: https://www.embl-abr.org.au/registries-
workshop/ Open Source and Software Development Best practice:
https://www.embl-abr.org.au/bioinformatics-sw-workshop/ Open and Scalable Training: https://www.embl-abr.org.au/open-
scalable-workshop/
Apologies Received: Malcolm McConville
Sylvain Forêt Jac Charlesworth
10 min
Item 2 Minutes from the Previous Meeting Confirmation of minutes from the previous meeting 1-2016 on 5th
October 2016
5 min
Item 3 EMBL-ABR Governance Planning of a more formal structure in anticipation of funding and formal recognition: Governance framework and operating model Proposed categories by Dom Structure
• Committees structure and charters (SIG, ISAG, Board?)o Node representation in decision making
• Network structure• Control and support functions’ roles
Role • Role of the Hub (e.g. coordinate and provide national leadership
through a consultative governance structure)• Role of the Nodes (e.g. delivering the activities)
Oversight responsibilities • Committees authorities and responsibilities• Hub’s Executives accountability and authority• Head of Nodes accountability and authority• Board oversight and responsibilities• International Science Advisory Group oversight and
responsibilities• Funding bodies oversight and responsibilities?• Reporting, escalation and veto rights
20 min
Item 4 EMBL-ABR Nodes Criteria Nodes: what makes a node? What we have so far are the node description forms that have criteria
20 min
EMBL-ABR HEADS OF NODES MEETING 2-2016 WEDNESDAY 16 NOVEMBER 2016 AT 1:00PM – 2:00PM AEDT VIA TELECONFERENCE MEETING AGENDA
3
Agenda Item No.
Subject Time
embedded though these may not be listed explicitly. We also have feedback from the ISAG about this that will be distributed shortly by the executive to each head of node. -Who can apply to become a node? - What type of scalability we want to have in place for the future Sylvain: How inclusive should the nodes be? Are we aiming for a small group with specific objectives or a wide dissemination in each institution? Dave E: This is something we should discuss but I see some nodes being small and focused and others being larger. It is also important to be able to welcome new nodes and retire old ones. Dom: From comments received from the ISAG, what are the characteristics of a node, of a relevant activity? Dom: Co-investments by nodes
• Is this a requirement?• At researcher or institution level?• How to ensure long-term commitments, sustainability?• How do nodes secure commitments from institutions. Is this the
responsibility of each node?• Business case to take to DVC(R)s
Item 5 EMBL-ABR Training and Communications Update Helen Gardiner
Announcements of training provided by nodes • Which training should be promoted as part of EMBL ABR?• Sharing training resources
o Coordinationo Organisation (e.g. several nodes might have the same
topic by with different tools)o Branding and copyrights
20 min
Item 6 EMBL-ABR Key Areas Update by Coordinators • Open Data - Dr Philippa Griffin• Training - Sonika Tyagi• Standards - Saravanan Dayalan• Tools - Nathan Watson Haigh
20 min
Item 7 Any Other Business Requirements for a Bioinformatics Infrastructure in Germany for future Research with bio-economic Relevance The German Network for Bioinformatics Infrastructure (de.NBI) – a short overview Confirmation of chair for next call Confirmation of next meeting of EMBL-ABR All Hands - December meeting to be replaced by the face to face and next call in January 2017
5 min
EMBL-ABR HEADS OF NODES MEETING 2-2016 WEDNESDAY 16 NOVEMBER 2016 AT 1:00PM – 2:00PM AEDT VIA TELECONFERENCE MEETING AGENDA
4
Head of Nodes Group, 2016
Chair – Vicky Schneider, EMBL-ABR Executive Dave Edwards, UWA Node Andrew Lonie, EMBL-ABR Executive Sonika Tyagi, AGRF Node Saravanan Dayalan, MA Node (nominated alternate) Ira Cooke, JCU Node Rob Cook, QCIF Node Steve Androulakis, Monash Node Marc Wilkins, UNSW Node Nathan Watson Haigh, Tools Coordination, EMBL-ABR Key Area Lead
Apologies Malcolm McConville, MA Node Sylvain Forêt, ANU Node Jac Charlesworth, UTAS Node
Observers Ms Helen Gardiner, Communications Manager, EMBL-ABR Hub Dr Philippa Griffin, Open Data Coordinator, EMBL-ABR Hub Alexandra Bennion, Administrative Assistant & Secretariat, EMBL-ABR Hub
https://www.embl-abr.org.au/
Agenda Item 2
EMBL-ABR HEAD OF NODES GROUP (HoN) MEETING 1-2016 ON WEDNESDAY 5 OCTOBER 2016 AT 2:00PM – 3:00PM AEDT VIA TELECONFERENCE CONFIDENTIAL DRAFT MINUTES
5
EMBL-ABRHeadofNodesGroup(HoN)Meeting1-2016Wednesday 5 October 2016 Time: 2:00pm – 3:00pm AEDT Via teleconference
DRAFT MINUTES
Members present:
A/Prof Vicky Schneider (Deputy Director, EMBL-ABR) Chair
A/Prof Andrew Lonie (Director, EMBL-ABR)
Sonika Tyagi (Australian Genome Research Facility)
Sylvain Forêt (Australian National University)
Steve Androulakis (Monash University)
Rob Cook (the Queensland Cyber Infrastructure Foundation)
Saravanan Dayalan – Metabolomics Australia, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne (nominated alternate)
Observers Dr Philippa Griffin, Open Data Coordinator, EMBL-ABR HUB Ms Fiona Kerr, Executive Management & Secretariat, EMBL-ABR HUB
Apologies: Dr Jac Charlesworth (University of Tasmania) Prof Dave Edwards (The University of Western Australia) Prof Marc Wilkins (University of New South Wales) Prof Malcolm McConville (Metabolomics Australia, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne)
EMBL-ABR HEAD OF NODES GROUP (HoN) MEETING 1-2016 ON WEDNESDAY 5 OCTOBER 2016 AT 2:00PM – 3:00PM AEDT VIA TELECONFERENCE CONFIDENTIAL DRAFT MINUTES
6
DISCUSSION ITEMS
1. Register of presences, welcome and scope of the calls
A) VS welcomed all to the first HoN group and get involved in activity with the EMBL-ABRgroup.Easier to have a 1 per month group meeting call.
ACTION: Hub to set up regular monthly HoN calls and HoN to ensure they can make it.
B) Quick round of introductions by everyone present.
2. EMBL-ABR Update and Where to Find Information
C) VS reported that the exec sent the Node description to the ISAG to look at as examples ofwork and hub currently waiting for feedback from the ISAG which will be relayed to HoN bythe executive.
ACTION: Exec to circulate specific feedback to the individual nodes once it’s received from ISAG
3. EMBL-ABR Communications Update
D) HT (Comms) is working on a 2 page flyer for each node. HT is also working on theweb presence of each node on the EMBL-ABR website, and ideally each node willalso have a dedicated page on their website that explains their node and links backto EMBL-ABR website.
ACTION: Exec will circulate 2-pagers and webpage about their nodes for sign off in the upcoming weeks.
4. EMBL-ABR Nodes Update – Head of Nodes
E) Activity leads concept: is to bring those who are experts in a biosciences domainbut are not a node to help create resources and work with EMBL-ABR, to buildrelationships and build the bioinformatics activities close to the actual users.
ACTION: Hub to distribute a report with quick update on the “people” page of website as we get more activity leads and HoN taking Coordination roles.
All to ensure they are informed about the key documents area, and specifically read this one please: https://www.embl-abr.org.au/wp-content/uploads/2016/06/The-Australia-Bioinformatics-Resource_September2016rev.1.pdf
Agenda Item 2
EMBL-ABR HEAD OF NODES GROUP (HoN) MEETING 1-2016 ON WEDNESDAY 5 OCTOBER 2016 AT 2:00PM – 3:00PM AEDT VIA TELECONFERENCE CONFIDENTIAL DRAFT MINUTES
7
F) AL gave an update on our submission to NCRIS roadmap, which made a strongcase for expertise and research infrastructure for bioinformatics. We are expectingto learn more about the NCRIS funding by the end of this year.
ACTION: VS is leading a position paper on bioinformatics infrastructure at national level in Australia which will be circulated for comments to HoN group, by November.
G) EMBL-ABR has now a EMBL-ABR Diversity and Gender Policy & Action Plan v1and a Self Assessment team with members from across the nodes.
ACTION: Fiona to set first Self assessment team meeting (November 2016) and discuss the policy document, which specific items are applicable to a national level network and what and how we want to present and liaise with ISAG and EMBL-ABR as a whole. HoN to have a look and bring any comments to EMBL-ABR Executive please.
H) VS summarized some pragmatic items she discussed with EMBL-EBI based onneeds she received from some of HoN.https://www.embl-abr.org.au/wp-content/uploads/2016/10/EMBL-EBI-_EMBL-ABR-August2016.pdf
ACTION: all HoN to read this and bring any wishes to follow up, priorities to VS.
I) All presented their slides and activities from each node, find this here:https://www.embl-abr.org.au/wp-content/uploads/2016/10/EMBL-EBI-_EMBL-ABR-August2016.pdf
ACTION: all agreed this is a good mechanism to bring updates on nodes in a homogenous manner, however for next ones all to please: -map the activity to the key area: Data, Tools, Platforms, Standards, Compute or Training -add links where the materials, tools, data, information about the mentioned activities/events can be found and share with the network, so we bring it at national level, if anything else, as awareness J
5. Any Other Business
J) HT mentioned that recent visits to Heidelberg, at ECCB and ICCB showed ageneral interest to hearing what Australia is doing. It became apparent that effort into communication to raise the profile and connection internationally and nationally are needed. HT also mentioned the Hub is working on anticipating how we might be assessed for impact and capability. VS and HT are already drafting an IMPACT FRAMEWORK FOR EMBL-ABR (IF4EMBL_ABR)
ACTION: VS/HT to circulate the IF4EMBL_ABR to HoN group, by latest 20 November
Agenda Item 2
EMBL-ABR HEAD OF NODES GROUP (HoN) MEETING 1-2016 ON WEDNESDAY 5 OCTOBER 2016 AT 2:00PM – 3:00PM AEDT VIA TELECONFERENCE CONFIDENTIAL DRAFT MINUTES
8
K) HT mentioned our current communication vehicles, also listed here see page 14.VS asked if all received EMBL-ABR newsletter and read it and there was a generalyes.
ACTION: All to pro-actively help engagement with EMBL-ABR by distributing newsletter and HT will start including the activities of nodes as we discuss these though the monthly newsletter, however, if YOU have an item you want prioritized, do please contact Executive so we can take this into account.
L) For Node Training events that are also part of their efforts as an EMBL_ABR Node,please see this guideline and do let the HUB know so we can also promote andkeep track for impact monitoring. This also means the Hub can add events timely toiann and maximize promotion as well as create a log of the events for futureanalysis.
ACTION: HoN to confirmed they have seen this and notify the ppl in their nodes to whom this is relevant, by next HoN call.
6. Next Meeting
The next HoN Meeting is scheduled in Melbourne for 16th November at 1pm AEDT.
Agenda Item 2
Requirements for a Bioinformatics Infrastructure in Germany for future
Research with bio-economic Relevance
Recommendations of the BioEconomyCouncil
Agenda Item 7.1
Agenda Item 7.1
Requirements for a Bioinformatics Infrastructure in Germany for future
Research with bio-economic Relevance
Recommendations of the BioEconomyCouncil
Agenda Item 7.1
4
Agenda Item 7.1
Table of Contents
Summary 6
Introduction 7
Bio-economicpotentialofmodernbiosciences 7
Bioinformaticstopics 8
Recommendations 11
a) Infrastructure 11
b) Optimisationoftheuseofcomputingcapacities 14
c) Developmentoflong-termstrategiesforresearch,action,andfunding 15
Attachment 16
Currentbioinformaticsfacilities–Examplesofpotentialexpertisecentres 16
Glossary 30
5|4
Agenda Item 7.1
Summary
Summary 1
Inthecomingyears,theapplicationofbiologicalknowledgeandmethodsgleanedfromthebioscienceswillbeof increasingeconomicrelevance. Inthis regard,bioinformaticswillplayanimportantrole.Thebioinformaticsinfrastructureneedstobeexpandedinor-dertoenablefurtherresearch,aswellastheuseofresearchfindings,tomeetbio-economicrequirements.Thecentraltopicsintheareaofbioinformaticsarethedevelopmentofflex-iblepipelinesrunninginparallelforinput,provision,andanalysis(datamanagement),theenhancementofstatisticalmethodsofanalysis(dataanalysis),andoptimisationofpredictivemodels(dataprocessing).Inordertoshapethesetopicsasbestaspossible,thefollowingactionisneeded:
• Establishmentofabioinformaticsinfrastructureconsistingofanumberoflocal,well-equipped,andspecialisedcentresofexpertiseandacomprehensivecoordinat-ingbodywiththefollowingresponsibilities:
• Networkingandfundinglocalcentresofexpertiseforthepurposesofensuringthedevelopmentoftechnology
• Increasingknowledgetransferbetweenbiologyresearchandbioinformatics
• Establishingstandardsforstoringandanalysingdata
• Makingthenecessarysoftwaretoolsfreelyavailableandstandardisinginterfaces
• Developmentoflong-termstrategiesforresearch,action,andfunding,inorderto
• Improvetheconditionsforjointpublic-privatefundingofcollaborativeprojects
• Promotethesustainabilityofavailabledataresources
• Optimisationoftheuseofcomputingcapacities,inorderto
• improvetheutilisationoflocalresourcesthroughcomprehensiveresourceplan-ning
• improvetheconditionsfortransferringdataviacloudcomputing(costs,security,labour)
• providesupercomputersforspecificapplications
1 TheBioEconomyCouncilwouldliketothankthemembersoftheSteeringCommitteeAlfredPühler,FrankOliverGlöckner,AlexanderGoesmann,ThomasHartsch,EricvonLieres,KlausMayer,NorbertReinsch,Chris-CarolinSchön,WolfgangWiechert,andRalfZimmer,aswellallthosewhoparticipatedintheworkshop“Bioinformatics”,allofwhommadeavitalcontributiontodevelopingtheseRecommendations.
6
Agenda Item 7.1
Introduction
Bio-economic potential of modern biosciences
Biologyinflux
Withtheemergenceofabroadspectrumofnewmethodsandtechnologies,biologyre-search has in recent years become a science generating massive amounts of data. Thesimultaneousdevelopmentofbioinformaticsconstitutestheprerequisiteforthestorage,globalexchange,andanalysisofthisdatavolume.Thecombinationofnewresearchtech-nologies–suchasnext-generationsequencing,high-throughputprecisionphenotyping,andso-calledOMICStechnologies–andbioinformaticstoolsforlinkingandanalysingthegenerateddataenableadeepunderstandingofbiologicalinterrelationships.Thisrangesfromdetailedknowledgeofthegeneticmake-upof individualspeciesor individualor-ganisms,tomechanismsforexpressingtheirphenotypiccharacteristics,tocomplexin-teractionsthattakeplacewithinanecosystem.
Frombasicresearchtoappliedscience
Thegrowingunderstandingofthemechanismsunderlyingtheexpressionofcharacteris-ticsinorganismsisgeneratingnewpossibilitiesforsustainable,economicuseofbiologi-calresources.Thisincludes,interalia,thedevelopmentofnewandusefulbiotechnologyprocesses,thetargetedimprovementinthebreedingofagriculturalcropsandfarmani-mals,andmoreaccurateorientationofcropprotectionandveterinarymedicine.Inaddi-tion,adeeperunderstandingofevolutionaryinterrelationshipsalsowillcontributetothediscoveryanduseofnewbiologicalpotentialswiththeaidofbiodiversityresearch.Thispermits the development of, for instance, systems biology approaches for the targetedsupplementationandoptimisationofcurrentbreedingprocesses.Sustainableeconomicconceptsforextractingbiocatalystsandbioactiveagentsfromvariousorganismscanalsobedevelopedbyexploitingtherecentlycreatedpossibilitiesfordirectlyaccessingthege-neticmaterialofmicroorganismsnotabletobecultivatedonalaboratoryscale.Thedataavailabletodayandinthefutureincreasinglypermitacomprehensivemodellingofboththeprocessofcentralmetabolismandselectindividualsynthesispathways.Inso-called“syntheticbiology”,suchmodelscanserveasthebasisforthetargetedredesignofentiremetabolicpathwaysintechnologicallyusefulorganisms.
Thebioinformaticsinfrastructure,whichiscurrentlyinsufficientformeetingtheneedsofresearch,isincreasinglyemergingasthelimitingfactorfortheoptimalfutureuseoftheentirebio-economicpotentialofmodernbiosciences.Thevariousfieldsofbiologyre-search–frombasicresearchtoappliedresearch–showasimilarneedforactionintheareaofbioinformatics.
7|6
Agenda Item 7.1
Introduction
The trend-setting significance of bioinformatics has already been recognised in manyEuropeanandnon-Europeancountries.Somecountries,suchastheNetherlands(NBIC),Switzerland(SIB),andFrance(ReNaBi),alreadyhavecomprehensive,well-organisedbio-informaticsstructuralprogrammesinplace;othercountries,suchasSweden(BILS),arecurrentlydevelopingthem.Attheinternationallevelaswell,effortsarealreadyunder-waytoestablishoverarchinginfrastructureprogrammesinordertoachievebetternet-workinganddataexchange(e.g.ELIXIRinEurope).Withrespectalsotointernationalex-change,butinparticulartothecompetitivenessofGermanresearch,thismakesitallthemoreurgenttopushforwardthedevelopmentofaGermanbioinformaticsinfrastructure.
Bioinformatics topics
Thebioinformaticsspectrumrangesfromthefundamentalproblematicofdatamanage-ment,whichcomprises,inparticular,datamaintenanceandstructuring,todataanalysis.Examplesofthisincludestatisticalandquantitativegenetics,populationgenetics,over-archingmeta-analyses,biometricanalysisofdata,integrationofdisparatedatatypes,andthetransformationoffindingsandtechniquesfrombasicresearchintoappliedresearchanddevelopment.Inparticular,modellingandsimulationofcomplexsystemsareplayinganincreasinglyimportantrole.
Datamanagement
Intheareaofdatamanagement,itisessentialthatsystemsbecreatednotjustforstoringandstructuringthegenerateddatabutalsoformakingsuchdataavailableforanalysisandinterpretation.Inthisregard,thechallengeisnotonlytomanagetheexponentiallygrowingamountofdatabutalsototakeintoaccountthegreatheterogeneityofprimarydata.Inordertomeetthesechallenges,itisnecessarytodevelopflexiblepipelinesrun-ninginparallelforinput,provision,andanalysis.Intuitivetoolsareneededforvisualisingandexaminingthedata.Inviewoftheenormousamountofdata,themainchallengeistodevelopefficientstrategiesforreducingthecomplexityandvolumeof,e.g.,primarydatafromsequencinganalysisbyusingsuitablemethodsfordatareductionandcompression.
8
Agenda Item 7.1
Dataanalysis
New,efficient,statisticalapproachesfordataanalysishavetobedevelopedinparallelwithdatamanagement.Thegeneticanalysisofcomplexfeatures,suchasyieldandresourceefficiency,usinggenomeanalysisandprecisionphenotypingproduceshigh-dimensionaldatavolumes,whoseoptimaluserequiresthecontinualrefinementofstatisticalmethodsof analysis. The integration and comparative analysis of data and results from variousresearchandapplicationareas(e.g.molecularbiology,physiology,biodiversityresearch,biotechnology,andbreeding,aswellassensor,process,andanalysisdatafromplantandanimal research) form the basis for interdisciplinary and translational research. Today,therearealreadyanumberofindividualdatasetscontainingmolecularandphenotypicinformation,aswellasbiologicalresourcesandanalysistools,whoseoptimalusecanbeensuredonlybybringingtogetherallexistinginformation.Thisrequiresinnovativebio-informaticsconceptsdesignedtoestablishknowledgebases thatensurethe linkingofindividualdatabasesandthustheintegrationofheterogeneousdataaswell.Theyarees-sentialforthecreationoffunctionalmodelsandsimulationapproaches,whichconstituteafundamentalbuildingblockforfuturebio-economicaction.
Dataprocessing
Therationalanddata-drivenselection, identification,andvalidationofsuitablemodelssimilarlyrequiretheavailabilityoflocalandcentralcomputerresources,aswellasthedevelopmentofcustomised,scalablesoftwarefortheirefficientuse.Bothaspectsareoffundamental significance for the combination of two especially relevant optimisationstrategies:Necessaryfortheoptimalplanningofnewexperimentsorthetargetedperfor-manceoptimisationof,e.g.,biotechnologicalprocessesgoverningconversionofmateri-als,are,ontheonehand,statisticallyvalidestimatesaboutpropagateddatauncertaintiesand,ontheother,robustpredictionsaboutnewmeasurements.
Existingexpertisecentres
Some of the problems discussed here are currently being worked on, and in other are-as,sophisticatedsolutionsandsystemsarealreadyinplace.Intheareaofmicrobiology,such research initiatives as GenoMik/PathoGenoMik and the European excellence net-work“MarineGenomicsEurope”have inrecentyearsmadean importantcontributiontothesuccessfuldevelopmentofmicrobialgenomeresearchinGermanyandEurope.Intheareaofplantresearch,theinitiative“GABI/PlantBiotechnologyoftheFuture”hasalreadyresultedinthedevelopmentofoutstanding,internationallyrecognisedexpertiseinanumberofareasingreenbioinformatics.Thishascreatedexcellentconditionsfores-tablishingamulti-tieredgreenbioinformaticsplatform.WorthyofmentionintheareaofanimalresearcharetheprojectsoftheFUGATO(FunctionalGenomeAnalysisinAnimalOrganisms)initiative.Inthecompetencenetworksforagriculturalresearch(CROP.SENSe.net,PHENOMICS,andSynbreed),bioinformaticshasalreadybeensuccessfullyintegratedin agricultural and biosciences research collaborations. Moreover, Germany has imple-mented a recognised, exemplary university system for training young bio-informaticsresearchers.
9|8
Agenda Item 7.1
Introduction
Therefore,withregardtothedevelopmentofexpertisecentres,itseemsappropriatethat,first,existingbioinformaticsfacilitiesbestrengthenedandthen,whereneeded,thedevel-opmentofnewcentresbepromoted.
Inthefieldofplantresearch,therearealreadyfivepotentialexpertisecentres:theMunichBioinformatics Centre, the institutes of the so-called “ABCD/J” region (Aachen, Bonn,Cologne,Düsseldorf,Jülich),PlantBioinformaticsatGatersleben-Halle,Tübingen-Hohen-heim region, and the Max Planck Institute of Molecular Plant Physiology in Golm. TheCenterforBiotechnology(CeBiTec)attheUniversityofBielefeldwouldmakeasuitableexpertisecentreinthefieldofbiotechnology.Inthefieldofthemodellingofbiochemi-cal networks and supercomputing applications in systems biology, the Jülich ResearchCentre likewise has an international reputation. The Bremen region – with the MaxPlanck Institute for Marine Microbiology, the Jacobs University, the Center for MarineEnvironmental Sciences at the University of Bremen, and the Alfred Wegener InstituteforPolarandMarineResearchinBremerhaven–constitutesapotentialexpertisecentreforthefieldofenvironmentalmicrobiologyandbiodiversityresearch.Inthefieldofani-malresearch,mentionshouldbemadeofthedatacentresVitVerden,theBavarianStateResearch Center for Agriculture in Grub, the Leibniz Institute for Farm Animal Biologyin Dummerstorf, and the Animal Breeding and Genetics Department at the UniversityofGöttingen.
10
Agenda Item 7.1
Recommendations
a) Infrastructure
Inordertoensurethatbiologicaldatahasthemostefficient,long-term,andsustainableuseforresearchandcommercialapplication,acoordinated,networkedbioinformaticsin-frastructureshouldbedeveloped,whichalsotakesintoaccounttheaspectoftranslatingresearchresultsfrombasicresearchtoapplication.Closecollaborationbetweenexperi-mentalanddata-generatingstructures,alongwithdevelopmentofbioinformaticscom-petence,isalreadyprovingtobeanimportantstructuralcomponent.
Thekeytodevelopingamodern,evolving,andefficientbioinformaticsinfrastructureliesintheestablishmentofatwo-trackorganisationalstructure,which,ontheonehand,hasanumberoflocal,well-equipped,andspecialisedexpertisecentresand,ontheother,pro-videsforacomprehensivebodyfornetworkingandcoordinatingthesecentres.Itshouldbenotedherethattheplannedbioinformaticsinfrastructureisnottobereservedstrictlyfortaskswithbio-economicrelevance.Rather,itneedstobeinvestigatedtowhatextentcross-networkingwithotherfieldsoflifesciencesmightbeexpedient.
Localexpertisecentres
Bybundlingknow-howandtechnicalfacilities,localexpertisecentresensurethedevel-opmentofbioinformaticsapproachestosolvingspecificproblems.Theyprovidethenec-essarycomputingcapacities,locallywherepossible.Inaddition,supercomputingcapaci-tiesshouldbesetupforspecialapplicationsandissues.
Potential locations for expertise centres would be those that (i) already have an estab-lished reputation in their field, (ii) have available, in addition to bioinformatics tools,such as software and databases, sufficient computing resources and professionals,(iii)arefirmlyembeddedintheresearchenvironmentthroughnationalandinternationalcollaborations,and(iv)havepermanentstructuresforeducatingyoungresearchersandtrainingusers(seeAttachment).
Throughlong-termsupportforandnetworkingoflocalexpertisecentres,bioinformaticstechnologiesusedonawidebasisaretobemaintained,enhanced,andmadeavailableforbroaduseinresearchprojects.Inaddition,withregardtotechnicalequipment,itmustbetakenintoaccountthatspecialisedindividualsolutionsareoftenrequiredinlightofthefactthathighlyspecialisedanalysesaresometimesundertaken.Inparticular,throughthecentralisedprovisionofbioinformaticsknowledgeandservices,smallerresearchgroupsandnewcomerstothefieldcanimmediatelybeputintoapositiontogeneratenewbio-logicalknowledgefromthedata,withoutthemselvesfirstneedingtocreatetheirownbioinformaticsinfrastructure.
Thetasksoftheexpertisecentreswouldfurthermoreincludethebroadcreationofcom-petencesinthebioinformaticsanalysisofgenomicandpostgenomicdata.Invariousnet-works,jointtrainingunitshaveprovedtobeasuperbmeansoftransferringknowledgebetweentheinstitutionsinvolved.Inaddition,theystrengthenthenetworkbetweentheinstitutions.Thisleadstoalong-lastingstrengtheningofgenomeresearchinGermany.
11|10
Agenda Item 7.1
Recommendations
Inaddition,thetrainingandfosteringofyoungresearchersshouldbeexpanded,forin-stance,ingraduateschools.Thiswillmakeiteasiertocreatetiesbetweenexpertsinge-nomeresearchandbioinformatics.
Thecomprehensivecoordinatingbody
Thecomprehensivebodyactsasthecoordination,contact,andinformationinterfacebe-tweenexpertisecentres,biologyandbioinformaticsresearchinstitutions,andotherus-ersandinterestgroups.Bypromotingtheexchangeofinformationbetweenthevariouscentres,aswellasbetweentheseandothernationalandinternationalpointsofcontactfromresearchandindustry,acommonfoundationiscreatedforaddressingthevarietyofbioinformaticsissuesandtasks.
Bynetworkinglocalbioinformaticsexpertisecentres,itispossibletopromoteinatarget-edmannerthedevelopmentoftechnologyinthevariousfieldsofappliedbiology,aswellasinbasicresearch,todevelopstandardsforstoringandanalysingdata,andtodevelopconceptsforthesustainabilityofavailabledataresources.
Whilethemaintaskoftheexpertisecentresisprovidingspecifictoolsforanalysingre-searchdatafromvariousfieldsofthebiosciences,theoverarchingbodypromotesthede-velopmentanduseofjointlyneededtoolsandstandards.Ascoordinatedbythecompre-hensivebody,allexpertisecentrestakepartinthedevelopmentofmutualfoundationsofbioinformaticsandmoreoverserveasapointofcontactforspecificissues.
The development of standard operating procedures, uniform interfaces, and conscien-tiousdatadocumentationbytheexcellencecentresaretobecoordinatedbythecompre-hensivebody.Thiswillspeedupcomparativeanalysesandensurethequalityofanalysesonalastingbasis.Thesameappliesfortheprovisionofspecialiseddatabasesandtoolsforgenomeandbiodiversityresearch.Accesstoreferencedatasetsthathavebeenverifiedbyexperts(biocuration)isincreasinglyprovingtobethekeytechnologyforhigh-qualityanalysisofbiologicaldata,aswellasthesearchfornewenzymesandprocessesforbio-technologicalapplications.
12
Agenda Item 7.1
Inaddition,thecomprehensivebodywillactastheinitial,intermediarypointofcontactforbioinformaticsissues.Inthisway,inthefieldofscience,itcancontributetostrength-eningthetransferofknowledgebetweenbiologyresearchandbioinformatics.Asinglepointofcontactsuchasthiswouldalsobecapableofpromotingexchangeandcollabora-tionwithcompaniesandpublicresearchinstitutions.
Inestablishingthecomprehensivebody,thefirststepshouldbetodevelopa“lean”coordi-nationstructure,whichinitiallyestablishesthenecessarynetworkbetweentheexistingcentres.Thiscouldbeachievedbysettingupanoversightgroup,whosememberswouldcoordinatetheactivitiesofthevariouscentresandoverseefurtherdevelopment.Theac-tivityofthisgroupshouldinitiallyfocusonthedevelopmentofanetworkstructureandthecoordinationofdevelopmentandstandardisationprojects,aswellasthenecessaryre-sourcesfortheexpertisecentres.SimilartothewaythisiscurrentlybeingpracticedintheNetherlands(NBIC)andSweden(BILS),closetiesbetweentheoverarchingauthorityandthebioinformaticscentrescouldbeachievedbyhavingthemembersfromtheexpertisecentresformthecoreoftheoversightgroup.Inthelongterm,transfertoabroader-basedinstitutionwithapermanentstaffshouldbesought,inordertobeabletoensuresupportforresearchers,ahighstandardofeducationinbioinformatics,andcommunicationtothepublicatahighlevel.
Inordertosetupacoordinatedstructure,itisfirstnecessaryattheorganisationalleveltohaveasmallcircleofpersonsdeclare themselveswillingto takeontheresponsibil-ityofdevelopingtheconceptofthecomprehensivebodyandtocreatetherequisitenet-workingwiththeresearchinstitutionsthatarepotentialcandidatesforexpertisecentres,withcompanies,andwithinternationalinstitutions.Moreover,afundingmodelhastobefoundthatpermitsaninstitutionsuchasthistobepromotedinthelongterm.InitialfundingistobesoughtfromBMBF.
13|12
Agenda Item 7.1
Recommendations
b) Optimisation of the use of computing capacities
Therapidlyacceleratinggenerationofdatainrecentyearsandtheresultingrequirementsintheareaofdataanalysis,particularlyforsequencinganalysesandforstatisticalcom-parisonsofgenotypesandphenotypes,cannotbemanagedwithstandalonecomputers.Analogousdevelopmentsaretobeobservedwithsimulationmethodsformodel-baseddataanalysisandexperimentplanninginstatisticalgeneticsandsystemsbiology.Inad-dition,thetypicalcomputingneedsofindividualworkinggroupsarenotcontinuousbutrathercharacterisedbypeakloadswithregardtotime.
Useoflocalresources
Localserversorcomputerclustersmakesenseforcoveringbasicneedsandareverycom-mon.Theuseof localresourcesisespeciallyadvantageouswherethedataistobepro-cessedinteractively.Thecapacityoflocalcomputerclusterscanessentiallybeimprovedthroughalternatingorsimultaneousperformanceoftasksfromotherfieldsthatrequiresignificantcomputingpower.However,thismeansthatallinvolvedworkinggroupsneedtocarryoutcomprehensiveresourceplanning.
Useofexternalresources–cloudcomputing
Inordertobeabletomanagepeakloadsthatoccurirregularly,itoftenisadvantageoustousethecapacitiesofcentralcomputingcentres.Today,largeamountsofexternalcom-putingcapacitycanbeleased(cloudcomputing).However,incomparisontolocalserversandclusters,thistriggershighercosts,andthesecurityofconfidentialdatanormallycan-notbeensured,oronlyatgreateffort.Thelackofdatasecurityisconsideredtobehighlyproblematic,inparticular,wherecollaborationswithprivatecompaniesareinvolved.Inaddition,thenecessityofhavingtorepeatedlytransferlargeamountsofdatatothecloudcreatesconsiderableaddedefforts.Fromthestandpointofbioinformatics,thecontinualenhancementofcloudtechnologieswithrespecttosecurityandperformancehasahighpriority.
Useofexternalresources–supercomputers
Intheresearchenvironment,memoryandcomputingcapacitiesforcomplexcalculationsare widely available through time slots on supercomputers. However, supercomputersareonlypartiallysuitableforunplannedpeakloads,sinceslotshavetobeaccesseddur-ingspecifictimewindows.Therefore,supercomputercentrescanbeseenasaverygoodcomplementto thenecessarymanagementandexpansionofbioinformaticshardwarecapacities,butnotasatechnicalsolutioninandofthemselves.
Inordertobeabletousenotonlylocalcomputersandcomputerclustersbutalsocentralsupercomputers, the needed software tools must be freely available and the interfacesmustbestandardised.
14
Agenda Item 7.1
c) Development of long-term strategies for research, action,and funding
Bringingacademicresearchtogetherwithprivatecompanies
Withrespecttothefuturerolethatbioinformaticswillplayinthebio-economy,effortsmustbemadetosecureincreasedjointpublic-privatefundingofcollaborativeprojects.Currently,however,therearestillnumerousobstaclesstandinginthewayofcollabora-tionbetweenacademicresearchandprivatecompanies.Theserangefromtheproblem-aticofpublicationofresearchdatafrompublic-privatepartnershipstopatentissues.
Sustainabilityofdataresources
Theconceptsregardingthesustainabilityofavailabledataresourcesareoffundamen-tal importance for all downstream analytical and knowledge-generating processes. Inadditiontodatageneratedthroughexperiments,theseincludetheirmetadata,suchasthedescriptionoftheexperiment,originandnatureofthebiologicalmaterialused,andtheanalyticalmethodsused.Inordertomakesuchadescriptionuniformandmoreeas-ily accessible for analytical methods, it is possible to employ“controlled vocabularies”or ontologies yet to be created. For instance, in addition to ensuring researcher accesstostate-of-the-artequipmentandthetechnologiesneededtogeneratedata,suchasforhigh-throughput sequencing, transcriptomics, proteomics, and metabolomics, it mustalsobeensuredthattheacquireddataareavailableforawidespectrumofapplicationsandforlongperiodsandareretrievable.
15|14
Agenda Item 7.1
Attachment
MunichBioinformaticsCentre
Involved institutions Institute of Bioinformatics and Systems Biology at the Helmholtz Centre in Munich (HMGU)
Technical University of Munich (TUM)
Specialisation Inter alia, comprehensive expertise in analysing, depicting, and providing genomes of classic plant model organisms (arabidopsis, medicago, brachypodium), as well as cultivated plants (rice, tomato, barley, wheat, rye, corn, sunflower). Analysis of next-generation sequencing data and their correlation with genomic, evolutionary, and biologically functional issues with regard to model organisms and agricultural crops and farm animals. Development of statistical methods for the analysis of quantita-tive characteristics, for the functional analysis of biodiversity, and for the resolution of important yield-determining mechanisms.
Extensive expertise in the fields of statistical genetics, plant and animal breeding, and molecular biology, ongoing appointment procedure for population genetics and biostatistics.
Networking Coordination/participation in two long-term research collaborations: AgroClustEr Synbreed, sponsored by BMBF, and the DFG special research field “Molecular mecha-nisms regulating yield and yield stability in plants”.
The participating groups are closely networked with national and international consortiums and initiatives in Europe and the US, such as the International Wheat Genome Sequencing Consortium (IWGSC), the International Barley Genome Sequencing Consortium (IBSC), and the European plant genome infrastructure plat-form transPlant. Comprehensive participation in projects of GABI / Plant Biotech-nology of the Future initiative. Collaboration with companies under public-private partnerships.
Education The two institutions together ensure the education of talented young scientists through the TUM courses of study in bioinformatics, agricultural sciences, biology, and molecular biotechnology. In order to develop interface competence between undergraduate and graduate students and ensure interdisciplinary networking between various departments, both institutions take part jointly in seminar series and summer schools.
Staff About 45 individuals in the fields of bioinformatics, quantitative genetics, animal and plant breeding, expanded by ongoing appointment procedures for population genetics (W2) and biostatistics (W3).
Software R package synbreed for genomic prediction of complex phenotypes
Databases No information
Computer infrastructure Powerful computer cluster at HMGU, Leibniz computing centre, and Munich Center of Advanced Computing
Attachment
Current bioinformatics facilities – Examples of potential expertise centres
a) Plants
16
Agenda Item 7.1
PlantBioinformaticsatGatersleben-Halle
Involved institutions Institute of Computer Science at the Martin Luther University (MLU) in Halle-Wittenberg
Leibniz Institute of Plant Biochemistry (IPB) in Halle
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Gatersleben
Specialisation Analysis of next-generation sequencing data (assembly, diversity studies, RNA-Seq, and ChIP-Seq); development of databases, data integration, and information retrieval; analysis of biological networks; image analysis (microscopy images, DNA signals, high-throughput phenotyping of plants); applied informatics in the fields of metabolomics and mass spectrometry; systems biology (modelling of metabolism and flow analyses); visualisation and visual data analysis of biological data; molecu-lar phylogeny.
Networking Numerous national and international collaborations with countries such as Greece, the UK, Turkey, Sweden, Switzerland, Spain, Finland, France, Austria, the Netherlands, Japan, Australia, the US, Israel, Canada, Russia, Iran. There are also collaborations with industry (BASF Plant Sciences, Bayer Crop Sciences, Boehringer, KWS, and many other companies involved in plant breeding).
Education Neben dem seit 1999 existierenden, nun auslaufenden Diplomstudiengang Bio-informatik (MLU Halle) gibt es einen Bachelor- und Masterstudiengang Bioinfor-matik (MLU Halle), Promotionen in Bioinformatik (MLU Halle) sowie umfangreiche Lehrtätigkeiten für Bioinformatik-Module (Uni Kiel), den Bachelor- und Masterstudi-engang Biotechnologie (Hochschule Anhalt, Standort Köthen) und den Bachelorstu-diengang Informatik (Hochschule Harz, Standort Wernigerode).
Staff MLU: 3 professorships, with an additional 4 budgeted (HH) and 3 outside-funded (DM) researcher positions
IPB: a group with four researchers (group head, 2 HH and 1 DM position)
IPK:workinggroups(ofwhich2arepurelyDM-funded)inthefieldofplantbioin-formatics,withatotalof29researchers(7HHand22DM)
Software Thefollowingtools,interalia,havebeendeveloped:
Alida(Documentationofdataanalyses);MiToBo(MicroscopeImageAnalysisToolbox);Vanted(analysisofOMICSdatainthenetworkcontext);SBGN-ED(Sys-temsBiologyGraphicalNotationEditor);LAILAPS(searchengineforinformationretrievalforuser-specificrelevanceanalysis);databasesanddataintegrationoflaboratorydatamanagement,includingcross-domaindataanalysis;bioconduc-torpackages(xcms,CAMERA,Rdisop,mzR);toolsformetaboliteidentification;IAP(imageanalysisofhigh-throughputphenotypingdata);HIVE(integrativeanalysisofmultimodedata);FBASimVis(fluxbalanceanalysis);KGML-ED(KEGGpathwayeditor);CentiBin/CentiLin(centralityanalysesinnetworks);Jstacs(li-braryforstatisticalanalysesandsequenceclassification);MotifAdjuster/MiMB/Dispom(transcriptionfactorbindingsitesannotationandprediction);andPHHMM(analysisofarray-CGHdata).
Databases Participationinthedevelopmentofpublicdatabasesinthefieldofplantbioin-formatics(selection):
GBIS(Federalcentralgenebankinformationsystem);MetaCrop(informationsystemformetabolisminplants)andMassBank(massspectrometryrefer-encedata).Inaddition,activecollaborativeworkiscontinuingforstandardsinsystemsbiology(SBML,SBGN);thecitabilityof(primary)researchdata(DataCite-DOI);theproteomicsinitiative(mzML,TraML)andtherepresentationofbiologi-calandexperimentalmeta-andprimarydata(ISA-TAB).
17|16
Agenda Item 7.1
Attachment
Computerinfrastructure High-performancecluster(90nodes/200gigabytemainmemory);SMPcomput-er(8x4-coreOpteron/256gigabytemainmemory);hierarchicalstoragemanage-ment(HSM)system(~65terabyte/9terabyteonlineaccess);high-performancecluster(1840cores/~2.17TBRAM);3Dvisualisationstation;andIPBcomputingcloud(650CPUcores,centralSANmemorynetwork)
BioinformaticsactivitiesinTübingenandHohenheim
Involved institutions Interdisciplinary Center for Bioinformatics Tübingen (ZBIT): Eberhard Karls Univer-sity in Tübingen, University Hospital in Tübingen, Max Planck Institute for Devel-opmental Biology, Max Planck Institute for Intelligent Systems, Friedrich Miescher Laboratory
Friedrich-Miescher-Laboratorium Hohenheim: University of Hohenheim (Agricul-tural Sciences faculty), State Research Centre for Plant Breeding
Specialisation ZBIT: Various areas of bioinformatics
Hohenheim: Statistical genomics
Networking Hohenheim: Various international collaborations in connection with GABI/PLANT2030 and SYNBREED. There is a research collaboration between the Institute of Plant Breeding (Schmid) and the Max Planck Institute in Tübingen (Weigel).
Education ZBIT: In 1998 the University of Tübingen established the first course of study in Germany for bioinformatics. Today, education in bioinformatics consists of BSc/MSc/PhD programmes. There are currently about 220 undergraduate and graduate students and 50 PhD candidates studying at various schools.
Hohenheim: In Hohenheim, education in bioinformatics and statistical genomics consists of the BSc and MSc programmes “Agricultural Science”, with emphasis on plant and animal breeding, the MSc programme “Crop Sciences”, with emphasis on plant breeding, and PhD programmes. There are currently about 100 undergraduate and graduate students and 20 PhD candidates in this field.
Staff ZBIT: 14 working groups in various areas of bioinformatics
Software ZBIT: Numerous software packages for use in green bioinformatics are being devel-oped in Tübingen, including:
Metagenomics (MEGAN package), phylogenies (SplitsTree), Galaxy server (gene prediction, cis-elements, etc.), short-read assembly (LOCAS), molecular modelling (BALL), proteomics (OpenMS), systems biology (BN++ [BioMiner], integrated next-generation sequencing analysis (SHORE package), NGS aligner (QPALMA, PALMapper, GenomeMapper), and NGS transcriptome analysis (rQUANT).
Hohenheim: A number of software packages for statistical genomics were devel-oped in Hohenheim. Examples include:
Genetic mapping (PLABQTL), simulation of geno- and phenotypes (phenosim, R-hy-pred) an analysis of breeding programmes (PLABSTAT, R-selectiongain, R-mvngGrAd)
Databases No information
Computer infrastructure ZBIT: In Tübingen, there is one cluster at the University and one at the Max Planck campus, which is maintained by staff with permanent positions.
Hohenheim: In Hohenheim, the departments within the Institute of Plant Breeding operate a cluster with more than 100 nodes, which, beginning in 2012, will be main-tained by an employee with a permanent position. Additional external computing capacities are used via collaborations.
18
Agenda Item 7.1
MaxPlanckInstituteofMolecularPlantPhysiologyinGolm
Involved institutions Max Planck Institute of Molecular Plant Physiology (MPIMP, Golm, central bioinfor-matics infrastructure group),
University of Potsdam Golm, bioinformatics group, chair Prof. Selbig)
Specialisation OMICS data management and analysis: Development of databases to manage OMICS data, specifically, metabolomics data and next-generation sequencing data, marker identification, e. g., in the context of breeding, development and application of statistical methods of OMICS data analysis;
Systems biology: Analysis of OMICS data against the backdrop of signalling and metabolic pathways, network reconstruction from OMICS data;
Genome-wide association studies: Development of tools for detecting genotype-phenotype associations;
RNA: Studies of sequence-structure-function relationships of RNA molecules, in particular, non-coding RNA (miRNA), development and provision of methods for functional classification of RNA
Networking With working groups of MPIMP and the University of Potsdam that are conducting experiments;
Exchange with the region’s bioinformatics groups (Leibniz Institute of Plant Bio-chemistry in Halle, Humboldt University in Berlin, MPIMB in Berlin-Dahlem);
Numerous national and international contacts via projects: MPIMP: inter alia, Uni-versity of Erlangen, the IMB in Aachen, University of Vienna, Aberdeen University;
Uni Potsdam: U. a. IPK Gatersleben, Ludwig-Maximilians-Universität München, National Institute of Biology Ljubljana, 8 Europäische Partner eines EU-MC-ITN
Education University of Potsdam: inter alia, the IPK in Gatersleben, Ludwig Maximilians Uni-versity in Munich, the National Institute of Biology in Ljubljana, 8 European partners of a EU-MC-ITN
Staff MPIMP: About 300 employees, including: MPIMP bioinformatics: 9 employees (1 group leader, 4 postdocs, 2 programmers, 2 PhD candidates), as well as bioinformat-ics-oriented employees in numerous working groups;
University of Potsdam: Bioinformatics working group (Prof. Selbig): 10+ employees (1 chair, 6 postdocs, 2 PhD candidates, 1 system administrator, student employees); bioinformatics-oriented employees in the Mathematical Modelling and Systems Biology working group (Prof. Huisinga); several bioinformatics-oriented employees in other working groups
Software A variety of self-developed standalone and Web-based software tools for OMICS data analysis (MetaGeneAlyse and pcaMethods for statistical data analysis;
Specialised software for analysing proteomics (IOMACS) and metabolomics (inter alia, TagFinder, GoBioSpace) experiments;
Commercial software (CLC, Statistica, Matlab, Mathematica), public domain soft-ware (R, MeV, etc.);
Operating systems: Linux, Windows; programming languages/environments: Python, Perl, Java, C,.Net, C#, R, MATLAB; Web programming, databases: SQL, MySQL, Postgres
19|18
Agenda Item 7.1
Attachment
Databases Golm metabolome database (GMD, GC/MS data), ChlamyCyc (Chlamydomonas metabolic pathways, genes, and proteins), RLooM (RNA loop structures), NGS Small-Reads-DB, TROST (potato water-stress data), ChlExDa (Chlamydomonas experimen-tal data), AraNet (expression correlation networks in model plants);
GABI-PD, GABI primary database;
Computer infrastructure MPIMP: 12 servers, 40TB disk space, number of cores: 88
Uni-Potsdam: 10 host computers with 96 computing cores, 10 workstation comput-ers, 48TB central hard-drive memory
b) Animals
VitVerden
Involved institutions VIT is a service computer centre for organised animal breeding
(e. g. state inspection associations, breeding organisation for cattle, horses, sheep, and pigs, genotyping laboratories)
Specialisation Computer applications and genetically statistical analyses in the area of animal husbandry and animal breeding
Networking German university institutes and research facilities in the area of animal breeding
Other national and international computing centres and research institutions in the area of animal breeding
Education Engineering degrees (specialisation in animal breeding), Programmers, database experts, Bioinformatics researchers
Staff About 120 employees (departments, programmers, system maintenance)
Software Proprietary software development (JAVA, Fortran, SAS)
Databases Oracle
Computer infrastructure Dispersed server systems, Linux clusters
GeorgAugustUniversityinGöttingen,AnimalBreedingandGeneticsDepartment
Involved institutions Georg August University in Göttingen, Research Data Processing Company (Gesells-chaft für wissenschaftliche Datenverarbeitung mbh, GWDG) in Göttingen
Specialisation Scientific computing with diverse data structures, processing of animal breeding-specific data, processing of data from high-throughput genotyping, high-through-put phenotyping, and next-generation sequencing, genomic models, association mapping, population genetics
Networking Synbreed, FUGATO projects, DFG-GRK 1664 “Scaling Problems in Statistics”, Centre for Statistics at the University of Göttingen
20
Agenda Item 7.1
Education PhD researchers in agricultural science, graduate researchers in agricultural science, mathematics, and physics
Staff Department head, 2 research associates, 2 postdocs, 10 PhD candidates, 2 program-mers
Software Mathematic software (Mathematica, Maple) Common statistics software (Statistica, SPSS, SAS, R, etc.) Animal breeding-specific software (e. g. ASReml, VCE, PEST, ZPLAN+, …)Software for sequencing data; processing of raw data (e. g. PHRAP, PHRED) sequenc-ing comparisons and multi-alignment (e. g. BLAST, bwa, Phylip, SAMtools)Sequencing databases (e. g. EMBL)Programming languages R, C++, Python, Fortran
Databases MySQL and Oracle databases at GWDG
Computer infrastructure GGWDG: Cluster with several parallel computers (Intel Xeon and AMD Opteron systems) with batch systems, a total of more than 5,000 cores and more than 18,000GB RAM GWDG systems for data security and server administrationVarious department-based servers, e. g. AMD Opteron with 24 cores, 2.1 GHz clocking, 128GB RAM
LeibnizInstituteforFarmAnimalBiology
Involved institutions Leibniz Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196 Dummerstorf
Specialisation ntegrative bioinformatics with farm animals (cow, pig) specifically for performance characteristics, functional characteristics, and behavioural characteristics
Statistical genomics, genetic statistics, evaluation of genetic parameters, evaluation of breeding potential, breeding planning, populations genetics in breeding
Ontology for behavioural characteristics
Networking Phenomics competence network
University of Rostock, Chair for Systems Biology and Bioinformatics
Institute for Neuro- and Bioinformatics, University of Lübeck and Institute for Animal Breeding and Husbandry, Christian Albrechts University in Kiel
Institute for Bioengineering and Food Science, Biostatistics Group, University of Life Sciences, Aas, Norway
Vit Verden, breeding associations, performance testing organisations, and other partners in the field
Education R courses for the International Leibniz Graduate School DiVa and FBN PhD candidates
Biomarker lab for students in molecular biotechnology and computer science at the University of Rostock
Gene set enrichment as part of the course Molecular Bioinformatics II under the master’s courses of study Molecular Life Sciences and Computer Science at the University of Lübeck
Linear module and mixed linear models in the master’s course of study Animal Sci-ences at the University of Rostock
21|20
Agenda Item 7.1
Attachment
Staff Working group Biomathematics and Bioinformatics with four researchers
Working group Animal Breeding and Genetics with four researchers
Junior working group Integrative Bioinformatics for Cattle (2 DM positions for five years)
Post-doctoral position for ontology development (1 DM position for five years)
Software Simulation of genotype distributions and phenotypes typical for farm animals
Algorithm development for integrative bioinformatics for farm animals
Ontology development with emphasis on animal behaviour
Databases Project database for the phenomics competence network Project data bank for integrative bioinformatics for cattle
Computer infrastructure FBN currently has five computer servers with a total of 124 nodes. In addition, it is possible to use external computer capacities (2 clusters with 30 and 10 nodes, respectively) at the University of Rostock (collaboration agreement)
BavarianStateResearchCenterforAgricultureinGrub
Involved institutions Bavarian State Research Center for Agriculture in Grub
Specialisation Genomic evaluation of the breeding potential of cattle and pigs Genome-wide association studies involving cattle and pigs
Networking Technical University of Munich
Christian Albrechts University in Kiel
University of Hohenheim
ZuchtData GmbH, Vienna
State Office for GeoInformation and Land Development in Kornwestheim
Bavarian State Board of Trustees for Animal Processing Producers
Education No information
Staff 6 researchers, 2 PhD candidates, 2 programmers
Software Software used: R, SAS; Beagle; findhap V2; ASReml, DMU, MiX99,.. Software developed in-house: under Fortran and Perl
Databases Genotypes: Alle Genotypen der Rinderrassen, Fleckvieh und Braunvieh in Deutschland, ca. 15.000 Genotypen Fleckvieh, ca. 5.000 Genotypen Braunvieh, hauptsächlich Illumina 54K Bead-Chip, teilweise Illumina 777K Bead-Chip (ca. 1.500), Genotypen von 2.000 Schweinen der Rassen Deutsche Landrasse und Deutsches Edelschwein, Illumina 60K Bead-Chip
Phenotypes: LPerformance data for the above-mentioned breeds in 44 characteris-tics since 1990, Heritage data for the above-mentioned breeds since 1950, Breeding value of the above-mentioned breeds in 44 characteristics
Database systems: Oracle mySQL
22
Agenda Item 7.1
Computer infrastructure Windows workstation computers (standard 4 GB of RAM)5 Linux workstations under Debian, Dual-Xeon 4- and 6-core 16, 19 or 64 GB of RAM per computer, Connection to 4TB hard-drive server, 1 Linux workstation with Oracle-DB, 1 IBM 550Q with 2 Power5 Quadcore Proz. and 64GB of RAM under AIX5L
c) Microbiologyandbiotechnology
ThebioinformaticstechnologyplatformattheCenterforBiotechnology(CeBiTec)attheUniversityofBielefeld
Involved institutions The “Bioinformatics Resource Facility” (BRF) – a research-oriented service and development institution resulting from the DFG’s bioinformatics initiative (2001) – administers the computers used by CeBiTec units, and it supports various large-scale projects in genome research, in particular, through the integration and new develop-ment of database applications for efficient storage of generated primary data, as well as through the implementation of software for analysing larger amounts of genome-based data.
Specialisation The spectrum of research projects, most of which are focused on biotechnology, ranges from the analysis of microbial genomes and metagenomes, to the process-ing of fungi (in particular, yeasts), algae, and plants (wall cress, sugar beets, grape-vine, rapeseed), to animal cell cultures (e. g. Chinese hamster ovary cells).
Networking In addition to the deployment of developed systems in various genome and post-genome projects at CeBiTec, numerous external partners also make use of the Biele-feld infrastructure under national and international collaborations. Currently, a total of approximately 500 internal and more than 2,700 external users are registered, with approximately 55% of them coming from Germany. Select projects: GenoMik, GenoMik-Plus, GenoMik-Transfer, PathoGenoMik, Marine Genomics Europe, Grain Legumes, GABI-Kat, NuGGET, AnnoBeet, SysMap, GK Bioinformatik, CLIB Graduate Cluster.
Education In the area of teaching, bioinformatics education at the University of Bielefeld consists of bachelor’s and master’s courses of study in “Bioinformatics and Genome Research” and “Natural Sciences Informatics”, as well as the master’s course of study “Genome-Based Systems Biology”. Bioinformatics education is under the auspices of 4 chairs (E. Baake – Biomathematics & Theoretical Bioinformatics; J. Stoye – Genome Informatics; R. Giegerich – Practical Informatics; R. Hofestädt – Bioinformatics & Medical Informatics) and other working groups (T. Nattkemper – Biodata Mining & Neuroinformatics; A. Sczyrba - Computational Metagenomics & Single Cell Genom-ics; A. Goesmann – Computational Genomics).
Staff In addition to the director position, the BRF has 6 permanent employee positions for system administration, which are all funded by the University. In addition, 3 student assistants are continually deployed for routine activities, with this being funded from CeBiTec’s budgeted resources. For the purposes of joint representation of the interests of the various CeBiTec working groups, a BRF coordinating committee was formed for planning and shaping the enhancement of the CeBiTec computer infra-structure, working closely with the system administrators.
23|22
Agenda Item 7.1
Attachment
Software In addition to administration and enhancement of the technical infrastructure, the BRF works in the field of applied bioinformatics by actively developing software solutions for high-throughput analyses in the field of genome and postgenome research. The emphasis is on DNA sequencing analysis and genome annotation (SAMS, GenDB, EDGAR SARUMAN, Conveyor), including the reconstruction of meta-bolic pathways (CARMEN), high-throughput analysis in the area of transcriptomics (EMMA), proteomics (QuPE), and metabolomics (MeltDB), as well as general data management and visualisation (ProMeTra). These systems are connected with one another through an integration layer called “BRIDGE”. Other software tools used at the Institute for Bioinformatics at CeBiTec: CARMA, CPA, r2cat, Gecko, GISMO, REGANOR, QAlign, PASSTA, BACCardI, Genlight, RNAcast, RNAshapes, Genalyzer, PathFinder, RNAhybrid, VANESA, PathAligner, TACOA, BIIGLE, etc.
In addition to the above-described Web-based programme packages and the virtual work environment, the Bielefeld Bioinformatics Server (BiBiServ) provides further bioinformatics applications for anonymous users. With its Web services-based pro-gramming interfaces, it offers an additional possibility for providing external users with newly developed tools on an established platform.
Databases DAWIS-M.D., CardioVINEdb, BioDWH, RAMEDIS, STCDB, Coryne-RegNet, Myco-Reg-Net, BIOIMAX, etc.
Computer infrastructure CeBiTec’s hardware park today comprises a computing output of approximately 25 TeraFLOPS (796 CPUs, i. e. 4,024 CPU cores), an online storage capacity of 433TB and a gross back-up capacity of approximately 1.4PB. For the purposes of long-term archiv-ing of raw data and daily data security, BRF deploys three tape systems with a possi-ble final capacity of 16PB. A computer cluster is available for processing primary data and for additional high-throughput analysis. This is used, inter alia, for diverse DNA sequencing analyses, such as the annotation of genomes or metagenome analyses.
For special bioinformatics applications with, e. g., very high memory needs, various application servers are available, which have been furnished with up to 96 CPU cores and a maximum of 1024GB of RAM. Also worthy of mention here are special computers that have been tailored specially to the specific needs of bioinformat-ics. For instance, 4 servers with a total of 12 TimeLogic DeCypher-FPGA cards are deployed to accelerate BLAST analyses. For GPU-based approaches, such as read mapping with the software SARUMAN developed in Bielefeld, three IBM iDataPlex servers, each with two NVIDIA Tesla M2070-GPU cards, were procured. For the independent development of FPGA-based algorithms, a Convey HC-1ex system with a full complement of RAM is also available. This equipment corresponds to an investment of more than EUR 6 million over the past 10 years from DFG, BMBF, and EU projects, as well as from special grants and the University’s own budgeted funds.
Another important component of the Bielefeld bioinformatics infrastructure is the virtual work environment based on Sun Ray thin clients. Through the use of energy-saving and inexpensive terminals, more than 350 bioinformatics workstations are provided today with minimal administrative effort. This has enabled BRF to operate virtually without interruption and in an extremely stable manner for 12 years. In particular, the suitability of this type of virtual workstation for worldwide use was proved through successful operation of the terminal in WAN in, inter alia, Europe, South America, and Asia.
24
Agenda Item 7.1
d) Plantsandmicrobiology
BioinformaticsintheABCD/JRegion
Involved institutions Technical University (RWTH) of Aachen
University of Bonn
University of Düsseldorf
Jülich Research Centre
Max Planck Institute for Plant Breeding Research in Cologne
Specialisation RWTH Aachen: RWTH Aachen has bolstered itself in the field of white systems biology (Prof. Blank), and the medical department has called for the formation of a bioinformatics group.
University of Bonn: With the Institute for Crop Sciences and Protection of Resources and Prof. Léon, the University of Bonn has expertise in the field of breeding poten-tial. Prof. Ewert heads the modelling work for the Agricultural Faculty at the Univer-sity of Bonn (especially yield modelling). An additional professorship for statistical genetics is planned.
Jülich Research Centre: European Plant Phenotyping Network (EPPN). In addition, at IBG1 (Prof. Wiechert), there is great expertise in white biotechnology and also in modelling of networks and biotechnological processes.
Max Planck Institute for Plant Breeding Research in Cologne: Expertise in the field of plant breeding and genetics.
University of Düsseldorf: Metagenomics, metabolic networks
Networking Existing collaborations with, inter alia, INRA France, International Arabidopsis Infor-matics Consortium, International Plant Phenotyping Network, International Tomato Annotation Group, International Medicago Genome Annotation Group, IPK, iPlant, MIPS Munich, Max Planck Institutes in Golm and Tübingen, University of Bielefeld, University of Toronto: BAR Viewer, Perth Plant Energy Biology Center SUBA, PlantsDB, Tomato/potato trait, phenotype and mapping database, Wageningen University, Fraunhofer Institute (Aachen Fraunhofer FIT: Life Science Informatics, Bonn: Fraun-hofer SCAI-Bioinformatics)
Education RWTH Aachen: To date, bioinformatics modules in the biotechnology course of study; bioinformatics is now slated to be included in biology.
University of Bonn: MSc Life Science Informatics, B-IT Center, Bonn; bioinformatics in MSc Crop Science.
University of Düsseldorf: An extensive bioinformatics curriculum is available for biologists (http://www.molevol.de/~bioinf/).
25|24
Agenda Item 7.1
Attachment
Staff RWTH Aachen: Beginning in mid-2012, a position as research associate/professor is to be filled
University of Bonn: Junior researcher group leader as research associate, system administrator, permanent
University of Düsseldorf: Two bioinformatics chairs within the Computer Sciences course of study
Jülich Research Centre: For the planned successor to the Gabi primary database: follow-up funding of 2 FTE through the research centre in connection with the appointment commitment to Prof. Usadel. The working group of Prof. Usadel at FZJ will likewise comprise additional staff from basic funding.
In connection with the DPPN (German Plant Phenotyping Network): access to the IT structure at IBG2 (Prof. Schurr), capacities at the Jülich Plant Phenotyping Center (JPPC) are to be expanded further.
At IBG1: a modelling department (biochemical networks and biotechnology pro-cesses)
At IBG2: a modelling group (structure-function models root and sprout)
Associated with BioSC: The Max Planck Institute for Plant Breeding Research in Cologne with three groups with direct access to green bioinformatics (Dr. Schnee-berger (NGS mapping), Dr. Jimenez-Gomez (adaptive genomics and genetics), and Dr. Stich (quantitative crop genetics) as well as with Prof. Koornneef).
Software RWTH Aachen: Mercator (MapMan annotation), Robin (microarray analysis), R-Robin (RNA seq analyses), MapMan, PageMan (visualisation of OMICS data), Corto
University of Bonn: Function annotation: PhyloFun, AHRD, R packages for ChIP-chip/ChIP-Seq (ChipR), aggregators/workflow tools for Web services
University of Düsseldorf: PhlyoPythiaS
Jülich Research Centre: OMIX (network editor), 13CFLUX (substance flow analysis), CADET (chromatography)
Databases RWTH Aachen: MapMan (functional classes), CSB.DB (correlation databases)
University of Köln: Aramemnon database (group of Prof. Flügge)
University of Bonn: AFAWE, function predictions http://afawe.mpipz.mpg.de
Jülich Research Centre: Gabi primary database successor; phenotype databases, coupling of phenotype-genotype databases
Computer infrastructure Jülich Research Centre: Supercomputing center, as well as clusters at IBG1 and IBG2
26
Agenda Item 7.1
e) Environmentalmicrobiologyandbiodiversityresearch
Bremeninfrastructureforenvironmentalmicrobiologyandbiodiversityresearch
Involved institutions Max Planck Institute for Marine Microbiology in Bremen
Jacobs University in Bremen
Center for Marine Environmental Sciences (MARUM) at the University of Bremen
Alfred Wegener Institute for Polar and Marine Research (AWI) in Bremerhaven
Specialisation Max Planck Institute for Marine Microbiology/Jacobs University: The bioinformat-ics focus of the Max Planck Institute for Marine Microbiology/Jacobs University is microbial diversity and genome research. In addition to sequencing analysis and classification (binning), the emphasis is particularly on the development and operation of reference databases (SILVA project) and the integration of diversity and function data with environmental parameters (Megx project). In addition, the Max Planck Institute for Marine Microbiology/Jacobs University plays an active role in the development of metadata standards, exchange formats, and ontologies, in order to improve the exchange of data and the interoperability of data and databases.
AWI / MARUM: Long-term archiving and publication of biological environment data. In addition, PANGAEA has played an active role in recent years in developing geodata infrastructures and relevant standards. Data are typically made available via central portal services, with PANGAEA® acting, on the one hand, as “data and metadata dis-tributor” and, on the other, as central network architect, portal operator, and broker between various e-infrastructures. Using various metadata standards and protocols (OGC-CS, OAI-PMH, DiGIR, ABCD), various portals and search engines are provided with content from PANGAEA®. PANGAEA played a critical role in the development of the citability of data and the creation of DataCite. Since 2009 services have been provided that enable dynamic cross-referencing of data and article, including from Science Direct (e. g.: http://dx.doi.org/10.1016/j.biocon.2010.04.009).
AWI: Current bioinformatics applications at AWI include the modelling of ecologi-cal niches for diatoms (project of the Hustedt diatom collection), the recording of time variations of microplankton algae communities in the face of global changes, and the breakdown of toxic pathways in dinoflagellates (shellfish poisoning, e. g. Azadinium spinosum), as well as molecular characterisation of ecosystems in sea ice (MacSeaIce project) and biotechnology applications of cold-adapted in-situ oil-degrading marine bacteria. Also deserving of mention: Participation in genome sequencing projects for key organisms, transcriptomics studies of adaption/acclima-tion in higher organisms, as well as at the ecosystem level.
Networking Max Planck Institute for Marine Microbiology/Jacobs University: With its bio-informatics expertise, the Max Planck Institute for Marine Microbiology/Jacobs University is involved in a number of current research projects: BMBF project MIMAS (Microbial Interactions in MArine Systems), SAW-Leibniz project ATKIM (degradability of arctic, terrigenous carbon in the sea), EU projects MAMBA (Marine Metagenomics for New Biotechnological Applications), EuroMarine (Integration of European Marine Research Networks of Excellence), and BioVeL (Biodiversity Virtual e-Laboratory). It also coordinates the EU project Micro B3 (Biodiversity, Bioinformat-ics, Biotechnology). In addition, the Max Planck Institute for Marine Microbiology/Jacobs University is involved in various national and European infrastructure pro-jects: DFG project CIBAS (Center for integrative Biodiversity Analysis and Synthesis), EU projects EuroFleets (Towards an alliance of European research fleets), EMBRC (European Marine Biological Resource Centres), and MIRRI (Microbial Resource Re-search Infrastructure). In addition, close contacts have been created to the European Bioinformatics Institute (EBI) and, in particular, to the ELIXIR project (European Life Sciences Infrastructure for Biological Information).
27|26
Agenda Item 7.1
Attachment
Networking AWI / MARUM: PANGAEA is an accredited world data centre in both the ICSU World Data System (WDS) and the WMO Information System (WIS), and in the past 15 years, it was actively involved in more than 140 national, European, and international projects (currently IODP (NSF), EUR-OCEANS and EUROMARIN, ESONET (NoE), EMSO (CP), EPOCA (CP), CoralFish (CP), EUROBASIN (IP), HYPOX (CP), EMODNET Bio and Tara-Oceans, as well as, on a national level, BIOACID, INTERDYNAMIK and SOPRAN – for a complete list, see www.pangaea.de/projects). In addition, PANGAEA maintains broad collaborations with scientific publishers (Elsevier, Springer, Wiley, AGU, and others)
AWI: In recent years, AWI has developed and expanded an extensive research and applications profile in modern OMICS methods (today, primarily next-generation sequencing and microarrays), inter alia, through research collaborations (e. g. Marine Genomics Europe Network of Excellence), genome sequencing consortiums (Micromonas, Th. pseudonana, E. siliculosus, F. cylindrus, E. huxleyi, Ch. Crispus, Glaciecola), and transcriptome sequencing projects (Krill, P. brachycara, Hyas, S. latissima, sea ice meta-transcriptome), as well as through programme research (e. g. coastline research, harmful algal blooms, ecological chemistry) and junior researcher groups (PLANKTOSENS).
Education Bachelor’s course of study in applied computational mathematics with specialisa-tion in bioinformatics at Jacobs University. Bioinformatics course and internship at the University of Bremen, and a master’s course of study in marine microbiology at the International Max Planck Research School. Periodic bioinformatics workshops and on-site training of users, online tutorials. Collaboration by students on projects in connection with internships, guided research modules, and student assistants.
Staff Max Planck Institute for Marine Microbiology/Jacobs University: 9 postdocs, 9 PhD candidates, 2 master’s students, 2 technicians, 1 team assistant, 1 group leader
AWI / MARUM: 5 postdocs, 2 technicians, 3 data managers, 1 group leader
AWI Computing Centre/Bio/Bioinformatics: 4 Postdocs
Software Diversity and phylogeny (ARB/SILVA), classification, binning (TETRA, TaxSOM, TaxoM-eter), standardisation (MetaBar, CDinFusion), annotation (JCoast), data integration (Megx.net)
AWI / MARUM: In recent years, the PANGAEA® group has developed open-source software (Schindler & Diepenbroek 2008) for building portals and connecting data providers, and because of its modularity, it can support any number of metadata standards (ISO19xxx, DIF, Dublin Core, Darwin Core etc.). The software is used for various projects (inter alia, IODP, CARBOCHANGE, EPOCA, ESONET/EMSO, HYPOX, C3-GRID). In addition, PANGAEA employs the data warehouse software from Sybase (IQ), which is primarily used as a preliminary step in compiling data products.
AWI: Comparative genomics (Phylogena), microalgae communities (Pyloassigner), comparative metagenomics (MGMCMC), micro-satellite marker design (STAMP)
Databases Max Planck Institute for Marine Microbiology/Jacobs University: SILVA: The Euro-pean database for ribosomal RNA sequences (www.arb-silva.de)
Microbial biodiversity research is principally based on the analysis of marker genes. In this regard, ribosomal RNA has become the gold standard, and for years the num-ber of publicly available rDNA sequences has been growing exponentially, doubling every 12 – 18 months (currently, roughly 2.7 million sequences as of January 2012). In order to be able to analyse this flood of data, specialised reference databases and software tools are of critical importance. The ARB and SILVA database project was established more than 20 years ago in order to meet this challenge. ARB and SILVA are internationally recognised tools for processing, curating, and analysing rDNA sequences in biodiversity research and for industrial quality control and medical diagnostics.
28
Agenda Item 7.1
Megx.net (www.megx.net): Megx.net was developed in 2005 as the first integrated database in the field of environmental microbiology, and it permits concentrated access to microbial genome information and biodiversity in the context of the environment. In this regard, global environmental parameters are generated on the fly from oceanographic data sources. The close networking of Megx.net with public sequencing and environmental data repositories, such as EMBL-EBI/ENA and PANGAEA, in combination with intuitive visualisation of results, provides users with a dynamic look at biodiversity and function in the context of the environment.
AWI / MARUM: PANGAEA® – Publisher for Earth & Environmental Science (ICSU World Data Center) (www.pangaea.de)
The broad spectrum of WDC-MARE databases, which are distributed across the entire gamut of geo-, bio- and environmental sciences, aid in the research of global environmental changes. The focus is on geo-referenceable data from the fields of oceanography, marine geology, paleoceanography, and marine biology. The operational platform is the information system PANGAEA. The system currently contains approximately 450,000 data sets with more than 6.5 billion data points on approximately 40,000 different parameters from all of the world’s seas and continents, receiving the majority of its funding from project data management and the development of geodata infrastructures.
AWI: PLANKTONNET biodiversity platform, Hustedt Diatom Research Centre (collec-tion data)
Standardisation and ontologies
MPI-Bremen / Jacobs University: Genomic Standards Consortium (www.gensc.org)
Founded in 2005 in Oxford, the Genomic Standards Consortium (GSC), composed of international researchers, took it upon itself to draft guidelines for a compact yet representative number of desirable additional data for sequencing information. This resulted first in the MIGS (Minimum Information about a Genome Sequence) and the MIMS (Minimum Information about a Metagenome Sequence) standards for genome and metagenome information. After several more years of development, the Consortium was recently able to publish the MIMARKS (Minimum Information about a MArker gene Sequence) standard and the MIxS (Minimum Information about any (x) Sequence) specifications. The GSC also oversees the development of ontologies, e. g. for habitat classification, through its Environment Ontology. The Max Planck Institute for Marine Microbiology/Jacobs University is in charge of the GSC and administers the central databases for the standards and specifications issued to date and a GSC reference implementation in XML.
Computer infrastructure Max Planck Institute for Marine Microbiology: 500 Cores als Cluster mit 60Tb Stor-age (ausfallsicher, permanent überwacht), Webserver, Archivspeicher für Sequen-zanalyse, Phylogenie, Annotation, Datenbanken und Services.
AWI: 12-node vector computer NEC SX8R, 3.3 TFlop/s, 56TB GFS file system, ocean/sea ice/paleoclimate models. 24-core dual-Opteron cluster, genome annotation, phylogeny, and transcriptomics. 1 SMP nodes 16-core Opteron, 32GB RAM, assembly/mapping 454-ILLUMINA genomics/transcriptomics, phylogenetic placement of 454 sequencing data, large-scale niche modelling, metagenomic Markov chain Monte Carlo Bayesian statistics. 1 SGI UV100 20-blade, 160-core Intel E7-883, 2.56TB RAM, 96TB file system with InfiniteStorage, ocean modelling, data assimilation, transcrip-tomics annotation and mapping, genome assembly, high-throughput phylogenetic placement of 454 sequencing data, metagenomic annotations. 2PB archive storage SL8500 (LTO/3), netapp scalable storage systems.
Use of services: Internationally via Web pages and Web services. Sun (Oracle) Secure Global Desktop (Web-based) and Sun-Ray thin clients for distributed work on virtual workstations. Collaboration with companies via Bremen-based Ribocon GmbH (spun off in 2005 by the Max Planck Institute). In addition, Galaxy workflows are used at AWI.
29|28
Agenda Item 7.1
Attachment
Glossary
BILS:BioinformaticsInfrastructureforLifeSciences.Decentralisednationalresearchin-frastructureforbioinformaticsinSweden,whichissupportedbytheSwedishResearchCouncil.
Biocatalysts:Biocatalystsarepolymerbiomoleculesthatacceleratebiochemicalreactionsinorganismsbyloweringor(lessfrequently)raisingtheactivationenergyinreactions.
Biocuration:Comprisesthetranslationandintegrationofbiologicaldatainadatabase,enablingthedatatobelinkedwithscientificliteratureandotherdatasets.
Biodiversity:Conceptthatdescribesthediversityoflifeonthethreelevelsofecosystems,species,andgenes.Afourth level isconsideredtobethediversityof interrelationshipswithinandbetweentheotherthreelevels,whichistermedfunctionalbiodiversity.
Cloudcomputing:CloudcomputingdescribestheapproachofmakingabstractedITin-frastructures(e.g.computingcapacity,datastorage,networkcapacities,orevenfinalisedsoftware)availableviaanetworkinamannerthatisdynamicallyadaptedtoneeds.
Computercluster:Anumberofnetworkedcomputers.Theobjectiveof“clustering”isusu-allytoincreasecomputingcapacityoravailabilityascomparedwithindividualcomputers.
Cropsense:Networkforcomplexsensortechnologyforcropresearch,breeding,anddatamanagement.
Dataintegration:Bringingtogetherofdatafromavarietyofdifferentsources.
ELIXIR:EuropeanLifeSciencesInfrastructureforBiologicalInformation,apan-EuropeaninitiativetodevelopapermanentEuropeanbioinformaticsinfrastructure.
FUGATO:Researchprogrammeonfunctionalgenomeanalysisinanimalorganismsspon-soredbytheGermanFederalMinistryofEducationandResearch(BMBF).
GABI/PlantBiotechnologyoftheFuture:Researchprogrammeinthefieldoffuture-ori-entedplantbiotechnologysponsoredbytheGermanFederalMinistryofEducationandResearch(BMBF)andprivatecompanies.
Genome:Theentiretyofanorganism’sgeneticinformation.
Genomics:Fieldofresearchthatlooksatorganismsattheleveloftheirgenomedata.
GenoMik:Researchandsponsorshipinitiative“GenomeResearchonMicroorganisms–GenoMik”launchedin2001bytheGermanFederalMinistryofEducationandResearch(BMBF)inordertocreatethestructuralandsubstantiveconditionsfortheuseofthepo-tentialofmicroorganismsbywayofglobal,genome-basedresearchapproaches.
High-throughputprecisionphenotyping:Automatedmethodsbywhichalargenumberofphenotypingsareperformedwithhighthroughput.
30
Agenda Item 7.1
Knowledgebases:Specialdatabasesforknowledgemanagement.
Metabolome:Theentiretyofanorganism’smetabolites.
Metabolomics:Fieldofresearchthatlooksatorganismsattheleveloftheirmetabolites.
Metadata:Datathatcontaininformationaboutotherdata.
Model-baseddataanalysis:Statisticaldataanalysisusingmodelsthataretailoredtotherespectiveproblemandthatseektoidentifypossiblemechanismsoftheunderlyingpro-cesses.
NBIC:NetherlandsBioinformaticsCentre,aDutchbioinformaticsnetworkwithexpertiseintheareasofresearch,teaching,andsupport.
Next-generationsequencing:NewmethodsforDNAsequencing,whichmakeincreasedthroughputpossible.
OMICStechnologies:All-encompassingdescriptionoftechnologiesusedtoanalysetheentiretyofanorganism’sparticularsystemlevel,e.g.allgenes(genomics),alltranscripts(transcriptomics),allproteins(proteomics),orallmetabolites(metabolomics).
Ontology:Formallystructured,linguisticdepictionsofasetoftermsandtherelationshipsbetweentheminagivensubjectmatter.Theyareusedtoexchange“knowledge”indigitalandformalformbetweenapplicationsoftwareandservices.
PathoGenoMik: Guideline of the German Federal Ministry of Education and ResearchforthefundingofresearchprojectswithintheERA-NETPathoGenoMics“TransnationalPathogenomics:Prevention,Diagnosis,Treatment,andMonitoringofHumanInfectiousDiseases”aspartoftheframeworkprogramme“Biotechnology–UsingandShapingOp-portunities”.
Phenomics: Competence network for agricultural and nutrition research sponsored bytheGermanFederalMinistryofEducationandResearch.Itrepresentsasystems-biolog-icalapproachtothegenotype-phenotypedepictionofthefarmanimalscattleandpigs.
Phenotyping:Quantitativeanalysisofkeyfunctionsandstructuresoforganismsandbio-logicalsystemsandtheunderlyingphysiological,molecular,andgeneticmechanisms.
Postgenomic data: Biological data that analyse cellular activities in their entirety andthusgobeyondthepurelygeneticlevelofdatacollection.
Primarydata:SequencingdataofDNA,RNA,andproteinmolecules.
Proteome:Theentiretyofallproteinsexpressedinanorganismatacertaintime.
Proteomics:Fieldofresearchthatlooksatorganismsattheleveloftheirproteins.
31|30
Agenda Item 7.1
Attachment
ReNaBi: Réseau National des plates-formes Bioinformatiques, a French bioinformaticsnetworkstructure.
SIB:SwissInstituteofBioinformatics,afederationofbioinformaticsresearchgroupsofleadingSwissuniversitiesandtheSwissFederalInstituteofTechnology.
Standardoperatingprocedures:Proceduresdescribingwhathappensduringaprocess.
Supercomputer:Thefastestcomputerof its time.Atypical featureofamodernsuper-computerisitslargenumberofprocessors,whichcanaccesssharedperipheryequipmentandapartiallysharedmainmemory.Supercomputersareoftenemployedforcomputersimulationsintheareaofhigh-performancecalculations.
Synbreed:CompetencenetworksponsoredbytheGermanFederalMinistryofEducationandResearchforestablishinganinterdisciplinarycentreforgenome-basedbreedingre-search involving crops and farm animals. Group of researchers from plant and animalbreeding,molecularbiology,bioinformatics,andhumanmedicine,togetherwiththein-volvementofuniversity,institutionalandindustrialcollaborationpartners.
SyntheticBiology:Fieldborderingonmolecularbiology,organicchemistry,engineering,nano-biotechnology,andinformationtechnology,withtheobjectiveofconstructingbio-logicalsystemsandmicroorganismswiththeaidofstandardisedbuildingblocks.
Systemsbiology:Biosciencewhoseobjectiveisunderstandingthecomplexanddynamicbiologicalprocessesofcellsandorganismsintheirentirety.
Transcriptome:Theentiretyofalltranscriptsexpressedinanorganismatacertaintime.
Transcriptomics:Fieldofresearchthatlooksatorganismsattheleveloftheirtranscripts.
32
Agenda Item 7.1
Members of the “Bioinformatics Workshop” Steering Committee
Prof.Dr.FrankOliverGlöcknerMaxPlanckInstituteforMarineMicrobiology/JacobsUniversityinBremen
Dr.AlexanderGoesmannCeBiTec/UniversityofBielefeld
Dr.ThomasHartschGeneDataAG
Dr.EricvonLieresJülichResearchCentre
Dr.KlausMayerMunichInformationCentreforProteinSequences(MIPS)/HelmholtzCentreinMunich
Prof.Dr.AlfredPühler(Chairman)CeBiTec/UniversityofBielefeld
Prof.Dr.NorbertReinschLeibnizInstituteforFarmAnimalBiology(FBN)inDummerstorf
Prof.Dr.Chris-CarolinSchönTechnicalUniversityofMunich
Prof.Dr.WolfgangWiechertJülichResearchCentre
Prof.Dr.RalfZimmerLudwigMaximiliansUniversityinMunich
33|32
Agenda Item 7.1
Members of the Bio-economy Research and Technology Council
Prof.Dr.Dr.h.c.ReinhardF.Hüttl(chairman)ScientificExecutiveDirectoroftheGermanResearchCentreforGeosciencesattheHelmholzCentreinPotsdam;PresidentoftheNationalAcademyofScienceandEngineering(acatech);ProfessorofSoilProtectionandRecultivationattheBrandenburgUniversityofTechnologyinCottbus
Dr.Dr.h.c.mult.AndreasJ.Büchting(deputychairman)ChairmanoftheSupervisoryBoardofKWSSAATAG
Prof.Dr.BerndMüller-Röber(deputychairman)ProfessorofMolecularBiology,MaxPlanckInstituteofMolecularPlantPhysiologyandtheUniversityofPotsdam
Prof.Dr.Dr.h.c.JoachimvonBraun(deputychairman)DirectorattheCenterforDevelopmentResearch(ZEF)attheUniversityofBonn
Prof.Dr.AchimBachemChairmanoftheExecutiveBoardoftheJülichResearchCentre
Dr.HelmutBornSecretaryGeneraloftheDeutscherBauern-verbande.V.(GermanFarmersAssociation)
Prof.Dr.HanneloreDanielTechnicalUniversityofMunich,ChairforNutritionPhysiology
Prof.Dr.Utz-HellmuthFelchtManagingDirector,OneEquityPartnersEurope,Munich;memberoftheSenateoftheNationalAcademyofScienceandEngineering(acatech)
Prof.Dr.ThomasHirthHeadoftheFraunhoferInstituteforInterfacialEngineeringandBiotechnologyandtheInstituteforInterfacialEngineeringattheUniversityofStuttgart.
Prof.Dr.FolkhardIsermeyerPresidentoftheJohannHeinrichvonThünenIn-stitute,FederalResearchInstituteforRuralAreas,Forestry,andFisheriesinBraunschweig
Dr.StefanMarcinowskiMemberoftheBoardofExecutiveDirectorsofBASFSE;ChairmanoftheManagementBoardoftheGermanBiotechnologyIndustryAssociation(DeutscheIndustrievereinigungBiotechnologie)(DIB)
Prof.Dr.Dr.h.c.ThomasC.MettenleiterPresidentoftheFriedrichLoefflerInstitute,FederalResearchInstituteforAnimalHealth,onRiemsIsland
Dr.Dr.h.c.ChristianPatermannAdvisoronknowledge-basedbio-economicstotheStateofNorthRhine-Westphalia
Prof.Dr.AlfredPühlerCentreforBiotechnology/UniversityofBielefeld
Prof.Dr.ManfredSchwerinProfessorofAnimalBreedingattheUniversityofRostock;ChairmanoftheLeibnizInstituteforFarmAnimalBiology(FBN)inDummerstorf
Prof.Dr.WiltrudTreffenfeldtR&DDirectorforEurope,MiddleEastandAfrica,DowEurope,Horgen,Switzerland
Prof.Dr.FritzVahrenholtCEOofRWEInnogyGmbH
Dr.HolgerZinkeChairmanofBRAINAG
Prof.Dr.AlexanderZehnder(permanentguest)DirectoroftheWaterResearchInstituteattheUniversityofAlbertainEdmonton,Canada
34
Agenda Item 7.1
The BioEconomyCouncil would like to thank the German Federal Ministry of Education and Research for its fund-ing, as well as the National Academy of Science and Engineering (acatech) for administrative support.
Special thanks are owed to the outside experts who provided valuable information for this paper. The BioEconomy-Council is solely responsible for the content of the recommendations.
The BioEconomyCouncil ’s work is supported by an administrative office : Dr. Claus Gerhard Bannick ( Head )Dr. Andrea George ( academic research assistant )Dr. Katja Leicht ( academic research assistant )Petra Ortiz Arrebato ( assistant )Ulrike von Schlippenbach ( academic research assistant )Dr. Elke Witt ( academic research assistant )Dr. Eva Wendt ( academic research assistant )Julian Braun, Martin Schmidt (student research assistants)
PUBLICATION DETAILS
Publisher Published by the BioEconomy Research and Technology Council ( BÖR )© BÖR, Berlin ( 2012 )
Registered addressCharlottenstraße 35–3610117 Berlin
Design and layout byOswald + Martin Werbeagentur, Berlin
Printed byBrandenburgische Universitätsdruckerei
ISSN 1869-1404, ISBN 978-3-942044-66-0, (print edition), ISBN 978-3-942044-67-7 (online version)
The German National Library lists this publication in the National Bibliography.Detailed bibliographic data can be found at http://dnb.d-nb.de.
35|34
Agenda Item 7.1
PublisherForschungs-undTechnologieratBioökonomie(BÖR)©BÖR,Berlin(2012)
ContactGeschäftsstelledesBioÖkonomieRatsCharlottenstraße35–3610117BerlinTel.:030767718911Fax:030767718912E-Mail:[email protected]:www.biooekonomierat.de
36
Agenda Item 7.1
The German Network for
Bioinformatics Infrastructure
(de.NBI) – a short overview
Prof. Dr. A. Pühler
de.NBI coordinator
Bielefeld, April 2015
Agenda Item 7.2
1
The German Network for Bioinformatics Infrastructure (de.NBI) – a short overview
Prof. Dr. A. Pühler
de.NBI coordinator
Bielefeld, April 2015
Content
1 Mission statement of the de.NBI project ......................................................................... 2
2 Composition of the de.NBI consortium ........................................................................... 2
3 The development of the de.NBI initiative ........................................................................ 3
4 The organization of the de.NBI project ........................................................................... 4
5 The eight de.NBI service centers .................................................................................... 5
6 The Central Coordination Unit (CCU) ............................................................................. 7
7 The five Special Interest Groups (SIGs) ......................................................................... 9
7.1 The Special Interest Group SIG 1 “Web Presence” ................................................. 9
7.2 The Special Interest Group SIG 2 “Service and Service Monitoring” ......................10
7.3 The Special Interest Group SIG 3 “Training and Education” ...................................10
7.4 The Special Interest Group SIG 4 “Infrastructure and Data Management” ..............11
7.5 The Special Interest Group SIG 5 “de.NBI Development” .......................................11
8 The de.NBI Coordination and Administration Unit (CAU) ...............................................11
8.1 The de.NBI Coordinator .........................................................................................11
8.2 The de.NBI Administration Office ...........................................................................12
9 The de.NBI Scientific Advisory Board (SAB) ..................................................................12
10 References ....................................................................................................................13
Agenda Item 7.2
2
1 Mission statement of the de.NBI project
i. The ‘German Network for Bioinformatics Infrastructure’ provides comprehensive first-
class bioinformatics services to users in basic and applied life sciences research.
ii. The de.NBI program coordinates bioinformatics training and education in Germany.
iii. The de.NBI program coordinates the cooperation of the German bioinformatics
community with international bioinformatics network structures.
2 Composition of the de.NBI consortium
The de.NBI consortium consists of 23 project partners
which are organized in 8 service centers. The
locations of the project partners and service centers
are shown in Fig. 2. The internal structure of each
individual service center is presented in Table 1.
Figure 2: Locations of project partners and service centers of the de.NBI initiative
Figure 1: The mission statement of the de.NBI project
Agenda Item 7.2
3
Table 1: Internal structure of the de.NBI service centers
Center Coordinator
(A) ’Heidelberg Center for Human Bioinformatics (HD-HuB)’
Partners: - DKFZ Heidelberg
- EMBL Heidelberg
- Universität Heidelberg
R. Eils, Heidelberg
(B) ‘Bielefeld-Gießen Center for Microbial Bioinformatics (BiGi)’ Partners: - Universität Bielefeld
- Universität Gießen
J. Stoye, Bielefeld
(C) ‘Bioinformatics for Proteomics (BioInfra.Prot)’ Partners: - Medizinisches Proteom-Center der Universität Bochum
- Leibniz-Institut für Analytische Wissenschaften – ISAS - e.V. Dortmund
M. Eisenacher, Bochum
(D) ‘Center for Integrative Bioinformatics (CIBI)’ Partners: - Freie Universität Berlin
- Universität Konstanz
- Universität Tübingen
O. Kohlbacher, Tübingen
(E) ‘RNA Bioinformatics Center (RBC)’ Partners: - Universität Freiburg
- Universität Leipzig - Max-Delbrück-Zentrum für Molekulare Medizin Berlin
R. Backofen, Freiburg
(F) ‘German Crop BioGreenformatics Network (GCBN)’ Partners: - Leibniz-Institut für Pflanzengenetik und Kulturpflanzen-
forschung Gatersleben - Helmholtz-Zentrum München - Forschungszentrum Jülich
U. Scholz, Gatersleben
(G) ‘Database Node’ Partners: - Jacobs University Bremen - SILVA
- Universität Bremen - PANGAEA
- Leibniz-Institut DSMZ Braunschweig - BacDive - TU Braunschweig - BRENDA
F. O. Glöckner, Bremen
(H) ‘Data Management Node (NBI-SysBio)’ Partners: - HITS Heidelberger Institut für Theoretische Studien
- Universität Rostock
W. Müller, Heidelberg
3 The development of the de.NBI initiative
The first step in establishing a German Network for Bioinformatics Infrastructure was initiated
by the Bioeconomy Council publishing a statement in April 2012 entitled “Requirements for a
Bioinformatics Infrastructure in Germany for future research with bioeconomic relevance” [1].
This statement was presented to the Federal Ministry of Education and Research (BMBF). In
May 2013 the BMBF published an announcement [2] for the selection of service centers
which should form the basis for a bioinformatics infrastructure in Germany. In October 2013,
finally, an international evaluation panel selected eight service centers which were asked to
form a German Network for Bioinformatics Infrastructure (de.NBI) by developing a joint
Agenda Item 7.2
4
proposal during a conceptional phase. In order to structure the conceptional phase a
coordinator for the de.NBI project was appointed by the BMBF in January 2014 and an
administration office was established in February 2014. During the first months of the year
2014 the joint application was completed and approved again by the evaluation panel. The
de.NBI project was finally started on March 1, 2015.
4 The organization of the de.NBI project
The organigram of the de.NBI project is presented in Figure 3. The Central Coordination Unit
(CCU) is the decision taking body of de.NBI. The CCU consists of nine delegates; eight
delegates are nominated by the service centers. One additional seat is reserved for the
de.NBI coordinator. The CCU can arrange so called Special Interest Groups (SIGs) which
are responsible for providing solutions concerning special questions to be treated in CCU
meetings. The organization of CCU meetings is tin the hand of the Coordination and
Administration unit (CAU). There is also a Scientific Advisory Board (SAB) which interacts
with the coordinator, the Central Coordination Unit and the service centers.
Figure 3: Organizational chart of the de.NBI project
Agenda Item 7.2
5
5 The eight de.NBI service centers
A brief overview on the research topics and the related services offered by the de.NBI
service centers are presented in Fig. 4. It should be mentioned that three service centers
deal with medical, microbial and plant genomics. Two service centers are involved in
proteomics and RNA bioinformatics. One service center is specialized in integrative bioin-
formatics. The last two service centers concentrate on data management and data collection.
Figure 4: Overview of the research topics and the related services offered by the units of the ‘German Network for Bioinformatics Infrastructure (de.NBI)’. Each unit of the network, including six service centers (red) and two local data resource nodes (green, blue), provides bioinformatics services in a defined field of scientific expertise, thereby covering many areas of life sciences research.
A more detailed description of the eight de.NBI service centers is given below.
(A) The ’Heidelberg Center for Human Bioinformatics (HD-HuB)’ provides access to
state-of-the art bioinformatics infrastructure and know-how in the key application
areas of human genetics and genomics, human microbiomics and systematic pheno-
Agenda Item 7.2
6
typing of human cells. A special field of expertise of the HD-HuB partners DKFZ,
EMBL and Heidelberg University lies in the area of analysis of next-generation
sequencing data with its tremendous potential for biomedical research.
(B) The ‘Bielefeld-Gießen Center for Microbial Bioinformatics (BiGi)’ combines the
bioinformatics expertise and resource facilities at Bielefeld University and Gießen
University as required in the field of microbial genome and post-genome research.
The center builds on the collective expertise available in the areas of terrestrial, plant,
food, animal, and clinical microbiology and includes recent developments in the field
of synthetic microbiology.
(C) The Service Center ‘Bioinformatics for Proteomics (BioInfra.Prot)’ is composed of
the Bioinformatics/Biostatistics work group of the Medizinisches Proteom-Center
Bochum and the Department of Bioanalytics of the Leibniz-Institut für Analytische
Wissenschaften - ISAS - e.V. The BioInfra.Prot center focuses on bioinformatics for
proteomics and on bioinformatics for human and medical/clinical data. The unit has
deep experience with proteomics tools and proteomics standards and analyses in the
field of clinical and human proteomics.
(D) The ‘Center for Integrative Bioinformatics (CIBI)’ joins the expertise from research
groups of the Freie Universität Berlin, Universität Konstanz and Eberhard-Karls-
Universität Tübingen that have developed valuable bioinformatics resources for next-
generation sequencing analysis, proteomics, metabolomics, and scientific workflows.
CIBI will be an important resource for integrative bioinformatics and its application
fields, offering sustainable software solutions in the form of libraries, tools and
workflows.
(E) The ‘RNA Bioinformatics Center (RBC)’ with research partners from the universities
of Freiburg and Leipzig and the Max Delbrück Center for Molecular Medicine serves
as a contact point for RNA bioinformatic enquiries. The center provides specialized
curated RNA-related information resources and services or expertise in topics like
RNA structure analysis, prediction of ncRNA targets, definition and classification of
RNA transcripts, and the analysis of protein-RNA interactions.
(F) The ‘German Crop BioGreenformatics Network (GCBN)’ provides tailored plant-
specific data and infrastructure to the plant research and plant breeding community.
The partners from Crop Plant Research at Gatersleben, the Helmholtz Center Munich
and the Forschungszentrum Jülich are internationally leading in the implementation
and development of plant genome oriented bioinformatics and provide expertise in
the statistical analysis and the interpretation of quantitative transcriptomic data. The
‘Leibniz Institute of Plant Genetics and Crop Plant Research’ maintains the Federal
Genebank of Agricultural and Horticultural Crop Species.
Agenda Item 7.2
7
(G) The ‘Database Node’ consists of the four databases SILVA, PANGAEA, BacDive,
and BRENDA with quality-controlled reference datasets. These local data resources
provide access to and services for ribosomal RNA genes from all three domains of life
(SILVA), georeferenced data from earth system research (PANGAEA), strain-linked
information on the different aspects of bacterial and archaeal biodiversity (BacDive),
and comprehensive enzyme information (BRENDA). Serving users in academia and
industry is the core mission of the database node. All databases reflect the core ex-
pertise of their home institutes in Bremen and Braunschweig.
(H) The ‘Data Management Node (NBI-SysBio)’ provides bioinformatics support and a
standards-based data management for systems biology projects, with focus on the
provenance of experimental results and on the reproducibility of modeling experi-
ments, as well as high-quality curated biochemical data to modelers and experimen-
talists. The node concentrates on two tools for the data management in life sciences
research: SEEK, a catalogue for storage, registration and exchange of data and
models; and SABIO-RK, a data repository for biochemical reaction kinetics.
6 The Central Coordination Unit (CCU)
The management of scientific, technical and administrative aspects of the de.NBI consortium
is the mandate of the Central Coordination Unit (CCU). Hence, the CCU is the main decision-
making body of de.NBI and responsible for the effective operation of the scientific, technical
and administrative management structure (see Table 2). The principal mission of the CCU in-
cludes: (i) defining the scientific strategy of de.NBI and the internal procedures to achieve the
de.NBI goals, with particular focus on recent scientific advances and breakthroughs; (ii)
measuring the progress and success of services and outputs defined in the work packages
of the de.NBI partners, with particular focus on the avoidance of overlapping activities; (iii)
evaluation of the user quantification and the user acceptance of the de.NBI services; (iv) as-
sessing the end users’ needs and feedbacks to permanently improve the de.NBI services; (v)
establishing and monitoring measures for quality control and quality assurance of the
network structures and services; (vi) conceiving concepts for national and international net-
working strategies of de.NBI; (vii) conceiving concepts for the connection of de.NBI to indus-
try; (viii) conceiving concepts for sustainability of the de.NBI services; (ix) conceiving
scientific concepts and recommendations for the strategic amendment of the de.NBI consor-
tium in the course of the second call by the BMBF/Pt Jülich. Each year the CCU convenes a
plenary meeting of the de.NBI staff, preferably as a satellite meeting to the de.NBI workshop,
to discuss all matters related to the de.NBI consortium as a whole.
Agenda Item 7.2
8
The board of the CCU consists of nine de.NBI partners, namely one elected representative
from each de.NBI unit (i.e. the unit coordinator) and the de.NBI coordinator. It is planned to
meet quarterly for regular CCU meetings that are chaired by the de.NBI coordinator and
organized at the institutes of the de.NBI partners in collaboration with the administration
office. The staff members of the administration office take part in the CCU meetings as
guests. The efficient functioning of the CCU is defined by general rules of internal procedure
to be conceptualized by the CCU board at the initial meeting. This compilation of general
rules is a fundamental basis for governing meetings and other scientific and administrative
operations of the CCU and the de.NBI consortium. The rules of internal procedure are
supplemented by a consortium agreement between the institutes of all de.NBI partners,
providing a legal business document outlining the basic terms of scientific collaboration
within de.NBI, each party’s responsibilities and any additional warranties or promises.
Moreover, the consortium agreement is a legal document for all de.NBI partners to confirm
the acceptance of the organization and management structure of de.NBI and, in particular, all
items related to scientific and financial reports to be provided to the BMBF/Pt Jülich.
Table 2: Tasks of the Central Coordination Unit (CCU) – In brief
Supporting work of the CCU is contributed by subcommittees denominated Special Interest
Groups (SIGs) focusing on various scientific topics relevant for operational and strategic
decisions by the de.NBI management. SIGs are small discussion groups of de.NBI experts.
SIG meetings are chaired by a de.NBI partner nominated by the CCU. Protocols of the
meetings and compiled recommendations of the SIGs are transmitted to the CCU for further
discussion and deliberation. The CCU can establish either temporary SIGs on specific issues
or permanent SIGs on perpetual outstanding issues of de.NBI. It is currently planned to
Establishing procedural rules for the CCU and the de.NBI consortium
Developing a consortium agreement for all de.NBI partners
Discussing and deciding on strategic goals of de.NBI, like
o development of training and education concepts
o development of guidelines for the cooperation with industry
o development o a concept for sustainability
Discussing the development of contacts to existing national and international
bioinformatics networks
Controlling the project aims of the de.NBI partners as defined in the workpackages
Approving the periodical reports for the BMBF/PtJ Jülich and the SAB
Agenda Item 7.2
9
establish five SIGs: SIG Web presence; SIG Service and Service Monitoring; SIG Training
and Education; SIG Infrastructure and Data Management; SIG de.NBI Development.
7 The five Special Interest Groups (SIGs)
SIGs are small discussion groups of de.NBI experts or representatives from all de.NBI units
and are established by the CCU. Meetings of the SIGs are generally chaired by a member of
the CCU and are attended by a member of the Coordination and Administration Unit.
Protocols of the meetings and compiled recommendations of the SIGs are presented to the
CCU for further discussion and deliberation. It is planned that SIG meetings will take place
four times per year, at least in the early stages of the establishment phase of de.NBI. SIGs
are an optimal instrument of the CCU to cover and handle more or less all open questions of
the review panel related to operational and strategic tasks of the de.NBI management. This
instrument is improved in such a way that each SIG can make recommendations to allocate
work connected to the important de.NBI tasks service and education to the different de.NBI
units.
7.1 The Special Interest Group SIG 1 “Web Presence”
The SIG Web presence takes care of developing a concept for one common website. This
SIG will continue the work already started during the design phase of de.NBI by members of
the writing team. It is currently planned to charge a professional website design service for
creating a modern web presence, including the website design, corporate design, and logo
design. This common website will include general information on the de.NBI consortium and
detailed information for users on the de.NBI services, the training and education program,
other de.NBI activities, user feedback and discussion fora, and contact details (“hotlines”).
The common website of the de.NBI consortium will also include detailed information on how
to get access to and how use the de.NBI services, and on how to participate in the de.NBI
training and education program. As part of the central web presence, social media will be
explored as a dissemination tool. Using Twitter and Facebook can be very effective to push
out short updates on the status of the services and on updates and releases. What will be
available for the user is clearly outlined by each de.NBI partner in the concept of the project.
It is the explicit mandate of the de.NBI personnel to provide the services at the individual
service centers and to organize and conduct training and education activities. The expert
representative for services at the AO will take care of the maintenance and future
development of the website in collaboration with the SIG Web presence and the de.NBI
partners.
Agenda Item 7.2
10
7.2 The Special Interest Group SIG 2 “Service and Service Monitoring”
The central de.NBI website will include overviews of the software tools, databases, web
servers, computing facilities, and other services offered by de.NBI. The SIG Service and
service monitoring will take care that all services are intuitively presented and documented
on the website. The SIG will develop a general workflow and guidelines for handling user
requests for the various offered services of de.NBI.
As outlined in the de.NBI project concept, user numbers and the user acceptance of the ser-
vices are only measurable with a set of different parameters when considering the various
user profiles and the variety of services provided by the de.NBI units. The concept of the
project presents an allocation of selected parameters to five alternative types of services and
a monitoring scheme for user numbers and the user acceptance. Therefore, each de.NBI unit
selects the best fitting parameters for each of the provided services to start a continuous
monitoring at the beginning of the establishment phase and to define in this way the base
line of user numbers. The continuous monitoring of user numbers or usage statistics is
complemented by additional measures to assess customer satisfaction. The feedback of
users is collected decentralized at the service centers after the delivery of the services and in
the central user feedback forum on the common website. User meetings arranged by the
service units, for instance in the course of training and education, are additional effective
means to collect the user feedback on specific services. All these items are handled by the
SIG Service and service monitoring. It also takes care of establishing effective procedures to
record and evaluate the feedback of users. Detailed usage statistics and end-user state-
ments about the service quality and ease of use are provided for the mid-term review.
7.3 The Special Interest Group SIG 3 “Training and Education”
The de.NBI consortium will develop an integrated educational concept to coordinate the dif-
ferent activities of the de.NBI partners and to ensure the harmonious functioning of the ser-
vices in the field of education and training, as an integrated concept has clear advantages
over non-coordinated teaching of related subjects. Detailed concepts for integrated training
and education will be developed by the SIG Education. The resulting training and education
plan of the de.NBI consortium will be displayed on the common website. However, integrated
training remains a form of traditional teaching and is sometimes less conductive to efficient
learning than educational activities designed to help the user and lead him to achieve the
necessary integration by his own efforts. Therefore, the de.NBI concept will of course contain
additional activities exceeding the standard training courses and will make use of the wide
range of teaching and service activities via the internet.
Agenda Item 7.2
11
7.4 The Special Interest Group SIG 4 “Infrastructure and Data Management”
The de.NBI consortium will take care of developing and implementing technical standards to
maximize compatibility, interoperability, safety, repeatability, and quality of the services. The
SIG Infrastructure and data management will delineate a catalogue of harmonized standards
(e.g. standard data formats) and guidelines (e.g. SOPs or minimal information documents)
for the de.NBI units, including rules for the handling and the release of data and the hand-off
of datasets to public data repositories. As de.NBI offers services for users in life sciences
research, these users are in general the owners of the data and hence only the users/owners
can authorize a data release. It is self-evident that the members of the de.NBI consortium
follow all national and international regulations, in particular laws governing data protection
and data security when handling clinical data, if handling is at all legally possible. Moreover,
the de.NBI consortium will adhere to all requirements as detailed in the ethics approval of the
particular project.
7.5 The Special Interest Group SIG 5 “de.NBI Development”
The SIG de.NBI development supports the management of administrative, scientific and
technical aspects of the de.NBI consortium. It will also assess the prospects for the national
and international cooperation of de.NBI with other bioinformatics network structures and
research networks, subject to agreement by the BMBF. The SIG de.NBI development will
also support the process of communicating the value of de.NBI services to customers for the
purpose of selling these services, in particular to customers from industry. It will moreover
develop marketing concepts for customer relationship management. Furthermore, the need
for integration of new services into de.NBI will be assessed in the future.
8 The de.NBI Coordination and Administration Unit (CAU)
8.1 The de.NBI Coordinator
The distinguished mission of the de.NBI coordinator appointed by the BMBF is the building of
the overall de.NBI service network and its scientific integration into prominent national and
international bioinformatics network structures. The de.NBI coordinator is the chairman and a
voting member of the Central Coordination Unit (see 6) and is responsible for the overall
organization, the management structure and the shaping of future scientific strategies of
Agenda Item 7.2
12
de.NBI. The collaboration between the de.NBI coordinator and the Central Coordination Unit
(CCU) will be defined by the CCU in internal rules of procedure. The de.NBI coordinator is
the highest institution of the de.NBI consortium and represents the interests of the de.NBI
partners to third parties. The de.NBI coordinator is the person to contact by the chairman and
members of the scientific advisory board (see 9), the BMBF/Pt Jülich, and other official
bodies in the course of national and international networking. It is the responsibility of the
de.NBI coordinator to control the scientific workflow within the de.NBI consortium and to
ensure accurate delivery of reports to the scientific advisory board and the BMBF/Pt Jülich.
8.2 The de.NBI Administration Office
The de.NBI Administration Office (AO) is part of the Central Administration Unit (CAU) and
the central support entity for de.NBI coordinator and the Central Coordination Unit (Fig. 3). It
provides a range of administrative and management services to the consortium and serves
as a connecting point between the de.NBI coordinator and the Central Coordination Unit. The
AO facilitates the communication within the de.NBI consortium, with users of de.NBI services
and the public on behalf of the de.NBI coordinator and the Central Coordination Unit. The AO
works to build and sustain relationships with users of de.NBI services, industrial partners and
offices that may coordinate related regional, national or international activities. The staff of
the AO (in collaboration with the Special Interest Group Web Presence (see 7.1) is
responsible for maintaining a de.NBI web page to bring together in a uniform way the mission
and organization of de.NBI, the scientific information related to services by the de.NBI
partners, schedules regarding bioinformatics training, education and scientific activities, and
guidelines for academic and industrial users of de.NBI services.
9 The de.NBI Scientific Advisory Board (SAB)
The Scientific Advisory Board (SAB) of de.NBI is independently established by the BMBF
and provided with a broad mandate to advise the de.NBI partners and in particular the
Central Coordination Unit on technical, organizational and strategic matters related to the
de.NBI goals. The principal mission of the SAB is summarized in Table 3. Evaluation of the
de.NBI consortium by the SAB is scheduled as an annual process that is associated to the
internal de.NBI workshop. It is planned that the Central Coordination Unit provides the BMBF
with a list of national and international candidate experts to serve as committee members of
the SAB. The scientific members of the SAB (up to 6 independent experts) are nominated by
Agenda Item 7.2
13
the BMBF. The elected chairman of the SAB directly reports to the BMBF. The board should
be appointed at the beginning of the establishment phase of de.NBI.
Table 3: Tasks of the Scientific Advisory Board – In brief:
Advising the CCU on science, technology and economic issues
Advising the CCU on technical, organizational and strategic matters
Reviewing the scientific and technical development of the de.NBI services
Reviewing the technical basis and infrastructure of the de.NBI partners
Identifying and reallocating scientific needs and services
Reporting decisions and recommendations to the CCU and the BMBF/PtJ Jülich
10 References
[1] Anforderungen an eine Bioinformatik-Infrastruktur in Deutschland zur Durchführung von
Bioökonomierelevanter Forschung, Empfehlungen des Bioökonomierats 06 (2012)
http://www.biooekonomierat.de/publikationen.html?tx_rsmpublications_pi1[publication]=7&tx_rsm
publications_pi1[action]=show&tx_rsmpublications_pi1[controller]=Publication&cHash=d4769cfce
e403d5ac8f14b01c3b3930d
English Version: Requirements for a Bioinformatics Infrastructure in Germany for future
Research with bio-economic Relevance (PDF, 590 kb)
[2] Bekanntmachung: Deutsches Netzwerk für Bioinformatik-Infrastruktur Deutsches
Netzwerk für Bioinformatik-Infrastruktur (Englisch) (PDF - 166 KB)
https://www.ptj.de/nbi
Contact:
Prof. Dr. Alfred Pühler Coordinator de.NBI - German Network for Bioinformatics Infrastructure Center for Biotechnology (CeBiTec) Bielefeld University 33594 Bielefeld Phone: +49 521 106 8750 Fax: +49 521 106 89046 e-mail: [email protected]
Förderkennzeichen 031A532 - 031A540
Agenda Item 7.2