BeeSpace: An Interactive Environment
for Functional Analysis of Social Behavior
BeeSpace: An Interactive Environment
for Functional Analysis of Social Behavior
Bruce SchatzInstitute for Genomic Biology
University of Illinois at Urbana-Champaignwww.beespace.uiuc.edu
First Annual BeeSpace Workshop University of Illinois June 6, 2005
BeeSpace FIBR ProjectBeeSpace FIBR Project
BeeSpace project is NSF FIBR flagshipFrontiers Integrative Biological Research, $5M for 5 years at University of Illinois
Analyzing Nature and Nurture in Societal Roles using honey bee as model
(Functional Analysis of Social Behavior)
Genomic technologies in wet lab and dry lab BeeBee [Biology] gene expressions SpaceSpace [Informatics] concept navigations
for Social Beehavior for Social Beehavior
Complex Systems IComplex Systems I
Understanding Social Behavior
Honey Bees have only 1 million neurons Yet… A Worker Bee exhibits Social Behavior!
She forages when she is not hungry but the Hive is She fights when she is not threatened but the Hive is
for Functional Analysisfor Functional Analysis
Complex Systems IIComplex Systems II
Understanding Functional Analysis
Molecular Mechanisms of Social Behavior Can only be Discovered via the Interactive Navigations of Distributed Systems
The InterspaceInterspace is the next generation of of the Net (beyond the Web) Where Concept Navigation across
Distributed Communities is routine
System ArchitectureSystem Architecture
Post-Genome InformaticsPost-Genome InformaticsClassical Organisms have extensive Genetic Descriptions!There will be NO more classical organisms beyondMice and Men other than Worms and Flies, Yeasts and Weeds.
So must use comparative genomics to classical organisms,Via sequence homologies and literature analysis.
Automatic annotation of genes to standard classifications,Such as Gene Ontology via sequence homology. Automatic analysis of functions to scientific literature,Such as concept spaces via text mining.
Descriptions in Literature MUST be used for future interactive environments for functional analysis!
Informational ScienceInformational ScienceComputational Science is the Third Branch of Science
(beyond Experimental and Theoretical)
Genes are Computed, Proteins are Computed, Sequence “equivalences” are Computed.
Informational Science is coming to be accepted as The Fourth Branch of Science
Based on Information Science technologies forFunctional Mining of Information Sources
Comparative Analysis within theComparative Analysis within theDry Lab of Biological KnowledgeDry Lab of Biological Knowledge
Biology: The Model OrganismBiology: The Model Organism
The Western Honey Bee, Apis melliferahas become a primary model for social behavior
Complex social behavior in controllable urban environment Normal Behavior – honey bees live in the wild Controllable Environment – hives can be modified
Small size manageable with current genomic technology Capture bees on-the-fly during normal behavior Record gene expressions for whole-brain or brain-region
(Note logistical limitations with bees and expressions)
Informatics: From Bases to SpacesInformatics: From Bases to Spaces
data Bases support genome datae.g. FlyBase has sequences and mapsGenes annotated by GeneOntology and
linked to biological literature
BeeBase (Christine Elsik, Texas A&M)Uses computed homologies to annotate genes
information Spaces support biological literaturee.g. BeeSpace uses automatically generated conceptual relationships to navigate functions
Project InvestigatorsProject Investigators
BeeSpace project is NSF FIBR flagshipFrontiers Integrative Biological Research, $5M for 5 years at University of Illinois
BiologyGene Robinson, Entomology (behavioral expression)Susan Fahrbach, Wake Forest (anatomical localization)Sandra Rodriguez-Zas, Animal Sciences (data analysis)
InformaticsBruce Schatz, Library & Information Science (systems) ChengXiang Zhai, Computer Science (text analysis)Chip Bruce, Library & Information Science (users)
Education and OutreachEducation and Outreach
Explaining Social Behavior at all Levels
Graduate Students and Postdocs as System Users5 early adopter labs then 15 international labs
Undergraduates to plan Bioinformatics Course through Susan Fahrbach at Wake Forest
Run Workshop for Middle School Minorities through UIUC SummerMath (George Reese)
University High School Biology Courses (David Stone) Home Hi Middle School for Girls Science (Jim Buell)
BeeSpace GOALSBeeSpace GOALS
Analyze the relative contributions of Analyze the relative contributions of Nature and Nurture in Nature and Nurture in
Societal Roles in Honey BeesSocietal Roles in Honey Bees
Experimentally measure differential gene expression for important societal roles during normal behavior
varying heredity (nature) and environment (nurture)
Interactively annotate gene functions for important gene clusters using concept navigation across biological literature representing community knowledge
Concept Navigation in BeeSpaceConcept Navigation in BeeSpace
NeuroscienceLiterature
MolecularBiology
Literature
BeeLiterature
Flybase,WormBase
BeeGenome
Brain RegionLocalization
Brain GeneExpression
Profiles
BehavioralBiologist
MolecularBiologist
Neuro-scientist
BeeSpace Software EnvironmentBeeSpace Software Environment Will build a Concept Space of Biomedical Literature
for Functional Analysis of Bee Genes
-Partition Literature into Community Collections-Extract and Index Concepts within Collections-Navigate Concepts within Documents-Follow Links from Documents into Databases
Locate Candidate Genes in Related Literatures then follow links into Genome Databases
BeeSpace Software ImplementationBeeSpace Software Implementation Natural Language Processing Identify noun phrasesRecognize biological entities
Statistical Information Retrieval Compute statistical contextsSupport conceptual navigation
Network Information SystemConcept switch across community collectionsSemantic Links into biological databases
BeeSpace Information SourcesBeeSpace Information Sources Biomedical Literature- Medline (medicine)- Biosis (biology)- Agricola, CAB Abstracts, Agris (agriculture)
Model Organisms (heredity)-Gene Descriptions (FlyBase, WormBase)
Natural Histories (environment)-BeeKeeping Books (Cornell Library, Harvard Press)
Worm Community System (1991)Worm Community System (1991) WCS Information SourcesLiterature Biosis, Medline, newsletters, meetings
Data Genes, Maps, Sequences, strains, cells
WCS Interactive EnvironmentBrowsing search, navigationFiltering selection, analysisSharing linking, publishing
WCS: 250 users at 50 labs across Internet (1991)NSF National Collaboratories Flagship
WCSMolecular
WCS Cellular
Medical Concept Spaces (1998)Medical Concept Spaces (1998)
Medical Literature (Medline, 10M abstracts) Partition with Medical Subject Headings (MeSH)
Community is all abstracts classified by core term 40M abstracts containing 280M concepts computation is 2 days on NCSA Origin 2000
Simulating World of Medical Communities 10K repositories with > 1K abstracts (1K with > 10K)
Navigation in MedSpaceNavigation in MedSpace
For a patient with Rheumatoid Arthritis Find a drug that reduces the pain (analgesic) but does not cause stomach (gastrointestinal) bleeding
Choose DomainChoose Domain
Concept SearchConcept Search
Concept NavigationConcept Navigation
Retrieve DocumentRetrieve Document
CONCEPT SWITCHINGCONCEPT SWITCHING
“Concept” versus “Term” set of “semantically” equivalent terms
Concept switching region to region (set to set) match
term
Semantic region
Concept SpaceConcept Space
Biomedical SessionBiomedical Session
Categories and ConceptsCategories and Concepts
Concept SwitchingConcept Switching
Document RetrievalDocument Retrieval
Biological Concept Spaces (2006)Biological Concept Spaces (2006)
Compute concept spaces for All of BiologyBioSpace across entire biomedical literature
50M abstracts across 50K repositories
Use Gene Ontology to partition literature into biological communities for functional analysis
GO same scale as MeSH but adequate coverage?GO light on social behavior (biological process)
Interactive Functional AnalysisInteractive Functional Analysis
BeeSpace will enable users to navigate a uniform space of diverse databases and literature sources for hypothesis development and testing, with a software system that goes beyond a searchable database, using statistical literature analyses to discover functional relationships between genes and behavior.
Genes to BehaviorsBehaviors to GenesConcepts to ConceptsClusters to ClustersNavigation across Sources
BeeSpace Information SourcesBeeSpace Information Sources
General for All Spaces:
Scientific Literature-Medline, Biosis, Agricola, Agris, CAB Abstracts-partitioned by organisms and by functions
Model Organisms -Gene Descriptions (FlyBase, WormBase, MGI, OMIM,
SCD, TAIR)
Special Sources for BeeSpace:-Natural History Books (Cornell Library, Harvard Press)
XSpace Information SourcesXSpace Information SourcesOrganize Genome Databases (XBase)Compute Gene Descriptions from Model OrganismsPartition Scientific Literature for Organism XCompute XSpace using Semantic Indexing
Boost the Functional Analysis from Special SourcesCollecting Useful Data about Natural Historiese.g. CowSpace Leverage in AIPL Databases
Towards the InterspaceTowards the Interspace
The Analysis Environment technology is GENERAL!
BirdSpace? BeeSpace?PigSpace? CowSpace? BehaviorSpace? BrainSpace?
BioSpace… Interspace