Systems Biology:Systems Biology:
A Path to the FutureA Path to the Future
Nat GoodmanNat Goodman
Institute for Systems BiologyInstitute for Systems Biology
February 14, 2005February 14, 2005
Systems biology is hot!Systems biology is hot!
Slide 3CBW Keynote Feb 14, 2005Nat Goodman
But what is it?But what is it?
System: collection of interacting parts
Everything is a system – even elementary Everything is a system – even elementary particles (if you believe string theory)!!particles (if you believe string theory)!!
Systems biology not Systems biology not study of biological study of biological systemssystems
That’s all of biology!That’s all of biology!Use of systems thinking to do biologyUse of systems thinking to do biology
Like Like molecularmolecular biology biologyMost people mean Most people mean systems molecularsystems molecular
biologybiology
Slide 4CBW Keynote Feb 14, 2005Nat Goodman
Examples (1)Examples (1)
Process: Process: system whose interesting properties change system whose interesting properties change over timeover time
Slide 5CBW Keynote Feb 14, 2005Nat Goodman
Examples (2)Examples (2)
StructureStructure: system whose interesting properties change : system whose interesting properties change over spaceover space
Slide 6CBW Keynote Feb 14, 2005Nat Goodman
Examples (3)Examples (3)
Many systems have interesting properties that vary with time and space
Can often separate dimensions – study time &
space independently
Slide 7CBW Keynote Feb 14, 2005Nat Goodman
Examples (4)Examples (4)
Slide 8CBW Keynote Feb 14, 2005Nat Goodman
Systems biology is…Systems biology is… Use of Use of systems thinkingsystems thinking to do biology to do biology
SystemSystem: collection of interacting parts: collection of interacting parts
Interesting systemInteresting system Whole greater than sum of partsWhole greater than sum of parts Properties of system nontrivial combination of properties of parts Properties of system nontrivial combination of properties of parts
plus interactionsplus interactions
Explain whole in terms of parts + interactionsExplain whole in terms of parts + interactions Rigorously describe how parts + interactions generate system Rigorously describe how parts + interactions generate system
propertiesproperties Study properties that parts + interactions can induceStudy properties that parts + interactions can induce EmergentEmergent properties: “surprising” system properties properties: “surprising” system properties
Explain how molecules and molecular interactions generate Explain how molecules and molecular interactions generate properties of cells or sub-cellular phenomenaproperties of cells or sub-cellular phenomena
Slide 9CBW Keynote Feb 14, 2005Nat Goodman
Origins of systems biologyOrigins of systems biology
Systems biology
Pathway models of biological systems
Mathematical models of biological systems
High throughput laboratory technology
Slide 10CBW Keynote Feb 14, 2005Nat Goodman
Origins of systems biologyOrigins of systems biology
Systems biology
Pathway models of biological systems
Mathematical models of biological systems
High throughput laboratory technology
Fully exploit big datasets
Don’t cherry pick!
Understand 1000s of genes
Less bias from hypotheses
Slide 11CBW Keynote Feb 14, 2005Nat Goodman
The basic recipeThe basic recipe Aitchison & Galitski. Aitchison & Galitski. Inventories to insightsInventories to insights. . J Cell BiolJ Cell Biol. .
2003. PMID: 12743099.2003. PMID: 12743099.
1.1. Select experimentally tractable biological modelSelect experimentally tractable biological model
2.2. Devise Devise predictivepredictive mathematical model for phenomenon of mathematical model for phenomenon of interest (well… you’re supposed to do this interest (well… you’re supposed to do this ))
3.3. Generate / assemble Generate / assemble globalglobal datasets under baseline datasets under baseline conditionsconditions
4.4. Perturb biological modelPerturb biological model
5.5. Generate Generate globalglobal datasets under perturbed conditions datasets under perturbed conditions
6.6. Compare predictions with reality and revise model Compare predictions with reality and revise model (again… that’s the idea (again… that’s the idea ) )
7.7. Repeat until doneRepeat until done
Slide 12CBW Keynote Feb 14, 2005Nat Goodman
Digression: homeostasisDigression: homeostasis
How does cell maintain target levels
of molecules?
If I knockdown expression with
RNAi, what happens to expression level?
Standard problems in control theory!
Presumably some kind of feedback
Hmm… why should this work at all??
Slide 13CBW Keynote Feb 14, 2005Nat Goodman
System beingcontrolled
Control theoryControl theory
comparator
target
sensor effector
Slide 14CBW Keynote Feb 14, 2005Nat Goodman
A familiar illustrationA familiar illustration
Slide 15CBW Keynote Feb 14, 2005Nat Goodman
Home heating modelHome heating model
temptempt+1t+1 = temp = temptt + temp.gain – temp.loss + temp.gain – temp.loss
temp.loss(temp) =temp.loss(temp) = k klossloss ( (temp – temp.outside)temp – temp.outside)
[if heat on][if heat on]temp.gain(temp) =temp.gain(temp) = k kgaingain (temp.radiator – temp) (temp.radiator – temp)
[if heat off][if heat off]temp.gain =temp.gain = 0 0
Slide 16CBW Keynote Feb 14, 2005Nat Goodman
Behavior: gain(20Behavior: gain(20ºº) = loss(20) = loss(20º)º)
Slide 17CBW Keynote Feb 14, 2005Nat Goodman
Behavior: balanced (gain = 2 Behavior: balanced (gain = 2 loss) loss)
Slide 18CBW Keynote Feb 14, 2005Nat Goodman
Behavior: fast! (gain = 10 Behavior: fast! (gain = 10 loss) loss)
Slide 19CBW Keynote Feb 14, 2005Nat Goodman
Behavior: cold! (gain Behavior: cold! (gain loss) loss)
1
0.5
0.10
Slide 20CBW Keynote Feb 14, 2005Nat Goodman
Behavior: noise (50%)Behavior: noise (50%)
1
25
10
Much less stable!System out-of-control except in narrow range
Slide 21CBW Keynote Feb 14, 2005Nat Goodman
Take home messageTake home message Feedback works if high enough gainFeedback works if high enough gain
Knockdown (e.g., RNAi) works if gain lowered enough to Knockdown (e.g., RNAi) works if gain lowered enough to break feedbackbreak feedback
Overexpression may have little effect or a lot (by driving Overexpression may have little effect or a lot (by driving system into feedback)system into feedback)
Noise complicates picture immenselyNoise complicates picture immensely Increasing noise can increase or decrease mean levelIncreasing noise can increase or decrease mean level
Simple systems can have complex behaviorSimple systems can have complex behavior When you see complex behavior, don’t assume complex systemWhen you see complex behavior, don’t assume complex system Complex behavior can be modeled and understoodComplex behavior can be modeled and understood
Slide 22CBW Keynote Feb 14, 2005Nat Goodman
ModelsModels Abstract or theoretical representation of phenomenonAbstract or theoretical representation of phenomenon
Represents some aspects of reality and ignores others Represents some aspects of reality and ignores others
Simple biological exampleSimple biological example Elements: genes, proteins (Elements: genes, proteins (assumedassumed identical identical )) Protein-protein interactions from yeast two hybridProtein-protein interactions from yeast two hybrid Protein-DNA interactions from literatureProtein-DNA interactions from literature mRNA abundance from microarraysmRNA abundance from microarrays
protein abundance protein abundance assumedassumed identical ( identical () )
Informal vs. formal (mathematical) modelsInformal vs. formal (mathematical) models Informal communicate ideas among scientistsInformal communicate ideas among scientists Formal can be analyzed, simulated rigorouslyFormal can be analyzed, simulated rigorously
Static vs. dynamic modelsStatic vs. dynamic models
Biocarta
Informal models (apoptosis)Informal models (apoptosis)
Slide 24CBW Keynote Feb 14, 2005Nat Goodman
Informal models (apoptosis)Informal models (apoptosis)
Creaghet al. Caspase-activation pathways in apoptosis and immunity. Immunol Rev. 2003 Jun;193:10-21.
Static models (graph)Static models (graph) Parts and relationshipsParts and relationships
Graphs commonly used formalism in systems biologyGraphs commonly used formalism in systems biology
Slide 26CBW Keynote Feb 14, 2005Nat Goodman
Graph propertiesGraph properties Small worldSmall world
Nodes closer to each other than expectedNodes closer to each other than expected
Scale freeScale free Power law distribution of neighborsPower law distribution of neighbors More highly connected nodes (More highly connected nodes (hubshubs) than expected) than expected
Self-similarSelf-similar Graph properties preserved when clustered by distanceGraph properties preserved when clustered by distance
Or maybe notOr maybe not Recent analysis suggests properties arise from errorsRecent analysis suggests properties arise from errors
Slide 27CBW Keynote Feb 14, 2005Nat Goodman
Self-similaritySelf-similarity
Song et al. Self-similarity of complex networks. Nature. 2005 Jan 27;433(7024):392-5.
Slide 28CBW Keynote Feb 14, 2005Nat Goodman
Molecular interaction mapMolecular interaction map
Aladjem et al. Molecular interaction maps--a diagrammatic graphical language for bioregulatory networks. Sci STKE. 2004 Feb 24;2004(222):pe8.
Slide 29CBW Keynote Feb 14, 2005Nat Goodman
Dynamic models (biological types)Dynamic models (biological types) Metabolic pathwaysMetabolic pathways
Produce substancesProduce substances
Signal transduction pathwaysSignal transduction pathways Transmit and transform information Transmit and transform information
Gene regulatory networksGene regulatory networks Control gene expressionControl gene expression Maintain steady state or guide cell to new steady stateMaintain steady state or guide cell to new steady state Major focus of systems biologyMajor focus of systems biology
General General regulatory networksregulatory networks Combine all of above Combine all of above
Slide 30CBW Keynote Feb 14, 2005Nat Goodman
Dynamic models (mathematical types -1)Dynamic models (mathematical types -1)
MechanisticMechanisticprotein A binds B activating C which travels to protein A binds B activating C which travels to nucleus and promotes transcription of Dnucleus and promotes transcription of D
FunctionalFunctionalproteins A & B required for expression of Dproteins A & B required for expression of D
QuantitativeQuantitativeaa units of A and units of A and bb units of B leads to units of B leads to dd units of D units of D
Many are graphicalMany are graphical
Slide 31CBW Keynote Feb 14, 2005Nat Goodman
Dynamic models (mathematical types -2)Dynamic models (mathematical types -2) Basic definitionsBasic definitions
StateState: ensemble of properties at one point: ensemble of properties at one point State spaceState space: set of all possible states: set of all possible states Transition rulesTransition rules: function that maps state into next state: function that maps state into next state
Allowable states (for abundance or concentration)Allowable states (for abundance or concentration) BooleanBoolean: on / off: on / off QualitativeQualitative: e.g., high, medium, low: e.g., high, medium, low StochasticStochastic: integers, e.g., number of molecules: integers, e.g., number of molecules ContinuousContinuous: real numbers: real numbers
Transition rulesTransition rules Deterministic or probabilisticDeterministic or probabilistic Mathematical frameworkMathematical framework
differential equations, difference equationsdifferential equations, difference equationsboolean logic, general mathematical or computational logicboolean logic, general mathematical or computational logicBayesian or other probabilistic networksBayesian or other probabilistic networks
Slide 32CBW Keynote Feb 14, 2005Nat Goodman
Dynamic models (usage)Dynamic models (usage) Model constructionModel construction
Automatic inference from data – central topic in Automatic inference from data – central topic in systems biologysystems biology
Manual, hand crafted by expertsManual, hand crafted by experts
PredictionPrediction SimulationSimulation Mathematical analysis possible sometimesMathematical analysis possible sometimes
Slide 33CBW Keynote Feb 14, 2005Nat Goodman
endo16 cisendo16 cis-regulatory system-regulatory system
Davidson et al. A genomic regulatory network for development. Science. 2002 Mar 1;295(5560):1669-78.
Slide 34CBW Keynote Feb 14, 2005Nat Goodman
Data types (1)Data types (1) Systems biology has voracious appetite for dataSystems biology has voracious appetite for data
Omnivorous!Omnivorous!
Large scale laboratory datasetsLarge scale laboratory datasets Genome and gene sequencesGenome and gene sequences mRNA abundance (aka gene expression profiles)mRNA abundance (aka gene expression profiles) Protein abundance and identificationProtein abundance and identification Protein-protein interactionsProtein-protein interactions Protein-DNA interactions (aka transcription factor binding Protein-DNA interactions (aka transcription factor binding
sites)sites) Gene-phenotype and gene-gene relationshipsGene-phenotype and gene-gene relationships Sub-cellular protein localizationSub-cellular protein localization
Slide 35CBW Keynote Feb 14, 2005Nat Goodman
Data types (2)Data types (2) Lab data augmented by computational predictionsLab data augmented by computational predictions
Protein-protein interactions inferred from other speciesProtein-protein interactions inferred from other species Protein-DNA interactions (aka transcription factor binding site Protein-DNA interactions (aka transcription factor binding site
prediction)prediction) Identification of protein binding domains from sequence or Identification of protein binding domains from sequence or
structurestructure Functional clustering through data and text miningFunctional clustering through data and text mining
Biological interpretation requires connecting novel Biological interpretation requires connecting novel data to biological “truth”data to biological “truth”
Manually curated datasets produced by expertsManually curated datasets produced by experts Ontologies, e.g., GOOntologies, e.g., GO Some curated datasets quite large and greatly expand the data Some curated datasets quite large and greatly expand the data
available from large scale experimentsavailable from large scale experiments
Slide 36CBW Keynote Feb 14, 2005Nat Goodman
Data fusionData fusion Large scale data often has high error ratesLarge scale data often has high error rates
Protein-protein interactions studied extensivelyProtein-protein interactions studied extensively 50% false positives50% false positives Unknown but probably higher false negative rateUnknown but probably higher false negative rate
Garbage in, garbage outGarbage in, garbage out
Slide 37CBW Keynote Feb 14, 2005Nat Goodman
Garbage in, dinner outGarbage in, dinner out
Data fusionData fusion Large scale data often has high error ratesLarge scale data often has high error rates
Protein-protein interactions studied extensivelyProtein-protein interactions studied extensively 50% false positives50% false positives Unknown but probably higher false negative rateUnknown but probably higher false negative rate
Data fusionData fusion Combine data from multiple sources to reduce errorCombine data from multiple sources to reduce error Central topic in systems biologyCentral topic in systems biology
Protein-protein interactionsProtein-protein interactions
Lack of Lack of concordance concordance among four large among four large Y2H projectsY2H projects
Numbers in Numbers in parentheses from parentheses from small studiessmall studies
Deane et al. Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 2002.
BiomodulesBiomodules
Cluster protein-protein interactions (both axes)Cluster protein-protein interactions (both axes)
Recapitulates known pathwaysRecapitulates known pathways
Rives, Galitski. Modular organization of cellular networks. PNAS 2003.
Slide 40CBW Keynote Feb 14, 2005Nat Goodman
Gene co-expression networksGene co-expression networks
Meta-analysis of ~3,000 microarray gene expression Meta-analysis of ~3,000 microarray gene expression experiments across human, fly, worm, yeastexperiments across human, fly, worm, yeast
Yielded ~3,000 meta-genesYielded ~3,000 meta-genes
Recapitulates known conserved processesRecapitulates known conserved processes
Stuart et al. A gene-coexpression network for global discovery of conserved genetic modules. Science 2003.
Slide 41CBW Keynote Feb 14, 2005Nat Goodman
Gene co-expression networksGene co-expression networks
Stuart et al. A gene-coexpression network for global discovery of conserved genetic modules. Science 2003.
Slide 42CBW Keynote Feb 14, 2005Nat Goodman
Date vs. party hubsDate vs. party hubs
Combines protein-protein Combines protein-protein interactions & microarrayinteractions & microarray
Party hubsParty hubs: interaction partners : interaction partners have correlated expressionhave correlated expression
Date hubsDate hubs: others: others
Date hubs more important for Date hubs more important for graph connectivitygraph connectivity
Party hubs have more spatial Party hubs have more spatial localizationlocalization
Hanet al. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature. 2004 Jul 1;430(6995):88-9.
Slide 43CBW Keynote Feb 14, 2005Nat Goodman
Date vs. party hubsDate vs. party hubs
Hanet al. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature. 2004 Jul 1;430(6995):88-9.
ActiveActivesubnetworkssubnetworks
Ideker et al. Discovering regulatory and signaling circuits in molecular interaction networks. Bioinformatics 2002.
Combines Combines protein-protein & protein-protein & protein-DNA protein-DNA interactions & interactions & microarraymicroarray
Found subgraphs Found subgraphs with correlated with correlated expressionexpression
Slide 45CBW Keynote Feb 14, 2005Nat Goodman
Condition-specific modelsCondition-specific models
Luscombe et al. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004 Sep 16;431(7006):308-12.
Combines regulatory Combines regulatory interactions from genetic, interactions from genetic, biochemical, ChIP-chipbiochemical, ChIP-chip
Found active subgraphs under Found active subgraphs under various conditions: various conditions: endogenous vs. exogenousendogenous vs. exogenous
Slide 46CBW Keynote Feb 14, 2005Nat Goodman
Condition-specific modelsCondition-specific models
Luscombe et al. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004 Sep 16;431(7006):308-12.
Combines regulatory Combines regulatory interactions from genetic, interactions from genetic, biochemical, ChIP-chipbiochemical, ChIP-chip
Found active subgraphs under Found active subgraphs under various conditions: various conditions: endogenous vs. exogenousendogenous vs. exogenous
Permanent vs. transient hubsPermanent vs. transient hubs
Exogenous subgraphs simpler.Exogenous subgraphs simpler.
Smaller hubsSmaller hubs
Fewer transient hubsFewer transient hubs
Interferon response in liver cellsInterferon response in liver cells
Yan et al. System-based proteomic analysis of the interferon response in human liver cells. Genome Biol. 2004;5(8):R54.
Combines protein-Combines protein-protein interactions & protein interactions & protein abundanceprotein abundance
Found many known Found many known IFN-regulated IFN-regulated proteins, pathways proteins, pathways and some new onesand some new ones
Note: sparse graph Note: sparse graph compared to yeast compared to yeast examplesexamples
Interferon response in liver Interferon response in liver cellscells
Yan et al. System-based proteomic analysis of the interferon response in human liver cells. Genome Biol. 2004;5(8):R54.
Slide 49CBW Keynote Feb 14, 2005Nat Goodman
Systems biology: A…Systems biology: A…
Systems biology
Pathway models of biological systems
Mathematical models of biological systems
High throughput laboratory technology
Slide 50CBW Keynote Feb 14, 2005Nat Goodman
Path to the futurePath to the future
Biotech has given us an embarrassment of riches
More data than we can eat!
Don’t pig-out on old-style research!
Aim for deep, rigorous understanding of biological systems
Path to the futurePath to the future
Convert all this progress into real richesfor science, society, our patients