+ All Categories
Home > Documents > Bioinformatics in the Department of Computer Science

Bioinformatics in the Department of Computer Science

Date post: 05-Jan-2016
Category:
Upload: cutler
View: 37 times
Download: 0 times
Share this document with a friend
Description:
Bioinformatics in the Department of Computer Science. Lenwood S. Heath Department of Computer Science Blacksburg, VA 24061. College of Engineering Northern Virginia Engineering Showcase March 5, 2004. Bioinformatics Faculty. Layne Watson. Cliff Shaffer. Naren Ramakrishnan. - PowerPoint PPT Presentation
Popular Tags:
26
1 Bioinformatics in the Department of Computer Science Lenwood S. Heath Department of Computer Science Blacksburg, VA 24061 College of Engineering Northern Virginia Engineering Showcase March 5, 2004
Transcript
Page 1: Bioinformatics in the Department of Computer Science

1

Bioinformatics in the Department of Computer

Science

Lenwood S. HeathDepartment of Computer Science

Blacksburg, VA 24061

College of EngineeringNorthern Virginia Engineering Showcase

March 5, 2004

Page 2: Bioinformatics in the Department of Computer Science

2

Bioinformatics Faculty

Cliff Shaffer

Adrian Sandu

Alexey Onufriev

Lenny Heath T. M. Murali

Naren Ramakrishnan

Eunice Santos

Layne Watson

Roger Ehrich

Chris North

Joao Setubal, CS and VBI

Page 3: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

3

Relevant Expertise

• Algorithms — Heath, Santos, Setubal, Shaffer, Watson• Computational structural biology — Onufriev, Sandu• Computational systems biology — Murali• Data mining — Ramakrishnan• Genomics — Heath, Murali, Ramakrishnan• Human-omputer interaction, visualization — North• Image processing — Ehrich, Watson• High performance computing — Sandu, Santos, Watson• Numerical analysis — Onufriev, Watson• Optimization — Watson• Problem solving environments — Ramakrishnan, Shaffer

Page 4: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

4

Selected Collaborations

• Virginia Tech: Biochemistry, Biology, Fralin Biotechnology Center, Plant Physiology, Veterinary Medicine, Virginia Bioinformatics Institute (VBI), Wood Science

• North Carolina State University: Forest Biotechnology Center

• Duke: Biology• University of Illinois: Plant Biology

Page 5: Bioinformatics in the Department of Computer Science

5

Selected Funding• NSF IBN 0219322: ITR: Understanding Stress Resistance Mechanisms in

Plants: Multimodal Models Integrating Experimental Data, Databases, and the Literature. L. S. Heath; R. Grene, B. I. Chevone, N. Ramakrishnan, L. T. Watson. $499,973.

• NSF EIA-01903660: A Microarray Experiment Management System. N. Ramakrishnan, L. S. Heath, L. T. Watson, R. Grene, J. W. Weller (VBI). $600,000.

• DARPA N00014-01-1-0852: Dryophile Genes to Engineer Stasis-Recovery of Human Cells. M. Potts, L. S. Heath, R. F. Helm, N. Ramakrishnan, T. O. Sitz, F. Bloom, P. Price (Life Technologies), J. Battista (LSU). $4,532,622.

• NSF MCB-0083315: Biocomplexity---Incubation Activity: A Collaborative Problem Solving Environment for Computational Modeling of Eukaryotic Cell Cycle Controls. J. J. Tyson, L. T. Watson, N. Ramakrishnan, C. A. Shaffer, J. C. Sible. $99,965.

• NIH 1 R01 GM64339-01: ``Problem Solving Environment for Modeling the Cell Cycle. J. J. Tyson, J. Sible, K. Chen, L. T. Watson, C. A. Shaffer, N. Ramakrishnan, P. Mendes (VBI). 211,038.

• Air Force Research Laboratory F30602-01-2-0572: The Eukaryotic Cell Cycle as a Test Case for Modeling Cellular Regulation in a Collaborative Problem Solving Environment. J. J. Tyson, J. C. Sible, K. C. Chen, L. T. Watson, C. A. Shaffer, N. Ramakrishnan. $1,650,000.

Page 6: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

6

Research Resources

System X• Third fastest computer on the planetLaboratory for Advanced Scientific Computing &

Applications (LASCA)• Parallel algorithms & math software• Anantham Cluster• Grid computingBioinformatics Research LAN• Linux, Mac OS X, Windows• Bioinformatics databases and analysis

Page 7: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

7

JigCell: A PSE for JigCell: A PSE for Eukaryotic Cell Cycle ControlsEukaryotic Cell Cycle Controls

Marc Vass, Nick Allen, Jason Zwolak, Dan Moisa,

Clifford A. Shaffer, Layne T. Watson,

Naren Ramakrishnan, and John J. Tyson

Departments of Computer Science and Biology

Page 8: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

8

Computational Molecular Biology

DNA

mRNA

Protein

Enzyme

Reaction Network

Cell Physiology

…TACCCGATGGCGAAATGC...

…AUGGGCUACCGCUUUACG...

…Met - Gly - Tyr - Arg - Phe - Thr...

ATP ADP

-P

X Y ZE1

E2

E3E4

Page 9: Bioinformatics in the Department of Computer Science

9

Clb5MBF

P Sic1SCFSic1

Swi5

Clb2Mcm1

Unaligned chromosomes

Cln2Clb2

Clb5

Cdc20 Cdc20

Cdh1

Cdh1

Cdc20

APC

PPX

Mcm1

SBF

Esp1Esp1 Pds1

Pds1

Cdc20

Net1

Net1P

Cdc14

RENT

Cdc14

Cdc14

Cdc15

Tem1

Bub2

CDKs

Esp1

Mcm1 Mad2

Esp1

Unaligned chromosomes

Cdc15

Lte1

Budding

Cln2SBF

?

Cln3

Bck2and

growth

Sister chromatid separation

DNA synthesis

Cell Cycle of Budding Yeast

Page 10: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

10

JigCell Problem-Solving Environment

Experimental Database

Wiring Diagram

Differential Equations Parameter Values

Analysis Simulation

VisualizationAutomatic Parameter Estimation

Page 11: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

11

Why do these calculations?

• Is the model “yeast-shaped”?

• Bioinformatics role: the model organizes experimental information.

• New science: prediction, insight

JigCell is part of the DARPA BioSPICE suite of software tools for computational cell biology.

Page 12: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

12

Expresso:A Next Generation Software

System for Microarray Experiment Management

and Data Analysis

Page 13: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

13

• Integration of design, experimentation, and analysis

• Data mining; inductive logic programming (ILP)

• Closing the loop

• Drought stress experiments with pine trees and Arabidopsis

Expresso: A Problem Solving Environment (PSE) for Microarray Experiment Design and Analysis

Page 14: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

14

Scenarios for Effects of Abiotic Stress on Gene Expression in Plants

Page 15: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

15

Data Mining with ILP

• ILP (inductive logic programming) is a data mining algorithm for inferring relationships or rules.

• ILP groups related data and chooses in favor of relationships having short descriptions.

• ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications).

• Hybrid reasoning: Information Integration “Is there a relationship between genes in a given

functional category and genes in a particular expression cluster?”

ILP mines this information in a single step

Page 16: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

16

Rule Inference in ILP• Infers rules relating gene expression levels to

categories, both within a probe pair and across probe pairs, without explicit direction

• Example Rule:[Rule 142] [Pos cover = 69 Neg cover = 3]

level(A,moist_vs_severe,not positive) :- level(A,moist_vs_mild,positive).

• Interpretation:

“If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.”

Page 17: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

17

ILP in the Expresso PipelineExpresso is a next generation software system for microarray

experiments that provides a database interface to ILP functionality.

Page 18: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

18

Status of Expresso• Capabilities

– Data capture and storage

– Statistical analysis

– Data mining by ILP

– Microarray experiment design — GeneSieve

– Expresso-assisted experiment composition

– Closing the experimental loop

• Successful microarray experiment analysis

– Pine, Norway spruce, yeast, Deinococcus radiodurans (an extremophile microorganism), human cell lines

• Planned microarray experiment analysis

– Potato, Arabidopsis thaliana, tomato, rice, corn

Page 19: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

19

Networks in Bioinformatics

• Mathematical Model(s) for Biological Networks

• Representation: What biological entities and parameters to represent and at what level of granularity?

• Operations and Computations: What manipulations and transformations are supported?

• Presentation: How can biologists visualize and explore networks?

Page 20: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

20

Reconciling Networks

Munnik and Meijer,FEBS Letters, 2001

Shinozaki and Yamaguchi-Shinozaki, Current Opinion

in Plant Biology, 2000

Page 21: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

21

Multimodal Networks• Nodes and edges have flexible semantics to represent:

- Time

- Uncertainty

- Cellular decision making; process regulation

- Cell topology and compartmentalization

- Rate constants

- Phylogeny

• Hierarchical

Page 22: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

22

Using Multimodal Networks

• Help biologists find new biological knowledge

• Visualize and explore

• Generating hypotheses and experiments

• Predict regulatory phenomena

• Predict responses to stress

• Incorporate into Expresso as part of closing the loop

Page 23: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

23

Conclusions

• Engaged faculty with the right expertise

• Numerous life science collaborations

• Federal research funding

• First-class computational resources

• A variety of cutting-edge bioinformatics research projects

Page 24: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

24

Bioinformatics Education

• Courses in Computer Science• Courses in the Life Sciences• Bioinformatics Option• Doctoral Program in Genetics,

Bioinformatics, and Computational Biology

Page 25: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

25

Doctoral Program in Genetics, Bioinformatics,

and Computational Biology

Multidisciplinary: biology, biochemistry, crop science, plant physiology, computer science, mathematics, statistics, veterinary medicine

Page 26: Bioinformatics in the Department of Computer Science

3/5/2004 Bioinformatics in Computer Science

26

Anantham Cluster

Previous cluster specs

• 200 AMD 1 GHz processors

• 1 GB RAM per processor

• 2 TB disk space

• 2.56 Gb/s Myrinet network

Previous 200 processor cluster


Recommended