+ All Categories
Home > Documents > Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University –...

Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University –...

Date post: 21-Dec-2015
Category:
View: 213 times
Download: 1 times
Share this document with a friend
Popular Tags:
23
Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center [email protected] Introduction to Bioinformatics
Transcript
Page 1: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

Chi-Cheng Lin, Ph.D.Associate Professor

Department of Computer ScienceWinona State University – Rochester Center

[email protected]

Introduction to Bioinformatics

Page 2: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

2

Outline

• What is Bioinformatics

• Human Genome Project

• Application of Bioinformatics

• References

Acknowledgement: The presentation includes adaptations from DOE’s “Human Genome Project and Beyond Primer” and Dr. Yan Asmann’s (Mayo Clinic) lecture notes

Page 3: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

3

Bioinformatics

• Living things have the ability to store, utilize, and pass on information

• Bioinformatics strives to – determine what information is biologically

important– decipher how it is used to precisely control the

chemical environment within living organisms

Page 4: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

4

What is Bioinformatics

• The combination of

Biology and Informatics

• Originally refers to the use of computational tools to organize and analyze genetic and protein sequence data (first coined by Dr. Hwa Lim in 1988)

Page 5: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

5

NCBI’s Definition of Bioinformatics

• NCBI (National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/) – “Bioinformatics is the field of science in

which biology, computer science, and information technology merge to form a single discipline.”

– “The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned.”

Page 6: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

6

Human Genome Project

Page 7: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

7

Human Genome Project

• Goals include– Identify genes in human DNA– Determine sequence making up human DNA– Store this information in databases– Improve tools for data analysis– Etc.

• Milestone– April 2003: HGP sequencing is completed and

Project is declared finished two years ahead of schedule

Page 8: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

8

Interesting Numbers of Human Genome

• 3 billion:– Number of chemical nucleotide bases (A, C, T, and

G) the human genome contains • 3 million:

– Locations where single-base DNA differences occur in humans

• 2.4 million:– Number of bases the largest known human gene

consists of (the average gene consists is 3000 bases)• 30,000:

– The total number of genes estimated (much lower than previous estimates of 80,000 to 140,000)

Page 9: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

9

Interesting Numbers of Human Genome

• 99.9%– Fraction of nucleotide bases are exactly the same in

all people• 50%

– Fraction of functions are unknown for over of discovered genes

• 2% – Fraction of genome codes for proteins (the others:

“junk” DNA)• 9%, 11%, 26%, 28%, 45%, 83%, 89%, and 95%

– The percentage of genes E. coli, rice, roundworm, yeast, fruit fly, zebrafish, mouse, and chimpanzee share with human, respectively.

Page 10: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

10

How does the human genome stack up?

Organism Genome Size (Bases)

Estimated Genes

Human (Homo sapiens) 3 billion 30,000

Laboratory mouse

(M. musculus)2.6 billion 30,000

Mustard weed (A. thaliana) 100 million 25,000

Roundworm (C. elegans) 97 million 19,000

Fruit fly (D. melanogaster) 137 million 13,000

Yeast (S. cerevisiae) 12.1 million 6,000

Bacterium (E. coli) 4.6 million 3,200

Human immunodeficiency virus (HIV)

9700 9

Humans share most of the same protein families with worms, flies, and plants!

Page 11: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

11

Anticipated Benefits of Genome Research

• Molecular Medicine

• Microbial Genomics

• Bioarchaeology, Anthropology, Evolution, and Human Migration

• DNA Identification (Forensics)

• Agriculture, Livestock Breeding, and Bioprocessing

Page 12: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

12

ELSI: Ethical, Legal, and Social Issues

• Privacy and confidentiality of genetic information.

• Fairness in the use of genetic information by insurers, employers, courts, schools, adoption agencies, and the military, among others.

• Psychological impact, stigmatization, and discrimination due to an individual’s genetic differences.

• Reproductive issues including adequate and informed consent and use of genetic

information in reproductive decision making.

• Clinical issues including the education of doctors and other health-service providers, people identified with genetic conditions, and the general public about capabilities, limitations, and social risks; and implementation of standards and quality‑control measures.

U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Page 13: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

13

ELSI Issues (cont.)

• Uncertainties associated with gene tests for susceptibilities and complex conditions (e.g., heart disease, diabetes, and Alzheimer’s disease).

• Fairness in access to advanced genomic technologies.

• Conceptual and philosophical implications regarding human responsibility, free will vs genetic determinism, and concepts of health and disease.

• Health and environmental issues concerning genetically modified (GM) foods and microbes.

• Commercialization of products including property rights (patents, copyrights, and trade secrets) and accessibility of data and materials.

U.S. Department of Energy Genome Programs, Genomics and Its Impact on Science and Society, 2003

Page 14: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

14Mike Thompson, Detroit, Michigan -- from The Detroit Free Press Source: http://cagle.msnbc.com/news/gene/gene5.asp

Page 15: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

15

Future Challenges: What We Still Don’t Know• Gene number, exact locations, and functions • Gene regulation • DNA sequence organization• Chromosomal structure and organization • Noncoding DNA types, amount, distribution, information content, and

functions • Coordination of gene expression, protein synthesis, and post-translational

events • Interaction of proteins in complex molecular machines• Predicted vs experimentally determined gene function• Evolutionary conservation among organisms• Protein conservation (structure and function)• Proteomes (total protein content and function) in organisms• Correlation of SNPs (single-base DNA variations among individuals) with

health and disease• Disease-susceptibility prediction based on gene sequence variation• Genes involved in complex traits and multigene diseases• Complex systems biology including microbial consortia useful for

environmental restoration• Developmental genetics, genomics

Page 16: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

16

Tackle Future Challenges: Bioinformatics

• High volume of data to store, compute, and analyze

• Huge amount of information to retrieve, interpret, and visualize

• Complex system to study, model, and simulate

THAT’S WHY BIOINFORMATICS

IS ESSENTIAL!!

Page 17: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

17

The ‘omic’ Revolution

• Bioinformatics has been split into various subjects:– Genomics – the sequencing and annotation of genomes– Comparative genomics – the comparison and characterisation of

genomes of different species to identify genes and their functions and to investigate evolutionary history

– Functional and structural genomics – the study of gene expression and protein structure and function

– Proteomics – the description of the complete set of proteins a particular genome codes for

– Others – Computational Biology (protein structure, protein folding), pharmacogenomics, microarray and proteomics data management and analysis, mutation and diseases, human migration pattern, medical informatics…

Page 18: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

18Drew Sheneman, New Jersey -- The Newark Star Ledger Source: http://cagle.msnbc.com/news/gene/gene14.asp

Genome Sequencing

Page 19: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

19

Human Migration Patterns using Mitochondrial DNA Sequences

Page 20: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

20

Human Migration Patterns using Mitochondrial DNA Sequences

Higher mutation rates (2.5/site/Myr), more than 10 time higher than nuclear DNA rateMitochondrial sequence variations determine human family tree

G A

G A G AT G

G AG G AGT A

Sequence Collection

Sequence Alignment

Profiles of Mutations

Phylogenetic analysis

Page 21: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

21

Medicine and the New Genetics

• Anticipated Benefits:– Improved diagnosis of disease– Earlier detection of genetic predispositions to

disease– Rational drug design– Gene therapy and control systems for drugs– Personalized, custom drugs

Gene Testing Pharmacogenomics Gene Therapy

Page 22: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

22

Pharmacogenomics: the study of how an individual's genetic inheritance affects the body's response to drugs

Benefits of pharmacogenomics:• Genetic testing before prescribing drugs• Dose-selection based on genetic variations• Drugs tailor-made to each patients

However, the application of pharmacogenomics in medical practice is very limited today, because of the limited genetic information from a large population

Page 23: Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center clin@winona.edu Introduction to Bioinformatics.

23

References

• NCBI (National Center for Biotechnology Information) http://www.ncbi.nlm.nih.gov/ homepage

• NCBI Science Primer http://www.ncbi.nlm.nih.gov/About/primer/

• Human Genome Project Information http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml (esp. link to the Education module)

• The Human Genome Project and Beyond Primer http://www.ornl.gov/sci/techresources/Human_Genome/publicat/primer2001/primer.ppt


Recommended