+ All Categories
Home > Documents > BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches to ribonucleic acid

BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches to ribonucleic acid

Date post: 11-Jan-2016
Category:
Upload: irving
View: 30 times
Download: 0 times
Share this document with a friend
Description:
BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches to ribonucleic acid. Outline of upcoming lectures. The first part of the course covered sequence analysis, including BLAST (Chapters 1-7). We begin the next part of the course: functional genomics (Chapters 8-12). - PowerPoint PPT Presentation
Popular Tags:
100
1 BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches to ribonucleic acid
Transcript
Page 1: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

1

BIOL6900 BioinformaticsChapter 8

Bioinformatics approaches to ribonucleic acid

Page 2: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

2

Outline of upcoming lectures

The first part of the course covered sequence analysis,including BLAST (Chapters 1-7).

We begin the next part of the course: functional genomics (Chapters 8-12).

We will study how DNA is transcribed to RNA (i.e. gene expression), and we will discuss microarrays. Then we will study proteins.

We will conclude with a survey of genomes (Ch. 13-20).

Page 3: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

32e Fig. 8.1

Page 4: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

4

Types of RNAs

• tRNA, rRNA - together 95% of total RNA

• mRNA,

• Other non-coding RNA:

small nuclear RNA (snRNA);

small nucleolar RNA (snoRNA);

microRNA (~22 nt); short interfering RNA (siRNA)

Page 5: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

52e Fig. 8.3

Rfam

The Rfam family includes alignments and descriptions of RNA families

http://rfam.sanger.ac.uk/

Page 6: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

62e Fig. 8.4

Summary of non-coding RNA families in Rfam database that are assigned to the long arm of human chromosome 21.

Page 7: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

72e Fig. 8.5

Figure 8.5 Identification of tRNAs using tRNAscan-SE server

Page 8: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

82e Fig. 8.5

Page 9: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

92e Fig. 8.6

Vienna RNA package

Page 10: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

102e Fig. 8.7

Figure 8.7Structure of a eukaryotic ribosomal DNA repeat unit

Page 11: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

112e Fig. 8.8

Page 12: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

122e Fig. 8.8

Page 13: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

132e Fig. 8.9

Page 14: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

14

• by region (e.g. brain versus kidney)

• in development (e.g. fetal versus adult tissue)

• in dynamic response to environmental signals

(e.g. immediate-early response genes)

• in disease states

• by gene activity

Gene expression is regulated in several basic ways

Page 157

Page 15: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

15

Organism Gene expression changes measured...

virus

bacteria

fungi

invertebrates

rodents

human In m

uta

nt

or

wild

typ

e ce

lls

Dev

elo

pm

ent

Cel

l typ

es

Dis

ease

In v

iru

s, b

acte

ria,

an

d/o

r h

ost

In r

esp

on

se t

o s

tim

uli

Fig. 6.1Page 158

Page 16: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

16

DNA RNA

cDNA

phenotypeprotein

Page 159

Page 17: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

17

DNA RNA

cDNA

protein DNA RNA

cDNA

protein

UniGene

SAGE

microarray

Fig. 6.2Page 159

(Serial Analysis of Gene Expression)

Page 18: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

18

DNA RNA

cDNA

phenotypeprotein

[1] Transcription[2] RNA processing (splicing)[3] RNA export[4] RNA surveillance

Page 160

Page 19: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

19Fig. 6.3Page 161

exon 1 exon 2 exon 3intron intron

transcription

RNA splicing (remove introns)

polyadenylation

Export to cytoplasm

AAAAA 3’5’

5’

5’

5’ 3’5’3’

3’

3’

Page 20: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

202e Fig. 8.10

Page 21: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

21

Relationship of mRNA to genomic DNA for RBP4

~2e Fig. 8.11

Page 22: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

222e Fig. 8.11

Page 23: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

232e Fig. 8.12

Page 24: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

242e Fig. 8.12

exon 2

exon 3

exon 1

Page 25: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

252e Fig. 8.12

exon 2

exon 3

exon 1

exon 3

exon 1

query 1: genomic contigNT_037887, nucleotides162875-163708

query 2: cDNA NM_000517

intron

intron

Page 26: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

26

Analysis of gene expression in cDNA libraries

A fundamental approach to studying gene expressionis through cDNA libraries.

• Isolate RNA (always from a specific organism, region, and time point)

• Convert RNA to complementary DNA

• Subclone into a vector

• Sequence the cDNA inserts. These are expressed sequence tags (ESTs)

2e ~Fig. 8.13

vector

insert

Page 27: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

27

UniGene: unique genes via ESTs

• Find UniGene at NCBI: from the home page click All databases (on the top bar) then UniGene, or go to: www.ncbi.nlm.nih.gov/UniGene

• UniGene clusters contain many ESTs

• UniGene data come from many cDNA libraries. Thus, when you look up a gene in UniGene you get information on its abundance and its regional distribution.

Page 164

Page 28: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

28

Cluster sizes in UniGene

This is a gene with1 EST associated;the cluster size is 1

Page 164& Fig. 2.3,Page 23

Page 29: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

29

Cluster sizes in UniGene

This is a gene with10 ESTs associated;the cluster size is 10

Page 164

Page 30: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

30

Cluster sizes in UniGene (human)

Cluster size (ESTs) Number of clusters1 42,8002 6,5003-4 6,5005-8 5,4009-16 4,10017-32 3,300

500-1000 2,1282000-4000 2338000-16,000 2116,000-30,000 8

UniGene build 194, 8/06

Page 31: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

31

Ten largest human UniGene clusters

Cluster size Gene22,925 eukary. translation EF (Hs. 522463)22,320 eukary. translation EF (Hs. 4395522)16,562 actin, gamma 1 (Hs.514581)16,309 GAPDH (Hs.169476)16,231 actin, beta (Hs.520640)11,076 ribosomal prot. L3 (Hs.119598)10,517 dehydrin (Hs.524390)

10,087 enolase 1 (alpha)(Hs.517145)

9,973 ferritin (Hs.433670)8,966 metastasis associated (Hs.187199)

UniGene build 186, 9/05Table 6.2Page 165

Page 32: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

Why ribosomal transcripts are abundantin UniGene

The major types of RNA are:

ribosomal RNA rRNA (~85%)transfer RNA tRNA (~15%)messenger RNA mRNA (~3%)

noncoding RNA ncRNA (<1%)small nuclear RNA snRNAsmall nucleolar RNA snoRNAsmall interfering RNA siRNA

Page 33: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

33

There are three distinctions of similarity in UniGene:

1. "Highly similar to" means >90% in the aligned region.

2. "Moderately Similar to" means 70-90% similar in the aligned region.

3. "Weakly similar to" means <70% similar in the aligned region.

Page 164

UniGene clusters are often “similar to” a known gene

Page 34: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

34

Species Canis familiaris (dog) Helianthus annuus (sunflower)Salmo salar (Atlantic salmon)Bombyx mori (domestic silkworm)Apis mellifera (honey bee)Lotus corniculatus (Birdsfoot trefoil)Physcomitrella patens (physco. moss) Lactuca sativa (garden lettuce) Malus x domestica (Apple)Hydra magnipapillata Populus tremula x

Populus tremuloides (aspen) Ovis aries (sheep)

UniGene includes 74 species (as of Aug. 2006), all with many ESTs available. Recent entries include:

Currently: ~130 species

Page 35: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

35

Page 36: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

36

Identifying protein-coding genes in genomic DNA remains a tremendous challenge. Genes can be predicted “ab initio” (by analyzing genomic DNA for the features of start and stop sites, exons/intron structures, regulatory regions etc.). When EST data are coupled with gene prediction, the accuracy soars.

Thus all ongoing genome sequencing projects include a major component of large-scale EST sequencing. Typically, this is done at different developmental stages (e.g. embryo versus adult), regions (e.g. brain versus gut), and physiological states (e.g. mosquitoes having fed on blood versus sucrose). EST data are deposited in UniGene. (dbEST)

The significance of UniGene’s continued growth

Page 37: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

37

Digital Differential Display (DDD) in UniGene

• UniGene clusters contain many ESTs

• UniGene data come from many cDNA libraries

• Libraries can be compared electronically

Page 165

Page 38: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

38Fig. 6.6Page 166

Page 39: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

39Fig. 6.6Page 166

Page 40: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

40Fig. 6.6Page 166

Page 41: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

41

UniGene brainlibraries

Page 42: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

42

UniGene lunglibraries

Page 43: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

43Fig. 6.7Page 167

Page 44: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

44Fig. 6.7Page 167

Page 45: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

45

n-sec1 up-regulated in brain

CamKII up-regulated

in brain

surfactant up-regulated in lung

Page 167

Page 46: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

46

fraction of sequences within the pool that mapped to the cluster shown

Page 47: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

47

DDD at UniGene

Question: are there individual RNA transcripts that are differentially present in a comparison of EST libraries?

Approach to estimating statistical significance: Fisher’s exact test.

Pages 165

Page 48: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

482e Fig. 8.14

Page 49: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

492e Fig. 8.14

Page 50: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

50

DDD at UniGene

Fisher’s exact test is a nonparametric method.

• It does not assume a normal distribution of the observations• It is easy to calculate• It often has less statistical power than parametric tests (such as a t-test)• For nonparametric methods, observations are typically arranged in an array with ranks assigned from 1 to n.

Page 51: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

51

DDD at UniGene

Fisher’s exact test is related to a chi square (2) test, but is appropriate for small sample sizes.

A 2 test is applied to row x column (rc) contingency tables

Determine whether the observed (O) frequencies of occurrence of a categorical value differ significantly from the expected (E) frequency of occurrence. Is O – E larger than expected by chance? rc

2 = i=1

(Oi – Ei)2

Ei

Page 52: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

52

Fisher’s exact test provides a p value

Digital differential display (DDD) results in UniGeneare assessed for significance using Fisher’s exact testto generate a p value.

p =

The null hypothesis (that gene 1 is not differentiallyregulated in a comparison of two libraries) is rejectedwhen p is < 0.05/G (where G = the number of UniGeneclusters analyzed).

Pages 165

NA! NB! c! C!

(NA + NB)! g1A! g1B! (NA – g1A)!(NB – g1B)!

Page 53: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

53

Pool A

Pool B

total

Gene 1 All other genes total

NA

NB

g1A NA-g1A

c = g1A + g1B

NB-g1Bg1B

C = (NA-g1A) + (NB-g1B)

Fisher’s Exact Test: deriving a p value

Table 6-3Page 167

Page 54: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

54

Pitfalls in interpreting cDNA library data

• bias in library construction• variable depth of sequencing• library normalization• error rate in sequencing• contamination (chimeric sequences)

Pages 166-168

Page 55: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

55Fig. 6.8p. 168-169

http://mgc.nci.nih.gov

Updated 8/06

Page 56: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

56

Serial analysis of gene expression (SAGE)

• 9 to 11 base “tags” correspond to genes

• measure of gene expression in different biological samples

• SAGE tags can be compared electronically

Page 169

Page 57: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

57

Tag 1

Tag 1Tag 2Tag n

Cluster 1Cluster 2Cluster 3

Cluster 1

SAGE tags are mapped to UniGene clusters

Page 169

Page 58: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

58Fig. 6.10Page 171

Page 59: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

59Fig. 6.11Page 171

Page 60: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

60

Page 61: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

61Fig. 6.12Page 171

Page 62: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

62Fig. 6.13Page 173

Page 63: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

63Fig. 6.14Page 174

Page 64: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

64Fig. 6.15Page 175

Page 65: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

65Fig. 6.15Page 175

Page 66: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

66

Microarrays: tools for gene expression

A microarray is a solid support (such as a membraneor glass microscope slide) on which DNA of knownsequence is deposited in a grid-like array.

The most common form of microarray is used to measure gene expression. RNA is isolated from matched samples of interest. The RNA is typically converted to cDNA, labeled with fluorescence (or radioactivity), then hybridized to microarrays in order to measure the expression levelsof thousands of genes.

Page 173

Page 67: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

67

• Wildtype versus mutant

• Cultured cells +/- drug

• Physiological states (hibernation, cell polarity formation)

• Normal versus diseased tissue (cancer, autism)

Questions addressed using microarrays

Page 173

Page 68: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

68

• metazoans: human, mouse, rat, worm, insect

• fungi: yeast

• plants: Arabidopsis

• many other: e.g. bacteria, viruses

Organisms represented on microarrays

Page 69: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

69

Fast Data on >20,000 genes in several weeks

Comprehensive Entire yeast or mouse genome on a chip

Flexible • As more genomes are sequenced, more arrays can be made. • Custom arrays can be made

to represent genes of interest

Easy Submit RNA samples to a core facility

Cheap? Chip representing 20,000 genes for $350; robotic spotter/scanner cost $100,000

Advantages of microarray experiments

Table 6-4Page 175

Page 70: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

70

Cost Some researchers can’t afford to doappropriate controls, replicates

RNA The final product of gene expression is proteinsignificance (see pages 174-176 for references)

Quality Impossible to assess elements on array surfacecontrol Artifacts with image analysis

Artifacts with data analysis

Disadvantages of microarray experiments

Table 6-5Page 176

Page 71: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

71

purify RNA, label

hybridize,wash, image

Biological insight

Sampleacquisition

Dataacquisition

Data analysis

Data confirmation

data storage

experimentaldesign

Fig. 6.16Page 176

Page 72: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

72

Stage 1: Experimental design

[1] Biological samples: technical and biological replicates

[2] RNA extraction, conversion, labeling, hybridization

[3] Arrangement of array elements on a surface

Page 177

Page 73: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

73

Sample 1 Sample 2 Sample 3

Fig. 6.17Page 177

Page 74: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

74

Samples 1,2 Samples 1,3 Samples 2,3

Sample 1, pool Sample 2, poolSamples 2,1:switch dyes

2e Fig. 8.18

Page 75: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

75

Stage 2: RNA and probe preparation

Page 178

For Affymetrix chips, need total RNA (about 10 ug)

Confirm purity by running agarose gel

Measure a260/a280 to confirm purity, quantity

Page 76: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

76

Stage 3: hybridization to DNA arrays

Page 178-179

The array consists of cDNA or oligonucleotides

Oligonucleotides can be deposited by photolithography

The sample is converted to cRNA or cDNA

Page 77: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

77

Microarrays: array surface

Southern et al. (1999) Nature Genetics, microarray supplement 2e Fig. 8.19

Page 78: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

78

Stage 4: Image analysis

Page 180

RNA expression levels are quantitated

Fluorescence intensity is measured with a scanner,or radioactivity with a phosphorimager

Page 79: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

79

Rett

Control

Differential Gene Expression on a cDNA Microarray

B Crystallin is over-expressed in Rett Syndrome

2e Fig. 8.20

Page 80: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

802e Fig. 8.21

Page 81: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

81Fig. 8.21

Page 82: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

82Fig. 8.21

Page 83: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

83Fig. 8.21

Page 84: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

84

Stage 5: Microarray data analysis

• How can arrays be compared? • Which genes are regulated?• Are differences authentic?• What are the criteria for statistical significance?• Are there meaningful patterns in the data (such as groups)?

Page 85: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

85

preprocessing

inferential statistics

exploratory statistics

Stage 5: Microarray data analysis

Page 86: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

86

preprocessing

inferential statistics

exploratory statistics

t-tests

global normalizationlocal normalizationscatter plots

clustering

Stage 5: Microarray data analysis

Page 87: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

87

Matrix of genes versus samples

T-test: • for each gene, calculate the mean expression value in control (C) and experimental (E) samples• Null hypothesis: the mean C and E values are the same• Use a t-test to see whether the null hypothesis can be rejected with a particular cutoff value (e.g. p < 0.05)• Correct the p value for multiple comparisons (e.g. if you measure expression values in 10,000 genes, then 5% (500 genes) might vary by chance alone).

Page 88: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

88

small p value; ratio large

small p value; ratio unimpressive

Perform a t-test in Excel to compare the mean of two groups,and to compare fold change to probability values

Page 89: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

89

disease vs normal

Error

t-test to determine statistical significance

difference between mean of disease and normalt statistic = variation due to error

Page 90: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

90

Error

Error

Tissue type

ANOVA partitions total data variability

variation between DS and normalF ratio = variation due to error

Before partitioning After partitioning

Subjectdisease vs normal

disease vs normal

Page 91: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

91

Matrix of genes versus samples

Metric (define distance)

supervised,unsupervised

analyses

clusteringtrees(hierarchical,k-means)

self-organizing

maps

principalcomponentsanalysis

Page 92: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

92

Stage 5: MIAME

In an effort to standardize microarray data presentationand analysis, Alvis Brazma and colleagues at 17institutions introduced Minimum Information About aMicroarray Experiment (MIAME). The MIAME framework standardizes six areas of information:• experimental design• microarray design• sample preparation• hybridization procedures• image analysis• controls for normalization

Visit http://www.mged.org

Page 93: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

93

Stage 6: Biological confirmation

Microarray experiments can be thought of as“hypothesis-generating” experiments.

The differential up- or down-regulation of specificgenes can be measured using independent assayssuch as

-- Northern blots-- polymerase chain reaction (RT-PCR)-- in situ hybridization

Page 94: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

94

Stage 7: Microarray databases

There are two main repositories:

Gene expression omnibus (GEO) at NCBI

ArrayExpress at the European Bioinformatics Institute (EBI)

See the URLs on page 184

Page 95: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

95

Gene expression omnibus (GEO)

NCBI repository for gene expression data

Page 96: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

96

Page 97: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

97

Page 98: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

98

Page 99: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

99

http://www.dnachip.org

Page 100: BIOL6900 Bioinformatics Chapter 8 Bioinformatics approaches  to ribonucleic acid

100

• Stanford Microarray Database http://www.dnachip.org

• links at http://pevsnerlab.kennedykrieger.org/

Microarrays: web resources


Recommended