+ All Categories
Home > Education > Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

Date post: 17-Jun-2015
Category:
Upload: yaser-sulaiman
View: 524 times
Download: 3 times
Share this document with a friend
Description:
A presentation about the paper titled "Error statistics of hidden Markov model and hidden Boltzmann model results" by Lee A Newberg. The paper is available at http://www.biomedcentral.com/1471-2105/10/212
Popular Tags:
129
Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results A paper by Lee A Newberg Presented by Yaser Sulaiman 1
Transcript
Page 1: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

1

Error Statistics of Hidden Markov Model and Hidden Boltzmann Model

Results

A paper by Lee A NewbergPresented by Yaser Sulaiman

Page 2: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

2

I’m a computer scientist

Page 3: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

3

who recently got interested in bioinformatics

Page 4: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

4

a different “flavor” of probability theory & stochastic

processes

Page 5: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

5

HMMs in computer science

Page 6: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

6

temporal pattern recognition

Page 7: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

7

speech recognition

Page 8: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

8

handwriting recognition

Page 9: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

9

bioinformatics

Page 11: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

11

bioinformatics in 5 minutes

Page 12: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

12

Molecular

Biology

Computer

Science

Statistics

Page 13: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

13

biological sequences

Page 14: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

14

DNA

Page 15: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

15stolen from Iowa State University

Page 16: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

16

RNA

Page 17: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

17

proteins

Page 19: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

19

sequence comparison

Page 20: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

20

@ the heart of bioinformatics

Page 21: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

21

why?

Page 22: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

22

sequence similarity

structural similarity

functional similarity

Page 23: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

23

not to mention evolution

Page 24: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

24

sequence alignment

Page 25: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

25

find optimal alignment

Page 26: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

26

according to a scoring function

Page 27: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

27

align AACGT and AACT

to max. identities

Page 28: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

28

AACGT|| |AA-CT

Page 29: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

29

AACGT||| |AAC-T

Page 30: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

30

it’s not always that easy!

Page 32: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

32

there’s more to bioinformaticsthan can fit into this

presentation

Page 33: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

33

back to the paper

Page 34: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

34

Error statistics of HMM & hidden Boltzmann model

results

Page 35: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

35

Error statistics of HMM & hidden Boltzmann model

results

Page 36: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

36

how to interpret a score

Page 37: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

37

1. is it strong enough to indicate signal?

Page 38: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

38

2. is it weak enough to indicate noise?

Page 39: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

39

false positive & true positive rates

Page 40: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

40

false positive rate (fpr) for

Page 41: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

41

true positive rate (tpr) for

Page 42: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

42

a faster, more general approach to estimating fpr/tpr

Page 43: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

43

we assume that we’re given:

Page 44: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

44

a hidden Boltzmann model

Page 45: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

45

a simple background model describing noise

Page 46: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

46

a computable foreground model describing signal

Page 47: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

47

Error statistics of HMM & hidden Boltzmann model

results

Page 48: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

48

a Markov process with unobserved states

Page 49: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

49

transition probabilities+

emission probabilities

Page 50: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

50

Error statistics of HMM & hidden Boltzmann model

results

Page 51: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

51

generalization of HMM

Page 52: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

52

scores rather than probabilities

Page 53: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

53

states(including start & terminal)

Page 54: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

54

transitions

Page 55: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

55

emitters

Page 56: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

56

emissions

Page 57: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

57

alphabet

Page 58: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

58

each state, transition, & emission has a real-valued

score

Page 59: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

59

emission path

Page 60: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

60

sequence

Page 61: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

61

score of emission path

Page 62: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

62

Page 63: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

63

hidden?

Page 64: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

64

an emission path can’t be uniquely determined from its

sequence

Page 65: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

65

a sequence can be emitted by any of several emission paths

Page 66: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

66

Page 67: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

67

how to score a given sequence

Page 68: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

68

maximum score

Page 69: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

69

forward score

an HMM interpretation of the hidden Boltzmann model

Page 70: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

70

for any , is treated as if it were an HMM probability

Page 71: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

71

exp (sfw (D ) )= ∑𝜋 ∈𝜋 D

exp (s ( 𝜋 ))

Page 72: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

72

free score

definition of free energy from thermodynamics

Page 73: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

73

temperature

Page 74: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

74

Z (D ,T )=exp (sfree (D ,T )/T )

Page 75: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

75

background model

Page 76: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

76

simple model: sequence positions are i.i.d.

Page 77: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

77

Pr (D|B )=∏i=1

L

Pr (d i∨B)

Page 78: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

78

mathematical problem statement

Page 79: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

79

fpr ( s0 )= ∑D∈D L

Pr (D|B )Θ(s (D )≥ s0)

Page 80: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

80

algorithm

Page 81: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

81

can be estimated via naïve sampling

Page 82: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

82

alternatively, can be estimated via importance sampling

Page 83: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

83

where

Page 84: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

84

importance sampling is more efficient

Page 85: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

85

importance sampling distribution

Page 86: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

86

Pr (D|T )=Pr (D|B )Z (D ,T )

Z (T )

Page 87: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

87

f (D , s0)=Z (T )Θ(s (D )≥s0)

Z (D ,T )

Page 88: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

88

sampling of sequences in a nutshell

Page 89: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

89

draw sample sequences according to

Page 90: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

90

compute for each sample

Page 91: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

91

use the average as an estimate for

Page 92: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

92

estimation of fpr

Page 93: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

93

Page 94: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

94

Page 95: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

95

f̂pr3 (s0 )={¿ f̂pr1(s0) , if f̂pr1(s0)≤ t̂nr2(s0)¿ f̂pr2(s0) ,otherwise

Page 96: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

96

which estimator is the best?

Page 97: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

97

based on the results,

Page 98: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

98

choice depends on efficiency of the estimators

Page 99: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

99

estimation of tpr

Page 100: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

100

by extending the technique for estimating tpr

Page 101: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

101

choice of

Page 102: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

102

which will be efficient for a given ?

Page 103: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

103

the relation between and isn’t straightforward

Page 104: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

104

build a calibration curve

Page 105: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

105

“we have empirically observed lower variances for error

statistic estimation when the fraction of sampled sequences

exceeding the given score threshold is 20-60%.”

Page 106: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

106

results

Page 107: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

107

HMMER 3.0

Page 108: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

108

randomly generated a length , Plan7 profile-HMM

Page 109: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

109

estimated its error statistics using polypeptide sequences of

length

Page 110: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

110

time to calculate error statistics for is 4.2-6.3 seconds

Page 111: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

111

runtime for naïve sampling would be much larger

Page 112: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

112

“an error statistic less than would require a runtime longer

than the present age of the universe.”

Page 113: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

113

a quick check using Wolfram|Alpha

Page 114: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

114

Page 115: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

115

discussion

Page 116: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

116

Page 117: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

117

future directions

Page 118: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

118

real problem instances

Page 119: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

119

scaling to different problem instances

Page 120: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

120

re-use of simulations

Page 121: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

121

other scoring functions

Page 122: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

122

complex background models

Page 123: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

123

stochastic context-free grammars

Page 124: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

124

to summarize

Page 125: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

125

error statistic estimation for hidden Boltzmann models

Page 126: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

126

applied to HMM

Page 127: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

127

faster than naïve sampling

Page 128: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

128

more general than other approaches

Page 129: Error Statistics of Hidden Markov Model and Hidden Boltzmann Model Results

129

…</presentation><questions>…


Recommended