+ All Categories
Home > Documents > NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input...

NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input...

Date post: 01-Apr-2015
Category:
Upload: alfonso-wynter
View: 220 times
Download: 3 times
Share this document with a friend
Popular Tags:
42
NP - Positive set Negativ e Set Full length ORFs Genome Annotat ed Candidat e NPs Top ranked NPs Input Training NP catalogue Negativ e Set Negativ e Set Negativ e set NP processin g tools Transla ted proteom e ML quality: Cross validation NeuroPID prediction
Transcript
Page 1: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

NP - Positiveset

Negative Set

Full lengthORFs

Genome

Annotated

Candidate NPs

Top ranked NPs

Input Training NP catalogue

Negative Set

Negative Set

Negative set

NP processing tools

Translated proteome ML quality:

Cross validation

NeuroPID

prediction

Page 2: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 3: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 4: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Q

Y C

N

H

L D

R

W

M

T

S

G

A

V

P

F

I E

K

0

20

40

60

80

100

1-lo

g(p-

valu

e, t-

test

)

1-lo

g(p-

valu

e, t-

test

)

A B

GRAVY

Instabilit

y

Molecular

Weig

ht PI

Aromati

city

0

10

20

30

Page 5: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 6: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN

MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTNDN

KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVDNP

AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSKRP

NDMLSQRYHFGLGKRMSEDEEESSQR

Page 7: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

MRSRTSVLTSSLAFLYFFGIVGRSALAMEETPASSMNLQHYNN

MLNPMVFDDTMPEKRAYTYVSEYKRLPVYNFGIGKRWIDTNDN

KRGRDYSFGLGKRRQYSFGLGKRNDNADYPLRLNLDYLPVDNP

AFHSQENTDDFLEEKRGRQPYSFGLGKRAVHYSGGQPLGSKRP

NDMLSQRYHFGLGKRMSEDEEESSQR

Page 8: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Random Forest Classifier RBF Linear SVC Gradient Boosting SVC Sigmoid Polynomal SVM0

0.2

0.4

0.6

0.8

1

‘accuracy’ ‘precision’ ‘recall’

Area under ROC curve

Cros

s va

lidati

on p

erfo

rman

ce

Page 9: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 10: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 11: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 12: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Cros

s va

lidati

on p

erfo

rman

ce

Page 13: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 14: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 15: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 16: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

S. frugiperda (Fall armyworm) 5

H. armigera (Cotton bollworm) 6

S. gregorian (Desert locust ) 4

A. florea (Little honeybee) 0

M. rotundata (Alfalfa leafcutter bee)1

C. floridanus (Florida carpenter ant) 2

A. echinatior (Leafcutter ant) 3

A

C

B

Page 17: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 18: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 19: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

D

Page 20: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

D

Page 21: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

SW Arthropods

UniProt Arthropods

 Random Forest

Gradient Boosting

Linear SVC

Random Forest

Gradient Boosting

Linear SVC

Mean Accuracy 0.94 0.95 0.94 0.92 0.92 0.86

Mean Precision 0.94 0.95 0.93 0.93 0.94 0.95

Mean Recall 0.92 0.92 0.92 0.95 0.95 0.85

Mean AUC 0.94 0.95 0.94 0.89 0.90 0.87SW

Chordata  UniProt

Chrodata

 Random Forest

Gradient Boosting

Linear SVC

Random Forest

Gradient Boosting

Linear SVC

Mean Accuracy 0.96 0.97 0.95 0.90 0.91 0.85

Mean Precision 0.94 0.94 0.88 0.91 0.92 0.89

Mean Recall 0.91 0.92 0.93 0.91 0.91 0.83

Mean AUC 0.95 0.95 0.94 0.90 0.91 0.85

Page 22: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Organism# sequences UniProtKB

# of full length

UniProtKB# of SP

# of NP & SP

# NeuroPID All methods

Functional annotation enrichment

B. mori 17908 17069 138 6 69Innate immunity;Insulin-like; Chorion, Hormne (NP)

S. invicta 14356 84 12 2 4 Innate immunity

D. melanogaster

39961 31091 475 21 120Innate immunity; Developmental; Channel ligand; Receptor, Hormone (NP)

C.elegans 26005 25534 464 21 89Hormone (NP), Channel ligand; Receptor, Protease

Page 23: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 24: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

SW Arthropods

UniProt Arthropods

 Random Forest

Gradient Boosting

Linear SVC

Random Forest

Gradient Boosting

Linear SVC

Mean Accuracy 0.94 0.95 0.94 0.92 0.92 0.86Mean Precision 0.94 0.95 0.93 0.93 0.94 0.95Mean Recall 0.92 0.92 0.92 0.95 0.95 0.85Mean AUC 0.94 0.95 0.94 0.89 0.90 0.87

SW Chordates

  UniProt Chordates

 Random Forest

Gradient Boosting

Linear SVC

Random Forest

Gradient Boosting

Linear SVC

Mean Accuracy 0.96 0.97 0.95 0.90 0.91 0.85Mean Precision 0.94 0.94 0.88 0.91 0.92 0.89Mean Recall 0.91 0.92 0.93 0.91 0.91 0.83Mean AUC 0.95 0.95 0.94 0.90 0.91 0.85

Page 25: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 26: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

organism / taxa# of UniProt (UniRef90)

# of NPP in SW

(UniRef90)

# of NPP in UniProt

(UniRef90)

PredictionNeuroPID

RBFa

Apis Melliferra 10394 6 19 7

SP in Apis Melliferab 2139 5 7

Gallus gallus 20760 5 5 5

SP in Gallus gallus 701 1 1

Bombyx mori 15250 5 17 9

SP in Bombyx mori 112 5 5 9

Octopoda 224 4 4 4

SP in Octopoda 76 3 3

Page 27: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Updates 5 7 2013

Page 28: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 29: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

RF 7

ExtraTree 8

SVM-SVC 16GBR 79

4

11

2 2

1

Page 30: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

RFExt-Tree

79

SVM-SVC

GBR

16

8786%

42%

100%100%

60%

75% RFExt-Tree

18

SVM-SVC

GBR

10

95100%

60%

100%100%

33%

80%

Apis mellifera Thaumeledone gunteri

Page 31: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

Updates 5 7 2013

Page 32: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 33: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 34: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 35: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 36: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

S. frugiperda (Fall armyworm) 5

H. armigera (Cotton bollworm) 6

S. gregorian (Desert locust ) 4

A. florea (Little honeybee) 0

M. rotundata (Alfalfa leafcutter bee)1

C. floridanus (Florida carpenter ant) 2

A. echinatior (Leafcutter ant) 3

A

C

B

Page 37: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 38: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 39: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.
Page 40: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

0

0.3

0.6

0.9

1.2

1.5

% o

f ann

otat

ed N

Ps in

taxo

nom

y

113

77

2510

Mammalia

Insecta

Caenorhabditis

others

Page 41: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

A B

Page 42: NP - Positive set Negative Set Full length ORFs Genome Annotated Candidate NPs Top ranked NPs Input Training NP catalogue Negative Set Negative set NP.

66- 97 KRL....YDFG.........LG..............KRA..YsyvSEYKRL.............................pvYN..FGLGKR 98- 120 SKM....YGFG.........LG..............KR.......DG..RM...............................YS..FGLGKR 121- 164 DYD....Y.YGeededdqqaIGdedieesdvgdlmdKR..........DRL...............................YS..FGLGKR 165- 191 ARP....YSFG.........LG..............KRA..P...SGAQRL...............................YG..FGLGKR 192- 216 GGS...lYSFG.........LG..............KR........GDGRL...............................YA..FGLGKRPVN 222- 253 GRSsgsrFNFG.........LG..............KRS..D...DIDFRE...............................LEekFAEDKR 254- 316 .YPqehrFSFG.........LG..............KREveP...SELEAVrneekdnssvhdkknntndmhsgerikrslhYP..FGIRKL 347- 367 RRP....FNFG.........LG..............KRI..P........M...............................YD..FGIGKR

66- 97 KRL....YDFG.........LG..............KRA..YsyvSEYKRL........pvYN..FGLGKR98-120 SKM....YGFG.........LG..............KR.......DG..RM..........YS..FGLGKR121-164 DYD....Y.YGeededdqqaIGdedieesdvgdlmdKR..........DRL..........YS..FGLGKR165-191 ARP....YSFG.........LG..............KRA..P...SGAQRL..........YG..FGLGKR192-220 GGS...lYSFG.........LG..............KR........GDGRL..........YA..FGLGKRPVNS221-253 GRSsgsrFNFG.........LG..............KRS..D...DIDFRE..........LEekFAEDKR254-316 YPqehrFSFG.........LG..............KREveP...SELEAVrne(25)slhYP..FGIRKL346-367 RRP....FNFG.........LG..............KRI..P........M..........YD..FGIGKR


Recommended