Post on 13-Apr-2018
transcript
7/26/2019 cancer detection - formal version.pptx
1/38
Cancerdetection
By : gierminsahagun
11337710
7/26/2019 cancer detection - formal version.pptx
2/38
This exampledemonstrates using a
neural network todetect cancer from
mass spectrometrydata on protien
proles.
7/26/2019 cancer detection - formal version.pptx
3/38
What is cancer ?
Is a term used fordiseases in whichabnormal cells divide
without control andare able to invadeother tissues.
Cancer cells can
spread to other partsof the body throughthe bloodand lymph systems
7/26/2019 cancer detection - formal version.pptx
4/38
7/26/2019 cancer detection - formal version.pptx
5/38
"andom facts aboutcancer
&he ma'ority of cancer sur(i(ors )#"$* werediagnosed + or more years ago.
,early half )"#$* of cancer sur(i(ors are 70years of age or older
&obacco use is the cause of about 22$ ofcancer deaths.
Another 10$ is due to obesity- a %oor diet- lac
of %hysical acti(ity- and drining alcohol. /ther factors include certain infections-
e%osure to ioniing radiation- anden(ironmental %ollutants.
7/26/2019 cancer detection - formal version.pptx
6/38
Introductio
n
#erum proteomic pattern diagnosticscan be used to di$erentiate samplesfrom patients with and without
disease. %role patterns aregenerated using surface&enhancedlaser desorption and ioni'ation(#)*!I+ protein mass spectrometry.
This technology has the potential toimprove clinical diagnostics tests forcancer pathologies.
7/26/2019 cancer detection - formal version.pptx
7/38
The %roblem, Cancer !etection
&he goal is to build a classier that candistinguish between cancer and control%atients from the mass s%ectrometry data.
&he methodology followed in this eam%leis to select a reduced set of measurementsor features that can be used todistinguish between cancer and control
%atients using a classier. &hese features will be ion intensity le(els at
s%ecic mass4charge (alues.
7/26/2019 cancer detection - formal version.pptx
8/38
-ormatti
ng the
!ata
7/26/2019 cancer detection - formal version.pptx
9/38
&o recreate the datain ovariandataset.matused in this
eam%le- download and uncom%ress theraw mass5s%ectrometry data from the 6A5,C web site. Create the datale /varianCancer010Cdataset.matby
either running scri%t msse2processinginBioinformatics &oolbo )&8* or by followingthe ste%s in theeam%lebiodistcompdemo)Batch%rocessing with %arallel com%uting*. &henew le contains (ariables 9- 8 and gr%.
7/26/2019 cancer detection - formal version.pptx
10/38
)ach column in 3 representsmeasurements taken from apatient. There
are 456 columnsin 3 representing 456patients7 out of which 545 are
ovarian cancer patientsand 89 are normal patients.
7/26/2019 cancer detection - formal version.pptx
11/38
)ach row in 3 represents the
ion intensity level at a specicmass&charge value indicatedin :;. There are 59
7/26/2019 cancer detection - formal version.pptx
12/38
The variable grp holds theindex information as to
which of these samplesrepresent cancer patientsand which ones represent
normal patients.
7/26/2019 cancer detection - formal version.pptx
13/38
"anking
=ey
-eatures
7/26/2019 cancer detection - formal version.pptx
14/38
"anking =ey-eatures
&his is a ty%ical classication%roblem in which the number offeatures is much larger than the
number of obser(ations- but in whichno single feature achie(es a correctclassication- therefore we need to
nd a classier which a%%ro%riatelylearns how to weight multi%lefeatures and at the same time
%roduce a generalied ma%%ing5
7/26/2019 cancer detection - formal version.pptx
15/38
A sim%le a%%roach for ndingsignicant features is to assume
that each 84 (alue isinde%endent and com%ute a two5way t5test. rankfeaturesreturnsan inde to the most signicant84 (alues- for instance 100
indices raned by the absolute(alue of the test statistic.
7/26/2019 cancer detection - formal version.pptx
16/38
&o nish recreating the datafrom ovariandataset.mat- loadthe /varianCancer010Cdataset.mat
andrankfeaturesfrom Bioinformatics&oolbo to choose 100 highest ranedmeasurements as in%uts .
ind ;ranfeatures)9-gr%-=/,
7/26/2019 cancer detection - formal version.pptx
17/38
The preprocessing steps from the
script and example listed above areintended to demonstrate a
representative set of possible pre&
processing and feature selectionprocedures. >sing di$erent steps orparameters may lead to di$erentand possibly improved results of
this example.
7/26/2019 cancer detection - formal version.pptx
18/38
7/26/2019 cancer detection - formal version.pptx
19/38
Classication >sing a
-eed -orward euraletwork
,ow that you ha(e identied some
signicant features- you can use thisinformation to classify the cancerand normal sam%les.
7/26/2019 cancer detection - formal version.pptx
20/38
setdemorandstream)#72!!0+1*
Dince the neural networ is initialiedwith random initial weights- the
results after training the networ(ary slightly e(ery time the eam%leis run. &o a(oid this randomness- the
random seed is set to re%roduce thesame results e(ery time. Eowe(erthis is not necessary for your owna%%lications.
7/26/2019 cancer detection - formal version.pptx
21/38
A 15hidden layer feed forward neuralnetwor with + hidden layer neurons is
created and trained. &he in%ut and target sam%les are
automatically di(ided into training-(alidation and test sets. &he training set isused to teach the networ.
&raining continues as long as the networcontinues im%ro(ing on the (alidation set.
&he test set %ro(ides a com%letelyinde%endent measure of networ accuracy.
7/26/2019 cancer detection - formal version.pptx
22/38
net @ patternnet(9+A
view(net+
The input and output have si'esof < because the network has notyet been congured to match our
input and target data. This willhappen when the network istrained.
7/26/2019 cancer detection - formal version.pptx
23/38
7/26/2019 cancer detection - formal version.pptx
24/38
ow the network is ready to be trained.The samples are automatically divided intotraining7 validation and test sets.
The training set is used to teach thenetwork. Training continues as long as thenetwork continues improving on thevalidation set.
The test set provides a completelyindependent measure of network accuracy.
The Training Tool shows the networkbeing trained and the algorithms used totrain it.
It also displays the training state duringtraining and the criteria which stoppedtraining will be highlighted in green.
7/26/2019 cancer detection - formal version.pptx
25/38
net-tr ; train)net--t*@
The buttons at the bottom openuseful plots which can be openedduring and after training. *inks
next to the algorithm names andplot buttons open
documentation on thosesubBects.
7/26/2019 cancer detection - formal version.pptx
26/38
7/26/2019 cancer detection - formal version.pptx
27/38
7/26/2019 cancer detection - formal version.pptx
28/38
&he trained neural networ can
now be tested with the testingsam%les we %artitioned from themain dataset.
&he testing data was not used intraining in any way and hence%ro(ides an out5of5sam%le
dataset to test the networ on.&his will gi(e us a sense of how
well the networ will do whentested with data from the real
7/26/2019 cancer detection - formal version.pptx
29/38
testI ; ):-tr.testnd*@ test& ;
t):-tr.testnd*@ test9 ; net)testI*@testClasses ; test9 J 0.+
The network outputs will be in
the range < to 57 so wethreshold them to get 5s and
7/26/2019 cancer detection - formal version.pptx
30/38
7/26/2019 cancer detection - formal version.pptx
31/38
/ne measure of how well the neural networ has t thedata is the confusion %lot. Eere the confusion matri is%lotted across all sam%les.
&he confusion matri shows the %ercentages of correct andincorrect classications. Correct classications are thegreen sHuares on the matrices diagonal.
ncorrect classications form the red sHuares.
f the networ has learned to classify %ro%erly- the%ercentages in the red sHuares should be (ery small-indicating few misclassications.
f this is not the case then further training- or training anetwor with more hidden neurons- would be ad(isable.
%lotconfusion)test&-test9*
7/26/2019 cancer detection - formal version.pptx
32/38
7/26/2019 cancer detection - formal version.pptx
33/38
7/26/2019 cancer detection - formal version.pptx
34/38
Another measure of how well the neural
network has fit data is the receiver operating
characteristic plot. This shows how the false positive and true
positive rates relate as the thresholding of
outputs is varied from 0 to 1. The farther left and up the line is, the fewer
false positives need to be accepted in order to
get a high true positive rate.
The best classifiers will have a line going
from the bottom left corner, to the top left
corner, to the top right corner, or close to that.
7/26/2019 cancer detection - formal version.pptx
35/38
Class 1 indicate cancer %atiencts-
class 2 normal %atients.
plotroc(testT7test3+
7/26/2019 cancer detection - formal version.pptx
36/38
7/26/2019 cancer detection - formal version.pptx
37/38
This example illustrated how
neural networks can be usedas classiers for cancerdetection.
/ne can also experimentusing techni2ues likeprincipal component analysisto reduce the dimensionalityof the data to be used forbuilding neural networks to
7/26/2019 cancer detection - formal version.pptx
38/38
"eferences
1 &.F. Conrads- et al.- Eigh5resolutionserum %roteomic features for o(ariandetection- >ndocrine5=elated Cancer- 11-
200"- %%. 1#3517!. 2 >.6. Fetricoin- et al.- ?se of %roteomic
%atterns in serum to identify o(ariancancer- Gancet- 3+)30#*- 2002- %%. +725
+77.