GRID ENABLED SYSTEM FORGRID ENABLED SYSTEM FORMEDICAL IMAGE GATHERING,MEDICAL IMAGE GATHERING,
ANALYZING, RETRIEVAL AND PROCESSINGANALYZING, RETRIEVAL AND PROCESSING
Gorgi Kakasevski†, Aneta Buckovska*, Suzana Loskovska*, Ivica Dimitrovski*†EU, Faculty of Informatics, Skopje, Macedonia
*UKIM, Faculty of Electrotechnics and Information Technology, Skopje, Macedonia†[email protected], *[email protected]
Web image gathering
Feature extractionClassifyingClustering
Keyword-based andContent-basedimage retrieval
Large-scaleimage processing
GRID ENABLED SYSTEM FORGRID ENABLED SYSTEM FORMEDICAL IMAGE GATHERING,MEDICAL IMAGE GATHERING,
ANALYZING, RETRIEVAL AND PROCESSINGANALYZING, RETRIEVAL AND PROCESSING
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Commercial Web image search engines
Content-based image retrieval
• Google Image Search• Yahoo Image Search• Altavista Image Search• Picsearch
• Image Rover• WebSEEK• ImageScape• Mirror• PicToSeek• Diogenes
Images on the Web
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Disadvantages• Large number of Web sites with images• Large number of Web image search engines • Many keywords and their combinations• Need of Web crawler for gathering images in DICOM format or gathering images from specific Web site• Lack of availability for searching by query image• By some criteria the search is difficult or impossible• Saving and organizing of images on local computer has to be done manually• When the images are organized locally, searching by specific criteria is difficult (ex. Site where the image is placed, notices from the users…)• In order to classify (or cluster) images, different programs should be used
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
System architecture
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
System architecture
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
System architecture
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
System architecture
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Web image gatheringDictionary of keywords
Group 1 Group 2 Group 3 . . .
kw1 kw2 kw3 . . . kwn
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Web image gatheringDictionary of keywords
Group 1 Group 2 Group 3 . . .
kw1 kw2 kw3 . . . kwn
• Keyword permutations (r=1 to r=3): r=1; (kw1, kw2), (kw1, kw3), ..., (kw1, kw10); r=2; (kw1, kw2, kw3), (kw1, kw2, kw4), ..., (kw1, kw9, kw10); r=3; (kw1, kw2, kw3, kw4), (kw1, kw2, kw3, kw5), ..., (kw1, kw8, kw9, kw10).
• Other parameters: image size, file type, color or b&w, number of images, type of safe search, filter and location of search
Selecting parameters
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Web image gathering
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Feature extraction• MPEG-7 descriptors, standard set (ISO/IEC standard) of descriptors that can be used to describe multimedia information: - DominantColor - ScalableColor - ColorStructure - ColorLayout - EdgeHistogram
• Skin type recognition
• Color moments
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Feature extraction• Download images• Feature extraction (eXperimentation Model - XM)
<?xml version='1.0' encoding='ISO-8859-1' ?><Mpeg7 xmlns = "urn:mpeg:mpeg7:schema:2001" xmlns:xml = "http://www.w3.org/XML/1998/namespace" xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" xmlns:mpeg7 = "urn:mpeg:mpeg7:schema:2001" xsi:schemaLocation = "urn:mpeg:mpeg7:schema:20 01 Mpeg7-2001.xsd"><DescriptionUnit xsi:type = "DescriptorCollectionType"><Descr iptor xsi:type = "DominantColorType"><SpatialCoherency>0</SpatialCoherency><Value><Percentage>2</Percentage><Index>1 1 1 </Index></Value><Value><Percentage>3</Percentage><Index>28 27 28 </Index></Value><Value><Percentage>2</Percentage><Index>26 14 13 </Index></Value><Value><Percentage>17</Percentage><Index>11 9 7 </Index></Value><Value><Percentage>5</Percentage><Index>26 22 21 </Index></Value></Descriptor></DescriptionUnit></Mpeg7>
<?xml version='1.0' encoding='ISO-8859-1' ?><Mpeg7 xmlns = "http://www.mpeg7.org/2001/MPEG-7_Schema" xmlns:xsi = "http://www.w3.org/2000/10/XMLSchema-instance"><DescriptionUnit xsi:type = "DescriptorCollectionType"><Descriptor xsi:type = "EdgeHistogramType"><BinCounts>0 4 7 1 2 3 5 5 3 4 2 5 2 5 6 1 2 0 7 6 5 2 7 3 4 4 3 5 4 4 4 3 5 5 3 4 1 2 7 5 4 3 4 7 4 3 3 4 6 4 2 2 6 5 5 5 2 6 4 4 0 2 2 7 4 2 4 5 7 5 2 4 6 2 6 2 2 7 2 4 </BinCounts></Descriptor></DescriptionUnit></Mpeg7>
DominantColor
EdgeHistogram
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Image mining• Clustering – Images are grouped into several clusters - Take a keyword permutation; - Gather images from the Web; - Extract image features; - Cluster first 100 images into m clusters; - Add the other images into some of the clusters; - For clusters which exceed the threshold l% from total number of images, make a relation between the images in the cluster, the cluster and the keywords; - Images from the clusters that do not exceed the threshold are placed into one cluster;
• EM method, WEKA (open source datamining software)
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Image mining• Classifying – Each image is assigned into certain class or discarded
- The users define class; - The users assign keywords to the class; - The users assign certain images into classes; - The users select descriptors which will describe the images into the class; - Build classification model (k-nearest neighbor, IBk implementation in WEKA); - Take a keyword permutation for certain class; - Gather images from the Web; - Extract image features; - Assign classes to each image (for each descriptor); - Place the image into the class which mostly appears.
• IBk method, WEKA (open source datamining software)
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Building relations• All information are returned to the server and relations between images, keywords, descriptors, classes and clusters are saved in the local database• When the all relations are saved in the local database the content-based and keyword-based image retrieval are enabled.
Content-based image retrieval
• Upload query image• Extract image features• Place the image into certain class/cluster (by use of classification model)• Retrieve images from the class where query image belong, sorted by similarity to the query image
Keyword-based image retrieval
• Enter keyword• Retrieve clusters and classes which correspond to entered keywords
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Experimental results• The users must set system parameters (to create dictionary with group of keywords, to define classes, to enter Web sites, to enter search engines, to select descriptors which describe classes, to set Grid parameters etc.)
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Experimental results• Classification (ex. Malaria cells)
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Experimental results• Grid execution - SEEGRID infrastructure - Sites:
c01.grid.etfbl.net; ce.ulakbim.gov.tr; ce01.info.uvt.ro; cluster1.csk.kg.ac.yu; tbit01.nipne.ro; ce01.isabella.grnet.gr; seegrid2.fie.upt.al; grid01.elfak.ni.ac.yu; rti29.etf.bg.ac.yu; ce001.imbm.bas.bg.
Job (100000 images) Local Grid Grid with G.A.
Web image gathering 42h50m 4h42m 3h25m
Feature extraction 58h22m 4h15m 3h18m
Clustering 43h05m 4h07m 3h10m
Classification 42h38m 3h40m 3h08m
Total ≈187h ≈17h ≈13h
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
Conclusion and future work• Our system yields much better results unlike the commercial search engines• It does Web image gathering through search engines or specific sites• It cluster and classify images• Has a search service with CBIR• Searching of images from different criteria is very easy• Images with their thumbnails are saved in local database• Easily upgradable• Test the Grid infrastructure with multimedia data
• Complex testing of the system• Modification of used algorithms and descriptors for feature extraction• Creating database with web sites where medical images can be found• Job optimization• Improvement in usage of replicas
GRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSINGGRID ENABLED SYSTEM FOR MEDICAL IMAGE GATHERING, ANALYZING, RETRIEVAL AND PROCESSING
THANKS!
Q U E S T I O N S ?