+ All Categories
Home > Documents > An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Date post: 23-Jan-2016
Category:
Upload: regis
View: 37 times
Download: 0 times
Share this document with a friend
Description:
An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC. Grace Dasovich Robert Kim Midterm Presentation August 21 2009. Outline. Outline. Related Work Data Modeling Approach and Results Similarity Measures Artificial Neural Network - PowerPoint PPT Presentation
Popular Tags:
38
An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC Grace Dasovich Robert Kim Midterm Presentation August 21 2009
Transcript
Page 1: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

An Investigation into the Relationship between Semantic and Content Based Similarity

Using LIDC

Grace Dasovich

Robert Kim

Midterm Presentation

August 21 2009

Page 2: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

OutlineOutline

• Related Work

• Data

• Modeling Approach and Results– Similarity Measures– Artificial Neural Network– Multivariate Linear Regression

• Conclusions

• Future Work

Page 3: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Computer-Aided Diagnosis (CADx) based on low-level image features– Armato et al. developed a linear discriminant

classifier using features of lung nodules– Need to find the relationship between the

image features and radiologists’ ratings

Related Work

Page 4: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Image features and the semantic ratings– Lung Interpretations

• Barb et al. developed Evolutionary System for Semantic Exchange of Information in Collaborative Environments (ESSENCE)

• Raicu et al. used ensemble classifiers and decision trees to predict semantic ratings

• Samala et al. used several combinations of image features and the radiologists’ ratings to classify nodules

Related Work

Page 5: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

– Similarity• Li et al. investigated four different methods to

compute similarity measures for lung nodules– Feature-based– Pixel-value-difference– Cross correlation– ANN

Related Work

Page 6: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Materials

• LIDC Dataset

• 149 Unique Nodules– One slice per nodule, largest nodule area

• 9 Semantic Characteristics– Calcification and Internal Structure had little

variation, thus were not used

• 64 Content Features– Shape, size, intensity, and texture

6

Data

Page 7: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Related Work

• Data

• Modeling Approach and Results– Similarity Measures– Artificial Neural Network– Multivariate Linear Regression

• Conclusions

• Future Work

Outline

Page 8: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Cosine Similarity

• Jeffrey Divergence

• Euclidean Distance

Similarity Measures

Page 9: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Similarity Measures

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Euclidean Distance

Co

sin

e S

imila

rity

Page 10: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

3.5

4

Euclidean Distance

Jeff

rey

Div

erg

en

ce

Similarity Measures

Page 11: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Computed feature distance measures

Similarity Measures

Page 12: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

OutlineOutline

• Related Work

• Data

• Modeling Approach and Results– Similarity Measures– Artificial Neural Network– Multivariate Linear Regression

• Conclusions

• Future Work

Page 13: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Two three-layer ANNs – Input (64 neurons), hidden layer (5 neurons), output

(1)– Input (64 neurons), hidden layer (5 neurons), output

(7)

• Input = 64 feature distances• Output = Semantic similarity or difference in

semantic ratings• Hyperbolic tangent function, backpropagation

algorithm, 200 iterations

Methods

Page 14: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN with a single output– 640 random pairs from all 109 nodules– 231 pairs from nodules with malignancy > 3– 496 pairs from nodules with area > 122 mm2

Methods

Page 15: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Methods

• ANN with seven outputs– 640 random pairs from all 109 nodules

Page 16: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Leave-one-out method– Cosine similarity or Jeffrey divergence or

difference in Semantic ratings used as teaching data

– An ANN trained with entire dataset minus one image pair

– The pair left out used for testing– Correlation between calculated radiologists’

similarity and ANN output calculated

Methods

Page 17: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN with a single output– 640 random pairs from all 109 nodules– 231 pairs from nodules with malignancy > 3– 496 pairs from nodules with area > 122 mm2

• ANN with seven outputs– 640 random pairs from all 109 nodules

Methods

Page 18: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN using 640 random pairs

Results

Page 19: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN using 231 pairs with malignancy rating > 3

Results

Page 20: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN using 496 pairs with area > 122 mm2

Results

Page 21: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN output vs. target values using Jeffrey divergence for the 640 pairs (r = 0.438)

Results

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Output

Ta

rge

t

Page 22: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• ANN using random 640 pairs and the Jeffrey divergence with seven semantic ratings

Results

Page 23: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

OutlineOutline

• Related Work

• Data

• Modeling Approach and Results– Similarity Measures– Artificial Neural Network– Multivariate Linear Regression

• Conclusions

• Future Work

Page 24: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Methods

• Normalization of Features– Min-Max Technique – Z-Score Technique

• Pair Selection– Looked for matches between k number of

most similar images based on semantic and content

24

Methods

Page 25: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Methods

• Multivariate Regression Analysis– Select features with highest correlation

coefficients

– Feature distance measures

25

Methods

Page 26: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Nodule Analysis– Determine differences between selected and

non-selected nodules– Define requirements for our model

Methods

Page 27: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

27

Results

0 2 4 6 8 10 12 14 16 18 200

0.5

1

Cor

rela

tion

Threshold0 2 4 6 8 10 12 14 16 18 20

0

1000

2000

Num

ber

of P

airs

Page 28: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

d(i, j) d2(i, j) exp(d(i, j))

Cosine 0.871 0.849 0.866

Jeffrey 0.647 0.633 0.608

Page 29: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

Correlation Coefficient Feature0.1175 Equivalent Diameter0.1085 Energy (Haralick)0.0823 Gabor Mean 135_050.0647 Convex Area0.0467 Gabor STD 135_040.0322 Min Intensity BG0.0295 Markov 40.0280 Variance (Haralick)0.0265 Gabor STD 45_050.0238 SD Intensity

R2 = 0.871

29

Results

Page 30: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Content

Sem

antic

30

Results

Page 31: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

1 2 3 4 50

0.5

1Lobulation

1 2 3 4 50

0.5

1Malignancy

1 2 3 4 50

0.2

0.4

0.6

0.8

1Margin

1 2 3 4 50

0.2

0.4

0.6

0.8

1Sphericity

1 2 3 4 50

0.5

1Spiculation

1 2 3 4 50

0.5

1Subtlety

1 2 3 4 50

0.5

1Texture

79 Nodules

70 Nodules

31

Results

Page 32: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

-2 0 2 4 6 80

0.2

0.4Equivalent Diameter

-2 0 2 4 60

0.2

0.4Energy

-1 0 1 2 3 40

0.2

0.4Gabor Mean 135 5

-2 0 2 4 6 8 100

0.5

1Convex Area

-2 -1 0 1 2 3 4 50

0.1

0.2Gabor SD 135 4

-3 -2 -1 0 1 20

0.2

0.4Min Intensity BG

-1 0 1 2 3 4 5 60

0.5

1Markov4

-2 0 2 4 6 80

0.5

1Variance

-2 -1 0 1 2 3 40

0.1

0.2Gabor SD 45 5

-2 0 2 4 60

0.1

0.2SD Intensity

79 nodules70 nodules

32

Results

Page 33: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Results

-5 0 5 100

0.1

0.2

0.3

0.4A

-5 0 5 100

0.05

0.1

0.15

0.2B

79 Nodules70 Nodules

79 Nodules70 Nodules

1 2 3 4 50

0.2

0.4

0.6

0.8C

1 2 3 4 50

0.2

0.4

0.6

0.8D

79 Nodules70 Nodules

79 Nodules70 Nodules

Results

A. Equivalent Diameter, B. Standard Deviation of Intensity, C. Malignancy, D. Subtlety

Page 34: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Preliminary Issues

• The ANN also is not yet sufficient to predict semantic similarity from content– Best correlation 0.438– Malignancy correlation 0.521– Jeffrey performed better unlike linear model

• A semantic gap still exists

Conclusions

Page 35: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Conclusions

• Our linear model applies to a specific type of nodule– Characteristics: High malignancy, high texture,

low lobulation, and low spiculation– Features: Larger diameter, greater intensity

• Linear models are not sufficient for determination of similarities– R2 of 0.871 with chosen nodules

35

Conclusions

Page 36: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Future Work

• Reduce variability among radiologists– Use only nodules with radiologists’ agreement

• Find best combination of content features– 64 may be too many– Currently only using 2D

Future Work

Page 37: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

• Different semantic distance measures– Some ratings are ordinal, Jeffery is for

categorical

• Different methods of machine learning– Incorporate radiologists’ feedback into training– Ensemble of classifiers

Future Work

Page 38: An Investigation into the Relationship between Semantic and Content Based Similarity Using LIDC

Thanks for Listening

Any Questions?

38

Thanks for Listening


Recommended