Region-based
Semi-supervised Clustering
Image Segmentation
2011 Seventh International Conference on Natural Computation (ICNC)
Tongfeng Sun, Zihui Ren, Shifei Ding
School of Information and Electrical Engineering
China University of Mining and Technology, China
(DOI: 10.1109/ICNC.2011.6022385)
Presentation by Onur Yılmaz
Outline
Introduction
Theoretical Analysis
Image Segmentation on Real Data
Experimental Comparison
Conclusion
IntroductionImage Segmentation
Clusterings are often used to segment images.
Image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as superpixels)
Possible goals:
SimplifyChange representation
IntroductionProblem
K-means and fuzzy C-means (FCM) are
Unsupervised methods,
«Natural separations of data»
may not represent user's preferences
IntroductionWhat is presented in this paper?
Improvement and application of
Semi supervised clustering in image
segmentation
based on spatial information
Theoretical Analysis
Semi-supervised clustering image segmentation is divided
into four steps:
obtain labeled
data
extract features and form feature
vectors
improve semi-
supervised clustering algorithm
merge and annotate regions
according to
manual guides
based on spatial
information design an appropriate
clustering objective
function and segment
image
Theoretical Analysis
Semi-supervised clustering image segmentation is divided
into four steps:
obtain labeled
data
extract features and form feature
vectors
improve semi-
supervised clustering algorithm
merge and annotate regions
according to
manual guides
based on spatial
information design an appropriate
clustering objective
function and segment
image
Theoretical AnalysisManual Guides
Manual guides directly reflect
user’s intentions and affect the
results of the segmentation
The user provides the number of
clusters and labels some pixels.
Labeled pixels have spatial
information features.
Theoretical Analysis
Semi-supervised clustering image segmentation is divided
into four steps:
obtain labeled
data
extract features and form feature
vectors
improve semi-
supervised clustering algorithm
merge and annotate regions
according to
manual guides
based on spatial
information design an appropriate
clustering objective
function and segment
image
Theoretical AnalysisAttribute Extraction
Characteristics of image pixels use two categories:
color features
neighbouring texture features
YUV color space:
Y → luminance
U → chrominance
V → chroma
Gray Matrix
Theoretical Analysis
Semi-supervised clustering image segmentation is divided
into four steps:
obtain labeled
data
extract features and form feature
vectors
improve semi-
supervised clustering algorithm
merge and annotate regions
according to
manual guides
based on spatial
information design an appropriate
clustering objective
function and segment
image
Theoretical AnalysisSemi-supervised Clustering
K-Means
is «natural» segmentation based on
data themselves
doesn’t represent user's preferences.
To overcome the problem,
Semi-supervised clustering considers
user's preferences
Theoretical AnalysisSemi-supervised Clustering
In this paper, constraint-
based semi-supervised
clustering algorithm is
improved with an objective
function.
Theoretical AnalysisSemi-supervised Clustering
Objective Function
Jobj =Total Weighted
Euclidean
Distance
Total Penalty for
Incorrect Labeled
Segment+
Theoretical AnalysisSemi-supervised Clustering
Objective Function
Jobj =Total Weighted
Euclidean
Distance
Total Penalty for
Incorrect Labeled
Segment+
Sum of distances for
segments to cluster
centers
over all clustersm-th element is assumed
to belong to the i -th
clustering
Cluster
center
# of
clusters
Theoretical AnalysisSemi-supervised Clustering
Objective Function
Jobj =Total Weighted
Euclidean
Distance
Total Penalty for
Incorrect Labeled
Segment+
Penalty function
when labeled data
are incorrectly
segmented Incorrectly assignmentfunction (1 or 0)
Penaltycoeff.
Distance difference toassumed cluster center
and original cluster center
Theoretical AnalysisImplementation of Semi-supervised Clustering
Data X = { xm }
Labeled data X’
Number of clusters k
Maximum number of
iterations s
Iteration termination
condition d
Subsets X1 X2 X3 X4 …. XkSemi-supervised
Clustering
INPUTS OUTPUTS
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
Satisfied
Not satisfied
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
Extend «labeled data» to adjacent data
Divide extended data to r subsets where
each subset have same label
Check r (found above) and k (number of
clusters in inputs)
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
If r = k Calculate cluster centers
If r > k Start over and prompt error
If r < k There is a need for (k-r) more clusters!
• Use k-means clustering to find k clusters
• Remove r cluster centers which are the nearest to
already found
• Add (k-r) cluster centers
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
Place data into
different clusters to
minimize the objective
function
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
Recalculate cluster
centers with giving
more weight to the
labeled data
Theoretical AnalysisImplementation of Semi-supervised Clustering
Initialize cluster centers
Classifydata
Recalculate cluster centers
Checkterminationcondition
Step 1 Step 2 Step 3 Step 4
Check for
• Maximum number of
iterations
• Change in cluster centers
Theoretical Analysis
Semi-supervised clustering image segmentation is divided
into four steps:
obtain labeled
data
extract features and form feature
vectors
improve semi-
supervised clustering algorithm
merge and annotate regions
according to
manual guides
based on spatial
information design an appropriate
clustering objective
function and segment
image
Theoretical AnalysisMerging and Annotation
In order to obtain desired objects,
Region merging:
A complex object with several independent segmentations whichshould be merged finally.
Region annotation:
Annotate regions with different meanings based on labeled data,
such as segmentation objects, background, etc.
Morphological dilation and erosion
Image Segmentation on Real DataData Preprocessing Step
200 images:
animals, plants, landscape and other aspects
Image Segmentation on Real DataManual Guides Step
Setting some parameters:
number of cluster classes, initial cluster centers etc.
Parameters should be designated according to
user’s preferences.
river
snow
field catsbackground Field
and 7 more
unlabeled
FLEXIBLE
Image Segmentation on Real DataImage Segmentation Step
Labeled data point (cluster centers from 1 to k)
Image Segmentation on Real DataImage Segmentation Step
Image segmentation results
Image Segmentation on Real DataMerging and Annotation Step
After merging and annotation
Experimental ComparisonExperiment Results
Method Number of iteration Time Accuracy Rate
Semi-supervised
Clustering
7 1.5 s 92 %
K-means Clustering 27 5.3 s 74 %
FCM Clustering 30 6.5 s 78 %
Experimental ComparisonExperiment Results
Semi-supervised clustering greatly
Reduces the number of iteration and
Improves segmentation accuracy
Experimental ComparisonExperiment Results
The weight of labeled data ranging from 1 to 200 plays greater impact on convergence speed
Experimental ComparisonExperiment Results
Semi-supervised clustering has good reliability.
The more labeled clusters are, the more consistent results
Experimental ComparisonExperiment Results
If labeled data are few, semi-supervised clustering is nearly
equivalent to K-means.
In addition, the segmentation results are insensitive to noise
Conclusion
Image segmentation experiments
in different conditions show that
region-based semi-supervised
clustering can improve the
accuracy and speed of
segmentation.
Conclusion
Different weights to labeled data and
unlabeled data in computing cluster
center
penalty function can effectively increases
the influences of manual guides
So the segmentation may be more in line
with user’s requirements.
What is presented?
Introduction
Theoretical Analysis
Image Segmentation on Real Data
Experimental Comparison
Thank you for your interest!
References
M. Mignotte. "A de-texturing and spatially constrained K-means approach for image segmentation," Pattern Recognition Letters. 32 (2), Jan. 2011, pp. 359-367.
R. J. He, B. R. Sajja, S. Datta and P. A. Narayana. "Volume and shape in feature space on adaptive FCM in MRI segmentation," Annals of Biomedical Engineering, 2008, 36(9), pp. 1580-1593.
J. Jin and D. Zhang. "Semi-supervised robust on-line clustering algorithm," Journal of Computer Research and Development, 2008, 45(3), pp. 496-502. (Pubitemid 351648839)
X. Bao, X. Peng, Y. Wang and Z. Cao. "Textile image segmentation based on semi-supervised clustering and Bayes decision," 2009 International Conference on Artificial Intelligence and Computational Intelligence, IEEE Computer Society, Nov. 2009, (3), pp. 559-562.
K. Wagstaf and C. Cardie. "Clustering with instance-level constraints," the 17th International Conference on Machine Learning(ICML), Morgan Kaufmann Publishers Inc, 2000, pp. 1103-1110.
References
E. P. Xing, A. Y. Ng, M. I. Jordan and S. Russell. "Distance metric learning with application to clustering with side-information," In S. Thrun S. Becker and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, MIT Press, 2003, pp. 505-512.
M. Bleinko, S. Basu and R. J. Mooney. "Integrating constraints and metric learning in semi-supervise clustering," Proceedings of the 21st International Conference on Machine Learning, ACM Press, 2004, pp. 81-88. (Pubitemid 40290795)
L. Vincent. "Graphs and mathematical morphology," Signal Processing, 1989, 16(4), pp. 365-388.
A. Baraldi and F. Parmiggaani. "An investigation of the textural characteristics association with gray level co-occurrence matrix statistical parameters," IEEE Transaction on Geoscience and Remote Sensing, 1995, 33 (2), pp. 293-304.
J. MacQueen. "Some methods for classification and analysis of multivariate observations," In Proceedings of 5th Berkeley Symposiumon Mathematical Statistics and Probability, University of California Press, 1967, pp. 281-297.
Baiwan gallery. http://www.mypcera.com/PHOTO/index.htm. 1 January, 2011.
Region-based
Semi-supervised Clustering
Image Segmentation
2011 Seventh International Conference on Natural Computation (ICNC)
Tongfeng Sun, Zihui Ren, Shifei Ding
School of Information and Electrical Engineering
China University of Mining and Technology, China
(DOI: 10.1109/ICNC.2011.6022385)
Presentation by Onur Yılmaz