+ All Categories
Home > Documents > ,QWHUQDWLRQDO&RQIHUHQFHRQ$IIHFWLYH&RPSXWLQJDQG,QWHOOLJHQW ... · techniques to evaluate and...

,QWHUQDWLRQDO&RQIHUHQFHRQ$IIHFWLYH&RPSXWLQJDQG,QWHOOLJHQW ... · techniques to evaluate and...

Date post: 24-May-2018
Category:
Upload: lykhanh
View: 213 times
Download: 0 times
Share this document with a friend
7
Semi-Supervised Emotional Classification of Color Images By Learning From Cloud Na Li, Yong Xia* Shaanxi Key Lab of Speech & Image Information Processing (SAIIP) School of Computer Science and Technology, Northwestern Polytechnical University Xi’an, China Email: [email protected] Yuwei Xia Computer Science College Xi’an Polytechnic University Xi’an, China Email: [email protected] Abstract—Classification of images based on the feelings gener- ated by each image in its reviewers is becoming more and more popular. Due to the difficulty of gathering training data, this task is intrinsically a small-sample learning problem. Hence, the re- sults produced by most existing solutions are less accurate. In this paper, we propose the semi-supervised hierarchical classification (SSHC) algorithm for emotional classification of color images. We extract three groups of features for each classification task and use those features in a two-level classification model that is based on the support vector machine (SVM) and Adaboost technique. To enlarge the training dataset, we employ each training image to retrieve similar images from the Internet cloud and jointly use the manually labeled small dataset and retrieved large but unlabeled dataset to train a classifier via semi-supervised learning. We have evaluated the proposed algorithm against the fuzzy similarity- based emotional classification (FSBEC) algorithm and another supervised hierarchical classification algorithm that does not learn from online images in three bi-class classification tasks, including “warm vs. cool”, “light vs. heavy” and “static vs. dynamic”. Our pilot results suggest that, by learning from the similar images archived in the Internet cloud, the proposed SSHC algorithm can produce more accurate emotional classification of color images. Keywords—Image emotional classification; semi- supervised learning; content-based image retrieval (CBIR) I. I NTRODUCTION With the rapid advance of information technologies, digital images have become mainstream of late notably within HDTV, cell phones, personal cameras, and many Internet and biomed- ical applications. A lot of research efforts have been devoted to analyzing and understanding the contents and semantics of those images. However, it was not until recently that the human emotions invoked by viewing a digital image began to draw re- search attentions from the multimedia community. Nowadays, researchers are particularly interested in developing automated techniques to evaluate and classify images according to the feelings they may bring, since such techniques may have potential and immediate applications in healthcare, such as filtering the multimedia contents viewed by people with mental illness or designing emotional human-computer interface and serious games for patients with neurodegenerative disorders. Automated emotional classification of images generally consists of two major steps: feature extraction and classifica- tion. The first step aims to identify the numerical descriptors that can objectively characterize the emotional information embedded in images. Since Soen et al. [1],[2] revealed the potential correlation between the average color and texture components of color images and 13 psychological image scales expressed in linguistic terms, many color and texture features have been attempted. Lee and Park [3] used six MPEG-7 visual descriptors, which were originally recom- mended for content-based image retrieval [4],[5], as image features to determine emotions. Inspired by the psychology and art theory, Machajdik and Hanbury [6] developed a series of features to represent color, texture, composition and content, respectively. Wang et al. [7] proposed a set of semantic visual features, including the figure-ground relationships, color patterns, shapes and compositions. Lee and Hsiao [8] extracted novel spatial-frequency features and applied them to image classification in two dimensional affective space. Low-level texture descriptors, such as the Wiccest features [9] and Gabor features [10], have also been applied to emotional valence categorization of images [11]-[13]. Besides discriminative features, an effective machine learn- ing method also plays a critical role in bridging the semantic gap between images and emotional human perceptions [14]. In fact, many commonly-used techniques, such as the support vector machine (SVM) [15], artificial neural network (ANN) [16], clustering algorithm [17],[18] and case-based reasoning [3], have been adopted for emotional classification of images [19]. Wang and Liu [20] simulated human emotions based on the images used in the Furnham’s shape-color Test [21] and adopted the back-propagation neural network (BPNN) to emotionally group images into two categories: “like” and “dis- like”. Zhao et al. [22] extended this work by adopting the fuzzy SVM algorithm to achieve image classification on an expanded emotion space, which consists of “pleasure”, “arousal”, and “dominance”. Wang et al. [23] identified an orthogonal three- dimension emotional factor space of images through 12 pairs of emotional words and employed the SVM of regression to annotate automatically the emotional semantic in images. To improve the performance of human-computer interfaces in terms of emotion, intuition and inspiration, Cho [24] proposed the interactive genetic algorithm (IGA) technique, which carries out optimization with human evaluation, and with which the user can obtain what he has in mind through repeated interaction, and applied it to the problems of image 978-1-4799-9953-8/15/$31.00 ©2015 IEEE 84 2015 International Conference on Affective Computing and Intelligent Interaction (ACII)
Transcript

Semi-Supervised Emotional Classification of ColorImages By Learning From Cloud

Na Li, Yong Xia*Shaanxi Key Lab of Speech & Image Information Processing (SAIIP)

School of Computer Science and Technology, Northwestern Polytechnical UniversityXi’an, China

Email: [email protected]

Yuwei XiaComputer Science College

Xi’an Polytechnic UniversityXi’an, China

Email: [email protected]

Abstract—Classification of images based on the feelings gener-ated by each image in its reviewers is becoming more and morepopular. Due to the difficulty of gathering training data, this taskis intrinsically a small-sample learning problem. Hence, the re-sults produced by most existing solutions are less accurate. In thispaper, we propose the semi-supervised hierarchical classification(SSHC) algorithm for emotional classification of color images. Weextract three groups of features for each classification task anduse those features in a two-level classification model that is basedon the support vector machine (SVM) and Adaboost technique.To enlarge the training dataset, we employ each training image toretrieve similar images from the Internet cloud and jointly use themanually labeled small dataset and retrieved large but unlabeleddataset to train a classifier via semi-supervised learning. We haveevaluated the proposed algorithm against the fuzzy similarity-based emotional classification (FSBEC) algorithm and anothersupervised hierarchical classification algorithm that does notlearn from online images in three bi-class classification tasks,including “warm vs. cool”, “light vs. heavy” and “static vs.dynamic”. Our pilot results suggest that, by learning from thesimilar images archived in the Internet cloud, the proposed SSHCalgorithm can produce more accurate emotional classification ofcolor images.

Keywords—Image emotional classification; semi-supervised learning; content-based image retrieval (CBIR)

I. INTRODUCTION

With the rapid advance of information technologies, digitalimages have become mainstream of late notably within HDTV,cell phones, personal cameras, and many Internet and biomed-ical applications. A lot of research efforts have been devotedto analyzing and understanding the contents and semantics ofthose images. However, it was not until recently that the humanemotions invoked by viewing a digital image began to draw re-search attentions from the multimedia community. Nowadays,researchers are particularly interested in developing automatedtechniques to evaluate and classify images according to thefeelings they may bring, since such techniques may havepotential and immediate applications in healthcare, such asfiltering the multimedia contents viewed by people with mentalillness or designing emotional human-computer interface andserious games for patients with neurodegenerative disorders.

Automated emotional classification of images generallyconsists of two major steps: feature extraction and classifica-tion. The first step aims to identify the numerical descriptorsthat can objectively characterize the emotional information

embedded in images. Since Soen et al. [1],[2] revealed thepotential correlation between the average color and texturecomponents of color images and 13 psychological imagescales expressed in linguistic terms, many color and texturefeatures have been attempted. Lee and Park [3] used sixMPEG-7 visual descriptors, which were originally recom-mended for content-based image retrieval [4],[5], as imagefeatures to determine emotions. Inspired by the psychologyand art theory, Machajdik and Hanbury [6] developed a seriesof features to represent color, texture, composition and content,respectively. Wang et al. [7] proposed a set of semanticvisual features, including the figure-ground relationships, colorpatterns, shapes and compositions. Lee and Hsiao [8] extractednovel spatial-frequency features and applied them to imageclassification in two dimensional affective space. Low-leveltexture descriptors, such as the Wiccest features [9] and Gaborfeatures [10], have also been applied to emotional valencecategorization of images [11]-[13].

Besides discriminative features, an effective machine learn-ing method also plays a critical role in bridging the semanticgap between images and emotional human perceptions [14].In fact, many commonly-used techniques, such as the supportvector machine (SVM) [15], artificial neural network (ANN)[16], clustering algorithm [17],[18] and case-based reasoning[3], have been adopted for emotional classification of images[19]. Wang and Liu [20] simulated human emotions basedon the images used in the Furnham’s shape-color Test [21]and adopted the back-propagation neural network (BPNN) toemotionally group images into two categories: “like” and “dis-like”. Zhao et al. [22] extended this work by adopting the fuzzySVM algorithm to achieve image classification on an expandedemotion space, which consists of “pleasure”, “arousal”, and“dominance”. Wang et al. [23] identified an orthogonal three-dimension emotional factor space of images through 12 pairsof emotional words and employed the SVM of regressionto annotate automatically the emotional semantic in images.To improve the performance of human-computer interfacesin terms of emotion, intuition and inspiration, Cho [24]proposed the interactive genetic algorithm (IGA) technique,which carries out optimization with human evaluation, andwith which the user can obtain what he has in mind throughrepeated interaction, and applied it to the problems of image

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 84

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

retrieval by emotional semantics. Li et al. [25] developed anemotional-based mapping scheme using analytical hierarchyprocess (AHP) on the basis of HSI color space, which providesa systematical way to evaluate the fitness of emotional labelsfor interpreting the emotional messages conveyed by a colorimage. Kun [26] analyzed the hierarchical characteristic ofaffective features, and thus proposed the affective featureshierarchical model (AFHM) to classify affective features intothree groups for information retrieval.

In spite of these achievements, automated emotional classi-fication of color images still remains an intriguing challenge.The difficulties can be largely ascribed to the fact that thefeelings generated by digital images are highly subjective andcan hardly be modeled unless there are a huge number oftraining data. However, due to the time-consuming nature ofcollecting and annotating images, emotional image classifica-tion is intrinsically a small sample learning problem. Nev-ertheless, with the explosive increase of multimedia contentson the Internet, we now have a myriad number of imagesbeing stored in the Internet cloud. For instance, Yahoo Flickerhosts over 8 billion photos on its servers, Facebook has anestimated 140 billion photos, with users uploading 6 billionphotos per month, and the Chinese Internet giant Tencentclaims that its Qzone social network now hosts 150 billionphotos. It is widely acknowledged that, when you would liketo share one image with your friends, you can easily find somesimilar images, even of the same scene for landmark images,by using sophisticated image retrieval techniques [25]. Sincethose similar images retrieved from the cloud can be used tolargely enhance the efficiency of image coding and inpainting[25], we believe they also open up great opportunities forimproving the accuracy of emotional image classification.

Motivated by this idea and the work done by Lee and Park[3], we propose a semi-supervised hierarchical classification(SSHC) algorithm for emotional classification of color images.In this paper, we define the emotional space using three pairsof adjectives with opposite meanings, including “warm vs.cool”, “light vs. heavy” and “static vs. dynamic”. We extractthree types of color features and three types of texture featuresto represent the emotional semantics in each image, and thenapply three of them to each bi-class emotional image classifi-cation task. The major contributions of this paper are in twofolds. First, to use different types of features in a combined andefficient way, we construct a hierarchical classification modelbased on the SVM and Adaboost [27] techniques. Second, toalleviate the small sample-size problem, we retrieve imagessimilar to each training sample from the archives in the cloudand jointly use the manually labeled images and retrievedunlabeled images to train a classifier in a semi-supervisedway. We have evaluated the proposed SSHC algorithm againstthe fuzzy similarity-based emotional classification (FSBEC)algorithm [3] and supervised hierarchical classification (SHC)without learning from the cloud on color images. Our pilotresults suggest that supplementing similar images retrievedfrom the cloud to the training image set does improve theperformance of emotional classification of images.

II. METHOD

A. Overview

The proposed SSHC algorithm consists of three major steps,including (1) extracting image features, (2) retrieving a largenumber of unlabeled image samples from the cloud, and (3)learning image classifiers in a semi-supervised way using bothlabeled and unlabeled images. The detailed diagram of thisalgorithm is shown in Fig. 1.

B. Feature Extraction

We adopt both color and texture features for differentimage classification tasks [28]. Three types of color features,including the color layout descriptor (CLD), scalable colordescriptor (SCD) and color structure descriptor (CSD), areused to either separate “warm” images from “cool” ones ordifferentiate “light” images from “heavy” ones. Three types oftexture features, namely the edge histogram descriptor (EHD),histogram of oriented gradient (HOG) and edge directiondescriptor (EDD), are employed to identify “static” imagesfrom “dynamic” ones. The first four features are part of theMPEG-7 visual descriptors, which have been widely usedin image retrieval [29],[30] and classification [31],[32]. Theother two texture features, i.e. the HOG and EDD, are alsocommonly used in image analysis and interpretation [33]. Thedetails of these features are briefly described as follows.

• CLD is based on the discrete cosine transform (DCT),which can be calculated generally in three steps. First, animage is partitioned into 64 sub-images and the dominantcolor in each sub-image is extracted. Second, 64 DCTcoefficients are calculated by applying DCT to extracteddominant colors. Third, a 12-dimensional CLD vector isobtained via quantization of the first 12 ZigZag-scannedcoefficients.

• SCD provides a histogram of the colors in the HSVcolor space, encoded by a Haar transform. Its binaryrepresentation is scalable according to the number of binsand bits representation accuracy. We use 64 bins and 4bits, and thus get a 64-dimensional SCD.

• CSD describes color features, including both the colorof features and the structure of features, and hence isable to distinguish between two images with identicalcolor histograms, but with colors in different areas of theimage. It can be generated by moving a structure elementaround the image and counting the presence of a colorin the structure. In this study, a 32-dimensional CSD isemployed.

• EHD describes the local edge distribution in an image.We partition each image into 16 sub-images and use fivebins to extract the histogram of non-directional edges andedges along four directions, i.e. horizontal, 45-degree,vertical and 135-degree on each sub-image. As a result,we have an 80-dimensional EHD.

• HOG is the histogram of oriented gradient which canbe computed in four steps, including (1) calculating thegradients of the image on a pixel-by-pixel basis, (2)

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 85

Fig. 1: Diagram of proposed algorithm

dividing the image into small spatial regions, namelycells, and accumulating a local histogram of gradientdirections on each cell, (3) combining cell units intoa big block and normalizing the histogram of all cellsin the block, and (4) collecting features of all blocks[34]. In this study, each block contains 2 × 2 cells, eachcell contains 8 × 8 pixels, step length is 8 pixels, andgradients are counted along 9 directions. Consequently,we have a 34596-dimensional HOG feature for each 256× 256 image.

• EDD gives the global histogram of the directions ofgradients and can be computed in three steps, including(1) performing edge detection and labelling edged pixels,(2) calculating the gradient of each edged pixel, (3)producing a 4-dimensional EDD by counting the numberof edged pixels whose gradient direction is horizontal,45-degree, vertical and 135-degree, respectively.

As a result, each image will be described by a 108-dimensional feature vector in the “warm vs. cool” and “lightvs. heavy” classification and by a 34680-dimensional featurevector in the “static vs. dynamic” classification.

C. Hierarchical Classifier

We use three groups of color or texture features, whosedimensionality ranges from 4 to 34596, to solve each of thosethree bi-class emotional image classification problems. Theproposed hierarchical classifier consists of two levels. At thelower level, we train a SVM as a classifier on each groupof features and use the Adaboost algorithm to enhance theclassification performance [27]. The core idea of Adaboostis to train different weak classifiers using the data sampledfrom the same training set and combine these weak classifiersinto a strong one. Let the training dataset be denoted byD = {(xi, yi) ; i = 1, 2, · · · , NT }, where xi ∈ RD is the i-th sample’s D-dimensional feature, and yi ∈ {0, 1} representsthe corresponding class label. In the k-th iteration, we firstsample NS training data to form a subset DS according tothe distribution of training sample {wk (i) ; i = 1, 2, · · · , NT },which can be initialized as an uniform distribution. Then, we

train a SVM on the subset DS as a weak classifier hk (x).Next, we test the classifier hk (x) on the entire training datasetD and obtain an error rate εk. If εk > 0.5, it means thatthe weak classifier is unqualified and we need repeat the datasampling and SVM training process until a qualified hk (x) isachieved. The weight of this qualified weak classifier can bedefined as follows

αk =1

2ln

(1− εkεk

). (1)

which ensures that the classifier with a lower error rate to havea higher weight. Before performing the next round of samplingand training, we should adjust the distribution of trainingsamples in the following way to ensure that misclassifiedsamples have higher probability of being sampled.

wk+1 (i) =wk (i)

zk

{e−αk , if hk (xi) = yieαk , if hk (xi) 6= yi

. (2)

where zk is a normalization constant that ensures∑i wk+1 (i) = 1. Thus, the next classifier will focus

mainly on misclassified samples. After performing the above-mentioned process T times, we obtain T weak classifiers andcan combine them to form a strong one in the following way

H (x) = sign

(T∑k=1

αkhk (x)

). (3)

For the g-th feature group, we can generate a classifierHg (x). At the higher level, we can combine those threeclassifiers to form a stronger one by using the Adaboostalgorithm again, shown as follows

H̃ (x) = sign

(3∑g=1

α̃kHg (x)

). (4)

where the weight α̃k is calculated based on error rate ofclassifier Hg (x), similar to the definition given in Eq. (1).Then, we can apply the classifier H̃ (x) to each bi-classclassification task. This hierarchical classification process isillustrated in Fig. 2.

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 86

Fig. 2: Flow chart of hierarchical classification process

D. Semi-Supervised Classification

To train an accurate classifier, we need as many imagesamples as possible. Since there are an astounding number ofimages available in the Internet cloud, there must be someimages that are similar to each training sample and canfacilitate the training process. For instance, one training imageand the similar images retrieved by using the content-basedimage retrieval (CBIR) technology developed by Baidu [35]are shown in Fig. 3.

In this study, we input each training image sample intothe Baidu CBIR system, and then download the first fiveimages the system retrieved. Thus we have another trainingimage set which is five-times larger than the original one.Since the Baidu CBIR system is not designed specificallyfor emotional classification, the retrieved images are similarto training samples in appearance, but may not necessarilyconvey the same emotional semantics. Hence, we assume thatall retrieved images in the bigger set are unlabeled.

To jointly use both labeled and unlabeled images to solveeach bi-class image classification problem, we construct asemi-supervised classifier in three steps. First, we extractfeatures from the manually labeled small training datasetand use those features to train a hierarchical classifier in asupervised way. Then, we apply this classifier to the samefeatures extracted from each retrieved image and predict theclass label of that image. Thus the retrieved images areconverted from unlabeled samples to labeled ones. Finally, wecan use retrieved images with predicted labels and originallylabeled images to retrain the hierarchical classifier, which isthe optimal classifier we obtained for emotional classificationof color images.

III. RESULTS

We have applied the proposed SSHC algorithm to threeemotional image classification tasks, including differentiationbetween “static” images and “dynamic” ones, between “light”images and “heavy” ones, and between “warm” images and“cool” ones. We collected 592 images with diversified con-tents, including human bodies, faces, buildings, natural scenes,indoors, flowers, sports and painting gallery. These imageswere labeled by 10 persons based on majority voting. As

a result, 96 images were labeled as “static”, 96 images as“dynamic”, 100 images as “light”, 100 images as “heavy”, 100images as “warm” and 100 images as “cool”. Some trainingsamples were displayed in Fig. 4.

We evaluated our algorithm against the FSBEC algorithm[3] and SHC algorithm by using the two-fold cross-validation.The FSBEC algorithm is based on the rough approximationand the fuzzy inter- and intra-similarities, whereas the SHCalgorithm is almost identical to the proposed algorithm exceptfor the lack of learning from the cloud. The classification ac-curacy was measured by the percentage of correctly classifiedtesting samples.

Fig. 5 depicts the accuracy of those three algorithms in solv-ing each bi-class emotional image classification problem andthe average accuracy counted in three experiments. It revealsthat the proposed algorithm outperforms the state-of-the-artFSBEC algorithm and SHC algorithm in all three classificationtasks. Particularly, comparing to the SHC algorithm that usemanually labeled training samples only, the proposed algorith-m has substantially improved accuracy. It demonstrates that wecan improve the performance of emotional image classificationby learning from the images retrieved from achieves in thecloud, though it usually seems that the information providedby those online images may contain a lot of uncertainty.

IV. DISCUSSIONS

Table I gives the computational time cost of the SHCand SSHC algorithms (Intel Core i5-4460 3.2G CPU, 8GBmemory and Matlab R2014a). Applying the proposed SSHCalgorithm to emotional classification of color images consistsof two phases: offline training and online testing. It is clearin Table I that the training phrase of the proposed approachis quite time-consuming, since it retrieves a large number ofimages from the cloud and uses them to train the classifier.Nevertheless, the testing phrase of both methods is veryefficient, costing around 0.2 second to classify each image.Thus one of the advantages of the proposed SSHC algorithmis that it can be applied to real-time emotional classification ofcolor images, as the traditional classification approaches basedon case-based learning are generally inefficient [3].

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 87

Fig. 3: One training image (left) and similar images obtained by the Baidu CBIR (right)

TABLE I: AVERAGE TIME COST IN THREE EMOTIONAL IMAGE CLASSIFICATION TASKS

XXXXXXXXMethodsSteps Offline Training

Testing Each ImageFeature

ExtractionRetrieving

Images fromCloud

SupervisedTrainingClassifier

Classification ofRetrieved Images

TrainingUltimateClassifier

SHC(s) 90.39 N/A 1.25 N/A N/A 0.1232

Proposed(s) 543.11 Not counted 119.6 6.92 237.9 0.2036

V. CONCLUSION

In this paper, we report our pilot study on emotionalclassification of color images by using a large number ofimages retrieved from the cloud and semi-supervised learningtechnique. Generally, due to the time-consuming nature ofmanual image annotation, emotional classification of imagesis a small-sample learning problem with intrinsic difficulties.In the proposed SSHC algorithm, the classifier is trained bynot only learning from manually labeled images but also fromimages retrieved from the cloud. Although those online imagesare unlabeled and implies a lot of uncertain information, ourresults suggest that, with a much larger set of unlabeledtraining images obtained from online, we can substantiallyimprove the accuracy of this image classification task by usingan appropriately designed semi-supervised learning technique.This work may provoke innovative solutions to other imageclassification and understanding applications based on learningfrom the cloud.

In the meantime, this pilot work will be further improvedby a series of studies, including investigating image featuresthat are more suitable for each classification task, developinga web-crawler that can automatically download much moresimilar images, studying the mechanism to exclude semanti-cally less-similar online images from the training process, and

exploring deep models for learning from a tremendous numberonline images.

ACKNOWLEDGMENT

This work was supported in part by the National NaturalScience Foundation of China under Grants 61471297, inpart by the Natural Science Foundation of Shaanxi Province,China, under Grant 2015JM6287, in part by the FundamentalResearch Funds for the Central Universities under Grants3102014JSJ0006, and in part by the Returned Overseas Schol-ar Project of Shaanxi Province, China.

REFERENCES

[1] T. Soen, T. Shimada, and M. Akita, Objective evaluation of colordesign, Color Research and Application, vol. 12, pp. 187-195, 1987.

[2] N. Kawamto, and T. Soen, Objective evaluation of color design II,Color Research and Application, vol. 18, pp. 260-266, 1993.

[3] L. Joonwhoan, and P. EunJong, “Fuzzy Similarity-Based EmotionalClassification of Color Images,” IEEE Transactions on Multimedia, vol.13, no. 5, pp. 1031-1039, 2011.

[4] A. Yamada, M. Pickering, S. Jeannin, L. Cieplinski, J. R. Ohm, andM.Kim, MPEG-7 Visual Part of Experimentation Model Version 10.0:ISO/IEC JTC1/SC29/WG11/N4063, 2001.

[5] L. Cieplinski, M. Kim, J. R. Ohm, M. Pickering, and A. Yamada,Text of ISO/IEC 15938-3/FCD Information Technology MultimediaContent Description Interface Part 3 Visual: ISO/IEC JTC1/SC29/WG11/N4062, 2001.

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 88

Fig. 4: Some training image samples from the class of “static” (1st row), “dynamic” (2nd row), “light” (3rd row), “heavy” (4th row), “warm” (5th row) and“cool” (5th row)

Fig. 5: Classification accuracy of the FSBEC, SHC and proposed SSHC algorithms in three emotional image classification tasks

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 89

[6] J. Machajdik, and A. Hanbury, “Affective Imag Classification usingFeatures Inspired by Psychology and Art Theory,” Proceedings of theinternational ACM conference on Multimedia, pp. 83-92, 2010.

[7] X. Wang, J. Jia, and J. Yin, “Interpretable aesthetic features for affec-tive image classification,” Proceedings of the 20th IEEE InternationalConference on Image Processing, pp. 3230-3234, Sep. 2013.

[8] P. M. Lee, and T. C. Hsiao, “Applying LCS to affective imageclassification in spatial-frequency domain,” Proceedings of the IEEECongress on Evolutionary Computation, pp. 1690-1697, July 2014.

[9] J. M. Geusebroek, “Compact object descriptors from local colourinvariant histograms,” Proceedings of the British Machine Conference,pp. 1029-1038, Sep. 2006.

[10] A. C. Bovik, M. Clark, and W. S. Geisler, “Multichannel textureanalysis using localized spatial filters,” IEEE Transactions on PatternAnalysis and Machine Intelligence, vol. 12, no. 1, pp. 55-73, 1990.

[11] V. Yanulevskaya, J. C. van Gemert, and K. Roth, “Emotional valencecategorization using holistic image features,” Proceedings of the 15thIEEE International Conference on Image Processing, pp. 101-104, Oct.(2008).

[12] W. N. Wang, Y. L. Yu, and J. C. Zhang, “Image emotional classifica-tion: static vs. dynamic,” Proceedings of the 2004 IEEE InternationalConference on Systems, Man and Cybernetics, pp. 6407-6411 vol.7,Oct. (2004).

[13] M. Tkalcic, A. Odic, A. Kosir, and J. Tasic, “Affective Labeling in aContent-Based Recommender System for Images,” IEEE Transactionson Multimedia, vol. 15, no. 2, pp. 391-400, (2013).

[14] J. H. Tang, Z. J. Zha, D. Tao, and T. S. Chua, “Semantic-Gap-OrientedActive Learning for Multilabel Image Annotation,” IEEE Transactionson Image Processing, vol. 21, no. 4, pp. 2354-2360, 2012.

[15] Z. Liu, and X. Yu, “Identification of Image Emotional Semantic Basedon Feature Fusion,” Proceedings of the 2012 International Conferenceon Computer Science & Service System, pp. 1802-1806, Aug. 2012.

[16] H. F. Li, and Q. Jin, “Research of Image Affective Semantic RulesBased on Neural Network,” Proceedings of the 2008 InternationalSeminar on Future BioMedical Information Engineering, pp. 148-151,Dec. 2008.

[17] W. N. Wang, and Y. L. Yu, “Image emotional semantic query basedon color semantic description,” Proceedings of the 2005 InternationalConference on Machine Learning and Cybernetics, pp. 4571-4576 Vol.7, Aug. 2005.

[18] H. F. Li, X. Wen, and H. Jin, “The Clustering Algorithm Researchof Image Emotional Characteristics Based on Ant Colony,” Proceed-ings of the 2008 International Symposium on Intelligent InformationTechnology Application Workshops, pp. 455-458, Dec. 2008.

[19] W. N. Wang, and Q. H. He, “A survey on emotional semantic imageretrieval,” Proceedings of the 15th IEEE International Conference onImage Processing, pp. 117-120, Oct. 2008.

[20] J. Wang, and B. Liu, “A Study on Emotion Classification of ImageBased on BP Neural Network,” Proceedings of the 2010 InternationalConference of Information Science and Management Engineering, pp.100-104, Aug. 2010.

[21] J.R.Smith, and S.F.Chang, “A fully automated content-based image

query system,” ACM Multimedia 96, 1996.[22] J. Zhao, H. Lu, Y. Li, and J. Chen, “A Kind of Fuzzy Decision Tree

Based on the Image Emotion Classification,” Proceedings of the 2012International Conference on Computing, Measurement, Control andSensor Network, pp. 167-170, July 2012.

[23] W. N. Wang, Y. L. Yu, and S. M. Jiang, “Image Retrieval by EmotionalSemantics: A Study of Emotional Space and Feature Extraction,”Proceedings of the 2006 IEEE International Conference on Systems,Man and Cybernetics, pp. 3534-3539, Oct. 2006.

[24] C. Sung-Bae, “Emotional image and musical information retrieval withinteractive genetic algorithm,” Proceedings of the IEEE, vol. 92, no. 4,pp. 702-711, 2004.

[25] H. F. Li, J. Li, and J. C. Song, “Extracting Emotional Semantics fromColor Image using Analytical Hierarchy Process,” Proceedings of the3rd International Conference on Intelligent Information Hiding andMultimedia Signal Processing, pp. 612-618, Nov. 2007.

[26] K. Huang, “Colorful Natural Scenes Retrieval based on AffectiveFeatures Hierarchical Model,” Proceedings of the 2008 InternationalSymposium on Information Science and Engineering, pp. 183-187, Dec.2008.

[27] X. Zhang, and F. Ren, “Improving Svm Learning Accuracy withAdaboost,” Proceedings of the 4th International Conference on NaturalComputation, vol. 3, pp. 221-225, Oct. 2008.

[28] M. Bleschke, R. Madonski, and R. Rudnicki, “Image retrieval systembased on combined MPEG-7 texture and Colour Descriptors,” Pro-ceedings of the 16th International Conference on Mixed Design ofIntegrated Circuits & Systems, pp. 635-639, June 2009.

[29] R. Dorairaj, and K. R. Namuduri, “Compact combination of MPEG-7color and texture descriptors for image retrieval,” Proceedings of the2004 Conference Record of the Thirty-Eighth Asilomar Conference onSignals, Systems and Computers, pp. 387-391 Vol.1, Nov. 2004.

[30] Y. Zhao, B. Wang, S. Ye, Y. Zheng, and L. Shi, “Image retrieval basedon improved MCD and EHD of MPEG-7,” Proceedings of the 2009International Conference on Test and Measurement, pp. 127-132, Dec.2009.

[31] H. Chen, Z. Gao, G. Lu, and S. Li, “A Novel Support Vector Ma-chine Fuzzy Network for Image Classification Using MPEG-7 VisualDescriptors,” Proceedings of the 2008 International Conference onMultiMedia and Information Technology, pp. 365-368, Dec. 2008.

[32] G. J. Tian, H. Fu, and D. D. Feng, “Automatic medical image cate-gorization and annotation using LBP and MPEG-7 edge histograms,”Proceedings of the 2008 International Conference on InformationTechnology and Applications in Biomedicine, pp. 51-53, May 2008.

[33] T. H. N. Le, K. Luu, and M. Savvides, “SparCLeS: Dynamic L1 SparseClassifiers With Level Sets for Robust Beard/Moustache Detection andSegmentation,” IEEE Transactions on Image Processing, vol. 22, pp.3097 - 3107, Aug. 2013.

[34] N. Dalal, and B. Triggs, “Histograms of oriented gradients for humandetection,” Proceedings of the 2005 IEEE Conference on ComputerVision and Pattern Recognition, vol. 1, pp. 886-893, June 2005.

[35] Baidu CBIR system-Baidu Institute of Deep Learning, available at:http://idl.baidu.com/IDL-news-2.html.

978-1-4799-9953-8/15/$31.00 ©2015 IEEE 90


Recommended