Deep Learning Framework based on Word2Vec and
CNN for Users Interests Classification
Abubakr H. Ombabi Computer Sciences.
Sudan University of
Science and Technology.
Khartoum, Sudan.
Onsa Lazzez REGIM-Lab.
University of Sfax.
National School of Engineers.
Sfax, Tunisia.
Wael Ouarda REGIM-Lab.
University of Sfax.
National School of Engineers.
Sfax, Tunisia.
Adel M. Alimi REGIM-Lab.
University of Sfax.
National School of Engineers.
Sfax, Tunisia.
Abstract— Social media has given internet users a venue for
sharing and expressing their interests and opinions on different
life sides. Daily, millions of users generate huge volume of reviews
and comments on social media which reflect their opinions on
different issues. Analyzing these opinions manually is a very hard
task. Thus, opinion analysis is the task of computationally
analyzing opinions expressed in social data. However, there are
few works that have considered both sentiment analysis and
classification to determine users’ topic of interest. In this study, an
approach that combines both sentiment analysis and classification
was proposed. The main objective of this work is to design an
effective method to provide a summary of users interests from
Twitter based on their social textual data on five categories which
are sports, travel, fashion, food and religion. Thus we are able to
discover the topic in which users are interested. Inspired by the
successes of deep learning, our proposed system takes advantages
of pre-trained Word2Vec for text pre-processing and to gain
vector representations of words which will be the input for suitable
Convolutional Neural Network architecture for deep features
extraction. Rectified Linear Unit and Dropout functions were
applied to improve the accuracy. Support Vector Machine
classifier was used to predict the final classification. TensorFlow
running on Python 2.7.12 was used to implement our system. This
system was tested and validated on different publicly available
corpus of reviews and comments from Twitter. The proposed
system achieved best accuracy of 97.3% for users interests
classification.
Keywords—Sentiment Analysis; Word2Vec; CNN; Twitter.
I. INTRODUCTION
Recently, Social networks (SN) such as Twitter, Google plus, Facebook, etc have become popular channels of communication and expressing divers attitudes and opinions. A vast volume of reviews and comments are generated in social networks. These social data reflects the opinions and the sentiments on issues in different applications such as recommending systems, government, media and other activities. This huge content has gained the attention of researchers to focus on social networks analysis (SNA) in their researches to obtain valuable information form SN users. Recently, many studies have proposed several approaches to semi-automated or an automated analysis approaches which can effectively assist in analyzing and managing this huge amounts of social data instead of manually analysis these big data which is costly task.
In fact, Twitter, which is commonly popular social network, various users post tweets for specific area and event. Daily, there are more than 316 million active users on Twitter generate more than 500 million tweets. We can use these tweets for discovering their topic of interests. Divers works have been done in the field of social network analysis (SNA), namely, classification of personal attributes, sentiment analysis of twitter users based on tweets. Now, if a user is interested in that topic or event than he/she get tweet on twitter about the topic / event positive or negative based on her / his sentiment or opinion. Sentiment analysis based on these tweets is necessary to obtain the positive or negative user’ opinion. Classification is used to determine which topic corresponds to a particular tweet, thus discovering the user interest topic. However, sentiment analysis is the task of computationally obtaining and categorizing opinions expressed in textual or visual information in order to classify whether the writer's opinion towards specific service ,topic ,product, etc [2] that can be negative ,positive, or neutral. Various techniques can be applied in the area of users sentiment predictions. As Duwairi in [3] has confirmed, sentiment analysis requires an implementation of a set of algorithms in order to define and exploit emotions and opinions in social network platform. According to [4] the sentiment analysis based on text classification are focused essentially on NLP, machine learning, statistical and linguistics knowledge and text mining methods to obtain subjective information textual data. SA is used to determine the emotions and orientations from large data to assist in making predictions [5].
In this study we focused on the Twitter Social Network which is the most common social network where social users provides huge amount of textual data (tags, reviews). We proposed an approach for the users’ interests analysis based on their sentiments (positive/ negative/ neutral) and the topic of which tweets are related to, in order to obtain the correct positive or negative users’ interests. In fact, the trends of sentiment user discovery from their provided social textual data consist on the recognition of deep features that can be extracted from the social data. For this reason, we have applied well know feed-forward CNN architecture.
This paper is organized as the following: section 2 describes the related works. Section 3 describes our method for analyzing tweets to discover the users’ interests based on users’ sentiments. Section 4 presents the implementation details and
Page 42 of 118
results obtained by our approach. Conclusion and future work are given in section 5.
II. LITERATURE REVIEW According to [5], sentiment analysis studies are classified
into supervised and unsupervised based on the applied
approach. In this work we proposed a system using supervised
approach to classify the users sentiments / opinions on
particular topic of users interests from the users generated
tweets.
In supervised (corpus-based) approach, different types of
machine learning (ML) classifiers for instance Support Vector
Machine (SVM), , Naïve Bayes (NB),K-Nearest Neighbor
(KNN), Decision Tree (D-Tree), etc can be applied to a pre -
annotated database in form of training set and testing set [5].
the classifiers must be trained on the training data to later build
a model which should be used to classify testing data corpus.
Regarding SA, this approach has achieved much higher
accuracy than other approaches, However, it involves creating
and labeling a large datasets manually which is very difficult
and very time consuming process even for expertise [7]. In
addition to, this model may be a domain-biased.
At the other hand, in the unsupervised (lexicon-based)
approach, dictionary is used for the semantic polarity of a
sentence and word predictions. For this , every word is assigned
polarity (strength) value for instance range from +1 to +5 may
be used for positive polarities in which word with +5 value
means it is much more positive than word with polarity of +1.
The lexicon could be initiated manually or automatically [5]. In
the automatic approach, a list of seed words is constructed, then,
the lexicon size can be expanded by applying some words
similarities. The total polarity of the sentence can be obtained
by calculating the polarity score of each word from the
dictionary and then add these polarities scores into one score to
obtain the sentiment of the entire text. This approach does not
cope well with different domains, besides its accuracy is lower
than the supervise approach [8].
Recently, Deep learning been explored for natural language
processing (NLP) tasks [9], particularly on textual data
representations at the sentences, documents, or words levels.
In sentiment analysis and text classification, many researches
have proposed several deep learning models to gain better
performance. Convolutional Neural Networks can be defined as
kind of feed-forwarding neural network. The basic CNN
consists of convolution, fully connected, relevance weights and
pooling layers. Compared to other deep nets approaches, CNN
involves less training data and easier to train. Moreover, CNN
is characterized by its fewer parameters and connections [10].
In the sentiment analysis or opinion mining CNN has proved to
be efficient and has introduced great performance on textual
data [9]. Unlike other neural net such as RNN, we just need to
annotate the whole corpus artificially. A key enabling factors in
CNN is that it uses convolutional filters to automatically
capture and learn features suitable to a particular task. CNN
performs feature extraction by applying convolution operation,
it’s able to learn local features automatically, thus, reducing the
manual operation. CNN takes advantage of applying the same
weights of neurons on the same feature map, this enables the
network to learn in parallel [11]. In machine learning,
implementation of one deep learning algorithm cannot obtain
best results, therefore combination of deep learning algorithm
and other pre-trained methods can obtain higher accuracy.
A. Word2vec In order to transform any NLP task into machine learning
algorithms, text must firstly be transferred into corresponding
vector representation. For this there are two vectorization
algorithms. One-hot representation, in which very long vector
is used to represent the words, the vector length is the same as
the size of the dictionary used in the corpus. It uses only 1 and
0 weights. According to [10] with One-hot representations it is
not easy to depend only on words vectors to define the
relationship between words. Another approach is distributed
representation which has recorded the best performance in deep
learning field. This method is based on mapping each word into
fixed length vector, distributing these vectors to form the vector
space [12]. A word vector is described as a low dimensional
vector representations that encode semantic features of words
learned in unsupervised neural nets models on a very big text
corpus. Word2vec is a neural network used to process the text
before this text is received by deep-learning algorithms [13]. It
takes text corpus as an input and generates the word vectors as
output. The vector representation of words is obtained after
word2vec builds vocabulary from the training corpus. The
resulting word vectors file could be used as features to deep
learning algorithms. In this algorithm, the sentence words are
initially represented in form of words matrix, then it transferred
into vectors in an n-dimensional vector space. In this method,
similar words are represented near each other in the vector
space [4]. Moreover, with Word2vec features can be obtained
without human intervention. Word2Vec can also perform
effectively even when its input is an individual word. With this
tool, very accurate predictions about a word’s meaning can be
obtained and the semantic relationship between words can be
easily evaluated.
B. Social textual data analysis
Recently several studies have proposed different
approaches for user’ attributes mining such as users’
sentiments, opinions, personal information, based on their
generated textual data from social networks. In this section we
presented some of the recent studies. First of all, Conneau et al.
[14] have proposed new character level model (VD- CNN) for
Natural Language Processing (NLP) task in the Social Network
analysis area. It is the first time that very deep Convolutional
Neural Network (CNN) been applied to NLP task. For this,
deep stack of local operations, convolutions and max-pooling
of size 3 for sentence high-level representation were applied.
Ngrams and ngrams-TF-IDF were used as features. The model
was tested on eight public large-scale datasets. An architecture
of small temporal convolution filters with different types of
pooling was examined which shown that significant
enhancement of the CNN configurations can be reached when
setting the depth to 29 convolutional layers.
Page 43 of 118
For the medical field, authors in [13] have analyzed the patient
(dis)satisfaction using doctors performance reviews to predict
their ratings on different measures. A static word vector model
was used for word representation then a CNN structure was
deployed contains Convolutional Layer, ReLu Layer, Pooling
Layer, and Fully-Connected Layer. The proposed model was
validated on 35000 user reviews. The model obtained an
accuracy of 93% in predicting rating on a 5-point scale.
Still with the reviews, Sahu et al.[2] have proposed model to
determine the polarity of the movie reviews on a scale of 0 to
4. A computation linguistic technique was applied for text
preprocessing. For features extraction an approach based on
structured N-grams was used, feature extraction impact analysis
was performed by computing information gain for each feature.
Furthermore, Zhou et al, [6], have proposed a novel deep
framework for movies reviews evaluation using word2vec to
obtain words vector representations with 7-layers CNN
architecture. The CNN contains 3 pairs of convolutional layers
and pooling layer to extract sentiments from texts. This model
incorporated ReLU, Normalization and Dropout techniques.
Different classifiers were examined such as NB, SVM. The
model has achieved highest accuracy of 45.4% when compared
with RNN and MV-RNN models.
For the user’ sentiment analysis, Joulin et al.[16] have proposed
a baseline model for text classification. In this work fastText
was evaluated and compared with existing classifiers for
sentiment analysis problem. Eight datasets and evaluation
protocol were used to evaluate the model. The best accuracy
was 98.6 in fastText. The experimental results shown that fast
text classifier is often on par in terms of accuracy with other
deep learning classifiers. In [17] the authors have instructed a
novel combined method for sentiment analysis. For this, rule-
based classification, supervised learning and machine learning
approaches were used. They proposed semi-automatic,
complementary model which has achieved good level of
classification effectiveness. Pawar et al. [18], have proposed
hybrid approach for sentiment classification, on Sanders twitter
dataset, after preprocessing, several features were extracted
such as N-gram feature, Lexicon Feature, Positive lexicons. The
opinion score of each tweet is calculated to classify the tweets.
Tweet is considered to as positive if it’s calculated score is
greater than 0, if it is less than 0 it is considered as negative, and
if it is zero it is considered as neutral class. A Neural Network,
QDA, SVM, LDA, Naive Bayes, Random Forest classifiers
were evaluated. SVM and Random Forest have recorded the
highest accuracy of 88.65 for both. Despite user’ sentiment, in
[8] the authors have designed an opinion mining model for
tweeter. After tweets are crawled from Twitter, pre-processing
steps were deployed.
Recently, many works are focusing on the understanding of
users on social media using user’s generated social data on its
different types. Authors in [23] have proposed user ontology
profiling in social networks by using framework containing
Facebook application. The model aimed to predict social
networks user’s Age, Gender, Race and Smile based on social
textual and visual Data. Also authors in [24] have proposed a
novel framework to understand both textual and visual data
form social networks to extract the user’s soft biometrics
information from posted pictures. Study in [25] has investigated
classifying lie or truth from speech signal. The model was based
on the Mel Frequency Cepstral Coefficient and performed on
ReLiDDB dataset. Users interest can be applied in many
security context like in [26] which aimed to use facial biometric
modality. Gabor and LBP features for face characterization and
the Euclidian and Mahcosine distance for classification were
tested. Also work in [27] has presented an experimental study
on the proposed face recognition approaches by building
systems with different techniques for features extraction and
classification. Authors in [28] have propose a bag of
geometrical features based face recognition approaches using
SVM, GA and other algorithms. This model was performed on
the two benchmarks ORL and Caltech Faces. Also [29] has
proposed a Smart Riding Club Biometric System with new
features extraction technique based on the fusion between two
basic texture descriptors Gabor and Local Binary Pattern.
Motivated by these works, we examine if the tweets shared
by social users in Twitter can be applied to discover their topic
of interest that present an important research area in the social
network analysis process. However, it is hard to predict the
interest of social users automatically from their shared tweets
because it’s not contains all the features of each topic.
III. PROPOSED APPROACH
In this study, we proposed a novel deep framework for user’
interest discovery based on user’s sentiment / opinions. This
model which is called (Deep Text Users Interests System
(DTUIS)) Focused on using Word2vec, CNN, and the
supervised classifier SVM.
Fig.1 shows the basic flow diagram of (DTUIS), first, we have
used some standard databases that contains set of tweets in
order to determine the interests according to their contained
text. Sentiment analysis has been done on tweets to know the
inclination of users, whether he/she is positively indicated his
sentiment over a particular topic or not.
Fig .1. Overview of the proposed approach
We have used matching of words for tweets classification to
categorize it under a certain topic (Sport, Religion, Culture,
Food and Fashion). Finally, user’ interest is obtained that shows
the positive of user’ inclination towards a specific topic.
In (DTUIS) we have used word2vec to transform tweets into its
corresponding vectors to build up the sentences vectors, and
then we used the word vector file which is generated by the
word2vec as the input data to the CNN to perform features
extraction. Finally, to classify the sentences into different
sentiment labels we used linear Support Vector Machine in
order to predict whether the user sentiment is positive, negative
or natural on the interest in various topics. Fig. 2 illustrates the
overall process of (DTUIS).
Page 44 of 118
In the following, we will present and details each step.
Fig. 2. Deep Text Users Interests System (DTUIS).
A. Data Collection This architecture was trained and validated on two publicly
available corpuses of pre-labeled tweets, one is the Movie
Reviews corpus originally collected and published by Pang et
al. 2002. It contains 10662 sentences balanced into positive and
negative. Second dataset is Sanders-Twitter Sentiment Corpus
version 0.2 created by Pawar et al. 2015 [18], contains 5500
hand-classified tweets on 4 topics , the tweets are labeled as
positive, negative, neutral and irrelevant. Table.1 illustrates
details of these datasets.
TABLE 1.Sanders and MR datasets labels distribution
Label
Number of reviews
Sanders MR dataset
Positive 570 5331
Negative 654 5331
Neutral 2503 -
Irrelevant 1786 -
Total 5513 10662
B. Preprocessing In (DTUIS), word2vec is used to obtain vectors
representation of words, these vectors are the input to the CNN.
In this paper, we use publicly available pre-trained Word2vec
model (static word vector model) which is trained on Google
News dataset of about 100 billion words [6], this model
contains 3 million words and phrases from Google News each
word is represented in 300-dimensional vector. From this big
corpus we can obtain precise relations of words.
C. Features Extraction After we obtained words vectors representation using pre-
trained word2vec, we will train Convolutional Neural Network.
The CNN architecture used in this system is inspired by the
CNN architecture used in [19]. In this architecture, the input to
the network is a sequence of words (the input sentence). Each
word is represented as vector, all vectors have the same length.
A sentence is represented as a 2-dimensional matrix. Our CNN
architecture as illustrated in Fig. 3 consisted of Convolutional
Layer for automatically features extraction using three
convolution kernels (convolution filters) of different sizes,
ReLu Layer, Pooling Layer with nonlinear sampling method in
order to decrease the number of characteristic parameters and
prevent overfitting, and Fully-Connected Layer.
Fig. 3. CNN architecture.[1]
At the input layer, the sentences of length k are considered
as vector of words, each word is represented as 300-dimentions
vector. A sentence becomes 2-dimensional matrix. The
sentence is considered as concatenation of words (word
vectors). Convolution can be defined as a binary operation
requiring two operands both of them represented as matrix, one
is the text segment and the other is the convolution filter (CF).
The output of this process is a single real number. The input
words is a matrix of the words vectors. A CF is also a matrix of
the same dimensions as the earlier one. A particular adapted CF
convolves on the input text matrix using a sliding window and
produces many real numbers outputs. The resulting sequence of
real numbers called feature map which is corresponds to a
particular CF being used. In this model, let be the
corresponding k-dimensional word vector to the i-th word in the
input sentence. A sentence of length n is represented as a
concatenation of words vectors as illustrated in (1). As we
stated earlier the sentence with length less than n will be padded
where necessary.
X1:n = X1 ⊕ X2 ⊕ …. ⊕ Xn, (1)
Where ⊕ denoting for the concatenation operator.
Generally, assume that is the concatenation of words
, a convolutional layer performs convolution
operation using convolution filter (h is the filter
height or the sliding window size) to each of the windows with
k width. In other words it is a matrix of size h×k, and is
the basic element from the i-th to the (i+j)-th, which represents
Page 45 of 118
the local feature matrix from the i-th line to the (i+j)-th line of
a sentence word vector. For instance, a feature (i-th feature
value) can be produced from a window of words using
(2).
(2)
Where f is a nonlinear activation function (convolution filter
function) commonly used RELU, hyperbolic tangent and
Sigmoid, etc.[1]. b is a bias term ((b ∊ R) which is a parameter
need to be learned as W during the training task . To generate a
feature map the filter convolves to each window of words in the
sentence matrix { as shown in (3).
C = (3).
Note that C ∊
Pooling mechanism is applied using max-overtime pooling for
features sampling [20] over the feature map and maximum
value of local feature is captured as the feature corresponding
to current filter using (4). The idea is to select the highest value-
feature that is the most important feature on each feature map. ĉ = max {C}. (4)
In this system we used multiple filters with different window
sizes as illustrated on Table. 2 to obtain multiple features, we
have presented the process using one feature which extracted
with one convolution filter. A fully connected layer receives the
selected feature vector as input and then Support Vector
Machine (SVM) classifier is applied to obtain the final
classification result. In this model, for regularization we applied
dropout with a constraint on l2-norms of the weight vectors
during the training [19]. With dropout we can solve significant
issue in machine learning, which is overfitting. Dropout
prevents co-adaptation of hidden units [19]. Dropout performs
setting to zero the output of each hidden neuron with p of 0.5.
The algorithm drops out the neurons which does not contribute
to the forward and back propagation passes. In this model, all
of the neurons are used but their outputs are multiplied by 0.5
after the convolutional layer. For all datasets and experiments in this paper different
parameters are set uniformly as shown in Table. 2. CF weights
W values and softmax layer weights U were assigned uniformly
from [-0:1; 0:1]. 100 feature maps were used for each of these
filter sizes creating a total of 300 feature maps. Learning rate
technique was applied for stochastic gradient descent with a
maximum of 100 training epochs. Batch size was set to 64 with
zero padding as needed.
TABLE. 2. CNN Parameters Settings
Parameter Value
Padding length 64
Word vector dimension 300
Filter region sizes 3,4,5
activation function ReLU (rectified linear unit)
Dropout probability parameter p 0.5
pooling 1-max pooling
L2 regularization lambda 0.0
IV. EXPERIMENTAL RESULTS In this framework, initially, word2vec was used to transform each word into a vector, in this way we can construct the sentences’ vectors. Then these sentences’ vectors are taken as inputs by then CNN for features extraction, in order to classify the sentences to positive/negative sentiment labels we used Support Vector Machine (SVM). We implemented our model an open source framework called TensorFlow running on Python 2.7.12.
A. Evaluation Measures For each experiment, popular evaluation measures such as
(Accuracy, Recall, Specificity, Precision etc.) were used to
evaluate the performance of (DTUIS). Table.3 presents the
calculated values of these measures on movie reviews and
sander datasets for binary classification. The calculated values
of these measures have proved that our approach is accurate and
promising in opinion mining, particularly when considering
users internets prediction. We recorded best Recall (REC),
Specificity, Precision, and F-score on MR dataset while the
measures values were decreased on Sander dataset.
TABLE. 3. Evaluation measures values from the obtained confusion matrices.
Evaluation Metrics Movie
Reviews
Sanders
Accuracy (ACC) 0.97 0.96
Recall ( REC) 1.0 0.96
Specificity (SP) 1.0 0.95
Precision (PRE) 0.99 0.95
False positive rate (FPR) 0.0 0.0
F-score (F1) 0.97 0.96
B. Comparison with other approaches We also evaluated and compared our results of sentimental
analysis with other previous studies on the same datasets. As
shown on Table. 4 for movie reviews dataset, the study by Pang
et al in [21] which is the first work done in this area has obtained
an accuracy of 87.2%. Other work in [22] by Mullen has
obtained an accuracy of 86% , Parabowo in [17] achieved best
accuracy of 87.3% using 10-fold & 5-fold cross validation,
work in [2] done by Sahu has achieved best accuracy of 88.95%
on Random Forest classifier. From this comparison it is found
out that our approach (DTUIS) that used word2vec, CNN, and
SVM outperformed other approaches and reached an accuracy
level of 97.3% which is promising result when it is compared
with other state of-the-art results on a movie review.
We also evaluated and compared our results on sander dataset
with Twitter Sentiment Classification done by Pawar at el in
[18] in which an opinion score of each tweet is calculated using
feature vectors. Then, Neural Network ,QDA, SVM, Naïve,
LDA, Random Forest machine learning classifiers were
evaluated where best classification accuracy reached 88.65%
with SVM followed by 88.62 on Neural Network. As shown on
Table. 5 (DTUIS) outperformed this model with best accuracy
of 96%.
Page 46 of 118
TABLE 4. Comparisons on Movie reviews dataset
Reference Features
Extraction
Classification Accuracy
Pang et al. [21] unigrams NB, ME, SVM 82.9%
Mullen et al. [22] Part-of Speech SVM 86%
Prabowo et al.[17] Part-of-Speech
taggers
SVM 87.3%
Sahu [2] Bi-grams,
Tri-grams
NB,KNNRF,
DT
88.95%
(DTUIS) CBOW SVM 97.3%
TABLE. 5. Comparisons on Sanders dataset
Reference Features
Extraction
Classification Best accuracy
Pawar at el [18] N-gram,
POS
SVM, Naive
Bayes, Random Forest
88.65% on
SVM
(DTUIS)
CBOW
SVM
96 %
Fig. 4, Fig. 5, illustrate the confusion matrix obtained when
running our system on Movie Reviews and Sanders datasets
respectively. TP = True Positive. FP= False Positive. FN= False
Negative. TN= True Negative. 0,1 for positive and negative
respectively. 2,3 for natural and irrelevant respectively.
Fig. 4. Confusion Matrix of Films Interest Classification on Movie Review
Dataset.
Fig. 5.Confusion Matrix of Sentiment Classification on Sanders Dataset.
V. CONCLUSION
Sentiment mining is an attractive and challenging area. In this
effort, we presented solution to this problem in form of a system
to classify the user interest in a particular topic. We have
described our interesting framework of convolutional neural
network and wor2vec to solve this problem. We have presented
our interesting results on the available public datasets. The
experimental results indicated that CNN with pre-trained
word2vev can outperform and record new state-of-the-art
scores over different classification algorithms. Since there are
more large scale user generated information in an online
environments. Sentiment analyses provide very important tools
for managing this information in predictions. SA represents rich
source of valuable information used in a wide application such
as public opinion, product analysis, etc. We believe that this
approach of sentiment analysis in mining users interests could
give further inspiration to other researchers. Future researches
direction can be in extending this approach to other domains of
opinion mining such as political discussion, newspaper articles,
etc. also it is recommended to extend this work to other
languages. Examine the architecture with other ML classifiers.
REFERENCES [1] R. M. Duwairi, “Sentiment Analysis for Arabizi Text,” pp. 127–132,
2016.
[2] T. P. Sahu, “Sentiment Analysis of Movie Reviews : A study on Feature Selection & Classification Algorithms,” 2016.
[3] R. M. Duwairi, “Arabic Sentiment Analysis using Supervised Classification,” 2014.
[4] P. Gupta, R. Tiwari, and N. Robert, “Sentiment Analysis and Text Summarization of Online Reviews : A Survey,” pp. 241–245, 2016.
[5] N. A. Abdulla, N. A. Ahmed, M. A. Shehab, and M. Al-ayyoub, “Arabic Sentiment Analysis :,” 2013.
[6] X. Ouyang and P. Zhou, “Sentiment Analysis Using Convolutional Neural Network,” 2015.
[7] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” vol. 2, no. 1, 2008.
[8] P. Liang, “Opinion Mining on Social Media Data,” pp. 91–96, 2013.
[9] A. Rios, “Convolutional Neural Networks for Biomedical Text Classification : Application in Indexing Biomedical Articles.”
[10] K. Xiao, Z. Zhang, and J. Wu, “Chinese Text Sentiment Analysis Based on Improved Convolutional Neural Networks,” no. 10, pp. 922–926, 2016.
[11] B. Hu and Z. Lu, “Convolutional Neural Network Architectures for Matching Natural Language Sentences,” pp. 1–9.
[12] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Distributed Representations of Words and Phrases and their Compositionality,” pp. 1–9.
[13] R. D. Sharma, S. Tripathi, S. K. Sahu, S. Mittal, and A. Anand, “Predicting Online Doctor Ratings from User Reviews Using Convolutional Neural Networks,” vol. 6, no. 2, 2016.
[14] A. Conneau, “Very Deep Convolutional Networks for Text Classification,” no. 2001, 2016.
[15] Y. Kim and A. M. Rush, “Character-Aware Neural Language Models.”
[16] A. Joulin, “Bag of Tricks for Efficient Text Classification,” 2015.
[17] R. Prabowo and M. Thelwall, “Sentiment Analysis : A Combined Approach.”
[18] K. K. Pawar and R. R. Deshmukh, “Twitter Sentiment Classification on Sanders Data using Hybrid Approach,” vol. 17, no. 4, pp. 118–123, 2015.
[19] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” 2011.
[20] J. Weston and M. Karlen, “Natural Language Processing ( Almost ) from Scratch,” vol. 12, pp. 2493–2537, 2011.
[21] B. Pang and L. Lee, “A Sentimental Education : Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts,” 2002.
Page 47 of 118
[22] T. Mullen and N. Collier, “Sentiment analysis using support vector machines with diverse information sources.”
[23] O. Lazzez, W. Ouarda, A. M. Alimi, “ Age, Gender, Race and Smile Prediction Based on Social Textual and Visual Data Analyzing,” vol 557, 2017.
[24] O. Lazzez, W. Ouarda, A. M. Alimi, “Understand Me if You Can! Global Soft Biometrics Recognition from Social Visual Data,”. vol 552, 2017.
[25] H. Nasri, W. Ouarda, and A. M. Alimi, “ReLiDSS : Novel Lie Detection system from speech signal,” 2016.
[26] I. Jarraya, W. Ouarda, and A. M. Alimi, “A Preliminary Investigation on Horses Recognition Using Facial Texture Features,” no. l, pp. 2803–2808, 2015.
[27] W. Ouarda, H. Trichili, A. M. Alimi and B. Solaiman, “MLP Neural Network For Face Recognition Based on Gabor Features and Dimensionality Reduction Techniques,” 2014.
[28] W. Ouarda, H. Trichili, A. M. Alimi, and B. Solaiman, “Face Recognition Based on Geometric Features Using Support Vector Machines,” pp. 89–95, 2014.
[29] W. Ouarda, H. Trichili, A. M. Alimi, B. Solaiman, “Towards A Novel Biometric System For Smart Riding Club,” J Inf Assurance Security. 2016;11(4):201–13.
Page 48 of 118