Date post: | 09-Feb-2018 |
Category: |
Documents |
Upload: | hung-phan-dang |
View: | 228 times |
Download: | 0 times |
of 59
7/22/2019 Text Sentiment analysis
1/59
Deep learning for SentimentanalysisPRESENTER: HNG D PHAN INSTITUTION OFINFORMATION TECHNOLOGY
7/22/2019 Text Sentiment analysis
2/59
Outline
1. Introduction2. Sentiment analysis approaches
3. Overview of deep learning for applications.
4. Deep learning for sentiment detection.
5. Future research direction
7/22/2019 Text Sentiment analysis
3/59
1. Introduction
Each sentence and paragraph contains it own sentiment feature.
With sentence:
This is a good moviepositive comment.
This movie contains bad words, bad characters and unrelated scennegative comments
7/22/2019 Text Sentiment analysis
4/59
1. Introduction
7/22/2019 Text Sentiment analysis
5/59
1. Introduction
Purpose of sentiment detection: Classification the comment.
Extract relationship between sentences in a paragraph.
Judgment and evaluation
Emotional state
intended emotional communication
7/22/2019 Text Sentiment analysis
6/59
Outline
1. Introduction2. Sentiment analysis approaches
3. Overview of deep learning for applications.
4. Deep learning for sentiment detection.
5. Future research direction
7/22/2019 Text Sentiment analysis
7/59
2. Sentiment analysis approachesIssues:
Classifying the polarity of a given text at the document, sentence, or feature/aspect level
Beyond polarity sentiment classification looks: angry, happy, sad, etc
Early work of Polarity detection:
Peter D. Turney [1]: The classification of a review is predicted by the average semantic othe phrases in the review that contain adjectives or adverb.
Bo Pang and Lillian Lee [2]: Exploiting class relationships for sentiment categorization wirating scales.
Benjamin Snyder and Regina Barzilay [3]: focus on restaurant reviews, analyzing specificrestaurant.
7/22/2019 Text Sentiment analysis
8/59
Peter D. Turney [1] Purpose: classification of film reviews.
Provide a simple unsupervised learning algorithm for classifying reviews as recom(thumbs up) or not recommended (thumbs down).
The classification of a review is predicted by the average semantic orientation of the review that contain adjectives or adverbs.
In this paper, the semantic orientation of a phrase is calculated as the mutual inforbetween the given phrase and the word excellent minus the mutual informationgiven phrase and the word poor.
7/22/2019 Text Sentiment analysis
9/59
Peter D. Turney [1]
1) Identify phrase in input
text contain ADJ and
adverbs
2) Estimate the semantic
orientation of each
extracted phrase
3) assign the given review
to a class, recommended
or not recommended,
Part of speech tagging
Pointwise Mutual Information
(PMI) and Information Retrieval (IR)
7/22/2019 Text Sentiment analysis
10/59
PMI-IR method The Pointwise Mutual Information (PMI) between two words, word1 and word2, is define
follows (Church & Hanks, 1989):
The Semantic Orientation (SO) of a phrase, phrase, is calculated here as follows:
Update the SO based on phrase in hits (matching in the document):
7/22/2019 Text Sentiment analysis
11/59
Peter D. Turney [1]
7/22/2019 Text Sentiment analysis
12/59
Peter D. Turney [1]
Disadvantages: average SO tends to err on the side of guessing that not recommended, when it is actually recommended.
7/22/2019 Text Sentiment analysis
13/59
Bo Pang and Lillian Lee [2]
Determine authors evaluation with respect to a multi-point scale (one tstar).
2 main steps:
Evaluating human performance at the task
Applying a meta-algorithm, based on metric-labeling formulation ofproblem, that alters a given n-ary classifiers output in an explicit atteensure that similar item receive similar labels.
7/22/2019 Text Sentiment analysis
14/59
Bo Pang and Lillian Lee [2]
The idea of metric labeling is provided by JON KLEINBERG AND ETARDOS ([28]).
Extract the cost of the labeling, which represents for the error in labelltotal cost:
Metric labeling: minimize the cost.
7/22/2019 Text Sentiment analysis
15/59
Bo Pang and Lillian Lee [2]Explicitly incorporates information about item similarities together wisimilarity information (for instance, one star. is closer to .two stars. thastars.) is to think of the task as one of metric labeling (Kleinberg and T2002), where label relations are encoded via a distance metric.
To detect the similarity between items and labels, 3 algorithm has beeresearched based on Support Vector Machines:
1. One-vs-all
2. Regression3. Metric labeling
Consider what item similarity measure to apply, proposing one based opositive-sentence percentage.
7/22/2019 Text Sentiment analysis
16/59
Bo Pang and Lillian Lee [2]One-vs-all
Each training point belongs to one of N different classes. The goal is toa function which, given a new data point, will correctly predict the clathe new point belongs [5].
(i) Solve K different binary problems: classify class k" versus the resfor k = 1; .;K.
(ii) Assign a test sample to the class giving the largest fk (x) (most pvalue, where fk (x) is the solution from the kth problem
Purpose: Classify reviews as output labels (score rank) and evaluateaccuracy.
7/22/2019 Text Sentiment analysis
17/59
7/22/2019 Text Sentiment analysis
18/59
Bo Pang and Lillian Lee [2]Regression
the idea is to find the hyperplane that best the training data, but wtraining points whose labels are within distance of the hyperplaneloss:
, is the negative of the distance between l and the value for x by the filted hyperplane function
Koppel and Schler (2005) found that applying linear regression todocuments (in a different corpus than ours) with respect to a threerating scale provided greater accuracy than OVA SVMs and otheralgorithms.
7/22/2019 Text Sentiment analysis
19/59
Bo Pang and Lillian Lee [2]Metric labeling
Let d be a distance metric on labels, and let nnk(x) denote the k nneighbors of item x according to some item-similarity.
Then, it is quite natural to pose our problem as finding a mappinginstances x to labels lx (respecting the original labels of the trainininstances) that minimize
7/22/2019 Text Sentiment analysis
20/59
Bo Pang and Lillian Lee [2] To detect the similarity between item, a traditional measure has us
overlap-based measure such as the cosine between term-frequencdocument vectors.
Ratings can be determined by the positive-sentence percentage (Ptext, i.e., the number of positive sentences divided by the number
subjective sentences.
7/22/2019 Text Sentiment analysis
21/59
Benjamin Snyder and Regina Barzilay Input: in a restaurant review such opinions may include food, ambien
service
Algorithm: The Grief algorithm-jointly learns ranking models for iaspects by modeling the dependencies between assigned ranks .
Analyzing meta-relations between opinions, such as agreement and c
Models the dependencies between different labels via the agreement
7/22/2019 Text Sentiment analysis
22/59
Benjamin Snyder and Regina Barzilay M-aspect ranking model contains m+1 components ((w[1], b[1]),(wb[m]), a). The first m components are individual ranking model, one aspect, the final is agreement model
Predict a joint rank for the m aspects which satisfies the individual ramodels as well as the agreement model.
The decoder then predicts the m ranks which minimize the overall gri
7/22/2019 Text Sentiment analysis
23/59
Benjamin Snyder and Regina Barzilay
7/22/2019 Text Sentiment analysis
24/59
2. Sentiment analysis approaches Objects to analysis:
Text content (adjective, adverb).
The accuracy of review.
Multiple feature/aspect.
Method:
Extension of Support Vector Machine.
Unsupervised learning
Disadvantage: The order of words is ignored and important informa
7/22/2019 Text Sentiment analysis
25/59
Outline1. Introduction
2. Sentiment analysis approaches
3. Overview of deep learning for applications.
4. Deep learning for sentiment detection.
5. Future research direction
7/22/2019 Text Sentiment analysis
26/59
3. Overview of deep learning for applica
Deep learning is a set of algorithms in machine learning that attempin multiple levels of representation, corresponding to different levelabstraction. It typically uses artificial neural networks. [11]
Deep learning application:
Hand writing recognition.
Speech processing.
7/22/2019 Text Sentiment analysis
27/59
Neural network
Artificial neural networks are models inspired byanimal central nervous systems (in particular thebrain) that are capable of machine learning and patternrecognition. They are usually presented as systems ofinterconnected "neurons" that can compute valuesfrom inputs by feeding information through thenetwork.
Main components:
Input, output
Weight.
Activation function.
7/22/2019 Text Sentiment analysis
28/59
The simplest model- the Perceptron
Learning:
7/22/2019 Text Sentiment analysis
29/59
Activation function
This is similar to the behavior of the linearperceptron in neural networks
However, its a nonlinear function, whichallows such networks to compute nontrivialproblems using only a small number of nodes.
http://en.wikipedia.org/wiki/Linear_perceptronhttp://en.wikipedia.org/wiki/Linear_perceptronhttp://en.wikipedia.org/wiki/Linear_perceptronhttp://en.wikipedia.org/wiki/Linear_perceptronhttp://en.wikipedia.org/wiki/Neural_networkshttp://en.wikipedia.org/wiki/Neural_networkshttp://en.wikipedia.org/wiki/Linear_perceptron7/22/2019 Text Sentiment analysis
30/59
Types of Artificial Neural Network:
Types of Artificial Neural Network:
The feed forward neural network was the first and arguably most simof artificial neural network devised. In this network the informationonly one directionforwards: From the input nodes data goes throhidden nodes (if any) and to the output nodes.
Recurrent neural networks (RNNs) are models with bi-directional dWhile a feed forward network propagates data linearly from input toRNNs also propagate data from later processing stages to earlier stacan be used as general sequence processors.
7/22/2019 Text Sentiment analysis
31/59
The Boltzmann machine A Boltzmann machine is a network of units with an "energy" defined for th
It also has binary units, but unlike Hopfield nets, Boltzmann machine unitsstochastic. The global energy, E, in a Boltzmann machine is identical in foa Hopfield network:
Problems:
the time the machine must be run in order to collect equilibrium statistics grows exponenmachine's size, and with the magnitude of the connection strengths
connection strengths are more plastic when the units being connected have activation prointermediate between zero and one, leading to a so-called variance trap. The net effect is causes the connection strengths to random walk until the activities saturate.
7/22/2019 Text Sentiment analysis
32/59
Restricted Boltzmann Machines RBM Boltzmann Machines (BMs) are a particular form of log-linear Markov Random F
i.e., for which the energy function is linear in its free parameters. Advantages: Not allow intralayer connectionbetween hidden-hidden and between
The energy function E(v,h) of an RBM is defined as:
7/22/2019 Text Sentiment analysis
33/59
Deep learning stepsTwo main steps:
1. Pre-trained one layer at a time: treating each layer in turn as aunsupervised restricted Boltzmann machine (RBM).
2. Fine-tuning: using supervised back propagation.
The resulting model is called a deep belief network, and may be builother building blocks than RBMs
7/22/2019 Text Sentiment analysis
34/59
Deep believe network training1. Train the first layer as an RBM that models the raw input x =h(0) as its visible layer
2. Use that first layer to obtain a representation of the input that will be used as data flayer. Two common solutions exist. This representation can be chosen as being the mactivations p(h(1) =1| h(0) ) or samples of p(h(1) | h(0) ).
3. Train the second layer as an RBM, taking the transformed data (samples or mean atraining examples (for the visible layer of that RBM).
4. Iterate (2 and 3) for the desired number of layers, each time propagating upward eior mean values.
5. Fine-tune all the parameters of this deep architecture with respect to a proxy for thelikelihood, or with respect to a supervised training criterion (after adding extra learninto convert the learned representation into supervised predictions, e.g. a linear classifie
7/22/2019 Text Sentiment analysis
35/59
3.2. Deep learning applicationHand-writing recognition:
The MNIST dataset consists of handwritten digit images and it is div60,000 examples for the training set and 10,000 examples for testing
In Dan Claudiu Ciresand Ueli Meier [15]:
Multi layer perceptron (MLP).
Train 5 MLPs with 2 to 9 hidden layers and varying numbers of hidden units. Malways the number of hidden units per layer decreases towards the output layer.
7/22/2019 Text Sentiment analysis
36/59
3.2. Deep learning applicationIn [15]:
7/22/2019 Text Sentiment analysis
37/59
3.2. Deep learning applicationSpeech recognition:
In George Hilton [17], deep neural networks is used to make acoustifor speech recognition.
Most current speech recognition systems use hidden Markov modelsto deal with the temporal variability of speech and Gaussian mixture(GMMs) to determine how well each state of each HMM fits a framwindow of frames of coefficients that represents the acoustic input.
To evaluate the fit: use a feed-forward neural network
Input: Frames of coefficients.
Output: posterior probabilities over HMM states
7/22/2019 Text Sentiment analysis
38/59
Outline1. Introduction
2. Sentiment analysis approaches
3. Overview of deep learning for applications.
4. Deep learning for sentiment detection.
5. Future research direction
7/22/2019 Text Sentiment analysis
39/59
4. Deep learning for sentiment analys General approaches: use semantic word space.
Semantic word spaces have been very useful but cannot exprelonger phrases in a principled way.
Solution: Sentiment Treebank, with 215,154 phrases in the par11,855 sentences.
Recursive Neural Tensor Network: predict compositional semapresent in new corpus
7/22/2019 Text Sentiment analysis
40/59
4. Deep learning for sentiment analysExample of the Recursiv
Network accurately pre
classes, very negative to
0, +, + +), at every node
capturing the negation
sentence.
7/22/2019 Text Sentiment analysis
41/59
Recursive Neural Tensor Network RN Represent a phrase through word vectors and a parse tree and t
vectors for higher nodes in the tree using the same tensor-basefunction.
Related area research:
Semantic Vector Spaces.
Compositionality in Vector Spaces.
Logical Form
Deep Learning
Sentiment analysis
7/22/2019 Text Sentiment analysis
42/59
Semantic Vector Spaces The dominant approach in semantic vector spaces uses distribu
similarities of single words.
Variants of this idea use more complex frequencies such as hoappears in a certain syntactic context (Pado and Lapata, 2007; 2008).
To overcome this, neural vector (Bengio, 2003) approach has bimplemented.
7/22/2019 Text Sentiment analysis
43/59
Compositionality in Vector Spaces Compositionality algorithms: related datasets capture two wor
:Mitchell and Lapata (2010) [24] two-word phrases and analycomputed by vector addition, multiplication and others.
Some related models:
Holographic reduced representations (Plate, 1995- [21]).
compositional matrix space model (Rudolph and Giesbrecht,
7/22/2019 Text Sentiment analysis
44/59
Compositionality in Vector SpacesCompositional matrix space model:
Assigns ordinal sentiment scores to phrases.
Account for critical interactions among the words in each sentimenphrase.
The score of phrase i:
Wk : d word of phr
Represen
7/22/2019 Text Sentiment analysis
45/59
Compositionality in Vector SpacesCompositional matrix space model (continue):
7/22/2019 Text Sentiment analysis
46/59
Compositionality in Vector SpacesWith Stanford system:
Recursive neural network (RNN)
matrix-vector RNNs .
New algorithm: Recursive Neural Tensor Network (RNTN).
7/22/2019 Text Sentiment analysis
47/59
Recursive Neural Model Translate input text to vector.
Compute parent vector in a bottom up fashion using different typecompositionality functions g.
. Not very good .
0 0 +
..
-
P1= g(b, c)
P2= g(p1, a)
7/22/2019 Text Sentiment analysis
48/59
Recursive Neural Network Two children vector is computed:
= (
) = (
)
f : tanh function, standard element-wise nonlinearity.
Compute label value by soft-max classifier:
= softmax( a)
7/22/2019 Text Sentiment analysis
49/59
7/22/2019 Text Sentiment analysis
50/59
Recursive Neural Tensor Network RN Provide an interaction that would allow the model to have greate
between the input vectors. RNTN: The main idea is to use the same, tensor-based compositi
for all nodes.
A single layer tensor:
7/22/2019 Text Sentiment analysis
51/59
Recursive Neural Tensor Network RN
7/22/2019 Text Sentiment analysis
52/59
Tensor Backprop through StructureThe error as a function of the RNTN parameters = (V;W;Ws;L) for
is:
The full derivative for slice V[k] for this tri-gram tree then is the sumnode:
7/22/2019 Text Sentiment analysis
53/59
Recursive Neural Tensor Network RN
7/22/2019 Text Sentiment analysis
54/59
Stanford Sentiment analysis source coHave library in java, C# and python
Extract from input text:
POS, NER: CRF tagging.
Parsed sentiment tree.
Online demo:
nlp.stanford.edu:8080/sentiment/rntnDemo.html
http://nlp.stanford.edu/sentiment/treebank.html
http://nlp.stanford.edu/sentiment/treebank.htmlhttp://nlp.stanford.edu/sentiment/treebank.htmlhttp://nlp.stanford.edu/sentiment/treebank.htmlhttp://nlp.stanford.edu/sentiment/treebank.html7/22/2019 Text Sentiment analysis
55/59
Stanford Sentiment analysis source coInput text: Stanford University is located in California. It is a great
founded in 1891.
Stanford
Stanford
0
8
NNP
ORGANIZATION
PER0
(ROOT (S (NP (PRP It)) (VP (VB
great) (NN university)) (, ,) (VP (VBN fo
(CD 1891)))))) (. .)))
7/22/2019 Text Sentiment analysis
56/59
Stanford Sentiment analysis source co
7/22/2019 Text Sentiment analysis
57/59
Outline1. Introduction
2. Sentiment analysis approaches
3. Overview of deep learning for applications.
4. Deep learning for sentiment detection.
5. Future research direction
7/22/2019 Text Sentiment analysis
58/59
5. Future research direction Overview of deep learning in sentiment detection.
Other sentiment analysis researches:
Sentiment Treebank.
Paragraph positive/negative detection.
With researches in Vietnamese language: Vietnamese Treebank (VLSP).
Word and phrase processing.
7/22/2019 Text Sentiment analysis
59/59
THANK YOU FOR YOUR ATTENTION