Date post: | 13-Apr-2017 |
Category: |
Engineering |
Upload: | yun-hao |
View: | 644 times |
Download: | 12 times |
Образец заголовка
Tutorial on Opinion Mining and Sentiment
Analysisby Rezvaneh Rezapour (rezapou2) and Yun Hao (yunhao2)
Prepared as an assignment for CS410: Text Information Systems in Spring 2016
Образец заголовкаIntroduction• How do you choose a movie to
watch?• How do you pick a restaurant or
hotel?• How do you decide which camera to
buy?
Fig 1
Образец заголовкаIntroduction
Fig 2
Fig 3 and Fig 4
Fig 5
Образец заголовкаIntroduction• People like to share their experiences or opinions
about a place, event or product with others.• There are numerous Web sites and pages
containing consumer opinions, for example Amazon and IMDB are great and valuable sources of information (reviews) to find other’s opinions.
• This online word-of-mouth behavior represents new and measurable sources of information. [10]
• But……. It is tooooo much !!!!!!!
Образец заголовкаMotivation• What do we need ?– Study and extract useful information
from individuals’ reviews.
• Why is it helpful?– Save time– Help to find good and bad features– Help to find positive and negative points
Opinion Mining
Sentiment Analysis
Образец заголовкаOpinion Mining• Definition– If a set of text documents (T) are given,
that have opinions on an object, opinion mining intends to identify attributes of the object on which opinion have been given, in each of the document and to find orientation of the comments i.e. whether the comments are positive or negative. [8]
Образец заголовкаOpinion Mining (cont.)• Some terms are often used
interchangeably for opinion mining.
Fig 6 Synonyms of Opinion Mining [8]
Образец заголовкаComponents of Opinion Mining Model
• Question: What do we want to extract from a review?– Positive and Negative opinions– Target of the opinions; Entity – Related set of components; aspect– Related attributes; aspect– Sometimes opinion holder; opinion
source
iPhoneBattery
Voice quality
Образец заголовка• Question: What do we want to
extract from a review?– Positive and Negative opinions– Target of the opinions; Entity – Related set of components; aspect– Related attributes; aspect– Sometimes opinion holder; opinion
source
Object
Features
Opinion Holder
Opinion Passage on a Feature
Components of Opinion Mining Model(cont.)
Образец заголовкаOpinions• Regular: usually referred to as opinion
• Positive or Negative sentiment, attitude or appraisal about an entity or aspect
• Comparative:• Relation of similarities or differences between two or
more entities• Preference of opinion holder based on shared aspects• Usually consists of comparative or superlative
adjectives or adverbs• Need to first identify the objects being compared, the
features being compared, and the preferences of the comparison [8]
Образец заголовкаSubjectivity and Emotion• Objective sentence present factual
information• Subjective sentence present feelings
and beliefs• Emotions are subjective feelings and
thoughts• Some sentences express no emotion
or opinion
Образец заголовкаOpinion Summary• Aspect Based:– Highlight important parts of the reviews– Produce a short text summary
Phone 1: Aspect: General
Positive: 5 <Sentence>Negative: 3 <Sentence>
Aspect: BatteryPositive: 20 <Sentence>Negative: 5 <Sentence>
Pros: Easy to read and understandCons: very qualitative
Образец заголовкаChallenges and Issues• Challenges
– Relevant objects vs irrelevant ones– Same feature expressed in different wordings– Words that could be positive and negative in different context– Long text that could contain both positive and negative opinions– Detecting opinion oriented sentences– Integrating the tasks above
• Some other issues– Identifying comparison words– Dealing with different writing style by different people– Tracking changing opinions– Measuring strength of opinions– Tackling sarcastic statements and mixed views– Spam opinions
Образец заголовкаSentiment Classification• Unit of Analysis: – Sentence– Document
• Methods:– Supervised– Semi-Supervised– Unsupervised
Образец заголовкаGetting Entity and Opinion• Create a structured text from reviews– Extract object features and opinions– Determine all sentiment polarities for opinions– Determine relevant opinions for each object
features• Method:– Use Conditional random Fields
• Linear CRFs ; Computed MAP– Leverage conjunction structure and syntactic
tree structure and integrate them both.
Образец заголовкаGetting Entity and Opinion (cont.)
• What features?– Token, lemma, part of speech– Expand each word by getting synonyms and
antonyms from WordNet– Use SentiWordNet to get the prior polarity
• Create your baseline– Rule-Based methods– Lexicon-Based methods
• Finding of the related paper [2]:– The proposed framework in the paper
outperformed many state-of-art methods.
Образец заголовкаUsing Ontology to Identify Feature
• How?– Use a seed set from the reviews
• Use ontology construction to:– Select relevant sentences including
conceptions– Extract the conceptions from those
sentences• Sentences should consist of conjunctions
and at least one concept seed.
Образец заголовкаUsing Ontology to Identify Feature (cont.)
• Feature identification:– Use ontology terminologies to extract features
• Identify related sentences which contains ontology terminologies• Polarity Identification:
– Use SentiWordNet– Calculate a score for positive, negative and neutral words– Generate an adjective lexicon with prior polarities
• Sentiment Analysis:– Calculate the overall opinion– Consider negative words and conjunctions words
• Finding of the related paper[3]:– The experiment was successful and the result is good:
• Accuracy of Feature Detection Result: 76.9%• Accuracy of Polarity Analysis: positive: 88.3% negative: 81.7%
Образец заголовкаMaking Use of Other Features
• Hypothesis 1:– Users prefer reviews that satisfies their information need,
that are credible, and that have mainstreaming opinion. [6]
• Features indicating…– Information need
• Whether the review satisfies users’ information need– Credibility
• Is the review credible enough?– Bias
• Is the review one of the mainstreaming ones?
• How to quantify these features?
Образец заголовкаMaking Use of Other Features (cont.)
• How to quantify these features? [6]
– Information need• Capture rate: the ratio of words in product attributes and functions
mentioned in the content of reviews – Credibility
• Reliable writers often use past and perfect tense in their writing according to psychological theory.
• The percentage of volitive auxiliary in a review and the percentage of past and perfect tenses in a review.
– Bias• The most frequent in reviews for a product is considered as
mainstreaming opinion (based on data from Amazon), and reviews that are given the same number of stars for the product is considered to carry mainstreaming opinion.
• The divergence (of the ratings) from mainstreaming opinion for a review is calculated.
Образец заголовкаMaking Use of Other Features (cont.)
• Hypothesis 2:– Reviews of reasonable length and lacking spelling
and grammar errors are easy to read and thus more helpful. [7]
• Features indicating…– The average level of subjectivity and the range
and mix of subjectivity and objectivity– Content readability
• How to quantify these features?
Образец заголовкаMaking Use of Other Features (cont.)
• How to quantify these features? [7]
– The average level of subjectivity and the range and mix of subjectivity and objectivity• An average probability of a review being
subjective (objective information is considered as the information that also appears in the product description, and subjective is everything else)
– Content readability• Number of spelling mistakes within each review• Number of sentences, words, and characters of a
review
Образец заголовкаMaking Use of Other Features (cont.)
• Hypothesis 3: – Customer opinions highly depend on the features of the
product being reviewed. [9]
• How to learn useful features from the reviews? [9]
1. Identify the features that are relevant to consumers as regarding to a certain type of product as well as the salience (relative importance of the features)
– Translate text into WordNet concepts and construct a graph with concepts being vertices and “is-a” relation being edges
– Use semantic similarity to add new edges to similar vertices 2. Locate all the related mentions of the identified features
in the reviews3. Quantify opinions mined from the reviews and create a
corresponding numeric vector for each review
Образец заголовкаWhat If Opinions Are Hidden?
• Going beyond overall rating to find user’s opinion about different aspects
• How?– Use Latent Aspect Rating Analysis (LARA)
• Approach:– Identify the major aspects and segment reviews
• How? Bootstrapping-based algorithm guided by a few seed words describing the aspects
– Infer aspect ratings and weights for each individual review based on the content and overall rating• How? A generative Latent Rating Regression (LRR) model
Образец заголовкаWhat If Opinions Are Hidden? (cont.)
• LRR?– The overall rating is assumed to be generated
from small aspects in the review which can be captured and weighted using a regression model.
– After inferring aspects and their weights we use Maximum Likelihood estimator (using EM algorithm) to find the optimal value that can maximize the probability of observing the overall ratings.
• Finding from related paper[5]:– LRR worked better than the other baseline
algorithms in measuring aspect ratings.
Образец заголовкаCan Social Context Help in Review Mining?
• What is social context?– The history of the reviewers and their social network
interactions.• This information is specified to some social network websites and
not all.• Using textual context and social context information can be
helpful in evaluating the quality of individual reviewers and reviews.
• How?– Construct a baseline using labeled reviews and the
review quality pair consists of the quality and helpfulness of each review which comes from manual labeling.
– Improve the above mentioned feature by adding social context.
– Use labeled data, unlabeled data and their social context information to create a semi-supervised model.
Образец заголовкаCan Social Context Help in Review Mining? (cont.)
• Features:– Text statistics: e.g.: length of the review, average length of
sentences– Syntactic features: # of POS tags– Conformity features: comparison of the review with other
reviews using KL-divergence.– Sentiment features: positive and negative words in the
reviews.• Extract features and constraints from social context and
add the regularizations to the model.• Finding of the related paper[4]:
– Using regularizations on social context improved the accuracy of the prediction when working with small training data.
Образец заголовкаReferences• Images:
Fig1: http://www.necatidemir.com.tr/wp-content/uploads/2015/02/data-answer.jpgFig 2: http://image.slidesharecdn.com/fightclubnetworks-160308231653/95/fight-club-networks-18-638.jpg?cb=1457479147Fig 3: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTY7wEWFCPkJNq96bgRLSljZm_dOf3zY7-THpQ_315cMIF_0FwKFig 4: http://www.diplomatic.lv/sites/default/files/editor/why_choose_uk_insurance_direct_0.gifFig 5: http://www.5wconsulting.com/uploads//over-inflating-your-opinion.gif
• Papers:[1] Liu, B. and L. Zhang (2012). A survey of opinion mining and sentiment analysis. Mining text data, Springer: 415-463.[2] Li, F., C. Han, M. Huang, X. Zhu, Y.-J. Xia, S. Zhang and H. Yu (2010). Structure-aware review mining and summarization. Proceedings of the 23rd international conference on computational linguistics, Association for Computational Linguistics.[3] Zhao, L. and C. Li (2009). Ontology-Based Opinion Mining for Movie. KSEM 2009, LNAI 5914, pp 204-214.[4] Lu, Y., Tsaparas, P., Ntoulas, A., & Polanyi, L. (2010). Exploiting social context for review quality prediction. Paper presented at the Proceedings of the 19th international conference on World Wide Web.[5] Wang, H., Lu, Y., & Zhai, C. (2010). Latent aspect rating analysis on review text data: a rating regression approach. Paper presented at the Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining.
Образец заголовкаReferences• Papers (cont.):
[6] Hong, Y., Lu, J., Yao, J., Zhu, Q., & Zhou, G. (2012, August). What reviews are satisfactory: novel features for automatic helpfulness voting. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 495-504). ACM.[7] Ghose, A., & Ipeirotis, P. G. (2011). Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics. Knowledge and Data Engineering, IEEE Transactions on, 23(10), 1498-1512.[8] Seerat, B., & Azam, F. (2012). Opinion mining: Issues and challenges (a survey). International Journal of Computer Applications, 49(9). [9] de Albornoz, J. C., Plaza, L., Gervás, P., & Díaz, A. (2011). A joint model of feature mining and sentiment analysis for product review rating. In Advances in information retrieval (pp. 55-66). Springer Berlin Heidelberg.[10] Liu, B., & Chen-Chuan-Chang, K. (2004). Editorial: special issue on web content mining. ACM SIGKDD Explorations Newsletter, 6(2), 1-4.