Date post: | 26-Jan-2015 |
Category: |
Technology |
Upload: | sagar-ahire |
View: | 117 times |
Download: | 0 times |
Resources for Sentiment AnalysisSeminar Presentation
Sagar Ahire133050073
IIT Bombay
02 May, 2014
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 1 / 48
Roadmap
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 2 / 48
Introduction
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 3 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synsetSO-CAL, created manually, with a graded score per wordWordnet-Affect, created semi-automatically, with affect information foreach synsetIndian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synsetSO-CAL, created manually, with a graded score per wordWordnet-Affect, created semi-automatically, with affect information foreach synsetIndian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synset
SO-CAL, created manually, with a graded score per wordWordnet-Affect, created semi-automatically, with affect information foreach synsetIndian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synsetSO-CAL, created manually, with a graded score per word
Wordnet-Affect, created semi-automatically, with affect information foreach synsetIndian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synsetSO-CAL, created manually, with a graded score per wordWordnet-Affect, created semi-automatically, with affect information foreach synset
Indian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Overview
Overview
An overview of today’s presentation:
This presentation covers lexical resources for sentiment analysis.
Four resources are covered, each using a different approach forrepresentation and creation:
Sentiwordnet, created automatically, with 3 graded scores per synsetSO-CAL, created manually, with a graded score per wordWordnet-Affect, created semi-automatically, with affect information foreach synsetIndian-Language Sentiwordnet, created by projecting the EnglishSentiwordnet
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 4 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-basedLexicon-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-basedLexicon-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-based
Lexicon-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Sentiment Analysis
Sentiment Analysis: Determining the opinion expressed in a text
Approaches:
Classifier-basedLexicon-based
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 5 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Domain Specificity (Example: Movie reviews mentioning ‘writer’,‘plot’, etc.) [Bro01]
Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’)
The lexicon-based approach aims at solving these problems.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Domain Specificity (Example: Movie reviews mentioning ‘writer’,‘plot’, etc.) [Bro01]
Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’)
The lexicon-based approach aims at solving these problems.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Analysis
Why Lexicon-based Approach?
The classifier-based approach has the following drawbacks:
Domain Specificity (Example: Movie reviews mentioning ‘writer’,‘plot’, etc.) [Bro01]
Lack of Context (Example: ‘good’ vs ‘not good’ vs ‘not very good’)
The lexicon-based approach aims at solving these problems.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 6 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form(lexical unit, sentiment).
Choices for lexical unit:
Word
Word sense
Phrase, etc.
Choices for sentiment:
Fixed categorization into ‘positive’ and ‘negative’
Graded sets like ‘strongly positive’, ‘mildly positive’, ‘neutral’, ‘mildlynegative’, ‘strongly negative’
Score in an interval like [0, 1] or [−1,+1]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form(lexical unit, sentiment).Choices for lexical unit:
Word
Word sense
Phrase, etc.
Choices for sentiment:
Fixed categorization into ‘positive’ and ‘negative’
Graded sets like ‘strongly positive’, ‘mildly positive’, ‘neutral’, ‘mildlynegative’, ‘strongly negative’
Score in an interval like [0, 1] or [−1,+1]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Sentiment Lexicons
A sentiment lexicon is a sentiment database for language units of the form(lexical unit, sentiment).Choices for lexical unit:
Word
Word sense
Phrase, etc.
Choices for sentiment:
Fixed categorization into ‘positive’ and ‘negative’
Graded sets like ‘strongly positive’, ‘mildly positive’, ‘neutral’, ‘mildlynegative’, ‘strongly negative’
Score in an interval like [0, 1] or [−1,+1]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 7 / 48
Introduction Sentiment Lexicons
Approaches for Creation
Manual
Automatic
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 8 / 48
Sentiwordnet
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 9 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon madeusing Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Support for both sentiment classification and subjectivity detection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon madeusing Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Support for both sentiment classification and subjectivity detection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon madeusing Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Support for both sentiment classification and subjectivity detection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet
Introduction to Sentiwordnet
Sentiwordnet [ES06] is an automatically generated sentiment lexicon madeusing Wordnet. Its salient features are:
High coverage
Support for graded sentiment labels
Support for both sentiment classification and subjectivity detection
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 10 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.
Each synset s is given three sentiment scores:
Positive score Pos(s)
Negative score Neg(s)
Objective score Obj(s)
Pos(s) +Neg(s) +Obj(s) = 1
Example Synset
beautifula: Pos = 0.75, Neg = 0.00, Obj = 0.25
aURL: http://sentiwordnet.isti.cnr.it/search.php?q=beautiful
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.Each synset s is given three sentiment scores:
Positive score Pos(s)
Negative score Neg(s)
Objective score Obj(s)
Pos(s) +Neg(s) +Obj(s) = 1
Example Synset
beautifula: Pos = 0.75, Neg = 0.00, Obj = 0.25
aURL: http://sentiwordnet.isti.cnr.it/search.php?q=beautiful
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Structure
Structure of Sentiwordnet
Sentiwordnet = Wordnet + Sentiment Information.Each synset s is given three sentiment scores:
Positive score Pos(s)
Negative score Neg(s)
Objective score Obj(s)
Pos(s) +Neg(s) +Obj(s) = 1
Example Synset
beautifula: Pos = 0.75, Neg = 0.00, Obj = 0.25
aURL: http://sentiwordnet.isti.cnr.it/search.php?q=beautiful
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 11 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
Sentiwordnet Creation
Creation Steps
The top-level steps in the algorithm to create Sentiwordnet are as follows:
1 Selection of seed set
2 Expansion using Wordnet’s semantic relations
3 Training of a team of ternary classifiers
4 Classification of each Wordnet synset using the classifiers
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 12 / 48
SO-CAL
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 13 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salientfeatures are:
Highly detailed lexicon
Graded sentiment label
Low coverage, but high accuracy
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salientfeatures are:
Highly detailed lexicon
Graded sentiment label
Low coverage, but high accuracy
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salientfeatures are:
Highly detailed lexicon
Graded sentiment label
Low coverage, but high accuracy
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL
Introduction to SO-CAL
SO-CAL is a system that uses a manually-constructed lexicon. Its salientfeatures are:
Highly detailed lexicon
Graded sentiment label
Low coverage, but high accuracy
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 14 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Features Used
SO-CAL classifies words into various features and treats each featuredifferently in the lexicon. They are:
Adjectives
Nouns, Verbs, Adverbs and Multiwords
Intensifiers and Downtoners
Negation
Irrealis Blocking
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 15 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Words are scored in [−5,+5]
Intensifiers and negation further act upon these scores
Examples
good: +3monstrosity: −5masterpiece: +5
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Words are scored in [−5,+5]
Intensifiers and negation further act upon these scores
Examples
good: +3monstrosity: −5masterpiece: +5
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
SO-CAL Structure
Structure of SO-CAL
Sentiment scoring:
Words are scored in [−5,+5]
Intensifiers and negation further act upon these scores
Examples
good: +3monstrosity: −5masterpiece: +5
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 16 / 48
Wordnet-Affect
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 17 / 48
Wordnet-Affect
Introduction to Wordnet-Affect
Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexiconmade using Wordnet. It associates affective information with eachsynset. Its salient features are:
Highly detailed
Ability to handle sentiment differently depending on emotion
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
Wordnet-Affect
Introduction to Wordnet-Affect
Wordnet-Affect [SV04] is a semi-automatically generated sentiment lexiconmade using Wordnet. It associates affective information with eachsynset. Its salient features are:
Highly detailed
Ability to handle sentiment differently depending on emotion
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 18 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
Wordnet-Affect = Wordnet + Affect Information.
Affect is represented using the following:
An a-label which represents the emotion,
The valency which indicates the sentiment.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
Wordnet-Affect = Wordnet + Affect Information.Affect is represented using the following:
An a-label which represents the emotion,
The valency which indicates the sentiment.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 19 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
The a-label is a tree of emotions starting at a root node with eachleaf node corresponding to a synset.
The valency can be any of positive, negative, neutral or ambiguous.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
Wordnet-Affect Structure
Structure of Wordnet-Affect
The a-label is a tree of emotions starting at a root node with eachleaf node corresponding to a synset.
The valency can be any of positive, negative, neutral or ambiguous.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 20 / 48
Wordnet-Affect Structure
root
mental-state
cognitive-state affective-state
mood emotion
positive-emotion
joy
elation
love
worship
negative-emotion
sadness
melancholy
shame
embarrassment
. . .
. . .
physical-state . . .
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 21 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Manual creation of initial resource
Automatic expansion using Wordnet relations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Manual creation of initial resource
Automatic expansion using Wordnet relations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Wordnet-Affect Creation
Creation Steps
Wordnet-Affect was created using the following steps:
Manual creation of initial resource
Automatic expansion using Wordnet relations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 22 / 48
Indian-Language Sentiwordnets
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 23 / 48
Indian-Language Sentiwordnets
Introduction to Indian-Language Sentiwordnets
Indian-language Sentiwordnets can be created using Wordnet projection[JRB10]. This approach has the following salient features:
Easy to create once backing resources are available
No reduplication of effort
Use of tried-and-tested representations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
Indian-Language Sentiwordnets
Introduction to Indian-Language Sentiwordnets
Indian-language Sentiwordnets can be created using Wordnet projection[JRB10]. This approach has the following salient features:
Easy to create once backing resources are available
No reduplication of effort
Use of tried-and-tested representations
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 24 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Find the corresponding Hindi synset using Indowordnet.
Assign sentiment scores from English synset to Hindi synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Find the corresponding Hindi synset using Indowordnet.
Assign sentiment scores from English synset to Hindi synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Indian-Language Sentiwordnets Creation
Creation Steps
The process of projecting a Sentiwordnet has the following steps:
Fetch a synset from the English Sentiwordnet.
Find the corresponding Hindi synset using Indowordnet.
Assign sentiment scores from English synset to Hindi synset.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 25 / 48
Conclusions
Roadmap: We Are Here
1 Introduction
2 Sentiwordnet
3 SO-CAL
4 Wordnet-Affect
5 Indian-Language Sentiwordnets
6 Conclusions
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 26 / 48
Conclusions
A Comparison of the Resources
Criterion SWN SO-CAL WN-Affect IL-SWN
Sentiment 3 x [0, 1] [−5,+5] Affect 3 x [0, 1]Lexical Unit Synset Word Synset SynsetBacking Resource Wordnet None Wordnet SWN + In-
dowordnetCreation Automatic Manual Automatic ProjectionNo of Entries 117,000 5,000 900 16,000
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 27 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic orProjection
Lexical Unit: Word, Synset or Higher Representations
Sentiment: Labels, Graded Scores or Affect Information
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic orProjection
Lexical Unit: Word, Synset or Higher Representations
Sentiment: Labels, Graded Scores or Affect Information
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic orProjection
Lexical Unit: Word, Synset or Higher Representations
Sentiment: Labels, Graded Scores or Affect Information
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks
To conclude, there are three choices in making a sentiment lexicon:
Creation Approach: Manual, Automatic, Semi-Automatic orProjection
Lexical Unit: Word, Synset or Higher Representations
Sentiment: Labels, Graded Scores or Affect Information
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 28 / 48
Conclusions
Concluding Remarks: Creation Approach
Manual Approach Automatic Approach
High annotation accuracy Low annotation accuracyHigh time investment Low time investmentMore details supported Less details supported
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 29 / 48
Conclusions
Concluding Remarks: Lexical Unit
Word Synset
Unreliable for polysemous words Reliable for polysemous wordsNo pre-processing required Requires WSDProjection is comparatively difficult Projection is comparatively easier
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 30 / 48
Conclusions
Concluding Remarks: Sentiment
Graded scores have been shown to be better than mere labels in general.Moreover, a graded score resource can always be converted to alabel-based resource.Affect information can help in specialized circumstances.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 31 / 48
Conclusions
Future Work
Possible directions in the future:
Automatic resources for higher-level lexical units like phrases, trees,etc.
Manual resources for synsets
Manual lexicons for Indian languages
Techniques for building dynamic resources to incorporate ‘netspeak’and other slang
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
Conclusions
Future Work
Possible directions in the future:
Automatic resources for higher-level lexical units like phrases, trees,etc.
Manual resources for synsets
Manual lexicons for Indian languages
Techniques for building dynamic resources to incorporate ‘netspeak’and other slang
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 32 / 48
Conclusions
References I
Julian Brooke, A semantic approach to automatic text sentimentanalysis, M.A. thesis, Stanford University, 2001.
Andrea Esuli and Fabrizio Sebastiani, SentiWordNet: A publiclyavailable lexical resource for opinion mining, Proceedings of the 5thConference on Language Resources and Evaluation (LREC-06), 2006,pp. 417–422.
Andrea Esuli, Automatic generation of lexical resources for opinionmining: Models, algorithms and applications, Ph.D. thesis, Universitadi Pisa, 2008.
Christiane Fellbaum, Wordnet: An electronic lexical database, ABradford Book, 1998.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 33 / 48
Conclusions
References II
Vasileios Hatzivassiloglou and Kathleen R. McKeown, Predicting thesemantic orientation of adjectives, Proceedings of the 35th AnnualMeeting of the Association for Computational Linguistics and EighthConference of the European Chapter of the Association forComputational Linguistics, Association for Computational Linguistics,1997, pp. 174–181.
Aditya Joshi, Balamurali A R, and Pushpak Bhattacharyya, Afall-back strategy for sentiment analysis in hindi: a case study,Proceedings of ICON 2010: 8th International Conference on NaturalLanguage Processing, Macmillan Publishers, India, 2010.
Jaap Kamps, Maarten Marx, Robert J. Mokken, and Maartende Rijke, Using wordnet to measure semantic orientations ofadjectives, Proceedings of LREC-04, 4th International Conference onLanguage Resources and Evaluation, 2004, pp. 1115–1118.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 34 / 48
Conclusions
References III
Ellen Riloff and Janyce Wiebe, Learning extraction patterns forsubjective expressions, Proceedings of the 2003 Conference onEmpirical Methods in Natural Language Processing, Association forComputational Linguistics, 2003, pp. 105–112.
Carlo Strapparava and Alessandro Valitutti, WordNet-Affect: anaffective extension of WordNet, Proceedings of the 4th InternationalConference on Language Resources and Evaluation (LREC-04), 2004,pp. 1083–1086.
Peter D. Turney and Michael L. Littman, Measuring praise andcriticism: Inference of semantic orientation from association, ACMTransactions on Information Systems 21 (2003), no. 4, 315–346.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 35 / 48
Additional Slides Wordnet
Wordnet
Wordnet [Fel98] is a lexical database organized by word sense. Thefundamental unit of storage is called a synset.
An Example Synset
brilliant, superba: of surpassing excellence“a brilliant performance”; “a superb actor”
aURL: http://wordnetweb.princeton.edu/perl/webwn?s=brilliant
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
Additional Slides Wordnet
Wordnet
Wordnet [Fel98] is a lexical database organized by word sense. Thefundamental unit of storage is called a synset.
An Example Synset
brilliant, superba: of surpassing excellence“a brilliant performance”; “a superb actor”
aURL: http://wordnetweb.princeton.edu/perl/webwn?s=brilliant
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 36 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semanticrelations. Some of them are:
Antonymy
Meronymy
Hypernymy
Hyponymy
Similar to, etc.
These relations are helpful in creating the training set for classifyingsynsets to create Sentiwordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semanticrelations. Some of them are:
Antonymy
Meronymy
Hypernymy
Hyponymy
Similar to, etc.
These relations are helpful in creating the training set for classifyingsynsets to create Sentiwordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Wordnet
Semantic Relations in Wordnet
Wordnet synsets are linked to each other by relations called semanticrelations. Some of them are:
Antonymy
Meronymy
Hypernymy
Hyponymy
Similar to, etc.
These relations are helpful in creating the training set for classifyingsynsets to create Sentiwordnet.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 37 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led totoday’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led totoday’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led totoday’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led totoday’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Sentiment Classification
Initial work that automatically detected the sentiment of a word led totoday’s modern lexicons. This included:
Use of conjunction-separated adjectives [HM97]
PMI-based Extraction using Web Queries [TL03]
Graph Expansion using Wordnet [KMMdR04]
Classification using Wordnet Glosses [Esu08]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 38 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary tofilter out objective words from sentiment classification. This includes:
Adapting Wordnet Glosses to Subjectivity Detection [Esu08]
Bootstrapping Subjective Expressions from a Corpus [RW03]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary tofilter out objective words from sentiment classification. This includes:
Adapting Wordnet Glosses to Subjectivity Detection [Esu08]
Bootstrapping Subjective Expressions from a Corpus [RW03]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Background
Subjectivity Detection
Work that identifies whether a term is indeed subjective is necessary tofilter out objective words from sentiment classification. This includes:
Adapting Wordnet Glosses to Subjectivity Detection [Esu08]
Bootstrapping Subjective Expressions from a Corpus [RW03]
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 39 / 48
Additional Slides Structure of SO-CAL
Adjectives
Adjectives were collected from a 500-document corpus and annotated witha sentiment score from −5 to +5.
Examples
good: +3sleazy: −3
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 40 / 48
Additional Slides Structure of SO-CAL
Nouns, Verbs, Adverbs, Multiwords
This was extended to other parts of speech and multiword expressions, fora total of about 5,000 words.
Examples
monstrosity: −5masterpiece: +5inspire: +2funny: +2 vs. act funny: −1
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 41 / 48
Additional Slides Structure of SO-CAL
Intensifiers and Downtoners
Intensifiers are words that increase sentiment intensity while downtonersare words that reduce sentiment intensity. For example extraordinarily andsomewhat.
Intensifiers and downtoners are modeled as percentage modifiers.
Examples
slightly: −50%extraordinarily: +50%
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
Additional Slides Structure of SO-CAL
Intensifiers and Downtoners
Intensifiers are words that increase sentiment intensity while downtonersare words that reduce sentiment intensity. For example extraordinarily andsomewhat.Intensifiers and downtoners are modeled as percentage modifiers.
Examples
slightly: −50%extraordinarily: +50%
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 42 / 48
Additional Slides Structure of SO-CAL
Negation
Negation is modeled as a numeric shift of value 4 towards the oppositesentiment.
Examples
good: +3 ⇒ not good: −1atrocious: −5 ⇒ not atrocious: −1
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 43 / 48
Additional Slides Structure of SO-CAL
Irrealis Blocking
An irrealis marker is a word that indicates that the sentiment may not bereliable because the event hasn’t actually happened. For example, ‘would’,‘expect’, ‘if’, quotation marks, etc.
Sentences with irrealis markers are ignored for sentiment analysis.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
Additional Slides Structure of SO-CAL
Irrealis Blocking
An irrealis marker is a word that indicates that the sentiment may not bereliable because the event hasn’t actually happened. For example, ‘would’,‘expect’, ‘if’, quotation marks, etc.Sentences with irrealis markers are ignored for sentiment analysis.
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 44 / 48
Additional Slides Sentiwordnet Creation
Seed Set
Two seed sets are created:
Lp for positive synsets
Ln for negative synsets
Each synset representation consists of:
The terms
The defninition
The sample phrases
Explicit indication of negation
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
Additional Slides Sentiwordnet Creation
Seed Set
Two seed sets are created:
Lp for positive synsets
Ln for negative synsets
Each synset representation consists of:
The terms
The defninition
The sample phrases
Explicit indication of negation
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 45 / 48
Additional Slides Sentiwordnet Creation
Wordnet Expansion
Relations of Wordnet used for expansion:
Direct antonymy
Similarity
Derived from
Pertains to
Attribute
Also see
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
Additional Slides Sentiwordnet Creation
Wordnet Expansion
Relations of Wordnet used for expansion:
Direct antonymy
Similarity
Derived from
Pertains to
Attribute
Also see
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 46 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
No of iterations of expansion (0, 2, 4, 6)
Learning algorithm (SVM, Rocchio)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
No of iterations of expansion (0, 2, 4, 6)
Learning algorithm (SVM, Rocchio)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
8 classifiers were created differing in:
No of iterations of expansion (0, 2, 4, 6)
Learning algorithm (SVM, Rocchio)
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 47 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Positive vs. Not Positive
Negative vs. Not Negative
The results are combined as:Positive Not Positive
Negative Objective Negative
Not Negative Positive Objective
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Positive vs. Not Positive
Negative vs. Not Negative
The results are combined as:Positive Not Positive
Negative Objective Negative
Not Negative Positive Objective
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48
Additional Slides Sentiwordnet Creation
Classifiers
Each ternary classifier is a sum of 2 binary classifiers:
Positive vs. Not Positive
Negative vs. Not Negative
The results are combined as:Positive Not Positive
Negative Objective Negative
Not Negative Positive Objective
Sagar Ahire (IIT Bombay) Sentiment Resources 02 May, 2014 48 / 48