+ All Categories
Home > Data & Analytics > Sentiment Analysis with NVivo 11 Plus

Sentiment Analysis with NVivo 11 Plus

Date post: 15-Apr-2017
Category:
Upload: shalin-hai-jew
View: 1,307 times
Download: 2 times
Share this document with a friend
88
Sentiment Analysis with NVivo 11 Plus Summer Institute on Distance Learning and Instructional Technology (SIDLIT 2016) August 4 - 5 , 2016
Transcript
Page 1: Sentiment Analysis with NVivo 11 Plus

Sentiment Analysis with NVivo 11 Plus

Summer Institute on Distance Learning and Instructional Technology (SIDLIT 2016)

August 4 - 5, 2016

Page 2: Sentiment Analysis with NVivo 11 Plus

Overview

• Researchers have long known that the words of a text have always contained more information than on the surface. As such, texts have been studied for subtexts and other latent or hidden information. One approach has involved the machine-enabled analysis of human sentiment, usually mapped out on a positive-negative polarity. NVivo 11 Plus (a qualitative research tool released in late 2015) enables the automated sentiment analysis of texts (coded research, formal articles, text corpora, Tweetstream datasets, Facebook wall posts, websites, and other sources) based on four categories: very positive, moderately positive, moderately negative, and very negative. The tool feature compares the target text set against a sentiment dictionary and enables coding at different units of analysis: sentence, paragraph, or cell. Further, the sentiment capability extracts the coded text into respective text sets which may be further analyzed using text frequency counts, text searches, automated theme and sub-theme extractions (topic modeling), and data visualizations.

2

Page 3: Sentiment Analysis with NVivo 11 Plus

Why Sentiment?

quot homines, tot sententiae

3

Page 4: Sentiment Analysis with NVivo 11 Plus

Sentiment and its Public / Private Expression

• Sentiment may be ephemeral in some cases but may harden into a stance (and an orientation and then a disposition) depending on the strength of the sentiment and whether the sentiment is reinforced or contradicted (by those in a social network and the larger community)

• Expression of sentiment may be reinforcing (strengthening that sentiment) or cathartic (dissipating that sentiment)

• Expression in particular venues may have particular effects

• Expressed sentiment may affect recipients (of the messages) differently based on their receptivity / susceptibility to the message

4

Page 5: Sentiment Analysis with NVivo 11 Plus

Sentiment and Action

• An assumed relationship between sentiment (+ or -) and behavior / action (in the aggregate)

• Not a simple cause-and-effect • Not simple predictivity

• Positive and negative sentiments exist; both can inspire to action but just different types of action

• Positive sentiment is not always desirable; negative sentiment is not always undesirable

• Positive view may lead to complacency on an issue about which one should not be complacent

5

Page 6: Sentiment Analysis with NVivo 11 Plus

Sentiment and Action (cont.)

• High emotional intensity, sympathy, and anger as sparks to (individual or mass) kinetic action (sometimes unheeding, sometimes not formally considered)

• Communications on the Social Web often “calls to action” • Fund-raising • Boycotting • Taking part in events • Voting • Taking on or maintaining a certain attitude • Taking precautions • Co-messaging, and others

6

Page 7: Sentiment Analysis with NVivo 11 Plus

Reasons for Measuring Sentiment

• Public sentiment metrics (from sentiment analysis or opinion mining) for indicators of success / failure (and degrees in between) for media professionals, publicists, and advertisers

• Public sentiment metrics as early indicators of potential individual or mass action

• Predictivity

• Public sentiment metrics as research tool to surface latent information

7

Page 8: Sentiment Analysis with NVivo 11 Plus

Sentiment and Social Media / Opinion Data

• All expressions on social media may feel ephemeral and invisible, but are actually permanent and highly visible and findable

• In social media, sentiment is studied to understand • Strategic messaging and trends moving through social networks [through

friend of a friend (FOAF) networks and word of mouth (WOM)] • People’s reputations and how they are trending • Evolving people-related events and various “potential futures”

8

Page 9: Sentiment Analysis with NVivo 11 Plus

Identification of Language-Based Markers

• Idea is to find language-based indicators (markers) that may serve as shorthand for particular insights about the state of the world

• Classic: What is “a|b”? or What is the state of “a” given the observation of “b”?

• Next step out is how these observations of the world may be used to inform decision-making and actions

9

Page 10: Sentiment Analysis with NVivo 11 Plus

Computational Sentiment Analysis 10

Page 11: Sentiment Analysis with NVivo 11 Plus

11

Page 12: Sentiment Analysis with NVivo 11 Plus

Positive-Negative Polarity…or Categories along Positive-Negative Continuum 12

Page 13: Sentiment Analysis with NVivo 11 Plus

Computational Sentiment Analysis

• Conceptualized as a positive-negative polarity • Binary conceptualization as positive, negative, or neutral • Continuum conceptualization as degrees of sentiment

• In NVivo 11 Plus: Classifications of text as • Very negative, moderately negative, moderately positive, very positive (and

“neutral” implied by non-inclusion in the coded sentiment set) • Understandings of general tendencies in a text set

• Access to the autocoded text set for each of the categories • Understandings of granular features of the extracted text sets

• Spinoff research from extracted text sets possible, such as text counts, word searches, and others

13

Page 14: Sentiment Analysis with NVivo 11 Plus

14

Page 15: Sentiment Analysis with NVivo 11 Plus

15

Page 16: Sentiment Analysis with NVivo 11 Plus

16

Page 17: Sentiment Analysis with NVivo 11 Plus

Various Methods

• Pre-coding a word set from a target language -> comparing text sets against that sentiment set or sentiment dictionary

• Usually focused around semantic-bearing terms (and not so much function or syntax terms)

• Using a customized sentiment dictionary for specialized text sets (such as Tweets or posts or microblogging messages on social media)

• Translating a different target language to another target language and then using that target language’s sentiment dictionary to code text

17

Page 18: Sentiment Analysis with NVivo 11 Plus

Various Methods (cont.)

• More sophisticated consideration of negation, irony, humor, sarcasm, and longer phrases (n-gram sequences, thought units vs. single words) for nuanced sentiment labeling

• Manual labeling with XML tagging and running queries based on the manual labeling

• Going with bag-of-words vs. structure-preserving sentiment approaches

18

Page 19: Sentiment Analysis with NVivo 11 Plus

NVivo 11 Plus

• A qualitative and mixed methods data analysis tool • Enables the curation of multimedia data in an unstructured / semi-

structured dataset • Enables manual coding of multimedia data • Enables the running of data queries against text versions of all data in a

project • Enables autocoding of text for sentiment, theme and sub-theme extraction,

and unique human coding (“autocoding by existing pattern”) • Enables drawing of various types of data visualizations related to the data

handled: word clouds, word trees, treemaps, dendrograms, cluster diagrams (2D and 3D), sociograms, geographical maps, and others

19

Page 20: Sentiment Analysis with NVivo 11 Plus

A Walk-through of the Sentiment Analysis Tool Use

• Collection of target texts • Data pre-processing or data cleaning • Ingestion into an NVivo 11 Plus project

• Single or combined text corpus (different results depending on treatment of the text)

• Preferable to have both versions, single texts and a combined corpus of those texts, for different types of questions and different types of processing

20

Page 21: Sentiment Analysis with NVivo 11 Plus

A Walk-through of the Sentiment Analysis Tool Use (cont.)

• May code at the level of sentences, paragraphs, or cells (level of granularity), depending also in part on how the textual data is structured (Tweets are not sentences and are coded as cells in the extracted data tables, for example; unpunctuated sentences will not be read as sentences by the software tool, etc.); other sentiment analysis approaches code at various levels of n-grams

• Documents coded at sentence level will result in more codes than those coded in paragraph level because of the smaller granularity

• Documents coded at paragraph level will result in coarser coding • Documents coded at cell level may only be applied to table data (which is how

microblogging and social network post data is collected and ingested; also, online survey data may often be output as table data)

21

Page 22: Sentiment Analysis with NVivo 11 Plus

A Walk-through of the Sentiment Analysis Tool Use (cont.)

• Autocoding by sentiment classifier • Coding of the text into four categories: very negative, moderately

negative, moderately positive, and very positive; dividable into negative or positive categories (and neutral, which is left out)

• Core words may appear in all four sentiment categories (and even in the neutral category) but usually at differing frequencies and sometimes with different word senses

• Words are not coded in any sort of mutually exclusive way, so this can capture some of the complexity in the text (remember that the coding is at different levels: sentences, paragraphs, or cells)

22

Page 23: Sentiment Analysis with NVivo 11 Plus

A Walk-through of the Sentiment Analysis Tool Use (cont.)

• Data visualizations from the coded data outcomes: intensity matrices, bar charts, tree maps, and sunbursts

• May recode or un-code text from the labeled sentiment text in the respective nodes

23

Page 24: Sentiment Analysis with NVivo 11 Plus

A Walk-through of the Sentiment Analysis Tool Use (cont.)

• Analysis of the respective autocoded text sets • Machine-enhanced approaches:

• Text frequency counts, text searches, matrix coding queries (as data queries); • Theme and sub-theme extraction, sentiment analysis of extracted sentiment

subsets (as autocoding); • Exploration (as data visualizations), and others

• Human-enhanced approaches: Manual analysis through “close reading” (vs. machine-based “distant reading”) of the labeled texts

• Cross-comparisons • External validations

24

Page 25: Sentiment Analysis with NVivo 11 Plus

Some Sentiment Tool Capabilities

• Comparison of documents and text corpora against a built-in pre-coded sentiment dictionary

• Inherency of intrinsic attractiveness (positive valence) or aversiveness (negative valence) embodied in language

• Dictionary words weighted based on degree and direction of sentiment • Apparently focused on single words (unigrams) only

• Other more complex sentiment classifiers built on bigrams and some trigrams• Not able to consider double-negatives (e.g. “not unheard of”)

• No confidence measure: p(y|x) or the “probability of y given x” where y is the sentiment classification and x is the input sentence, paragraph, or cell text

• An inferred confidence based on human oversight of the autocoded sentiments

25

Page 26: Sentiment Analysis with NVivo 11 Plus

Some Sentiment Tool Capabilities (cont.)

• Can set base content languages to one of the following: Chinese (simplified), UK English, US English, French, German, Japanese, Portuguese, and Spanish

• Sentiment analyses in other languages may be based on translations of other languages to English and a base of sentiment off of the English dictionary, or it may be based off of the native languages (but the first is more likely and more common in the field).

• Interface language is separate from the base content language. • NVivo 11 Plus projects may include any range of languages expressible in

Unicode (the char set UTF-8), but only the base one is used for various text-based analytics and to automated analytics (like sentiment); translations of non-base language words will need to be done ahead of time in order to ensure that all languages’ sentiments are analyzed.

26

Page 27: Sentiment Analysis with NVivo 11 Plus

Some Sentiment Tool Capabilities (cont.)

• Not finer points of humor, sarcasm, idioms, slang, or irony; no accommodations for social media-speak (#hashtags, FOMO / “fear of missing out,” #TBT / “throwback Thursday,” etc.)

• Also not the nuances of polysemy (multi-meaninged words), denotative vs. connotative meanings (and vice versa), cultural references, and word-use context

• Quantitative counts of sentiment in four categories (coded to nodes); qualitative information of text coded to the respective four categories

• Extracted text sets available for further analyses • Need to assess the actual coded text sets• Ability to manually uncode and recode textual data

27

Page 28: Sentiment Analysis with NVivo 11 Plus

Some Sentiment Tool Capabilities (cont.)

• Can treat sentiment coding as a binary (negative or positive) or as a four-category set (very negative, moderately negative, moderately positive, and very positive)

• Inability to see or modify pre-coded sentiment dictionary against which a text set is compared (currently)

• Also inability to create a customized dictionary for sentiment analysis at this point

• Not treating the text sets in a structured sequential way but more bag-of-words (without the original order)

28

Page 29: Sentiment Analysis with NVivo 11 Plus

Some Sources of Texts and Text Sets

Formal • Processed data

• Edited articles and books • Human and machine-created codes

• Raw data • Data tables • Survey data

Informal • Social media platforms as sources of

opinion-rich data • Tweetstream datasets • Facebook wall posts • Crowd-sourced encyclopedia articles

(from Wikipedia)

• Websites

29

Page 30: Sentiment Analysis with NVivo 11 Plus

Three Demos 30

Page 31: Sentiment Analysis with NVivo 11 Plus

Seeding Text Sets

• Article text set • Social media text set • Gray literature text set

31

Page 32: Sentiment Analysis with NVivo 11 Plus

Article Text Set 32

Page 33: Sentiment Analysis with NVivo 11 Plus

33

Page 34: Sentiment Analysis with NVivo 11 Plus

34

Page 35: Sentiment Analysis with NVivo 11 Plus

35

Page 36: Sentiment Analysis with NVivo 11 Plus

36

Page 37: Sentiment Analysis with NVivo 11 Plus

Social Media Text Set 37

Page 38: Sentiment Analysis with NVivo 11 Plus

38

Page 39: Sentiment Analysis with NVivo 11 Plus

39

Page 40: Sentiment Analysis with NVivo 11 Plus

40

Page 41: Sentiment Analysis with NVivo 11 Plus

41

Page 42: Sentiment Analysis with NVivo 11 Plus

42

Page 43: Sentiment Analysis with NVivo 11 Plus

43

Page 44: Sentiment Analysis with NVivo 11 Plus

44

Page 45: Sentiment Analysis with NVivo 11 Plus

45

Page 46: Sentiment Analysis with NVivo 11 Plus

46

Page 47: Sentiment Analysis with NVivo 11 Plus

If too many “https,” … a work-around

• Many social media data captures will result in a lot of “http” references because the site refers to many other Web pages

• Automated theme extraction will result in one or two high-level categories, with one of them being “http”

• This masks what the actual themes are…so it’s important to clean the data of “http” and output a different text set for theme extraction.

• At this point, there is no direct way to change up the level of theme extraction (to enable an automated bypass of “http” at the top level and to go right to the more substantive contents. (Please see next three slides.)

47

Page 48: Sentiment Analysis with NVivo 11 Plus

48

Page 49: Sentiment Analysis with NVivo 11 Plus

49

Page 50: Sentiment Analysis with NVivo 11 Plus

50

Page 51: Sentiment Analysis with NVivo 11 Plus

Gray Literature Text Set 51

Page 52: Sentiment Analysis with NVivo 11 Plus

52

Page 53: Sentiment Analysis with NVivo 11 Plus

53

Page 54: Sentiment Analysis with NVivo 11 Plus

54

Page 55: Sentiment Analysis with NVivo 11 Plus

55

Page 56: Sentiment Analysis with NVivo 11 Plus

56

Page 57: Sentiment Analysis with NVivo 11 Plus

57

Page 58: Sentiment Analysis with NVivo 11 Plus

Some Data Visualization Types 58

Page 59: Sentiment Analysis with NVivo 11 Plus

Intensity Matrix 59

Page 60: Sentiment Analysis with NVivo 11 Plus

Bar Chart 60

Page 61: Sentiment Analysis with NVivo 11 Plus

3D Cluster Chart 61

Page 62: Sentiment Analysis with NVivo 11 Plus

Comparison Diagram 62

Page 63: Sentiment Analysis with NVivo 11 Plus

Heat Map 63

Page 64: Sentiment Analysis with NVivo 11 Plus

Spider / Radar Chart 64

Page 65: Sentiment Analysis with NVivo 11 Plus

Hierarchy Chart: Treemap 65

Page 66: Sentiment Analysis with NVivo 11 Plus

Hierarchy Chart: Sunburst 66

Page 67: Sentiment Analysis with NVivo 11 Plus

Circular Project Sources and Coding Layout 67

Page 68: Sentiment Analysis with NVivo 11 Plus

Post-Sentiment Capture Analytics 68

Page 69: Sentiment Analysis with NVivo 11 Plus

Post-Sentiment Capture Analytics (with Related Data Visualizations)

• Analysis of the subsetted data • Text frequency count • Text search • Matrix coding query • More sentiment analysis • Theme and sub-theme extraction • Word relatedness clustering

69

Page 70: Sentiment Analysis with NVivo 11 Plus

70

Page 71: Sentiment Analysis with NVivo 11 Plus

71

Page 72: Sentiment Analysis with NVivo 11 Plus

72

Page 73: Sentiment Analysis with NVivo 11 Plus

73

Page 74: Sentiment Analysis with NVivo 11 Plus

Questions & Comments? 74

Page 75: Sentiment Analysis with NVivo 11 Plus

For Consideration

• What informs whether you have positive or negative sentiment about something?

• Is this based on your values? Your expectations of the world? Your culture? Your upbringing?

• Is this based on experiences (whether pain or pleasure)? • Is this a process that is a fully conscious one or one that may occur in a

subconscious or even unconscious way?

• Once you have formed a sentiment about something (or even a “pre-sentiment”), how committed are you to it?

• How hard it is for you to change your mind? Why?

75

Page 76: Sentiment Analysis with NVivo 11 Plus

For Consideration (cont.)

• Between positive and negative sentiment, which one is more likely to lead you to take action? What sort of action(s)?

• What emotions lead to a sensation of pleasure? Why? • What emotions lead to a sensation of displeasure? Why?

• Or is it a matter of intensity of emotion that moves you to action? Or surprise? (Please share a direct experienced story or two.)

• On the converse, what sort of sentiment tends to make you passive? To dissuade you from action? Why?

76

Page 77: Sentiment Analysis with NVivo 11 Plus

For Consideration (cont.)

• When you express sentiment on social media (share), does it tend to have a reinforcement effect (make you more committed to your sentiment) or a cathartic effect (make you less committed to your sentiment)? Does public expression strengthen the sentiment or weaken it? Or neither? Why?

• Does reinforcement, catharsis, or no-effect occur from expression of sentiment on social media based on the particular issue and context?

• Is there a difference (in terms of action taken) if you express the sentiment to a friend face-to-face (privately)? To a family member? Online? To a larger community? To strangers? Why?

77

Page 78: Sentiment Analysis with NVivo 11 Plus

Emotions

• Study of sentiment evolved to the study of emotions, which are • higher dimensional… • somewhat linked to personality…• linked to various psychological models…• measured using various psychometrics…and • observable in various ways (in lab settings)

78

Page 79: Sentiment Analysis with NVivo 11 Plus

Robert Plutchik’sWheel of Emotions

Eight Primary Emotions:

• Anger

• Fear

• Sadness

• Disgust

• Surprise

• Anticipation

• Trust

• Joy

79

By Machine Elf 1735

Page 80: Sentiment Analysis with NVivo 11 Plus

For Consideration (cont.)

• How attuned are you to your emotions? • Emotion as motivator: What sort of emotion drives you to action? How can

you manipulate that emotion in order to motivate yourself to desirable action (and to demotivate yourself from undesirable action)?

• Emotion as de-motivator: What sort of emotion drives you to passivity and inertia (even for good behaviors)? How can you motivate yourself to get past such de-motivating emotions?

80

Page 81: Sentiment Analysis with NVivo 11 Plus

Some Exercise Ideas

• Think of text sets that you see often. Identify one set. If you were to guess (hypothesize), how would this text set rank in terms of sentiment in the four categories: very negative, moderately negative, moderately positive, and very positive (or more simply in a polarity of negative-positive).

• Why would you assume this particular distribution? (If you have chance, run a sentiment analysis, and see what you get.)

• Much of the world’s information is spoken and shared aloud. Identify a set of spoken data. Transcode it into written text. What sentiment distributions do you expect to see? Why? Run the sentiment analysis. What do you find?

81

Page 82: Sentiment Analysis with NVivo 11 Plus

Research Applications? Problem-solving Applications?

• What are some practical research applications of sentiment analysis in your respective research domain(s)?

• What are some practical problem-solving applications of sentiment analysis in your respective research domain(s)?

82

Page 83: Sentiment Analysis with NVivo 11 Plus

Hidden States

• Given these sentiment-based observations from natural language, what is / are the hidden state(s)?

• Hidden state-of-the-person? • Hidden state of people or groups or populations (collectively)? • Hidden state of the issue? The context?

83

Page 84: Sentiment Analysis with NVivo 11 Plus

Assertability?

Enablements / Affordances• Standalone assertions (descriptive

data): This text set uses language that falls on this particular sentiment distribution (whether polarity or category).

• The respective text sets (in each sentiment category) have the following topical focuses.

• Based on the expressed sentiments, the following actions may be predicted (with a certain level of confidence).

Limitations• There are aspects of the text sets

that are not addressed based on limits of the sentiment analysis tool.

• The text sets only contain a certain amount of information. The sets are not an N of all.

• The sentiment coding is / was not overseen by humans for correction and re-coding.

84

Page 85: Sentiment Analysis with NVivo 11 Plus

Assertability? (cont.)

Enablements / Affordances• Remote profiling (inferential): This

individual or group tends to go negative (and / ) or positive on these particular topics.

• Based on the expressed communications, this person may be assumed to be of a certain psychological make-up.

• Based on the expressed communications, this person may take the following actions.

Limitations• The sentiment analysis only

addresses sentiment and not the wider research into emotion and valence.

• Sentiment analysis is often studied in isolation, without the benefit of other information streams.

85

Page 86: Sentiment Analysis with NVivo 11 Plus

Assertability? (cont.)

Enablements / Affordances• Comparative assertions (analytical):

These two or more text sets (text corpora) differ in terms of sentiment in these ways…and around these particular topics / concepts. Here are some reasons why.

• Based on these differences, the following observations may be made.

Limitations• The sentiment analysis is only run

over textual data, not image-, audio-, video- or other such data. For multimedia, there should be informational and sentiment equivalencies of the multimedia data in text form.

86

Page 87: Sentiment Analysis with NVivo 11 Plus

Ways to Strengthen Sentiment Analysis

• Select the text sets strategically. • Capture a sufficient amount of text examples. • Pre-process the data effectively, without losing information. • Do not over-assert beyond where the data will go. • Bring in contextual and cultural insights to add color to the data.

87

Page 88: Sentiment Analysis with NVivo 11 Plus

Conclusion and Contact

• Dr. Shalin Hai-Jew• iTAC, Kansas State University • 212 Hale / Farrell Library • [email protected]• 785-532-5262

• All the collected datasets and visualizations were created by the presenter.

• The visualizations were created inside NVivo 11 Plus.

• The presenter has no tie to QSR International.

88


Recommended