Aspect and Sentiment Unification Model

Aspect and Sentiment Unification ModelACM Web Search and Data Mining 2011

Yohan Jo & Alice [email protected] & Information LabKAIST

December 2010

1

Wednesday, December 1, 2010

mailto:[email protected]


Our Research

• KAIST: major research and undergrad/graduate education in Korea

• KAIST CS has 49 full-time tenure-track faculty

• Research at Users & Information Lab

• Topic modeling: LDA, HDP and their variants

• Sentiment analysis of reviews, Twitter, and other user-generated contents

• We welcome collaborations and discussions: email [email protected]




Our Research

• KAIST: major research and undergrad/graduate education in Korea

• KAIST CS has 49 full-time tenure-track faculty

• Research at Users & Information Lab

• Topic modeling: LDA, HDP and their variants

• Sentiment analysis of reviews, Twitter, and other user-generated contents

• We welcome collaborations and discussions: email [email protected]




Problem: Unstructured reviews

4


These aspects and aspect-specific sentiments are available on some Web sites for some of the products.

5


6

Can we automatically find and analyze the relevant attributes and the aspect-specific sentiments?




Overview of Talk

• Introduction to Topic Models

• LDA: Latent Dirichlet Allocation

• Aspect and sentiment in review data

• ASUM: Aspect and Sentiment Unification Model

• Experiments and results

• Review data

• Twitter data

8


Topic Models

Slides from David Blei (Princeton University)http://www.cs.princeton.edu/~blei/blei-meetup.pdf

A great tutorial by David Blei on videolectures.nethttp://videolectures.net/mlss09uk_blei_tm/


http://www.cs.princeton.edu/~blei/blei-meetup.pdf


http://videolectures.net/mlss09uk_blei_tm/

http://videolectures.net/mlss09uk_blei_tm/









Latent Dirichlet AllocationBlei, Ng, and Jordan, JMLR 2003

1. Basic Assumption2. Generative Process3. Inference4. Graphical Representation


http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?hp

nascar, races, track, raceway, race, cars, fuel, auto, racing

economic, slowdown, sales, recession, costs, spending, save

fans, spectators, sports, leagues, teams, competition13






fans, spectators, sports, leagues, teams, competition

Topics: multinomial over wordsWednesday, December 1, 2010




Topics: multinomial over wordsTopic DistributionsWednesday, December 1, 2010

http://www.nytimes.com/2010/08/09/sports/autoracing/09nascar.html?





















Graphical Representation of LDA

Topic Distributions




Topics: multinomial over words

Topics

sales xxx slowdown recession cars races spending xxx save costs fuel

15


Input to LDA

16


Input to LDA

16





Topics Discovered by LDA

nascar 0.12 spending 0.09 sports 0.12

races 0.10 economic 0.07 team 0.11

cars 0.10 recession 0.06 game 0.10

racing 0.09 save 0.05 player 0.10

track 0.08 money 0.05 athlete 0.09

speed 0.06 cut 0.04 win 0.07

... ... ...

money 0.002 speed 0.003 nascar 0.001

Topics: multinomial over vocabulary17


Topic Distributions of Documents in the Corpus

Topic distributions for each document in the corpus

18

http://www.nytimes.com/2010/08/09/sports/

Topic




Graphical View

19


Graphical View


19

Observed


Graphical View

Topics


19

Observed




Topics: multinomial over words

Discovered

Topic Distributions

Discovered


ASUM: Aspect Sentiment Unification Model

to uncover the intertwined semantic structure of aspects and sentiments in reviews

Yohan Jo and Alice OhWSDM 2011

20


Problem

21


Aspect

• This thing is small, and it's light, too.

• Start up and turn off time is fast.

• The low light performance is best in class, period

• The one thing I don't get is the 640X480 movie mode.

22


Sentiment





23


Sentiment Words

• affective words: love, satisfied, disappointed

• general evaluative words: best, excellent, bad

• aspect-specific evaluative words: small, cold, long

24


Sentiment Words

• affective words: love, satisfied, disappointed

• general evaluative words: best, excellent, bad

• aspect-specific evaluative words: small, cold, long

24

This camera is small.Beer was cold.The wine list is long

The LCD is small.Pizza was cold.The wait is long.


automatically discover aspects and the corresponding sentiments in reviews

SLDA: Sentence LDAASUM: Aspect Sentiment Unification Model

24,184 amazon reviews

7 product categories

27,458 yelp reviews

4 cities

320 restaurants

12 sentences per review (ave)

25


Observation





26


Observation





One sentence describes one aspect

26


Observation





One sentence describes one aspectLDA assumption: each word represents one aspect

26


LDA vs SLDA

w

z

θ

α

D

φ

β

T N M

(a) SLDA

w

z

θ

α

D

φ

β

γ

π

s

TS

N M

(b) ASUM

Figure 2: Graphical representation of SLDA andASUM. A node represents a random variable, anedge represents dependency, and a plate representsreplication. A shaded node is observable and an un-shaded node is not observable.

and a sentiment. For ASUM, in contrast, a pair of topic and

sentiment is represented as a single language model, where

a word is more probable as it is closely related to both the

topic and the sentiment. This provides a sound explanation

of how much a word is related to certain topic and sentiment.

Multi-Aspect Sentiment (MAS) model [19] differs from

the other models in that it focuses on modeling topics to

match a set of pre-defined aspects that are explicitly rated

by users in reviews. Sentiment is modeled as a probability

distribution over different sentiments for each pre-defined as-

pect, and this distribution is derived from a weighted combi-

nation of topics and words. To fit the topics and sentiment

to the aspects and their ratings, it requires training data

that are rated by users for each aspect. ASUM does not use

any user-rated training data, which are often expensive to

obtain.

Joint Sentiment/Topic (JST) model [13] takes the most

similar approach to ours. Sentiment is integrated with a

topic in a single language model. JST is different from

ASUM in that individual words may come from different

language models. In contrast, ASUM constrains the words

in a single sentence to come from the same language model,

so that each of the inferred language models is more focused

on the regional co-occurrences of the words in a document.

Both JST and ASUM make use of a small seed set of sen-

timent words, but the exploitation is not explicitly modeled

in JST. ASUM integrates the seed words into the generative

process, and this provides ASUM a more stable statistical

foundation.

It is difficult to compare the aspects and sentiments found

by the different models. Although sentiment classification

is not the main goal of the models, we carried out senti-

ment classification for a quantitative comparison. The re-

sults, presented in Section 6.4 shows that ASUM outper-

forms TSM and JST.

4. MODELS

We propose two generative models that extend traditional

topic models. Our goal is to discover topics that match the

aspects discussed in reviews.

4.1 Sentence-LDA

In LDA the positions of the words is neglected for topic

Table 1: Meanings of the notations used in the mod-els

D the number of reviews

M the number of sentences

N the number of words

T the number of aspects

S the number of sentiments

V the vocabulary size

w word

z aspect

s sentiment

φ multinomial distribution over words

θ multinomial distribution over aspects

π multinomial distribution over sentiments

α(k) Dirichlet prior vector for θ

β(w), βj(w) Dirichlet prior vector for φ

γ(j) Dirichlet prior vector for π

zi the aspect of sentence i

si the sentiment of sentence i

z−i the aspect assignments for all sentences ex-cept sentence i

wi the word list representation of sentence i

w the word list representation of the corpus

Mdk, M(−i)dk the number of sentences (except i) that are

assigned aspect k in review d

Mdj , M(−i)dj the number of sentences (except i) that are

assigned sentiment j in review d

Nkw, N(−i)kw the number of words (except the words in

sentence i) that are assigned topic k

Nkjw, N(−i)kjw the number of words (except the words in

sentence i) that are assigned topic k andsentiment j

inference. As discussed in previous work [22], this prop-

erty may not be always appropriate. In reviews, words from

an aspect tend to co-occur within close proximity to one

another. SLDA imposes a constraint that all words in a

sentence should be generated from one topic. The plate no-

tation of SLDA is shown in Figure 2(a) and the meanings of

the notations are summarized in Table 1.

In SLDA, a review is generated as follows:

1. The review’s aspect distribution is drawn.

(θ ∼ Dirichlet(α))

2. For a sentence,

(a) an aspect is chosen. (z ∼ Multinomial(θ))

(b) every word is generated from the word distribu-

tion of the chosen aspect.

(w ∼ Multinomial(φz))

We use Gibbs sampling [8] to estimate latent variables θand φ. At each transition step of the Markov chain, z is

drawn from the conditional probability

P (zi = k|z−i,w) ∝

M (−i)dk + αk

PTk�=1 M (−i)

dk� + αk�

Qw∈wi

β̂w(β̂w + 1) · · · (β̂w + mw − 1)

β̂0(β̂0 + 1) · · · (β̂0 + m0 − 1)

27


product-specific details of reviewsAspects found by SLDA

α : 0.1

β : 0.001

28

e l e c t r o n i c se l e c t r o n i c se l e c t r o n i c se l e c t r o n i c se l e c t r o n i c s restaurantsrestaurantscamera iso window keyboard laptop park beerhand card vista pad ram street winefeel raw softwar button processor valet drinkgrip imag mac kei graphic cash glass

weight camera instal mous netbook lot selectsize shoot os touch drive meter bottlfit nois xp trackpad core across martini

solid file run finger game car tapsmall print program touchpad batteri find mojitobodi pictur driver scroll hp free margarita


Aspect-Sentiment Unification Model

w

z

θ

α

D

φ

β

T N M

(a) SLDA

w

z

θ

α

D

φ

β

γ

π

s

TS

N M

(b) ASUM

















obtain.













foundation.






forms TSM and JST.

4. MODELS




4.1 Sentence-LDA









w word

z aspect

s sentiment






























2. For a sentence,







P (zi = k|z−i,w) ∝

M (−i)dk + αk

PTk�=1 M (−i)

dk� + αk�

Qw∈wi

β̂w(β̂w + 1) · · · (β̂w + mw − 1)

β̂0(β̂0 + 1) · · · (β̂0 + m0 − 1)

29

topic (LDA)

aspect (SLDA)

{sentiment, aspect} (ASUM)


Aspect-Sentiment Unification Model

w

z

θ

α

D

φ

β

T N M

(a) SLDA

w

z

θ

α

D

φ

β

γ

π

s

TS

N M

(b) ASUM

















obtain.













foundation.






forms TSM and JST.

4. MODELS




4.1 Sentence-LDA









w word

z aspect

s sentiment






























2. For a sentence,







P (zi = k|z−i,w) ∝

M (−i)dk + αk

PTk�=1 M (−i)

dk� + αk�

Qw∈wi

β̂w(β̂w + 1) · · · (β̂w + mw − 1)

β̂0(β̂0 + 1) · · · (β̂0 + m0 − 1)

29

topic (LDA)

aspect (SLDA)

{sentiment, aspect} (ASUM)


Sentiment Seed WordsTable 3: Full list of sentiment seed words inPARADIGM and PARADIGM+. For each word set,the first line is the positive words, and the secondline is the negative words. The words’ order doesnot mean anything.

Paradigmgood, nice, excellent, positive, fortunate, cor-rect, superiorbad, nasty, poor, negative, unfortunate,wrong, inferior

Paradigm+good, nice, excellent, positive, fortunate, cor-rect, superior, amazing, attractive, awesome,best, comfortable, enjoy, fantastic, favorite,fun, glad, great, happy, impressive, love, per-fect, recommend, satisfied, thank, worthbad, nasty, poor, negative, unfortunate,wrong, inferior, annoying, complain, disap-pointed, hate, junk, mess, not good, not like,not recommend, not worth, problem, regret,sorry, terrible, trouble, unacceptable, upset,waste, worst, worthless

the negative sentiment from the sentence. Previous work

has proposed several approaches for this problem including

flipping the sentiment of a word when the word is located

closely behind “not” [7]. We use simple rules to express the

negation by prefixing “not” to a word that is modified by

negating words, as is done in [6].

5.2 Sentiment Seed WordsThe sentiment information of the seed words are incor-

porated into ASUM. We carefully chose seed words so that

they are not aspect-specific evaluative words because those

are assumed to be unknown. We use two sets of seed words.

The first set Paradigm is the sentiment oriental paradigm

words from Turney’s work [21], which contains nine positive

words and nine negative words. The second set Paradigm+is Turney’s paradigm words plus other affective words and

general evaluative words. The full list of the words is in

Table 3.

6. EXPERIMENTSWe performed four experiments to evaluate our models,

SLDA and ASUM. In the first experiment, we evaluate the

aspects discovered by SLDA, and in the second experiment,

we evaluate the senti-aspects discovered by ASUM. In the

third experiment, we evaluate the sentiment words found by

ASUM, and in the last experiment, we test the sentiment

classification performance of ASUM.

6.1 Aspect DiscoveryThe first experiment is to automatically discover aspects

in reviews using SLDA. We define three criteria for mea-

suring the quality of the aspects of reviews. First, the as-

pects discovered should be coherent. Second, the aspects

should be specific enough to capture the details in the re-

views. Third, the aspects should be those that are discussed

the most in the reviews. We applied SLDA to Electronics,Restaurants, and Photography data sets and evaluated

the modeling power of SLDA based on the criteria. We also

compared the results with LDA to see the effect of our as-

sumption that one sentence represents one aspect. We varied

the numbers of aspects and found that 50 makes the best

results, which we used for all the experiments in this sec-

tion. We also tried various values of α and β but found that

they do not really affect the quality of the result, and we

use symmetric α and β set to be 0.1 and 0.001, respectively.

Some examples of the discovered aspects are presented in

Table 4.

From Electronics, SLDA discovered aspects that are

specific to the seven product categories as well as general

aspects such as design, orders, and service. SLDA discov-

ered seven aspects about laptops–OS, MacBook, peripher-

als, battery life, hardware, graphics, and screen–as shown in

Table 4(a). Each aspect represents a specific detail of the

laptop. The aspects also cover most of the important parts

and features of the laptop that users often point out and dis-

cuss in laptop reviews. These aspects are representative of

the 50 aspects found, most of which are closely related to the

product categories. These coherent, specific, and important

aspects that SLDA found would be effective for potential

applications such as aspect-level sentiment summarization

and retrieval.

We compared the results of SLDA with the aspects found

by LDA, and Table 5 presents the aspects related to cameras

found by SLDA and LDA. The aspects found by SLDA such

as “grip” and “lens” are specific details that people evaluate

about a camera. LDA does not capture these fine-grained

aspects, but rather it finds more general aspects such as

“brands” and “components”. This difference stems from our

assumption built into SLDA that a single sentence represents

one aspect, so that the words used in a single sentence tend

to form one aspect. Accordingly, the aspects discovered by

SLDA tend to account for the local positions of the words,

which is an appropriate property for our goal. In contrast,

LDA has a broader view that an aspect can be composed

of any words in a review regardless of intra-sential word co-

occurrences. The table shows all of the aspects discovered

about cameras by the two models, and we can see that SLDA

captured more aspects than LDA, which again shows that

SLDA is more appropriate for finding aspects in reviews.

For Restaurants, many of the SLDA aspects are related

to restaurant types such as Mexican, seafood, breakfast, and

dessert. The rest include parking, waiting, evaluation and

other general aspects about restaurants. Examples are pre-

sented in Table 4(b). The aspects “parking” and “waiting”

are two detailed points that people often describe in restau-

rant reviews. LDA discovers similar aspects for Restau-rants except for the last two aspects in the table, “liquors”

and “interjections”. In the LDA result, the top words in

“liquors” are spread out across different aspects. For exam-

ple, “beer” appears in an aspect related to bars, and “wine”

appears in an aspect related to desserts. That shows that

LDA captures more global aspects from reviews. When

we look at the interjections, they also appear across vari-

ous aspects for LDA. Without considering sentence bound-

aries, the interjections didn’t have enough evidence to be

formed into one aspect. In SLDA, on the other hand, the

co-occurrences of these words within sentence boundaries

cause them to form an aspect. Discovering the aspect con-

taining words such as“yum”and“wow” is meaningful for the

restaurant reviews because knowing the probability of that

aspect in the corpus and in each of the reviews would lead

to a better understanding of the reviews.

6.2 Senti-Aspect Discovery

built into the model by setting asymmetric priorsand Gibbs sampling initialization

30


w

z

θ

α

D

φ

β

T N M

(a) SLDA

w

z

θ

α

D

φ

β

γ

π

s

TS

N M

(b) ASUM

















obtain.













foundation.






forms TSM and JST.

4. MODELS




4.1 Sentence-LDA









w word

z aspect

s sentiment






























2. For a sentence,







P (zi = k|z−i,w) ∝

M (−i)dk + αk

PTk�=1 M (−i)

dk� + αk�

Qw∈wi

β̂w(β̂w + 1) · · · (β̂w + mw − 1)

β̂0(β̂0 + 1) · · · (β̂0 + m0 − 1)

β is different for positive φ and negative φ

31

α : 0.1

β : 0 for negative sentiment seed words in positive senti-aspects

0 for positive sentiment seed words in negative senti-aspects

0.001 for all other words

Sentiment Seed Words in the Model


Senti-Aspects discovered by ASUM

contain both aspect words and sentiment words

32

positive senti-aspectspositive senti-aspectspositive senti-aspects negative senti-aspectsnegative senti-aspectsworth screen easi monei fingerprintmonei color light save glossipenni bright carri notwast magnetextra clear weight wast screenwell video lightweight yourself showeveri displai suction notbui fingerprice crisp small awai finishdollar great around spend printspend resolut vacuum notworth smudg

pai qualiti power stai easili


Senti-Aspects discovered by ASUM

contain both aspect words and sentiment words

33

positive senti-aspectspositive senti-aspects negative senti-aspectsnegative senti-aspectsnegative senti-aspectsflavor music dry loud cashtender night bland tabl onlicrispi group too convers cardsauc crowd salti hear creditmeat loud tast music downsidjuici bar flavor nois parksoft atmospher meat talk take

perfectli peopl chicken sit acceptveri dinner bit close bring

moist fun littl other wait


aspect-specific sentiment words

discovered without using sentiment labels

34

Table 7: Automatically detected sentiment words.The senti-aspects discovered by ASUM were utilizedto illustrate different sentiment words for the sameaspect.Common Words Sentiment Wordsscreen colorbright displaicrisp qualitisharp

clear great pictur sound movi beauti goodhd imag size watch rai nice crystalglossi glare light reflect matt edg macbookkei black bit peopl notlik minor

music songplayer videodownload itunzune file

radio listen fm movi record easi convertpodcast album audio book librari watchproblem updat driver vista system xpfirmwar disk mac hard run microsoft appl

our us serverwaiter tabl shehe waitress askminut seat

water glass refil wine attent friendlibrought sat veri arriv plate help staff nicesaid me want card get tell if would gui badcould rude pai becaus walk then

“crust”. To express negative sentiment, they use words suchas “dry”, “bland”, and “disappointed”. These two aspectswere discovered in ASUM but not in SLDA, and the reasonis that people express their sentiment toward these aspectsvery clearly. In SLDA the words that convey a sentimenttoward the quality of meat appear in various cuisine-typeaspects such as steak, burger, and pizza. Because peopleoften evaluate specifically on the quality of meat, however,these words become apparent in ASUM.

6.3 Aspect-Specific Sentiment WordsThe joint modeling of aspect and sentiment means ASUM

finds, as top probability words in each of the senti-aspects,both aspect words, and sentiment words that are dependenton the aspect. Since we start with a set of general sentimentwords, this yields the effect of bootstrapping the generalsentiment words to discover aspect-specific sentiment words.This is one advantage of ASUM over TSM [16], in which alltopics share one sentiment word distribution.

We introduce a simple method for employing the resultof ASUM to automatically distinguish between positive andnegative sentiment words for the same aspect. This increasesthe utility of ASUM by providing an organized result thatshows why people express sentiment toward an aspect andwhat words they use. The process is as follows:

1. Calculate the cosine similarity between every pair ofsenti-aspects with different sentiments.

2. If the similarity exceeds a certain threshold, two senti-aspects are considered to represent the same aspect.

3. If a word takes a high probability in both senti-aspects,then this word is a common word.

4. If a word takes a high probability in only one senti-aspect, then this word is a sentiment word whose sen-timent follows the senti-aspect.

We applied this method to our data sets and present theresults in Table 7. For a music player, people praised theconverting process, but they did not like driver and firmwareupdates. In the restaurant reviews, people praised wait-ers and waitresses for being attentive and friendly, but theycomplained when the servers were rude. The overall resultsshow that ASUM discovers aspect-specific sentiment words,

Table 8: Sentiment classification by the generativemodels and the supervised classifiers. The numberof aspects is 70 for each sentiment.

Electronics Restaurants

Baseline 0.81 0.85

LingPipe-Uni 0.71 0.81

LingPipe-Bi 0.79 0.87

ASUM 0.78 0.79

ASUM+ 0.84 0.86

JST+ 0.65 0.60

TSM+ 0.48 0.52

which can be used in applications such as review summa-rization.

6.4 Sentiment ClassificationIn this section, we present the results of sentiment classifi-

cation to quantitatively evaluate the quality of senti-aspectsdiscovered by ASUM. To determine the sentiment of a re-view, we use π (Equation 1), the probabilistic sentiment dis-tribution in a review, such that a review is set to be positiveif positive sentiment has the equal or a higher probabilitythan negative sentiment, and set to be negative otherwise.Both Electronics and Restaurants use the 5-star ratingsystem, and the ratings of 1 or 2-stars are treated as neg-ative and 4 or 5-stars positive. We do not classify on thereviews with 3-stars, but they are still used to fit ASUM tothe data. The hyperparameters of ASUM are set to be thesame as in the experiments in Section 6.2.We compare the performance of ASUM with JST [15],

TSM [16], LingPipe [1] (Unigrams & Bigrams), and the base-line. LingPipe first separates subjective sentences from ob-jective sentences, and then finds sentiment using word fea-tures. The baseline classifies each review according to thenumbers of sentences that contain the positive and negativesentiment seed words.The classification results are presented in Figure 3 in terms

of accuracy. The baseline and LingPipe are not shown inthe figure because of space, but they are shown numericallyin Table 8. In all settings, ASUM outperforms the otherunsupervised models and even supervised LingPipe in thesame condition of unigrams. The baseline with only the seedwords performs quite well, but ASUM performs even better.In general, the accuracy increases as the number of aspectsincreases because the models better fit the data. JST hadgreat performance on movie reviews in the original paper[15], but did not perform well on our data. TSM is not in-tended for sentiment classification, and sentiment words arenot adapted to aspects. In the original paper [16], TSM wasused to analyze topic life cycles and sentiment dynamics.The visualization of the assignment of senti-aspects to

each sentence would help to understand and analyze thereviews. The posterior probability of the senti-aspects foreach sentence was used to visualize reviews. Two examplesare shown in Figure 4. The visualization shows that thesentiments were found to be quite accurate. It is worth not-ing that sentences that are too short are difficult to assigncorrect senti-aspects because they may lack strong evidencefor sentiment and aspects.


senti-aspects assigned to sentences

sentiments shown in greeen (p), pink (n)

35

I was so excited about this product.I’d tasted the coffee and it was pretty good and easy and quick to make.However, this machine makes the most awful, LOUD sound while heating water.It’s disturbing to hear in the morning, while others are sleeping especially!Keurig’s customer service is terrible too!

The restaurant is really pretty inside and everyone who works there looks like they like it.The food is really great.I would recommend any of their seafood dishes.Come during happy hour for some great deals.The reason they aren’t getting five stars is because of their parking situation.They technically don’t “make” you use the valet but there’s only a half dozen spots available to the immediate left.



same aspect from different reviews

36

Parking (A46, Negative)park, street, valet, lot, there, free, can, find, onli, if, valid, car, get, meter, your, block, hour, spot

• Parking is only validated for 3 hours.

• This place is a lol hard to see coming from 10th street and parking is limited.

• They don’t have a lot/any designated parking/complimentary valet.

• Apparently since it’s Friday the valets charge $5 to park, which I found really annoying and just found a spot on the street.



same aspect from different reviews

37

Coffeemaker Easy (A10, Positive)coffee, hot, maker, brew, cup, great, caraf, pot, good, fast, keep, hour, love, like, machin, warm, time, thermal, easi

• Makes coffee fast and hot

• It took us several uses to understand how much coffee to use

• And easy to use programmer for morning coffee

• Very convenient

• Guests always comment on how nice it looks and how easy it is to use


0.4!

0.45!

0.5!

0.55!

0.6!

0.65!

0.7!

0.75!

0.8!

0.85!

30! 50! 70! 100!

Accura

cy!

Number of Topics!

ASUM! ASUM+! JST+! TSM+!

(a) Electronics

0.4!

0.45!

0.5!

0.55!

0.6!

0.65!

0.7!

0.75!

0.8!

0.85!

0.9!

30! 50! 70! 100!

Accura

cy!

Number of Topics!


(b) Restaurants

Figure 3: Sentiment classification results. Three unified models (ASUM, JST, TSM) are compared in thefigures with two seed word sets Paradigm and Paradigm+ (“+” indicates Paradigm+). The error barsrepresent the standard deviation after multiple trials.

same condition of unigrams. The baseline with only seed

words performs quite well, but ASUM performs even better.

In general, the accuracy increases as the number of aspects

increases because the models better fit the data. However,

the increase slows down for ASUM, as the additional in-

crease of the number of aspects becomes no longer effective.

JST had great performance on movie reviews in the original

paper, but did not perform well on our data. TSM is not

intended for sentiment classification, and sentiment words

are not specialized to each aspect, but rather TSM is used

to analyze topic life cycles and sentiment dynamics.

The visualization of the assignment of a senti-aspect to

each sentence would help to understand the reviews. The

results of ASUM after 1000 Gibbs sampling were used to

visualize reviews and the senti-aspects. Some examples are

shown in Figure 4. The visualization shows that the senti-

ments were found to be quite accurate. It is difficult to see

whether the aspects found are coherent from just one ex-

ample per data set. It is worth noting that some sentences

do not contain strong sentiment words, but the sentiments

may have been correctly assigned because our model unifies

aspect and sentiment.

7. CONCLUSION

In this paper, we proposed two generative models to dis-

cover aspects in reviews. SLDA constrains that all words

in a single sentence be drawn from one aspect. ASUM uni-

fies aspects and sentiment and discovers pairs of {aspect,

sentiment}, which we call senti-aspects. The aspects and

senti-aspects discovered from reviews of electronic devices

and restaurants show that SLDA and ASUM capture impor-

tant evaluative details of the reviews. ASUM is also capable

of capturing aspects that are rarely mentioned without sen-

timent. We showed that the senti-aspects found by ASUM

can be used to illustrate the opposite sentiments toward the

same aspect, which would be utilized in applications such as

review summarization. In the quantitative evaluation of sen-

timent classification, ASUM outperformed other generative

models and came close to supervised classification methods.

For future work, our models may be utilized for aspect-

based review summarization. We can apply the models to

other types of data such as editorials and art critiques. We

can use different seed words to capture dimensions other

than sentiment.

1 I was so excited about this product. {A24, p}

2 I'd tasted the coffee and it was pretty good and easy and quick to make. {A24, p}

3 HOWEVER, this machine makes the most awful, LOUD sound while heating water... {A10, n}

4 It's disturbing to hear in the morning, while others are sleeping especially! {A7, n}

5 Keurig's customer service is terrible, too!! {A27, n}

(a) Electronics

1 The restaurant is really pretty inside and everyone who works there looks like they like it. {A41, p}

2 The food is really great. {A24, p}

3 I would recommend any of their seafood dishes. {A19, p}

4 Come during happy hour for some great deals. {A25, p}

5 The reason they aren't getting five stars is because of their parking situation. {A49, n}

6 They technically don't "make" you use the valet but there's only a half dozen spots available to the immediate left.{A46, n}

(b) Restaurants

Figure 4: Visualization of reviews. The senti-aspectdiscovered by ASUM is annotated the end of eachsentence. (blue(darker): positive, red(lighter): neg-ative)

0.4!

0.45!

0.5!

0.55!

0.6!

0.65!

0.7!

0.75!

0.8!

0.85!

30! 50! 70! 100!

Accura

cy!

Number of Topics!


(a) Electronics

0.4!

0.45!

0.5!

0.55!

0.6!

0.65!

0.7!

0.75!

0.8!

0.85!

0.9!

30! 50! 70! 100!

Accura

cy!

Number of Topics!


(b) Restaurants

Figure 3: Sentiment classification results. Three unified models (ASUM, JST, TSM) are compared in thefigures with two seed word sets Paradigm and Paradigm+ (“+” indicates Paradigm+). The error barsrepresent the standard deviation after multiple trials.

same condition of unigrams. The baseline with only seed

words performs quite well, but ASUM performs even better.

In general, the accuracy increases as the number of aspects

increases because the models better fit the data. However,

the increase slows down for ASUM, as the additional in-

crease of the number of aspects becomes no longer effective.

JST had great performance on movie reviews in the original

paper, but did not perform well on our data. TSM is not

intended for sentiment classification, and sentiment words

are not specialized to each aspect, but rather TSM is used

to analyze topic life cycles and sentiment dynamics.

The visualization of the assignment of a senti-aspect to

each sentence would help to understand the reviews. The

results of ASUM after 1000 Gibbs sampling were used to

visualize reviews and the senti-aspects. Some examples are

shown in Figure 4. The visualization shows that the senti-

ments were found to be quite accurate. It is difficult to see

whether the aspects found are coherent from just one ex-

ample per data set. It is worth noting that some sentences

do not contain strong sentiment words, but the sentiments

may have been correctly assigned because our model unifies

aspect and sentiment.

7. CONCLUSION

In this paper, we proposed two generative models to dis-

cover aspects in reviews. SLDA constrains that all words

in a single sentence be drawn from one aspect. ASUM uni-

fies aspects and sentiment and discovers pairs of {aspect,

sentiment}, which we call senti-aspects. The aspects and

senti-aspects discovered from reviews of electronic devices

and restaurants show that SLDA and ASUM capture impor-

tant evaluative details of the reviews. ASUM is also capable

of capturing aspects that are rarely mentioned without sen-

timent. We showed that the senti-aspects found by ASUM

can be used to illustrate the opposite sentiments toward the

same aspect, which would be utilized in applications such as

review summarization. In the quantitative evaluation of sen-

timent classification, ASUM outperformed other generative

models and came close to supervised classification methods.

For future work, our models may be utilized for aspect-

based review summarization. We can apply the models to

other types of data such as editorials and art critiques. We

can use different seed words to capture dimensions other

than sentiment.

1 I was so excited about this product. {A24, p}

2 I'd tasted the coffee and it was pretty good and easy and quick to make. {A24, p}

3 HOWEVER, this machine makes the most awful, LOUD sound while heating water... {A10, n}

4 It's disturbing to hear in the morning, while others are sleeping especially! {A7, n}

5 Keurig's customer service is terrible, too!! {A27, n}

(a) Electronics

1 The restaurant is really pretty inside and everyone who works there looks like they like it. {A41, p}

2 The food is really great. {A24, p}

3 I would recommend any of their seafood dishes. {A19, p}

4 Come during happy hour for some great deals. {A25, p}

5 The reason they aren't getting five stars is because of their parking situation. {A49, n}

6 They technically don't "make" you use the valet but there's only a half dozen spots available to the immediate left.{A46, n}

(b) Restaurants

Figure 4: Visualization of reviews. The senti-aspectdiscovered by ASUM is annotated the end of eachsentence. (blue(darker): positive, red(lighter): neg-ative)

Sentiment Classification Comparison

among generative models, ASUM performs best

38

JST: Joint Sentiment Topic Model, Lin and He, CIKM09

TSM: Topic Sentiment Mixture, Mei et al., WWW07


tral discussions, the positive opinions, and the negative opin-ions about the topic (subtopic) change over time. For thispurpose, we introduce two additional concepts, “topic lifecycle” and “sentiment dynamics” as follows.

Definition 4 (Topic Life Cycle) A topic life cycle,also known as a theme life cycle in [16], is a time seriesrepresenting the strength distribution of the neutral contentsof a topic over the time line. The strength can be measuredbased on either the amount of text which a topic can explain[16] or the relative strength of topics in a time period [15, 17].In this paper, we follow [16] and model the topic life cycleswith the amount of document content that is generated witheach topic model in di!erent time periods.

Definition 5 (Sentiment Dynamics) The sentimentdynamics for a topic ! is a time series representing thestrength distribution of a sentiment s ! {P, N} associatedwith !. The strength can indicate how much positive/negativeopinion there is about the given topic in each time period.Being consistent with topic life cycles, we model the sen-timent dynamics with the amount of text associated withtopic ! that is generated with each sentiment model.

Based on the concepts above, we define the major tasksof Topic-Sentiment Analysis (TSA) on weblogs as: (1)Learning General Sentiment Models: Learn a senti-ment model for positive opinions and a sentiment model fornegative opinions, which are general enough to be used innew unlabeled collections. (2) Extracting Topic Modelsand Sentiment Coverages: Given a collection of Weblogarticles and the general sentiment models learnt, customizethe sentiment models to this collection, extract the topicmodels, and extract the sentiment coverages. (3) Model-ing Topic Life Cycle and Sentiment Dynamics: Modelthe life cycles of each topic and the dynamics of each senti-ment associated with that topic in the given collection.

This problem as defined above is more challenging thanmany existing topic extraction tasks and sentiment classifi-cation tasks for several reasons. First, it is not immediatelyclear how to model topics and sentiments simultaneouslywith a mixture model. No existing topic extraction work[9, 1, 16, 15, 17] could extract sentiment models from text,while no sentiment classification algorithm could model amixture of topics simultaneously. Second, it is unclear howto obtain sentiment models that are independent of specificcontents of topics and can be generally applicable to any col-lection representing a user’s ad hoc information need. Mostexisting sentiment classification methods overfit to the spe-cific training data provided. Finally, computing and dis-tinguishing topic life cycles and sentiment dynamics is alsoa challenging task. In the next section, we will present aunified probabilistic approach to solve these challenges.

3. AMIXTUREMODELFORTHEMEANDSENTIMENT ANALYSIS

3.1 The Generation ProcessA lot of previous work has shown the e!ectiveness of mix-

ture of multinomial distributions (mixture language models)in extracting topics (themes, subtopics) from either plaintext collections or contextualized collections [9, 1, 16, 15,17, 12]. However, none of this work models topics and sen-timents simultaneously; if we apply an existing topic modelon the weblog articles directly, none of the topics extracted

with this model could capture the positive or negative sen-timent well.

To model both topics and sentiments, we also use a mix-ture of multinomials, but extend the model structure to in-clude two sentiment models to naturally capture sentiments.

In the previous work [15, 17], the words in a blog arti-cle are classified into two categories: (1) common Englishwords (e.g., “the”, “a”, “of”) and (2) words related to atopical theme (e.g., “nano”, “price”, “mini” in the docu-ments about iPod). The common English words are cap-tured with a background component model [28, 16, 15],and the topical words are captured with topic models. Inour topic-sentiment model, we extend the categories for thetopical words in existing approaches. Specifically, for thewords related to a topic, we further categorize them intothree sub-categories: (1) words about the topic with neu-tral opinions (e.g., “nano”, “price”); (2) words representingthe positive opinions of the topic (e.g., “awesome”, “love”);and (3) words representing the negative opinions about thetopic (e.g., “hate”, “bad”). Correspondingly, we introducefour multinomial distributions: (1) !B is a background topicmodel to capture common English words; (2) " = {!1, ..., !k}are k topic models to capture neutral descriptions about kglobal subtopics in the collection; (3) !P is a positive sen-timent model to capture positive opinions; and (4) !N is anegative sentiment model to capture negative opinions forall the topics in the collection.

According to this mixture model, an author would “write”a Weblog article by making the following decisions stochas-tically and sampling each word from the component models:(1) The author would first decide whether the word will bea common English word. If so, the word would be sampledaccording to !B . (2) If not, the author would then decidewhich of the k subtopics the word should be used to de-scribe. (3) Once the author decides which topic the wordis about, the author will further decide whether the wordis used to describe the topic neutrally, positively, or nega-tively. (4) Let the topic picked in step (2) be the j-th topic!j . The author would finally sample a word using !j , !P

or !N , according to the decision in step(3). This generationprocess is illustrated in Figure 2.

!!!!!

!!!!"

#

!!!!$

!!!!%

!!!!&

!"#$%&'

()*+$+,"

!"-&$+,"

!!!!!

!!!!"

#

!!!!$

'()*)+

,

-

""""!./0./1

"""""./0./1

""""$./0./1

""""2./0./&

""""2./0./%

####0!

####0"

####0$

$$$$,

!/3 $$$$,

0

Figure 2: The generation process of the topic-sentiment mixture model

We now formally present the Topic-Sentiment Mixturemodel and the estimation of parameters based on blog data.

WWW 2007 / Track: Data Mining Session: Predictive Modeling of Web Users

173

Language model p(w) like φ in LDA(Document-independent)

Generation Process of word w1. Decide whether w is generated from B or themes.If B, then choose w according to p(w|B).else

2. Choose a theme j from which w is generated.3. Decide whether w is generated from θj, θP, or θN.4. Choose w from the selected θ.

Like topic z in LDA(Document-Specific)A theme itself is not a language model

Language model p(w|B)B: Background words (e.g., function words)

1

2

k

θP and θN are theme-independent (i.e., shared by all themes)• They should cover as many sentiment words as

possible to be applied to all themes• This is problematic because it requires special effort

(unlike general sentiment words)• This model can’t find theme-specific sentiment words

TSM


(a) (b) (c)

Figure 1: (a) LDA model; (b) JST model; (c) Tying-JST model.

in Figure 1(a), is one of the most popular topic models basedupon the assumption that documents are mixture of topics,where a topic is a probability distribution over words [2,18]. The LDA model is e!ectively a generative model fromwhich a new document can be generated in a predefinedprobabilistic procedure. Compared to another commonlyused generative model Probabilistic Latent Semantic Index-ing (pLSI) [8], LDA has a better statistical foundation bydefining the topic-document distribution !, which allows in-ferencing on new document based on previously estimatedmodel and avoids the problem of overfitting, where both areknown as the deficits of pLSI. Generally, the procedure ofgenerating each word in a document under LDA can be bro-ken down into two stages. One firstly chooses a distributionover a mixture of K topics. Following that, one picks upa topic randomly from the topic distribution, and draws aword from that topic according to the topic’s word proba-bility distribution.

The existing framework of LDA has three hierarchical lay-ers, where topics are associated with documents, and wordsare associated with topics. In order to model document sen-timents, we propose a joint sentiment/topic (JST) modelby adding an additional sentiment layer between the docu-ment and the topic layer. Hence, JST is e!ectively a four-layer model, where sentiment labels are associated with doc-uments, under which topics are associated with sentimentlabels and words are associated with both sentiment labelsand topics. A graphical model of JST is represented in Fig-ure 1(b).

Assume that we have a corpus with a collection of Ddocuments denoted by C = {d1,d2, ..., dD}; each docu-ment in the corpus is a sequence of Nd words denoted byd = (w1, w2, ..., wNd

), and each word in the document isan item from a vocabulary index with V distinct terms de-noted by {1, 2, ..., V }. Also, let S be the number of distinctsentiment labels, and T be the total number of topics. Theprocedure of generating a word wi in document d boils downto three stages. Firstly, one chooses a sentiment label l fromthe document specific sentiment distribution "d. Followingthat, one chooses a topic randomly from the topic distri-bution !l,d, where !l,d is chosen conditioned on the senti-ment label l. It is worth noting at this point that the topic-document distribution of JST is di!erent from the one ofLDA. In LDA, there is only one topic-document distribution

! for each individual document. In contrast, each documentin JST is associated with S (number of sentiment labels)topic-document distributions, each of which corresponds toa sentiment label l with the same number of topics. Thisfeature essentially provides means for the JST model to mea-sure the sentiment of topics. Finally, one draws a word fromdistribution over words defined by the topic and sentimentlabel, which is again di!erent from LDA that a word is sam-pled from the word distribution only defined by topic.

The formal definition of the generative process which cor-responds to the hierarchical Bayesian model shown in Fig-ure 1(b) is as follows:

• For each document d, choose a distribution "d ! Dir(#).

• For each sentiment label l under document d, choosea distribution !d,l ! Dir($).

• For each word wi in document d

– choose a sentiment label li ! "d,

– choose a topic zi ! !d,li ,

– choose a word wi from the distribution over wordsdefined by the topic zi and sentiment label li, %li

zi.

The hyperparameters $ and & in JST can be treated asthe prior observation counts for the number of times topicj associated with sentiment label l sampled from a docu-ment and the number of times words sampled from topic jassociated with sentiment label l respectively, before hav-ing observed any actual words. Similarly, the hyperparame-ter # can be interpreted as the prior observation counts forthe number of times sentiment label l sampled from docu-ment before any words from the corpus is observed. In JST,there are three sets of latent variables that we need to infer,including: the joint sentiment/topic-document distribution!, the joint sentiment/topic-word distribution %, and thesentiment-document distribution ". We will see later in thepaper that the sentiment-document distribution " plays animportant role in determining the document polarity.

In order to obtain the distributions of !, % and ", wefirstly estimate the posterior distribution over z, i.e the as-signment of word tokens to topics and sentiment labels. Thesampling distribution for a word given the remaining topicsand sentiment labels is P (zt = j, lt = k|w, z!t, l!t, $, &, #)where z!t and l!t are vector of assignments of topics and

Generation Process of word w1. Choose a sentiment l.2. Choose a topic label z based on l.3. Choose w from φzl.

Same β for all φ’s• There is no difference in β for positive φ and negative φ• Effect of Gibbs sampling initialization fades away

There is no sentence layer required for aspect discovery

JST


strated in Titov and McDonald (2008), the topicsproduced by LDA do not correspond to ratable as-pects of entities. In particular, these models tend tobuild topics that globally classify terms into productinstances (e.g., Creative Labs Mp3 players versusiPods, or New York versus Paris Hotels). To com-bat this, MG-LDA models two distinct types of top-ics: global topics and local topics. As in LDA, thedistribution of global topics is fixed for a document(a user review). However, the distribution of localtopics is allowed to vary across the document.

A word in the document is sampled either fromthe mixture of global topics or from the mixture oflocal topics specific to the local context of the word.It was demonstrated in Titov and McDonald (2008)that ratable aspects will be captured by local topicsand global topics will capture properties of revieweditems. For example, consider an extract from a re-view of a London hotel: “. . . public transport in Lon-don is straightforward, the tube station is about an 8minute walk . . . or you can get a bus for £1.50”. Itcan be viewed as a mixture of topic London sharedby the entire review (words: “London”, “tube”, “£”),and the ratable aspect location, specific for the localcontext of the sentence (words: “transport”, “walk”,“bus”). Local topics are reused between very differ-ent types of items, whereas global topics correspondonly to particular types of items.

In MG-LDA a document is represented as a setof sliding windows, each covering T adjacent sen-tences within a document.4 Each window v in docu-ment d has an associated distribution over local top-ics θloc

d,v and a distribution defining preference for lo-cal topics versus global topics πd,v. A word can besampled using any window covering its sentence s,where the window is chosen according to a categor-ical distribution ψd,s. Importantly, the fact that win-dows overlap permits the model to exploit a largerco-occurrence domain. These simple techniques arecapable of modeling local topics without more ex-pensive modeling of topic transitions used in (Grif-fiths et al., 2004; Wang and McCallum, 2005; Wal-lach, 2006; Gruber et al., 2007). Introduction of asymmetrical Dirichlet prior Dir(γ) for the distribu-tion ψd,s can control the smoothness of transitions.

4Our particular implementation is over sentences, but slidingwindows in theory can be over any sized fragment of text.

(a) (b)Figure 3: (a) MG-LDA model. (b) An extension of MG-LDA to obtain MAS.

The formal definition of the model with Kgl

global and K loc local topics is as follows: First,draw Kgl word distributions for global topics ϕgl

z

from a Dirichlet prior Dir(βgl) and K loc word dis-tributions for local topics ϕloc

z� - from Dir(βloc).Then, for each document d:

• Choose a distribution of global topics θgld ∼ Dir(αgl).

• For each sentence s choose a distribution over slidingwindows ψd,s(v) ∼ Dir(γ).

• For each sliding window v

– choose θlocd,v ∼ Dir(αloc),

– choose πd,v ∼ Beta(αmix).• For each word i in sentence s of document d

– choose window vd,i ∼ ψd,s,– choose rd,i ∼ πd,vd,i ,

– if rd,i = gl choose global topic zd,i ∼ θgld ,

– if rd,i= loc choose local topic zd,i∼θlocd,vd,i

,

– choose word wd,i from the word distribution ϕrd,izd,i .

Beta(αmix) is a prior Beta distribution for choos-ing between local and global topics. In Figure 3a thecorresponding graphical model is presented.

2.2 Multi-Aspect Sentiment ModelMG-LDA constructs a set of topics that ideally cor-respond to ratable aspects of an entity (often in amany-to-one relationship of topics to aspects). Amajor shortcoming of this model – and all other un-supervised models – is that this correspondence isnot explicit, i.e., how does one say that topic X is re-ally about aspect Y? However, we can observe thatnumeric aspect ratings are often included in our databy users who left the reviews. We then make theassumption that the text of the review discussing anaspect is predictive of its rating. Thus, if we modelthe prediction of aspect ratings jointly with the con-struction of explicitly associated topics, then such a

310

Assumptions•A sentence is covered by several sliding windows

(ψds is a window distribution of sentence s in document d)

Generation Process of word w in sentence s•Choose a window v from ψds

•Decide whether w is chosen from global topics or local topics (πv = {p(gl), p(loc)})• If r = gl, choose topic z from ϑgl

else if r = loc, choose topic z from ϑloc

•Choose w from z

MAS

ya = {p(1-star), p(2-stars), ..., p(5-stars)}

a = explicitly rated aspectf = n-gram featureJf,y = common weight for fJaf,y = aspect-specific weight for fpaf,r,z = fraction of words in f assigned r = loc & z = a

model should benefit from both higher quality topicsand a direct assignment from topics to aspects. Thisis the basic idea behind the Multi-Aspect Sentimentmodel (MAS).

In its simplest form, MAS introduces a classifierfor each aspect, which is used to predict its rating.Each classifier is explicitly associated to a singletopic in the model and only words assigned to thattopic can participate in the prediction of the senti-ment rating for the aspect. However, it has been ob-served that ratings for different aspects can be cor-related (Snyder and Barzilay, 2007), e.g., very neg-ative opinion about room cleanliness is likely to re-sult not only in a low rating for the aspect rooms,but also is very predictive of low ratings for the as-pects service and dining. This complicates discoveryof the corresponding topics, as in many reviews themost predictive features for an aspect rating mightcorrespond to another aspect. Another problem withthis overly simplistic model is the presence of opin-ions about an item in general without referring toany particular aspect. For example, “this product isthe worst I have ever purchased” is a good predic-tor of low ratings for every aspect. In such cases,non-aspect ‘background’ words will appear to be themost predictive. Therefore, the use of the aspect sen-timent classifiers based only on the words assignedto the corresponding topics is problematic. Such amodel will not be able to discover coherent topicsassociated with each aspect, because in many casesthe most predictive fragments for each aspect ratingwill not be the ones where this aspect is discussed.

Our proposal is to estimate the distribution of pos-sible values of an aspect rating on the basis of theoverall sentiment rating and to use the words as-signed to the corresponding topic to compute cor-rections for this aspect. An aspect rating is typicallycorrelated to the overall sentiment rating5 and thefragments discussing this particular aspect will helpto correct the overall sentiment in the appropriate di-rection. For example, if a review of a hotel is gen-erally positive, but it includes a sentence “the neigh-borhood is somewhat seedy” then this sentence ispredictive of rating for an aspect location being be-low other ratings. This rectifies the aforementioned

5In the dataset used in our experiments all three aspect rat-ings are equivalent for 5,250 reviews out of 10,000.

problems. First, aspect sentiment ratings can oftenbe regarded as conditionally independent given theoverall rating, therefore the model will not be forcedto include in an aspect topic any words from otheraspect topics. Secondly, the fragments discussingoverall opinion will influence the aspect rating onlythrough the overall sentiment rating. The overallsentiment is almost always present in the real dataalong with the aspect ratings, but it can be coarselydiscretized and we preferred to use a latent overallsentiment.

The MAS model is presented in Figure 3b. Notethat for simplicity we decided to omit in the figurethe components of the MG-LDA model other thanvariables r, z and w, though they are present in thestatistical model. MAS also allows for extra unasso-ciated local topics in order to capture aspects not ex-plicitly rated by the user. As in MG-LDA, MAS hasglobal topics which are expected to capture topicscorresponding to particular types of items, such Lon-don hotels or seaside resorts for the hotel domain. Infigure 3b we shaded the aspect ratings ya, assumingthat every aspect rating is present in the data (thoughin practice they might be available only for some re-views). In this model the distribution of the overallsentiment rating yov is based on all the n-gram fea-tures of a review text. Then the distribution of ya, forevery rated aspect a, can be computed from the dis-tribution of yov and from any n-gram feature whereat least one word in the n-gram is assigned to theassociated aspect topic (r = loc, z = a).

Instead of having a latent variable yov,6 we use asimilar model which does not have an explicit no-tion of yov. The distribution of a sentiment rating ya

for each rated aspect a is computed from two scores.The first score is computed on the basis of all the n-grams, but using a common set of weights indepen-dent of the aspect a. Another score is computed onlyusing n-grams associated with the related topic, butan aspect-specific set of weights is used in this com-putation. More formally, we consider the log-lineardistribution:

P (ya = y|w, r, z)∝exp(bay+

�

f∈wJf,y+pa

f,r,zJaf,y), (1)

where w, r, z are vectors of all the words in a docu-6Preliminary experiments suggested that this is also a feasi-

ble approach, but somewhat more computationally expensive.

311

There is no aspect-specific sentiment wordsThis model requires user-rated training data


w

z

θ

α

D

φ

β

T N M

(a) SLDA

w

z

θ

α

D

φ

β

γ

π

s

TS

N M

(b) ASUM

















obtain.













foundation.






forms TSM and JST.

4. MODELS




4.1 Sentence-LDA









w word

z aspect

s sentiment






























2. For a sentence,







P (zi = k|z−i,w) ∝

M (−i)dk + αk

PTk�=1 M (−i)

dk� + αk�

Qw∈wi

β̂w(β̂w + 1) · · · (β̂w + mw − 1)

β̂0(β̂0 + 1) · · · (β̂0 + m0 − 1)

ASUM

Generation ProcessFor each sentence•Choose a sentiment s•Choose a topic label z•Choose words from φzs

β is different for positive φ and negative φ

Current Limitations• θ would be different for different

sentiments as in JST• If a sentence is too short (1 or 2

words), the topic assigned is almost random (because there is no clue)• It does not model well the sentences

that have multiple aspects


Results on Twitter Data

• 1.3 million tweets

• 50k words in vocabulary

• What would happen when we apply this model to Twitter

• Many more and wide variety of aspects (topics)

• Different notions of “sentiment”

• Review data: polarity (like vs. dislike)

• Twitter: feelings (happy vs. unhappy)


Seed Words

:):-):]

:^):D:-D=)=]=D

:(:-(:[:-[:'(:/:-/=(=[


Positive Senti-Aspects

creamic

chocoleat

cakecookibuttermmyum

peanutyummichip

strawberribreakfast

coffe

morngood

everyon:)

nightdai

hellohopeworldtwitter

afternoonall

happigreathow

idoladam

americankri

lambertallenvote

watchdanctalentpaulawin

susanboylhe

dinnerhome

:)hadhousfamilifun

nightthenfriendlunchhang

birthdaiparti

tonight

happidai

birthdaimotherfatherhopeall

thankmomblessdad:)

easterlovegreat

votemileijonacyrudemiaward

toolovatoteentaylorselenasongchoicwho

brother


Positive Senti-Aspects

love:)

thanksmile

yayourkeepupllallwe

goodmake

lolalwai

godblessprailord

prayerjesuour

thankyour

##time##we

christhe

praislove

obama#tcothealthpalinpresidvote

mccainhe

caresenat

republicanreform

billtax

elect

##dollar##dresssalenew

boughtwear

designshopshoeartbag

vintagblackgift

paint

appwindowiphongooglinstalmac

firefoxos

tweetdeckchrome

betadownload

usversiondesktop

evermoviseenfunniwatch

wabestve

thingfunniest

hilarilaughvideolovesaw


Negative Senti-Aspects

hurtfeelpain:(

sorethroat

headachsick

stomachdoctor

fluey

coughteethbut

##percent####dollar##

marketstockpricetraderatesaleforexprofitbankreportriseoil

billion

tireblah:(

sleepfeelboresickbedim

sleepiworksoo

realliughwant

monei##dollar##

makeguarante

onlinearn

monthfree

twitter24

homeincomhouryourstart

jacksonmichael

mjrip

farrahdi

sadfawcettdeaddeathtributbillihe

memorimicheal

twitterfacebook

trywhywork

uploadhowcanfigurupdat

tweetdeckcomput

linkanyonpictur


Negative Senti-Aspects

flightairportplanedrivehomehourbackwait

traffictrainbutripfromcar

delai

quizclassexamtest

homeworkschool

tomorrowdonefinishpapermath

:(workfinalstart

gameplai

tonightwatchlaker

footbalwin

nightreaditigerlet

waitseasonsomewing

rainsnow

weatherstormsunwindcold

outsiddai

thunderdegresunnicloudbuthere

#iranelectiran

avatarsupport

addiranian

democraciprotest1-click#irangreen#gr88tehranoverlai

#twibbon

obamahealthsenatstatepresidnewbill

courttax

#tcotvotelawfundblog

govern


ASUM: uncovering the hidden semantic structure of aspects and sentiments

Alice Oh [email protected] Jo [email protected]://uilab.kaist.ac.krhttp://uilab.kaist.ac.kr






http://uilab.kaist.ac.kr

http://uilab.kaist.ac.kr

Date post:	26-Jan-2015
Category:	Technology
Upload:	alice-oh
View:	110 times
Download:	1 times

Aspect and Sentiment Unification Model

Technology