User Trait Expression and Portrayal through Social Media€¦ · Political Orientation Lib/Cons...

User Trait Expression and Portrayalthrough Social Media

Daniel Preotiuc-Pietro

Bloomberg LP

1 November 2018

Context

The availability of large scale user generated data provides thecontext for new applications and research.

The key elements are:• metadata

• user• time• location

• volume• diversity

• text• images• network information

Context

The availability of large scale user generated data provides thecontext for new applications and research.

The key elements are:• metadata

• user• time• location

• volume• diversity

• text• images• social connections

User Traits and Text

Hypothesis

User generated text reveals individual differences in bothdemographic and psychological traits.

Demographic Traits

• Age (Rao et al. 2010, ACL)• Gender (Burger et al. 2011, EMNLP)• Location (Eisenstein et al. 2010, EMNLP)• Political Orientation (Volkova et al. 2014, ACL)

Demographic Traits

• Age (Rao et al. 2010, ACL)• Gender (Burger et al. 2011, EMNLP)• Location (Eisenstein et al. 2010, EMNLP)• Political Orientation (Volkova et al. 2014, ACL)• Popularity (Lampos et al. 2014, EACL)• Occupation (Preotiuc-Pietro et al. 2015, ACL)• Income (Preotiuc-Pietro et al. 2015, PLoS ONE)• Political Ideology (Preotiuc-Pietro et al. 2017, ACL)• Race (Preotiuc-Pietro & Ungar 2018, COLING)

Psychological Traits

Psychological traits:• Mental illness (Coppersmith et al. 2014, ACL)• Personality (Schwartz et al. 2013, PLoS ONE)• Empathy (Abdul-Mageed et al. 2017, ICWSM)

Psychological Traits

Psychological traits:• Mental illness (Coppersmith et al. 2014, ACL)• Personality (Schwartz et al. 2014, PLoS ONE)• Empathy (Abdul-Mageed et al. 2017, ICWSM)• ‘Dark Triad’ Personality (Preotiuc-Pietro et al. 2016, CIKM)• Active Open-Minded Thinking (Carpenter et al. 2018,

JDM, in press)

Aspects

1. Data2. Prediction3. Insight

Example: Political Ideology

Daniel Preotiuc-Pietro, Liu Ye, Daniel J Hopkins, and Lyle Ungar. “Beyond BinaryLabels: Political Ideology Prediction of Twitter Users”. In: ACL. 2017.

Data

Social media data analysis:

• Unobtrusive• Observe behaviors, rather than self-reported

• Access to data from a larger and more diverse population• Traditional social science research is based on convenience

lab samples

• Access to both historical and real-time data• Fine spatial granularity

Data - Ethics

Twitter – profiles are public by default

Facebook/Instagram – users provide informed consent to sharedata

User-trait analysis requires trait-level information and,provided through surveys, is sensitive and is anonymised.

All studies were approved by the institutional Internal ReviewBoard (IRB).

Data - Example

We collected a new data set:• 3.938 users (4.8M tweets)• public Twitter handles with >100 posts

Political ideology is reported through an online survey:• our use case is US politics• the major US ideology spectrum is Conservative – Liberal• seven point scale• additionally reported age, gender and other demographics


Data - Applications

Social media data enables new types of applications andstudies.

Real-time passive polling:

Prediction

Prediction Insight

Perspective NLP/ML Social Science

Goal Models to predict traits of unknownusers

Gain a better understanding ofgroup behaviors and differences

Framing Predictive task Exploring/testing hypotheses

Methods Regression/Classification Statistical hypothesis testingInterpretable featuresUse domain experts in analysis

Prediction - Example

• Linear Regression• Learning: V. Conservative (1) – V. Liberal (7)• Engagement: Neural (4) – Moderate C/L (3&5) – C/L (2&6)

– Very C/L (1&7)• 10 fold-cross validation• Range of linguistic features• Evaluation – Pearson R between predictions and true labels

.294

.165

.286

.149

.300

.169.145

.079

.256

.169

.369

.196

.00

.10

.20

.30

.40

Leaning Engagement

Unigrams LIWC Topics Emotions Political All


Prediction - Applications

Applications of predictive models of user traits:

• Improving downstream NLP tools:• sentiment analysis• text classification

• Personalised AI applications:• machine translation• dialogue systems with an identity

• Uncover and adjust model biases• Control for demographic biases in data analysis• Marketing or Targeted ads• Measure communities in real-time over space and time

Insight

Prediction Insight

Perspective NLP/ML Social Science

Goal Models to predict traits of unknownusers

Gain a better understanding ofgroup behaviors and differences

Framing Predictive task Exploring/testing hypotheses

Methods Regression/Classification Statistical hypothesis testingInterpretable featuresUse domain experts in analysis

Insight - Example

Differences between moderate and extreme users

Words associated with moderateliberals (5 and 6).

Words associated with extremeliberals (7).

relative frequency

a aacorrelation strength

Correlations are age and gender controlled


Applications - Insight

Insight allows us to:

• Gain a better understanding of:• human behaviors• language use• linguage change• cultural differences• stylistic differences• pragmatic differences• human stereotypes

• Confirm or generate new data-driven hypotheses

Aspects

1. Data2. Prediction3. Insight

All steps pose unique challenges and implications.

Aspects

This talk will try to address some of these aspects:

1. Data• User sampling• Trait collection

2. Prediction3. Insight

• Content• Phrase choice• Style• Pragmatic roles

Aspects


1. Data collection• User sampling• Trait collection



User sampling

Collecting representative gold data for training models.

For political orientation, previous NLP research collected users:


User sampling

Our hypotheses:

1. These users are far more likely to be politically engaged2. The prediction problem was over-simplified3. Neutral users are not accounted for4. There are differences between moderate and extreme users

on the same side


Engagement

Data set obtained using previous methods

2.64 2.95

0.73

0.79

0.11

0.18

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00Political word usage across

user groups

Media/Pundit Names

Politician Names

Political Words

Average percentage of political word usage


Engagement

Our data set (survey-based, 7 point ideology scale)

2.64 0.76 0.55 0.42 0.36 0.46 0.51 0.76 2.95

0.73

0.24

0.140.07 0.07

0.09 0.12

0.19

0.79

0.11

0.03

0.03

0.02 0.020.03

0.03

0.04

0.18

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50


user groups

Media/Pundit Names

Politician Names

Political Words



Engagement

Our data set (survey-based, 7 point ideology scale)

2.64 0.76 0.55 0.42 0.36 0.46 0.51 0.76 2.95

0.73

0.24

0.140.07 0.07

0.09 0.12

0.19

0.79

0.11

0.03

0.03

0.02 0.020.03

0.03

0.04

0.18

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50


user groups

Media/Pundit Names

Politician Names

Political Words



Over-simplification

The prediction problem was over-simplified

.891

.785

.662

.581

.972

.785

.679

.590

.976

.789

.690

.625

.5

.6

.7

.8

.9

1.0

CvL 1v7 2v6 3v5

Topics Political Terms Domain Adaptation

ROC AUC, Logistic Regression, 10 fold-cross validation.


User sampling

Take aways:

• 3x more political terms for automatically identified userscompared to the highest survey-based scores

• Performance drops by 15% even when predicting extremeusers

• Performance drops by 35% to close to random whenpredicting between politically moderates

User sampling has a important impact in experimental results.


Aspects





Trait collection

Trait collection: Identifying the trait value for users.

Several common methods exist:

1. Self-report2. Distant Supervision3. Perception (Annotation)4. Survey-based

Trait collection

1. Self-Report

• Method:• Mining profile descriptions• Mining tweet contents• Mining network connections• Processing profile images

• Advantages:• Large volume• Easy to implement

• Disadvantages:• Sample biases - some groups of users are more likely to

self-disclose personal information• Data usually required post-filtering due to false positives

Trait collection

2. Distant Supervision

• Method:• Map users to community statistics (e.g. Census data)

• Advantages:• Very large volume• Wide variety of traits have community statistics

• Disadvantages:• Statistics may be outdated• Twitter population is a biased sample of the general

population• Users that can be geolocated are not representative of the

Twitter population• Geo-located tweets might be posted from a different

location than the user’s home

Trait collection

3. Perception

• Method:• Human annotation of profiles, including text

• Advantages:• Accurate for common traits• Medium volume

• Disadvantages:• Contains systematic biases and stereotypes of particular

traits• Models trained on this data will capture only the

perception of the annotator

Trait collection

4. Survey-based

• Method:• Ask users for trait information through surveys

• Advantages• Collect information from the actual users• Can collect multiple traits• Can collect less common psychological traits

• Disadvantages:• Costly / Low volume• May be untruthful – but we can safeguard

Trait collection - Comparison

Comparing trait collection methods, race prediction, evaluatedon survey-based traits.

Daniel Preotiuc-Pietro and Lyle Ungar. “User-Level Race and Ethnicity Predictorsfrom Twitter Text”. In: COLING. 2018.

Survey-based vs. Perceived

We studied how the two differ in relation to demographic traits.

Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, andDaniel Preotiuc-Pietro. “Analyzing Biases in Human Perception of User Age andGender from Text”. In: ACL. 2016.

Experimental Setup

20 Tweets/user

9 ratings/user

Forced choice guess

Self-rated confidence (1-5)

Real traits known inadvance through

self-reports

This way we isolate the textual cues from any other profilerelated cues (screen name, profile pic, etc)


Data Set

Trait Outcome #Users #RatersGender M/F 2607 1083Age Integer 1066 737Education Adv/BSc/HS 900 481Political Orientation Lib/Cons 2500 943

Data set statistics


Human perception accuracy

.517

.330

.500

.000

.757

.445

.816

.416

.858

.488

.903

.631

.0

.1

.2

.3

.4

.5

.6

.7

.8

.91.0

Gender (%) Education (%) PoliticalOrientation (%)

Age (r)

Random Accuracy Majority/Average Guess

People are usually correct.


Inaccurate Gender Stereotypes

Trained two models on the same data with:• perceived labels• real labels

Training on perceived traits introduces a systematic biasLucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, and

Daniel Preotiuc-Pietro. “Analyzing Biases in Human Perception of User Age andGender from Text”. In: ACL. 2016.


40.1

6.17.9

45.8

0

10

20

30

40

50

Males Females

Pred. Male Pred. Female

Model predictions.

42.2

9.97.2

40.7

0

10

20

30

40

50

Males Females

Perc. Male Perc. Female

Human guesses.

Model trained on >10,000 users with self-reported gender.



40.1

6.17.9

45.8

0

10

20

30

40

50

Males Females


Model predictions.

42.2

9.97.2

40.7

0

10

20

30

40

50

Males Females


Human guesses.

The accuracies for correct predictions are reversed



40.1

6.17.9

45.8

0

10

20

30

40

50

Males Females


Model predictions.

42.2

9.97.2

40.7

0

10

20

30

40

50

Males Females


Human guesses.

The accuracies for incorrect predictions are also reversed!



Words more likely to be associatedwith females among male authors

Words more likely to be associatedwith males among female authors

The size of the word is the strength to which they’re inaccuratestereotypes i.e. ’love’ is more likely to mislead people inguessing female compared to ’wonderful.’


Controlling Perception

Can we control human perception of demographic traits?

We restrict to selecting tweets from the user’s timeline.Daniel Preotiuc-Pietro, Sharath Chandra Guntuku, and Lyle Ungar. “Controlling

Human Perception of Basic User Traits”. In: EMNLP. 2017.

Controlling Perception

Annotator accuracy on predicting gender in the threeconditions.

76.66%

40.67%35.99%

55.73%

32.26%

23.47%

91.33%

47.83%43.50%

0%

25%

50%

75%

100%

Overall Females Males

Random Opposite Same

Daniel Preotiuc-Pietro, Sharath Chandra Guntuku, and Lyle Ungar. “ControllingHuman Perception of Basic User Traits”. In: EMNLP. 2017.

Beyond Survey-based Methods

Survey-based screening methods for mental illnesses areimperfect.

Mental illness is less likely to be self reported due to lack ofawareness or social stigma.

Surveys may not be the best tool for collecting ’gold’ labels.

Social media can be an alternative.

Johannes Eichstaedt et al. “Facebook Language Predicts Depression in MedicalRecords”. In: PNAS. 2018.

Beyond survey-based methods

We linked medical records with clinical diagnosis of depressionto Facebook data.

Johannes Eichstaedt et al. “Facebook Language Predicts Depression in MedicalRecords”. In: PNAS. 2018.

Aspects





Content analysis

There are differences between neutral users and ideologicallyextreme users.

Words associated with eitherextreme conservative or liberal

Words associated with neutralusers


Correlations are age and gender controlled. Extreme groups arecombined using matched age and gender distributions.


Content analysis

There are differences between moderate and extreme users onthe same side.

Words associated with moderateliberals (5 and 6).

Words associated with extremeliberals (7).

relative frequency


Correlations are age and gender controlledDaniel Preotiuc-Pietro, Liu Ye, Daniel J Hopkins, and Lyle Ungar. “Beyond Binary

Labels: Political Ideology Prediction of Twitter Users”. In: ACL. 2017.

Content analysis

Rank Correlation Topic (most frequent words)1 .116 hilarious, celeb, capaldi, corrie, chatty, corden,

barrowman2 .106 photo, art, pictures, photos, instagram, photoset,

image3 .106 hot, sex, naked, adult, teen, porn, lesbian, tube,

tits4 .087 turn, accidentally, barely, constantly, onto, bug,

suddenly5 .086 ha, ooo, uh, ohhh, ohhhh, maam, gotcha, gee,

ohhhhh

LIWC-1 .104 hfuck, gay, sex, sexy, dick, naked, fucks, cock,aids, cum

LIWC-2 .088 hate, fuck, hell, stupid, mad, sucks, suck, war,dumb, ugly

Word2Vec topics with the highest Pearson correlation betweenmoderately liberal users and moderately conservative users(gender/age controlled).


Aspects





Phrase Choice

Which word is more likely to be used by a female ?

Charming – Fascinating

Daniel Preotiuc-Pietro, Wei Xu, and Lyle Ungar. “Discovering User AttributeStylistic Differences via Paraphrasing”. In: AAAI. 2016.

Phrase Choice


Charming – Fascinating


Phrase Choice

Which word is more likely to be used by an older person?

Impressive – Amazing


Phrase Choice


Impressive – Amazing


Phrase Choice

Which word is more likely to be used by a person of higheroccupational class ?

Suggestions – Proposals


Phrase Choice


Suggestions – Proposals


Phrase Choice


Brutal – Fierce


Phrase Choice


Brutal – Fierce


Phrase Choice


Defensive – Protective


Phrase Choice


Defensive – Protective


Phrase Choice


Humour – Wit


Phrase Choice


Humour – Wit


Phrase Choice

68.5%73.7%

67.2%

50%

60%

70%

80%

90%

100%

Gender Age Occ.Class


Phrase Choice

The method for quantifying phrase choice is straightforward:

Gender(w) = log(

Female(w)Male(w)

)(1)

Within a paraphrase pair (w1,w2), the differenceGender(w1) −Gender(w2) is the stylistic distance.

We use only equivalent paraphrases of 1–3 grams from PPDB2.0.

Statistics are computed over large Twitter data sets with usertraits.


Phrase Choice

Study which attributes of words in a pair are preferred by onegroup:

• Word Length in Characters• Word Length in Syllables

Simple proxies for word complexity

• Affective Norms: Valence, Arousal, Dominance14k rated wordsValence: suicide (0.15)→ bacon (0.70)→ laughter (1)

• Concreteness40k rated words: spirituality (1)→morning (3.44)→ tiger (5)

• Age of Acquisition30k rated words: great (5.05)→ splendid (7.22)→ tremendous (10.63)

• More in the paper ...


Phrase Choice

-.048

-.051

-.053

.047

.089

-.037

-.022

-.028

.077

.158

-.124

-.026

-.034

.110

.211

-0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25

Concreteness

Happiness

Word Rareness

# Syllables

Word Length

Occ.Class (High) Age (>30) Gender (M)

Correlation coefficients between paraphrase pair worddifferences and user group differences in usage.


Phrase Choice

.163

-.068

-.043

-.012

-.041

.067

.182

-.002

-.014

.036

-.001

.050

.045

.097

-.060

.010

.031

.028

.050

.047

.080

-.032

-.007

.030

.005

.040

.016

.010

-.014

.023

.000

-.024

.004

-.020

-.065

-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200

Age of Acquisition

Concreteness

Dominance

Arousal

Happiness

#Syllables

Word Length

Openess Conscientiousness Extraversion Agreeableness Neuroticism

Correlation coefficients between paraphrase pair preferenceand user group usage.

Daniel Preotiuc-Pietro, Jordan Carpenter, and Lyle Ungar. “Personality DrivenDifferences in Paraphrase Preference”. In: NLP+CSS Workshop, ACL. 2017.

Phrase Choice

.163

-.068

-.043

-.012

-.041

.067

.182

-.002

-.014

.036

-.001

.050

.045

.097

-.060

.010

.031

.028

.050

.047

.080

-.032

-.007

.030

.005

.040

.016

.010

-.014

.023

.000

-.024

.004

-.020

-.065

-.200 -.150 -.100 -.050 .000 .050 .100 .150 .200

Age of Acquisition

Concreteness

Dominance

Arousal

Happiness

#Syllables

Word Length

Openess Conscientiousness Extraversion Agreeableness Neuroticism

Correlation coefficients between paraphrase pair preferenceand user group usage.

Daniel Preotiuc-Pietro, Jordan Carpenter, and Lyle Ungar. “Personality DrivenDifferences in Paraphrase Preference”. In: NLP+CSS Workshop, ACL. 2017.

Aspects





Stylistic Differences

Correlations of stylistic features with age and income.

0.3 0.2 0.1 0.0 0.1 0.2 0.3

Income r

0.3

0.2

0.1

0.0

0.1

0.2

0.3

Age r

# Char/Token

# Tokens/Tweet

# Chars/Tweet

#words>5char

Type/token RatioPunctuation

Smileys

URLs

ARIF-Kincaid

Coleman-Liau

Flesch RE

FOGSMOG

LIX

Nouns

Verbs

Pronouns

Adverbs

Adjectives

Determiners

Interjections

Named entitiesContextuality

Abstract

Hedging

Specific

Elongations

Hapax legom.

Surface

Readability

Syntax

Style

Lucie Flekova, Lyle Ungar, and Daniel Preotiuc-Pietro. “Exploring StylisticVariation with Age and Income on Twitter”. In: ACL. 2016.

Stylistic Differences

Specificity – quantifies how much detail is engaged in text.

1 – Always too much.

5 – Mascara is the most commonly worn cosmetic, and women will spend an average of $4,000

on it in their lifetimes

Yifan Gao, Yang Zhong, Daniel Preotiuc-Pietro, and Junyi Jessy Li. “Predicting andAnalyzing Language Specificity in Social Media Posts”. In: AAAI. 2019.

Aspects





Pragmatic roles

Vulgar words are often used in communication (1%)

Despite this, they are a restricted set of words (100)

Demographic traits impact how often users employ vulgarwords online (correlations with % vulgar use):

Isabel Cachola, Eric Holgate, Daniel Preotiuc-Pietro, and Junyi Jessy Li.“Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentimentanalysis in social media”. In: COLING. 2018.

Pragmatic roles

Vulgarity is employed purposefully

Vulgar words are used for different pragmatic functions

We identified six different pragmatic functions

We annotated 8,524 instances of vulgar words across 7,800tweets from users with known demographic traits.

Eric Holgate, Isabel Cachola, Daniel Preotiuc-Pietro, and Junyi Jessy Li. “WhySwear? Analyzing and Inferring the Intentions of Vulgar Expressions”. In: EMNLP.2018.

Pragmatic roles

1. Express aggression (15.2%)

The word is used in order to harm the person or group thetweet is about.

USER You are an ass Your industry is full of assholes and you do nothing to improve (...)


Pragmatic roles

2. Express emotion (24.8%)

The word is used to express emotions (positive or negative)related to the users internal states, exclamations, feelings orattitudes towards an object. If removing the vulgar term, theexpressed emotion is lacking.

There are so many things I want to do, But investing in equipment is a pain in the ass


Pragmatic roles

3. Emphasise (29.8%)

The word is used to emphasize a statement or feeling.

today is a good ass day URL


Pragmatic roles

4. Auxiliary (17.0%)

The use of this word is simply a manner of speaking and doesnot fit any of the above descriptions. Descriptions of externalemotions (those of someone else) fall into this category.

Wish USER could save my ass on these exams like he used to


Pragmatic roles

5. Signal Group Identity (4.7%)

This word is used as a marker of identity in a specific socialgroup.

Now this is a group of ass kickers


Pragmatic roles

6. Non-Vulgar (8.2%)

The use of this word is not vulgar (e.g., named entities thatinvolve vulgar words).

Kick Ass 2 - Red Band Trailer URL


Take Aways

• Data collection poses challenges:• Sampling biases• Label collection

• Insight is important for social science and obtainedthrough• Interpretable modelling and prediction methods• Linguistically motivated features• Collaboration with domain experts• Traditional social science approaches• Quasi-experimental methods

Thank You!

Thank you to my amazing collaborators:

Thank You!

Thank you!

Date post:	06-Oct-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

User Trait Expression and Portrayal through Social Media€¦ · Political Orientation Lib/Cons...

Documents