Observational studies in social media

Observational Studies

Class Data Mining Technology for Business and SocietyProgram M. Sc. Data ScienceUniversity Sapienza University of RomeSemester Spring 2016Lecturer Carlos Castillo http://chato.cl/

Sources:● Multiple papers, see beginning of each section.

http://chato.cl/

Matching is a popular technique

Randomized controlled experiment

1.Response of subjects assigned to treatment compared to response of subjects assigned to control

2.Assignment of subjects to groups is done using a randomization device

3.Treatment is under the control of a researcher

Matching observational study

1.Response of subjects assigned to treatment compared to response of subjects assigned to control

2.Assignment of subjects to control is done matching characteristics and size of treatment group

3.Treatment is not under the control of a researcher

Matching design: hurricanes and online friendships

Phan, Tuan Q., and Edoardo M. Airoldi. "A natural experiment of social network formation and dynamics." Proceedings of the National Academy of Sciences 112.21 (2015): 6595-6600.

Example: US universitiesand Hurricane Ike in 2008

Phan, Tuan Q., and Edoardo M. Airoldi. "A natural experiment of social network formation and dynamics." Proceedings of the National Academy of Sciences 112.21 (2015): 6595-6600.

Treatment n=5Control n=10Study group n=130

Selection of control group

● Facebook posts from 1.5M students in 130 universities

● Matched 5 affected with 10 unaffected:– Similar: size, college

ranking according to USNews, whether these colleges are public or private institutions, tuition fees, and other regional factors

Results(red=treatment, blue=control)

Both undergo densification

Treatment has larger clustering coefficient(more triangles)

Matching design: exercise and stressas reflected on Twitter

Dos Reis, Virgile Landeiro, and Aron Culotta: Using matched samples to estimate the effects of exercise on mental health from Twitter. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9960

Exercise Mood

Design

1) Detect exercise at time t1

– Post a message containing hashtag #runkeeper, #nikeplus, #runtastic, #endomondo, #mapmyrun, #strava, #cyclemeter, #fitstats, #mapmyfitness, or #runmeter.

2) Measure mood at time t2 > t1

– Automatic classifier of mood, three important classes: hostility (or anger), depression (or dejection), anxiety (or tension)

Mood classifier

● Hostility (or anger)● “shut your freaking yaphole”

● Depression (or dejection)● “such a horrible day”

● Anxiety (or tension)● “nervous for Monday”

Control = Random users(same country and language)

Hostility Dejection Anxiety

-25

-20

-15

-10

-5

0-21.1 -5.4 -7.9

Per

cent

cha

nge

afte

r e x

erci

sing

Problem: missing variables

Exercise Mood

Demographics

Matching method ...

● For each user in treatment, find another user that:– Is a reciprocal friend of the user

– In same city/state

– With same gender

– Closest number of followers, followees, tweets

Control = Matched users (blue)

Hostility Dejection Anxiety

-25

-20

-15

-10

-5

0-21.1 -5.4 -7.9

0.9-3.9 -2.7

Per

cent

cha

nge

afte

r e x

erci

sing

Can you guess a possible explanation?

%Female

%from CA

0 10 20 30 40 50 60

random controlmatched controltreatment

#Followers

#Friends

0 50 100 150 200 250 300 350 400 450 500

random controlmatched controltreatment

Matching design and difference-in-differences: question answering sites

Hüseyin Oktay, Brian J. Taylor, and David D. Jensen. 2010. Causal discovery in social media using quasi-experimental designs. SOMA 2010.

Stack Overflow

Research question

● What happens after an answer is accepted?

● Does this inhibit people from answering?

The lifetime of a question

● Most answers are received shortly after a question is posed

● Over time, fewer answers are received● At some point, an answer might be accepted by

the asker

Measurement

● Rate after

● Rate before

● Answer rate change

Results

● Results indicate that the average answer rate change is negative, i.e. there are less answers after an answer is selected

What is suspicious about this?

Matching design

Treatment Control (matched)

The matched question: (1) has no accepted answer by t+Δt, (2) has similar Nt/t, and (3) has similar Nt+Δt/Δt

Difference-in-differences

● Difference-in-differences:

Results

● The matching design shows that the answering rate change is more positive for treatment questions (those having a selected answer)

● Having a selected answer slows down the reduction in answering rate = more answers!

Propensity score matching: actions and outcomes

Alexandra Olteanu, Onur Varol, and Emre Kıcıman, Towards an Open-Domain Framework for Distilling the Outcomes of Personal Experiences from Social Media Timelines, in International Conference on Web and Social Media (ICWSM), AAAI - Association for the Advancement of Artificial Intelligence, 17 May 2016. [link]

All the slides from this section from author's talks:

http://research.microsoft.com/apps/pubs/default.aspx?id=264353

Have a question? Ask the Internet!should i go to law school

should i take a multivitamin

should i text her or wait for her to text me

should i join the military

should i leave my husband

should i get married

should i pop a burn blister

should i see a doctor

should i consolidate my student loans

should i do cardio before or after weights

should i get a tattoo

Idea

● Open-domain system to extract ...

Situation → Action → Outcomes

● … from social media● Assume there will be many mistakes● Attempt the best possible design

Example

T1: “I got a kitten! We named

her Versace :-)”

Example

T1: “I got a kitten! We named

her Versace :-)”

T2: “No sleep because the damn kitten is nuts!”

Basic operations

(1) Extract timelines

(2) Match events

(3) Precedents and subsequents

Many sub-problems

● Identification of experiential messages● Timestamping event occurrences● Recognition and canonicalization events● Identification of precedent and subsequent

events● Identification of positive and negative valence

of events

Experiential messages classifierPersonal Experiences Other (news, 3rd person,

etc.)

Just completed a 15.72km run with @RunKeeper. Check it out! <URL> #RunKeeper

New campaign to protect children from second hand smoke launched <URL>

Just to set the mood I brought some Marvin Gaye and Chardonnay

Whoa. The kid from Cincinnati just suffered a horrible injury. Not good.

Lacrosse is so much fun why didn’t I start earlier lol

@Bob I hear you.

Oh yeah guys we got a new puppy

@Charlie did you enjoy your night at the club?

Naïve-Bayes classifier • Features = collocated

tokens• 10k labeled tweets.• Fleiss’ kappa = 0.325

26% of tweets mention personal experiences8% mention goals/desires66% are news/3rd person or other tweets

Event identification

I got a new kitten and he has blue eyes and stripes and I need a good name but

nothing that’s normal

I got a new kitten

he has blue eyes

but nothing that’s normal

stripes

I need a good name

== got a cat, got a new cat, …

Kıcıman, Emre, and Matthew Richardson. "Towards decision support and goal achievement: Identifying action-outcome relationships from social media." KDD 2015. [link]

http://dx.doi.org/10.1145/2783258.2783310

Alignment

Alignment and matching

Compare withboth neighboring quadrants

Example subsequentsEvent Example PosNeg

Pros cat named We just got a cat and named it Versace

0.70

I’ve got a cat I’ve got a kitten asleep on my lap, and my heart has softened.

0.67

Love my new kitten

I love my new kitten 0.88

Cons Ran upstairs But I ran upstairs and fell and now my head hurts

0.20

Damn kitten … no sleep because the damn kitten kept going nuts…

0.22

Cat is literally My cat is literally the devil 0.31

Example precedents● Event: “personal record” in marathon

Days Before Marathon

Improving matching

● Matching ideally should take many elements into account

● Can we take all the elements we know?– Yes!

● Propensity matching matches by P(T=1)

Propensity matching stratification

Propensity matching stratification

Features of a user are all of their past events

PS Estimator trained w/average perceptron learning algorithm; extracted timelines are training data.

Decile stratification

Propensity score matching

● You got a kitten● According to what's known about your past,

your probability of getting a kitten was x● You will be matched with someone whose

probability of getting a kitten was also x– But who did not get a kitten

● Every strata has a different unbalance– Which is predictable

Matching design

● 39 situations in 9 groups● Outcome is binary

variable● Average effect

P(outcome|T) - P(outcome|C)

Example:having high triglycerides level

Outcome Count Absolute Increase

Z-Score

Your_risk 46 24.8% 18.12

Statin 48 23.1% 17.69

Lower 120 35.9% 17.18

Cardiovascular 54 23.0% 16.72

Healthy_diet 55 19.3% 16.54

Fatty_acid 29 18.3% 16.37

Help_prevent 73 26.9% 16.01

Risk_factor 33 18.3% 15.55

Fish_oil 48 24.4% 15.42

inflammation 78 25.1% 15.30

Example:having belly fat

Outcome Count Absolute Increase Z-Score

Burn 156 62.2% 8.96

Ab_workout 13 8.5% 5.82

Workout_lose 13 8.5% 5.82

Help_burn 8 11.1% 5.82

add_video 26 14.0% 5.75

url_playlist 26 14.0% 5.75

Fitness 39 18.6% 5.51

Ab 43 19.1% 5.51

Playlist_mention 30 15.3% 5.39

Biceps 7 4.7% 4.74

Evaluation

Labeling by non-experts (Mechanical Turk workers)Usual measures: precision and recall

Precision @ Rank

Summary

● No matching– Requires randomization into treatment and control

groups

● Matching– Ideally is done on all known variables

● Propensity score matching– Powerful tool to combine known variables

● Be very skeptical about your results!

EventInstall net to keep cat inside the house

OutcomeLearning that cats do whatever they want

Date post:	09-Apr-2017
Category:	Technology
Upload:	carlos-castillo-chato
View:	247 times
Download:	0 times