Date post: | 09-Apr-2017 |
Category: |
Technology |
Upload: | carlos-castillo-chato |
View: | 247 times |
Download: | 0 times |
Observational Studies
Class Data Mining Technology for Business and SocietyProgram M. Sc. Data ScienceUniversity Sapienza University of RomeSemester Spring 2016Lecturer Carlos Castillo http://chato.cl/
Sources:● Multiple papers, see beginning of each section.
Matching is a popular technique
Randomized controlled experiment
1.Response of subjects assigned to treatment compared to response of subjects assigned to control
2.Assignment of subjects to groups is done using a randomization device
3.Treatment is under the control of a researcher
Matching observational study
1.Response of subjects assigned to treatment compared to response of subjects assigned to control
2.Assignment of subjects to control is done matching characteristics and size of treatment group
3.Treatment is not under the control of a researcher
Matching design: hurricanes and online friendships
Phan, Tuan Q., and Edoardo M. Airoldi. "A natural experiment of social network formation and dynamics." Proceedings of the National Academy of Sciences 112.21 (2015): 6595-6600.
Example: US universitiesand Hurricane Ike in 2008
Phan, Tuan Q., and Edoardo M. Airoldi. "A natural experiment of social network formation and dynamics." Proceedings of the National Academy of Sciences 112.21 (2015): 6595-6600.
Treatment n=5Control n=10Study group n=130
Selection of control group
● Facebook posts from 1.5M students in 130 universities
● Matched 5 affected with 10 unaffected:– Similar: size, college
ranking according to USNews, whether these colleges are public or private institutions, tuition fees, and other regional factors
Results(red=treatment, blue=control)
Both undergo densification
Treatment has larger clustering coefficient(more triangles)
Matching design: exercise and stressas reflected on Twitter
Dos Reis, Virgile Landeiro, and Aron Culotta: Using matched samples to estimate the effects of exercise on mental health from Twitter. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.
Exercise Mood
Design
1) Detect exercise at time t1
– Post a message containing hashtag #runkeeper, #nikeplus, #runtastic, #endomondo, #mapmyrun, #strava, #cyclemeter, #fitstats, #mapmyfitness, or #runmeter.
2) Measure mood at time t2 > t1
– Automatic classifier of mood, three important classes: hostility (or anger), depression (or dejection), anxiety (or tension)
Mood classifier
● Hostility (or anger)● “shut your freaking yaphole”
● Depression (or dejection)● “such a horrible day”
● Anxiety (or tension)● “nervous for Monday”
Control = Random users(same country and language)
Hostility Dejection Anxiety
-25
-20
-15
-10
-5
0-21.1 -5.4 -7.9
Per
cent
cha
nge
afte
r e x
erci
sing
Problem: missing variables
Exercise Mood
Demographics
Matching method ...
● For each user in treatment, find another user that:– Is a reciprocal friend of the user
– In same city/state
– With same gender
– Closest number of followers, followees, tweets
Control = Matched users (blue)
Hostility Dejection Anxiety
-25
-20
-15
-10
-5
0-21.1 -5.4 -7.9
0.9-3.9 -2.7
Per
cent
cha
nge
afte
r e x
erci
sing
Can you guess a possible explanation?
%Female
%from CA
0 10 20 30 40 50 60
random controlmatched controltreatment
#Followers
#Friends
0 50 100 150 200 250 300 350 400 450 500
random controlmatched controltreatment
Matching design and difference-in-differences: question answering sites
Hüseyin Oktay, Brian J. Taylor, and David D. Jensen. 2010. Causal discovery in social media using quasi-experimental designs. SOMA 2010.
Stack Overflow
Research question
● What happens after an answer is accepted?
● Does this inhibit people from answering?
The lifetime of a question
● Most answers are received shortly after a question is posed
● Over time, fewer answers are received● At some point, an answer might be accepted by
the asker
Measurement
● Rate after
● Rate before
● Answer rate change
Results
● Results indicate that the average answer rate change is negative, i.e. there are less answers after an answer is selected
What is suspicious about this?
Matching design
Treatment Control (matched)
The matched question: (1) has no accepted answer by t+Δt, (2) has similar Nt/t, and (3) has similar Nt+Δt/Δt
Difference-in-differences
● Difference-in-differences:
Results
● The matching design shows that the answering rate change is more positive for treatment questions (those having a selected answer)
● Having a selected answer slows down the reduction in answering rate = more answers!
Propensity score matching: actions and outcomes
Alexandra Olteanu, Onur Varol, and Emre Kıcıman, Towards an Open-Domain Framework for Distilling the Outcomes of Personal Experiences from Social Media Timelines, in International Conference on Web and Social Media (ICWSM), AAAI - Association for the Advancement of Artificial Intelligence, 17 May 2016. [link]
All the slides from this section from author's talks:
Have a question? Ask the Internet!should i go to law school
should i take a multivitamin
should i text her or wait for her to text me
should i join the military
should i leave my husband
should i get married
should i pop a burn blister
should i see a doctor
should i consolidate my student loans
should i do cardio before or after weights
should i get a tattoo
Idea
● Open-domain system to extract ...
Situation → Action → Outcomes
● … from social media● Assume there will be many mistakes● Attempt the best possible design
Example
T1: “I got a kitten! We named
her Versace :-)”
Example
T1: “I got a kitten! We named
her Versace :-)”
T2: “No sleep because the damn kitten is nuts!”
Basic operations
(1) Extract timelines
(2) Match events
(3) Precedents and subsequents
Many sub-problems
● Identification of experiential messages● Timestamping event occurrences● Recognition and canonicalization events● Identification of precedent and subsequent
events● Identification of positive and negative valence
of events
Experiential messages classifierPersonal Experiences Other (news, 3rd person,
etc.)
Just completed a 15.72km run with @RunKeeper. Check it out! <URL> #RunKeeper
New campaign to protect children from second hand smoke launched <URL>
Just to set the mood I brought some Marvin Gaye and Chardonnay
Whoa. The kid from Cincinnati just suffered a horrible injury. Not good.
Lacrosse is so much fun why didn’t I start earlier lol
@Bob I hear you.
Oh yeah guys we got a new puppy
@Charlie did you enjoy your night at the club?
Naïve-Bayes classifier • Features = collocated
tokens• 10k labeled tweets.• Fleiss’ kappa = 0.325
26% of tweets mention personal experiences8% mention goals/desires66% are news/3rd person or other tweets
Event identification
I got a new kitten and he has blue eyes and stripes and I need a good name but
nothing that’s normal
I got a new kitten
he has blue eyes
but nothing that’s normal
stripes
I need a good name
== got a cat, got a new cat, …
Kıcıman, Emre, and Matthew Richardson. "Towards decision support and goal achievement: Identifying action-outcome relationships from social media." KDD 2015. [link]
Alignment
Alignment and matching
Compare withboth neighboring quadrants
Example subsequentsEvent Example PosNeg
Pros cat named We just got a cat and named it Versace
0.70
I’ve got a cat I’ve got a kitten asleep on my lap, and my heart has softened.
0.67
Love my new kitten
I love my new kitten 0.88
Cons Ran upstairs But I ran upstairs and fell and now my head hurts
0.20
Damn kitten … no sleep because the damn kitten kept going nuts…
0.22
Cat is literally My cat is literally the devil 0.31
Example precedents● Event: “personal record” in marathon
Days Before Marathon
Improving matching
● Matching ideally should take many elements into account
● Can we take all the elements we know?– Yes!
● Propensity matching matches by P(T=1)
Propensity matching stratification
Propensity matching stratification
Features of a user are all of their past events
PS Estimator trained w/average perceptron learning algorithm; extracted timelines are training data.
Decile stratification
Propensity score matching
● You got a kitten● According to what's known about your past,
your probability of getting a kitten was x● You will be matched with someone whose
probability of getting a kitten was also x– But who did not get a kitten
● Every strata has a different unbalance– Which is predictable
Matching design
● 39 situations in 9 groups● Outcome is binary
variable● Average effect
P(outcome|T) - P(outcome|C)
Example:having high triglycerides level
Outcome Count Absolute Increase
Z-Score
Your_risk 46 24.8% 18.12
Statin 48 23.1% 17.69
Lower 120 35.9% 17.18
Cardiovascular 54 23.0% 16.72
Healthy_diet 55 19.3% 16.54
Fatty_acid 29 18.3% 16.37
Help_prevent 73 26.9% 16.01
Risk_factor 33 18.3% 15.55
Fish_oil 48 24.4% 15.42
inflammation 78 25.1% 15.30
Example:having belly fat
Outcome Count Absolute Increase Z-Score
Burn 156 62.2% 8.96
Ab_workout 13 8.5% 5.82
Workout_lose 13 8.5% 5.82
Help_burn 8 11.1% 5.82
add_video 26 14.0% 5.75
url_playlist 26 14.0% 5.75
Fitness 39 18.6% 5.51
Ab 43 19.1% 5.51
Playlist_mention 30 15.3% 5.39
Biceps 7 4.7% 4.74
Evaluation
Labeling by non-experts (Mechanical Turk workers)Usual measures: precision and recall
Precision @ Rank
Summary
● No matching– Requires randomization into treatment and control
groups
● Matching– Ideally is done on all known variables
● Propensity score matching– Powerful tool to combine known variables
● Be very skeptical about your results!
EventInstall net to keep cat inside the house
OutcomeLearning that cats do whatever they want