Date post: | 14-Apr-2018 |
Category: |
Documents |
Upload: | jonathan-stray |
View: | 225 times |
Download: | 0 times |
of 34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
1/34
Fron%ersof
Computa%onalJournalism
ColumbiaJournalismSchool
Week3:SocialFilteringSeptember25,2013
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
2/34
Week5:SocialFiltering
Findingsourcesonsocialmedia
Par%cipatoryJournalism
Informa%onDistribu%ononSocialNetworks
SocialSoHware
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
3/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
4/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
5/34
ClassifyUsers
Classicmachinelearningproblem.Classifyeach
userasoneof:
journalist/blogger organiza%on ordinaryindividualFirst,needtoencodeasavector/select
features...
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
6/34
Featuresforuserclassifier
#offollowers/following #ofposts,favorites
percentageofpoststhatareRTs,@replies,links
presence/absenceofnameden%%es topicdistribu%onoftweets (IPTCtopleveltopics)
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
7/34
Digression:IPTCMediaTopicCodes
Interna%onalstandardhierarchicaltaxonomy,partoftheNewsMLmarkupsystem.DefinedbyReuters,AP,
NYTimes...
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
8/34
K-nearestneighborclassifier
TakeKclosesttrainingpoints(inhighdimensional
featurespace),choosemajoritylabel.
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
9/34
Crea%ngthetrainingdata
1,850randomusers
1,532knownorganiza%ons
1,490knownjournalistsandbloggers
iredMechanicalTurkworkerstoapplylabels.
Eachuserlabeledbytwoworkers,discardedifdisagreement.
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
10/34
ClassifierAccuracy
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
11/34
Eyewitnessclassifier
Goalistofindindividualtweetsthatareeyewitnessreports.
StartedwithLIWC(linguis%cinquiryandwordcount)dic%onarythatclassifiesEnglishwordsalong70differentdimensions,includingemo%on,cogni%on,%me,health...
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
12/34
WordAspects
Usedpercep%oncategorywords
plusinsightandcertaintywords
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
13/34
Eyewitnesstweetclassifier
Itsaneyewitnesstweetifitcontainsanyof
thesespecialwords!(ortheirstems)
ighprecision!Lowrecall.
89oftweetsclassifiedaseyewitnessactuallywere.
Butonly32ofeyewitnesstweetsdetected.
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
14/34
Otherdimensions
TweetcontainsURLtophotoorvideo(usedtableofdomainnames,e.g.flickr.com=photo)
Postedfrommobiledevice(fromtweetmetadatanaming
pos%ngapp)
Geocodeusersstatedloca%on(thisispainfulandunreliable)
Distribu%onoffriendsloca%ons.(Friend=mutualfollowing)
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
15/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
16/34
Testuserreac%ons
Thisgivesyoucontextyouhavethecontextforwhetherornotyouthinktheyrereputableorwhetherornottheyreworthreachingoutto.
Itsgivingmealotofcontextwhichisreallyusefulwhenyouretryingtoverifyifsomeoneisreputableornot.
Iwouldtendtofocusontheeyewitnessesandjournalists/bloggers.EventuallyIdlookateveryoneelsebutIdwanttostartmysearchwiththosetwogroupsbecausetheywouldnormallyprovidemewiththemostinformaCon.
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
17/34
Testuserreac%ons
Popularfeatures:
Eyewitnessfiltering,userloca%on,image/videofilter
Unpopularfeatures:
En%tyextrac%onnothelpful,noabilitytofilterbyloca%onandeyewitnessstatus,focusonusers
insteadofcontent
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
18/34
Week5:SocialFiltering
Findingsourcesonsocialmedia
Par%cipatoryJournalism
Informa%onDistribu%ononSocialNetworks
SocialSoHware
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
19/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
20/34
User
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
21/34
User
storiesnotcovered
filtering
x
x
x
x
x
x
x
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
22/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
23/34
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
24/34
x
x
x
x
x
whouserchoosestofollow=
socialfiltering
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
25/34
Week5:SocialFiltering
Findingsourcesonsocialmedia
Par%cipatoryJournalism
Informa%onDistribu%ononSocialNetworks
SocialSoHware
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
26/34
Twierfollowernetwork
Wehavecrawledtheen%reTwiersiteandobtained41.7millionuserprofiles,1.47billionsocialrela%ons,4,262trendingtopics,and106milliontweets.Initsfollower-followingtopologyanalysiswehavefounda
non-power-lawfollowerdistribu%on,ashorteffec%vediameter,andlowreciprocity,whichallmarkadevia%onfromknowncharacteris%csofhumansocialnetworks
-Kwaket.al,WhatisTwier,aSocialNetworkoraNewsMedia?
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
27/34
Morefollowingsthanfollowers
S ll di b d
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
28/34
Smallavgdistancebetweentwonodes(why?andwhatdoesthismean?)
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
29/34
Itsanewsnetwork
Smallnumberofhigh-degreehubs
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
30/34
ItsanewsnetworkSmallnumberofhigh-degreehubs
Differentnetworkstructurethane.g.Facebook.
Differentuses.
why?
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
31/34
Week5:SocialFiltering
Findingsourcesonsocialmedia
Par%cipatoryJournalism
Informa%onDistribu%ononSocialNetworks
SocialSoHware
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
32/34
SocialSoHware
Basicassump%on:structureofsoHwareinfluenceshowgroupsuseit.
or:architectureinfluencesbehavior
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
33/34
Threewaystoinfluencebehavior
Norms:culture,habits,e%quee,theusers
senseofwhatisrightorappropriate
Laws:rulesenforcedbytheadministrator
Code:whatitisactuallypossibletodo
7/27/2019 Computational Journalism at Columbia, Fall 2013, Lecture 4: Social Filtering
34/34
Designproblem...
Whatdowewanttheuserstoaccomplish
together?
owdoweencouragethis?
Wecanwritethecode,butthecultureistosomedegreebeyondourpredic%onorcontrol.