+ All Categories
Home > Documents > Collating Social Network Profiles

Collating Social Network Profiles

Date post: 04-Jan-2016
Category:
Upload: nomlanga-fitzgerald
View: 25 times
Download: 3 times
Share this document with a friend
Description:
Collating Social Network Profiles. Objective. System. . . . Objective. System. Input. Output. . Social Network Profiles. - PowerPoint PPT Presentation
23
Collating Social Network Profiles
Transcript
Page 1: Collating Social Network Profiles

Collating Social Network Profiles

Page 2: Collating Social Network Profiles

2

<Twitter Profile, Facebook Profile, G+ Profile, …>

Objective

<Company Name> System<Twitter Profile, Facebook Profile, G+ Profile, …>

Page 3: Collating Social Network Profiles

3

<Twitter Profile, Facebook Profile, G+ Profile, …>

Objective

Company Name SystemSocial Network

Profiles

Input Output

Page 4: Collating Social Network Profiles

4

Record Linkage+

Identity

Page 5: Collating Social Network Profiles

5

Agenda

Introduction Objective

Contrast to Existing Work

Work Done Baseline System

Individual Network Approach

Machine Learning Experiments

Next Steps, Q&A

Page 6: Collating Social Network Profiles

6

Baseline System

Page 7: Collating Social Network Profiles

7

Ground Truth

Two networks: Facebook and TwitterTop seventy 2013 Fortune 500 companies

Page 8: Collating Social Network Profiles

8

Baseline Algorithm

1.Take company name.

2.Search Facebook/Twitter API using it.

3.Return first result from each.

Page 9: Collating Social Network Profiles

9

Baseline Performance

Facebook Twitter Both0

10

20

30

40

50

60

70

34

52

30

Corr

ect

Matc

hes

Page 10: Collating Social Network Profiles

10

Individual Network Approach

Page 11: Collating Social Network Profiles

11

New Approach

Score profiles based onEdit Distance

Company Name – Username

Company Name – Display Name

Relative Popularity

Page 12: Collating Social Network Profiles

12

Display Name

Username

Page 13: Collating Social Network Profiles

13

New Approach

Score profiles based onEdit Distance

Company Name – Username

Company Name – Display Name

Relative Popularity

Page 14: Collating Social Network Profiles

14

Scoring

Edit Distance Score:

Popularity Score:

Page 15: Collating Social Network Profiles

15

Best Performing Combination

Facebook Twitter Both0

10

20

30

40

50

60

70

34

52

30

40

50

34

Baseline Username Edit Distance + Popularity

Corr

ect

Matc

hes

Page 16: Collating Social Network Profiles

16

Machine Learning Experiments

Page 17: Collating Social Network Profiles

17

Freebase Ground Truth

1,422 with a social media presence

917 with Facebook, 687 with Twitter

598 with both

553 with valid profiles

Page 18: Collating Social Network Profiles

18

Training Set

553 Correct

553 Incorrect

1106

Total

Page 19: Collating Social Network Profiles

19

Cross Validation Results

Classifier Test | Train Train | Test

Linear Regression 0.734 0.707

Gaussian Naïve Bayes 0.972 0.956

Multinomial Naïve Bayes 0.511 0.506

Bernoulli Naïve Bayes 0.720 0.701

Decision Tree 0.954 0.935

Page 20: Collating Social Network Profiles

20

Next Steps

Improve training set: provide harder examples

Page 21: Collating Social Network Profiles

21

Next Steps

Improve training set: provide harder examplesIncorporate more profile data

Page 22: Collating Social Network Profiles

22

Next Steps

Improve training set: provide harder examplesIncorporate more profile dataBuild system around classifiers

Page 23: Collating Social Network Profiles

23

Agenda

Introduction ObjectiveContrast to Existing Work

Work Done Baseline SystemIndividual Network ApproachMachine Learning Experiments

Next Steps, Q&A


Recommended