+ All Categories
Home > Technology > Benchmarking the Privacy-Preserving People Search

Benchmarking the Privacy-Preserving People Search

Date post: 11-Jul-2015
Category:
Upload: daqing-he
View: 69 times
Download: 0 times
Share this document with a friend
12
Benchmarking the PrivacyPreserving People Search Shuguang Han, Daqing He and Zhen Yue SIGIR 2014 Workshop on Privacy-Preserved Information Retrieval
Transcript
Page 1: Benchmarking the Privacy-Preserving People Search

Benchmarking  the  Privacy-­‐Preserving  

People  Search  

Shuguang Han, Daqing He and Zhen Yue

SIGIR 2014 Workshop on Privacy-Preserved Information Retrieval

Page 2: Benchmarking the Privacy-Preserving People Search

Social  Match  is  Important  in  People  Search  

2

because a tighter social similarity make it easier for people to

connect

Then

Need the users’ social networks to return the potential candidates who have either direct or indirect

connections with the given users.

Page 3: Benchmarking the Privacy-Preserving People Search

But  Privacy  is  a  BIG  Concern  in  People  Search  

3

�  People  search  often  performed  in  many  users  in  SNS  such  as  Facebook,  LinkedIn  

�  Users  in  SNS  often  either  opt  out  from  certain  social  networks  or  provide  incomplete  or  even  fake  information.    

�  data  mining  algorithms  may  not  work  or  even  harm  the  user  experience  when  equipped  with  such  incomplete  and  noisy  social  information  

Page 4: Benchmarking the Privacy-Preserving People Search

People  Search  Studies  o:en  Adopt  Public  Available  Network    

4

Coauthor  networks  are  often  used  because  of  

lacking  privacy  concerns      

However  this  limits    

the  type  of  people  search  being  studied,  

 So  should  study  

 other  social  networks  

which  has  privacy  concerns        

Page 5: Benchmarking the Privacy-Preserving People Search

Our  Research  

� Obtaining  a  privacy-­‐preserving  social  network    �  By  simulating  with  the  public  available  coauthor  networks  

�  Assume  coauthor  networks  could  be  used  as  surrogates  �  Motivation1:  many  real-­‐world  social  networks  (including  coauthor  

networks  and  many  other  privacy-­‐concerned  networks  such  as  Facebook  social  networks)  share  the  same  patterns  [Ugander,  et  al  2011,  Barabási  et  al.    1999,  Watts,  et  al  1998]  �  All  small-­‐world  networks  and  their  degree  distributions  are  highly  skewed.    

�  Motivation  2:  assortative  patterns  (the  preferences  of  connecting  people  who  share  the  similar  features)  of  social  networks  are  all  assortatively  mixed  [Newman  2002]  �  whereas  the  technological  and  biological  seems  to  be  disassortative.  

5

Page 6: Benchmarking the Privacy-Preserving People Search

Research  Focuses  

�  Privacy  issues  on  people  search  �  Either  global  and/or  local  network  features  used  in  people  search  �  Global  network  features:  the  features  that  are  propagated  through  the  

whole  networks    �  measured  by  the  PageRank  value  running  on  the  whole  social  networks  

�  Local  network  features:  the  features  that  are  directly  related  to  the  ego-­‐network  of  the  querying  user    �  measured  by  the  proportion  of  common  social  connections  

� Query  users’  privacy  concerns  and  candidates’  privacy  concerns  

6

Page 7: Benchmarking the Privacy-Preserving People Search

Data  

�  Academic  publication  collection  �  containing  219,677  conference  papers  from  the  ACM  Digital  Library.    

�  between  1990  and  2013.    �  Only  public  available  information  of  a  paper:  the  title,  abstract  and  authors  �  No  further  author  disambiugation  besides  ACM  Digital  Library  author  ID  

�  In  total,  the  collection  contains  253,390  unique  authors  and  953,685  coauthor  connection  instances.  

�  Users’  people  search  activities:  Han  et  al.  [5]  evaluation  of  a  people  search  system.    �  four  different  people  search  tasks,  each  aimed  to  search  for  5  candidates.    �  A  baseline  plain  content-­‐based  people  search  system    �  An  experimental  system  that  enhances  people  search  with  three  interactive  facets:  

content  relevance,  social  similarity  between  the  user  and  a  candidate  (the  local  network  feature)  and  the  authority  of  a  candidate  (the  global  network  feature).    �  The  experiment  system  allowed  the  querying  users  to  tune  the  value  associated  with  each  facet  

in  order  to  generate  a  better  candidate  search  results.    �  24  participants  were  recruited  for  that  user  study.    

�  At  the  beginning  of  the  user  study,  each  participant  was  asked  to  provide  their  publications  and  their  close  social  connections  (such  as  advisors).    

�  In  the  post-­‐task  questionnaire,  the  participants  were  asked  to  rate  the  relevance  of  each  marked  candidate  in  a  Five-­‐point  Likert  scale  (1  as  non-­‐relevant  and  5  as  the  highly  relevant).  

7

Page 8: Benchmarking the Privacy-Preserving People Search

SimulaDng  Privacy-­‐Concern  networks  

8

pi is the probability of a given user has privacy

concern, di is the degree of association of a user and dmax. Is the maximized

degree in the network, λ helps to establish different

selection strategies

Evaluation: •  How the new network impact people search performance: MAP

Page 9: Benchmarking the Privacy-Preserving People Search

Impacts  of  Global  Network  Feature  of  Privacy-­‐Concern  Network  on  People  Search    

9

Page 10: Benchmarking the Privacy-Preserving People Search

Impacts  of  Local  Network  Feature  of  Privacy-­‐Concern  Network  on  People  Search    

10

Page 11: Benchmarking the Privacy-Preserving People Search

Impacts  of  Query  Users’  Privacy  Concerns  on  People  Search  

11

Page 12: Benchmarking the Privacy-Preserving People Search

Insights  

�  Both  the  local  and  global  network  features  are  important  for  the  performance  of  people  search  (compare  to  not  using  social  network).    �  Comparing  to  the  global  network  feature,  the  local  network  feature  is  more  

important.    �  Privacy-­‐concerns  reflected  in  local  and  global  network  features  can  

significantly  influences  on  the  performance  of  people  search  �  The  privacy  concerns  from  the  high-­‐degree  candidates  in  the  network  will  have  

more  impacts  on  global  features.    �  The  local  network  feature  is  related  to  both  the  querying  users  and  the  candidates  

in  the  networks.    �  the  privacy  concerns  from  both  of  them  have  significant  impact  on  the  people  search  

performance.    �  The  privacy  concerns  from  high-­‐degree  candidates  have  bigger  influences  on  the  people  

search  than  that  of  the  lower-­‐degree  candidates,  especially  when  those  high-­‐degree  candidates  are  related  to  the  querying  user.    

�  We  also  find  that  if  the  querying  users  provide  more  social  connections,  the  search  performance  would  increase  steadily.  

12


Recommended