Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy...

Post on 27-Dec-2015

224 views 3 download

Tags:

transcript

Link Recommendation In P2P Social NetworksYusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy

Bilkent University, Ankara, Turkey

VLDB WOSS 2012

Outline

• Introduction• Motivation for P2P Social Networks• Link Recommendation• P2P Top-k Common Neighbor• Experiments• Discussion• Future Work

2/23

VLDB WOSS 2012

Introduction

• Social networks are mostly based on centralized infrastructure (“fat server thin client”).

• However, P2P infrastructure is a natural alternative for social networks.• Problems with centralized

infrastructure.

3/23

VLDB WOSS 2012

Problems with Centralized Systems

• Privacy: Social network providers can misuse users’ data.• Censorship: Social network provider can censor users’ shares.• Scalability: Data can be distributed over

network.• These can be avoided in P2P Social networks.

4/23

VLDB WOSS 2012

Advantages of P2P Systems

• Data can be maintained by peers, no need for another computer.

• Level of privacy can be defined according to user.

• Misuse of both linkage and user data is prevented.

• Accordingly, significant amount of research is needed for algorithms and systems of P2P Social Networks.

5/23

VLDB WOSS 2012

P2P Social Network Challenges

• Algorithm Perspective– Distributed graph algorithms– P2P Performance

• Systems Perspective– Storage– Robustness– Security

• SOWHOO: Our open source implementation» https://github.com/yusufaytas/sowhoo

6/23

VLDB WOSS 2012

Social Network Algorithms on P2P Environment

• In a P2P Social Network, peers have limited information about the network.

• Known algorithms like link prediction, community detection, information diffusion should be revisited.

• Efficiency of overlay network should be taken into account as well as algorithm accuracy.

• In this context, we propose a new approach “Link Recommendation”.

7/23

VLDB WOSS 2012

Problem Background

• Common Neighbor : A node is more likely to interact with another node if number of their shared neighbors is high.

• Top-K Query Processing: Finding k objects that have highest scores.

Id S1

a 0.9

d 0.85

e 0.83

h 0.75

. .

. .

Id S2

e 0.96

f 0.84

b 0.83

d 0.56

. .

. .

Id S1 S2a 0.9e 0.83 0.96d 0.85 0.56f 0.84b 0.83h 0.75

0.23

0.34

0.41

0.27

8/23

VLDB WOSS 2012

Problem Background

• Zhang proposed a Common Neighbor algorithm (NCNP) to predict links in a distributed graph.

• Kermarrec proposed a distributed social graph embedding algorithm (SocS) for link prediction.

• We consider P2P environment settings.• Our approach uses P2P Top-k retrieval to

enhance performance.• Scoring methods improve network overlay.

9/23

VLDB WOSS 2012

Link Recommendation

• Link recommendation: suggesting new links by considering both neighborhood information and network performance.

• To measure social information and P2P network, we use node scoring.

• We adapted Common Neighbors to distributed environment using Fagin’s and Threshold Algorithm.

10/23

VLDB WOSS 2012

Link Recommendation(Cont’d)

2

23

9

5

11/23

VLDB WOSS 2012

Node Scoring

• Node Importance• Reputation Scoring• P2P Systems Measures• Composite Measures– Trusted Centrality– Available Authority

• Our weighting strategy may suggest friendships that improve P2P Topology

12/23

VLDB WOSS 2012

Top-K Common NeighborE

A

F

D

B

C

Node A requests new Recommended Node.

Each node returns

recommended node.

Node A evaluates returned nodes and terminates if algorithm converges.

13/23

VLDB WOSS 2012

Top-K FA and TA Common Neighbor

• Top-K FA Common Neighbor algorithm stops if it receives k recommended nodes from all neighbors.– It generally results in worst case scenario.

• Top-K TA Common Neighbor algorithm stops if it has k recommended nodes greater than the threshold(approximated).– Threshold calculated at each iteration.

14/23

VLDB WOSS 2012

Setup For Experiments

• Synthetic and real data • For real data– Gnutella (6301 nodes and 20777 edges)– Wikipedia (7115 nodes and 103689 edges)

• For synthetic data, we implemented: – Uniformly distributed model,– Small world model of Watts and Strogatz,– Clustering model of Holme and Kim.

• We plan to use data from SOWHOO. 15/23

VLDB WOSS 2012

Experiments(Performance)

• We have evaluated algorithms’ efficiency as number of interactions vs. number of edges.

• An interaction/access is to retrieve recommended node information, i.e. weight and address from a peer.

• Assigned weights to network globally and locally according to power-law and uniform distribution.

• Global weights are single and do not change according to a node. Local weights are assigned by each node and differ.

16/23

VLDB WOSS 2012

Top-K TA vs. Top-K FA

17/23

VLDB WOSS 2012

Experiments (Accuracy)

• We evaluated algorithms according to recommended nodes by considering regular Common Neighbor as baseline.

• Also need to evaluate by using:– Rank of recommended nodes. – Sum of weights for recommended nodes.

• Performance measure(ω) for accuracy and efficiency trade-off:

18/23

VLDB WOSS 2012

Top-K TA vs. Top-K FA

19/23

VLDB WOSS 2012

SOWHOO

• We are building a P2P Social Network application to test our algorithms.

Super Peer

Super Peer

20/23

VLDB WOSS 2012

SOWHOO(Cont’d)

• SOWHOO has 3 layers : application layer, system layer, and network layer.

Network Layer

Application Layer

System Layer• Application Layer handles

user requests and provides user interface.

• System Layer provides mechanisms like pub/sub, notify/update and so on.

• Network layer provides messaging infrastructure between peers.

21/23

VLDB WOSS 2012

Discussion

• We presented ongoing work on Link Recommendation.

• P2P Top-K FA and TA Common Neighbors to find recommended links for a node.

• P2P Top-k TA is significantly better than P2P Top-k FA Common Neighbors in terms of efficiency.

• We also presented weighting methods and proposed combined weights.

22/23

VLDB WOSS 2012

Future Work

• We are planning to improve Top-K TA Common Neighbor algorithm to Top-K TA Common Neighbor+.

• Test our algorithms according to accuracy measures we have discussed.

• We are planning to complete implementation of SOWHOO.

• Test our algorithms on data generated by SOWHOO.

23/23