+ All Categories

L27

Date post: 10-May-2015
Category:
Upload: networkingcentral
View: 394 times
Download: 0 times
Share this document with a friend
Popular Tags:
38
Peer to Peer Systems and File Sharing Carl Lagoze – CS431 – Cornell University May 3, 2004 Portions borrowed from sources listed on next slide
Transcript
Page 1: L27

Peer to Peer Systems and File Sharing

Carl Lagoze – CS431 – Cornell UniversityMay 3, 2004

Portions borrowed from sourceslisted on next slide

Page 2: L27

Sources of this lecture

• J. Berkes, Decentralized Peer-to-Peer Network Architecture: Gnutella and Freenet

• R. Morris, Chord+DHash+Ivy: Building Principled Peer-to-Peer Systems

• S. Kamvar, M. Schlosser, H. Garcia Molina, EigenRep: Reputation Management in P2P Networks

• J. Golbeck, B. Paris, J. Hendler, Trust Networks on the Semantic Web

Page 3: L27

Characteristics of P2P Network

• Sharing of computing resources by direct exchange

• Blur between clients, servers, routers• Nodes are autonomous

Page 4: L27

P2P Advantages

• Efficient use of resources• Scalability• Reliability• Administrative simplicity• Democracy

Page 5: L27

P2P’s Political History

• Major basis in music sharing context• Overshadows numerous applications• Recent research is investigating generic

applicability– DHTs– Reputation and Trust

Page 6: L27

Small-World Phenomenon

• Milgram’s “six degrees of separation (1967)• Forwarding of letters from Nebraska to Boston,

MA• Average chain 6 of six hops

Page 7: L27

Power Laws and Small Worlds

• Out-degree distribution is:– 1/Kα where α > 0

• Characteristics of a variety of phenomenon– Web Graph– IMDb connection (acted in same movie)– Social interactions– P2P networks (Gnutella)– Epidemiology

Page 8: L27

Strength of Weak Ties

• Extension of power-law phenomenon

• Short-cuts (between cliques) critical to small world phenomenon

Page 9: L27

Napster and P2P

• Not really P2P• Central search index• Direct interaction for

access (p2p)• Central index was key

to litigation

Page 10: L27

Gnutella

• Fully P2P• Flooded query

– Scalability problems– TTL controls broadcast– “Query Memory”

controls circularity

• Reliability problems• But “whom to sue?”

Page 11: L27

Kazaa

• Hybrid of Napster model and Gnutella model

• Notion of a super peer– Like a regionalized napster server– Dynamically chosen by characteristics

• P2P relationship among super-peers• Queries directed towards super-peers

Page 12: L27

“Free-riding”

• Definition: downloading but not sharing any data

• On gnutella networks 15% of users contribute 94% of content

Page 13: L27

Freenet

• Goal: create an uncensorable and secure global information store– Anonymity and fault tolerance

• http://freenet.sourceforge.net/ • Three types of network messages:

– Advertise storage space to store unknown data– Insert a file to the network– Request a file (with a key) from the network

• Use of one-way secure hashes to identify files and encryption to store files– Node does not know what it is storing

• Non-traceability of messages– A node can not determine where its message is stored

Page 14: L27

Freenet Request/Response Sequence

Page 15: L27

Distributed Hash Tables (DHT)

• Overcoming the flooded search problem• Operationally like standard hash tables• Data is distributed around the network• Features

– Efficient: • O(log N) messages per lookup• Even distribution of keys among nodes

– Adaptable • Network reconfiguration does not cascade to all nodes

– Robust: replication of tables provides survival to node failures

Page 16: L27

Chord

• One implementation of DHT within a larger P2P project

• http://www.pdos.lcs.mit.edu/chord/• Algorithm properties

– Common hash function distributes node ID (IP) and document ID uniformly

– Maps a content key to its node successor

Page 17: L27

Chord Key Mapping

N32

N10

N100

N80

N60

CircularID Space

K33, K40, K52

K11, K30

K5, K10

K65, K70

K100

Key ID Node ID

Robustness via each node remembering N successors and replicating tableat successors

Page 18: L27

Use of finger table to avoid linear lookups

ith finger table position points to first node that succeeds n by at least 2i+1

Page 19: L27

Key location with finger table

• Use finger table to find furthest node that precedes key

• O(logN) hops leads to target

Page 20: L27

From DHTs to P-trees

• DHTs only support equality queries– Return the value of resources with ID1

• Need to support range queries– OAI type query, find all nodes resources that

were changed between D1 and D2

• P-tree reuses aspects of fault-tolerant ring of Chord with logarithmic search properties for equality and range queries.

Page 21: L27

Pastry Project

• Factors in network locality as part of DHT algorithm

• http://research.microsoft.com/~antr/Pastry/

Page 22: L27

Identity, Trust, Reputation

• Identity– Who is making a statement– Certificates, PKI

• Trust– Can I believe the person who is making the

statement– PGP Web of Trust

• Reputation– What is the history of trust in the person

making the statement– Reputation management

Page 23: L27

Reputation Issues

• Small world phenomenon makes web of trust feasible

• Reputation is context specific– I can be trusted with questions about OAI-PMH– Can you trust me belaying for you?

Page 24: L27

Simple reputation network

A

B

C

•A knows and trusts B

•B knows and trusts C

•A can infer trust for C

Page 25: L27

Reputation Inference Algorithm

• Begin at source (node seeking a reputation)

• Poll each of neighbors whose reputation it trusts– Ignore neighbors with bad reputation

• Have each neighbor recursively find reputation of sink (node for which reputation is sought)

Page 26: L27

Accuracy of inferences

• Incorrect bad rating by a node has minimal effect– Will be dropped from path in reputation seeking– Will be overcome by “correct” good rating by another

node.

• Incorrect good rating by a node can have cascading effect– Can cause ratings of good nodes to be ignored through

lies– Serious threat to network

• Good trust algorithm minimizes effect of “bad nodes”

Page 27: L27

From Golbeck and Hendler

Page 28: L27

Trellis

• http://trellis.semanticweb.org• Semantic web based system for decision

making assessing reliability of information and sources

• Decision maker can construct compound statements justifying decision and providing basis for others’ decisions

Page 29: L27

Trellis (cont.)

• Components– Statements (Carl Lagoze is a bad teacher)– Basis of statement

• http://cornellbigred.collegesports.com/sports/m-crew/mtt/kruse_william00.html

– Principal source of basis/statement • William Kruse

– Qualifications to state certainty of component

Page 30: L27

Trellis compound statement

From Gil and Ratnaker

Page 31: L27

Advogato

• Trust metrics for open source software developers

• http://www.advogato.org/ • Three levels of trust/certification

– Master– Journeyer– Apprentice

Page 32: L27

Advogato (cont)

• Graph structure of trust– Domain of master is only master– Domain of journeyer is master and journeyer– Domain of apprentice is all

• Computation of trust is via network flow (well known problem with efficient solutions)– Hard-wired set of users from which all trust flows (gods of

the system)– people reached by the flow are those accepted by the trust

metric– With the three levels, the maxflow is computed three

times:

• Robust (resistant to attack) and efficient

Page 33: L27

Eigentrust

• Algorithm for Reputation Management in P2P Networks

• Kamvar, Schlosser, Garcia-Molia (Stanford)• http://

www.stanford.edu/~sdkamvar/research.html

Page 34: L27

Eigentrust Approach

• Goal: Identify sources of inauthentic files and bias peers agains downloading from them

• Method: Give each peer a trust value based on its previous behavior

• Trust values– Local: open a peer has on another based on past

experience– Global: trust that entire system places in a peer

• Want latter computed from aggregate of former• Dual goals

– Know all peers– Perform minimal computation and store minimal data

Page 35: L27

Past History Approach

• Each peer biases its choice of downloads using its own opinion vector

• Problems:– Each peer has limited past experience– Inertia – if a peer has good past experience

with another, it will be biased towards relying on it

Page 36: L27

Friends of friends approach

• Ask for opinions of the people who you trust

• Weigh their opinions by your trust in them• Problems

– You have a lot of friends: too much to compute and store

– Few friends: won’t have enough data

Page 37: L27

Eigentrust Approach

• Whole networks cooperates to store and compute trust vector

• Each peer holds its own opinions• Each peer holds its own global reputation• Iterative algorithm that converges to

compute global trust ratings (in the nature of PageRank)

Page 38: L27

More Eigentrust Issues

• Secure Score Management– Voting among multiple score managers– Peer score held by another peer

• Threat scenarios– Malicious individuals (always bad)– Malicious collectives (always bad, think highly

of each other)– Camouflaged collectives (sometimes good to

trick people)– Malicious spies (good all the time but friends

with bad folks)


Recommended