Distributed Hash TablesDavid TamPatrick Pang
Presentation OutlineWhat is DHT (Distributed Hash Table)?Why DHTs?ApplicationsHow lookup works?Alternatives to DHTsPerformance RoutingPerformance Load BalancingSecurity Routing AttackSecurity Inconsistent BehaviourComparison to Other FacilitiesCurrent Research ProjectsConclusion
What is DHT?Distributed hash tableDistributed applicationget (key)dataput(key, data)(Figure adopted from Frans Kaashoek)DHT provides the information look up service for P2P applications. Nodes uniformly distributed across key space Nodes form an overlay network Nodes maintain list of neighbours in routing table Decoupled from physical network topology
Why DHTs?Why Do We Need DHTs? Simplifies the development for large-scale distributed Apps Better security and robustness Simple API Exploits P2P resourcesWhy Middleware? Simplifies the development for large-scale distributed Apps Better security and robustness Simple API
Applications Anything that requires a hash table Databases, FSes, storage, archival Web serving, caching Content distribution Query & indexing Naming systems Communication primitives Chat services Application-layer multi-casting Event notification services Publish/subscribe systems ?
How lookup works?2141210750346891113Finger Table for Node 2151Example: Chord [Stoica et. al.]
startintervalsucc.3[3,4)54[4,6)56[6,10)710[10,2)10
How lookup works?214121075034689111315Finger Table for Node 101Example: Chord
startintervalsucc.11[11,12)1212[12,14)1214[14,2)142[2,10)2
How lookup works?214121075034689111315Finger Table for Node 101Example: Chord
startintervalsucc.11[11,12)1212[12,14)1214[14,2)142[2,10)2
How lookup works?1214121075034689111315Finger Table for Node 14Example: Chord
startintervalsucc.15[15,0)150[0,2)12[2,6)26[6,13)7
How lookup works?1214121075034689111315Finger Table for Node 14Example: Chord
startintervalsucc.15[15,0)150[0,2)12[2,6)26[6,13)7
How lookup works?2141210750346891115Now Node 2 can retrive information for key 0 from Node 1.1Example: Chord
Alternatives to DHTs Distributed file system Centralized lookup P2P flooding queries(Figures adopted from Frans Kaashoek)
Performance -- LookupPurpose -- to locate a target nodeEach step, try to get closer to locating target node Ask a closer neighbour Performance & scalability tied directly to lookup algorithm2 Aspects to Scalability size of routing table O(log N) lookup path length O(log N)2 Aspects to Performance Path latency Lookup path length (# hops)3 Techniques proximity lookup proximity neighbour selection geographic layout
Performance -- Load BalancingIssues Hot-spots Content Lookup Heterogeneous nodes & paths System flux
Solution Replication is the key Also good for fault-tolerance Cache lookup answers backwards along path
Security Incorrect Lookup (1) When asked for the next hop, give a wrong answerFinger Table for Node 2Node 2 to Node 10: Please tell me how to reach key 0 .2141210753468911131510
startintervalsucc.3[3,4)54[4,6)56[6,10)710[10,2)10
Security Incorrect Lookup (2) When asked for the next hop, give a wrong answerFinger Table for Node 102141210750346891113151Node 2 to Node 10: Please tell me how to reach key 0 .Node 10 answers: ask Node 14
startintervalsucc.11[11,12)1212[12,14)1214[14,2)142[2,10)2
When asked for the next hop, give a wrong answerFinger Table for Node 14214121075346891113151Node 2 to Node 14: Please tell me how to reach key 0 .Node 14 answers: ask Node 10Security Incorrect Lookup (3)0
startintervalsucc.15[15,0)150[0,2)12[2,6)26[6,13)7
Security Incorrect Lookup (4)Solution [Sit and Morris]: Define verifiable system invariant Allow the querier to observe lookup progress
Our idea how this can be implemented: Concretely, using an integral monotonically decreasing quantity to implement the idea of progress. The concept of monotonically decreasing quantity has been used in program construction guaranteeing total correctness. [Parnas]
Security Inconsistent Behaviour Inconsistent Behaviour, i.e., lie intelligibly Sybil attack [Kaashoek] Solution 1: public key solution
Security Inconsistent Behaviour Inconsistent Behaviour, i.e., lie intelligibly Sybil attack [Kaashoek] Solution 1: public key solutionSolution 2: Byzantine ProtocolByzantine Generals Problem:How to find out the traitors among the Generals? [Lamport]
Inconsistent Behaviour, i.e., lie intelligibly Sybil attack [Kaashoek] Security Inconsistent BehaviourSolution 1: public key solutionSolution 2: Byzantine ProtocolByzantine Generals Problem:How to find out the traitors among the Generals? [Lamport]
Security Inconsistent Behaviour Inconsistent Behaviour, i.e., lie intelligibly Sybil attack [Kaashoek] Solution 1: public key solutionSolution 2: Byzantine ProtocolByzantine Generals Problem:How to find out the traitors among the Generals? [Lamport]
Comparison to Other Facilities
FacilityAbstractionEasy Use/PrgScalabilityLoad-BalanceDHThighhighhighyesCentralized LookupmediummediumlownoP2P flooding queriesmediumhighlownoDistributed FSlowmediummediumno
FacilityFault-ToleranceSelf-OrgAdminDHThighyeslowCentralized LookuplownomediumP2P flooding queriesdependsyeslowDistributed FSmediumnohigh
Research ProjectsIris security & fault-tolerance US GovtChord circular key spacePastry circular key spaceTapestry hypercube spaceCAN n-dimensional key spaceKelips n-dimensional key spaceDDS-- middleware platform for internet service construction-- cluster-based-- incremental scalability
Summary Good middleware platform Exploits P2P networks An exciting new research area
References Lamport, Leslie et. al. The Byzantine Generals Problem Sit, Emil, Morris, Robert. Security Considerations for Peer-to-Peer Distributed Hash Tables Kaashoek, Frans. Distributed Hash Tables Building large-sacle, robust distributed applications Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup service for Internet applications Parnas, D. L. Connecting Theory to Practice: Software Engineering Programme
Make it simple to write decentralized, distributed applications ..