Date post: | 08-Jul-2015 |
Category: |
Technology |
Upload: | nuno-loureiro |
View: | 660 times |
Download: | 0 times |
Distributed Systems
dumpFS
Distributed Systems
Carnegie Mellon UniversityProject for Distributed Systems
dumpFSA Distributed Storage Solution
• Bruno Garrancho• Eugénio Pinto • Nuno Loureiro
1Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Prof. António Casimiro
• Prof. Bill Nace
Acknowledgements
•2Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 3Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Current demand for massive storage
• Commodity Hardware• Simple semantics of web context• Alternative solutions: too
generic, too complex, extra overhead, too expensive
• Not end user demand
Motivation
•4Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Availability
• Performance
• Scalability
Goals
•5Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Black box Storage
• API/Middleware for developers
• Web, Web & Web...
• Streams, Streams & Streams...
• WORM
How it works
•6Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 7
Architecture
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 7
Architecture
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Monitor
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Monitor
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Monitor
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
API
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Monitor
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
API
Application
API
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 8
Architecture
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End UserCerebrum(...)
Storage(...)
Monitor
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
API
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 9
Architecture - PUT
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Distributed Systems
dumpFS
Distributed Systems 10
Architecture - GET
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Availability
• Performance
• Scalability
Revisiting the goals
•11Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Availability
• Performance
• Scalability
Revisiting the goals
•11
How do we
provide these properties?
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Heartbeat (between all nodes)Detection of Failures
• Distributed System State (local node state sent to cerebrums)
CPU LoadDisk Space
Monitoring
•12Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 13
Distributed System State
Cerebrum
Monitor
Server
HTTP API
Cerebrum
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 13
Distributed System State
Cerebrum
Monitor
Server
HTTP API
Cerebrum
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
5 secs {load; disk}
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 13
Distributed System State
Cerebrum
Monitor
Server
HTTP API
Cerebrum
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
5 secs {load; disk}
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 13
Distributed System State
Cerebrum
Monitor
Server
HTTP API
Cerebrum
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
5 secs {load; disk}
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 13
Distributed System State
Cerebrum
Monitor
Server
HTTP API
Cerebrum
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
Storage
Monitor
Server
HTTP API
5 secs {load; disk}
0255075
100
0255075
100
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Crash Failures & Broken Links
Heartbeat- Only online nodes are selected
Replicated Files
Replicated Components
Tolerance to failures
Availability
•14Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•15
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•16
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•16
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•16
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•16
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•16
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•17
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
LB
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•17
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
LB
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•17
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
LB
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•17
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
LB
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
Tolerance to failures
•17
Cerebrum
Storage(...)
(...)
dumpFS
Monitor
Application
API
End User
End User
End User
End User
End User
LB
Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Cerebrums provide only localization
to the API, not data
• The primary storage node replicates
file in parallel while receiving data (PUT)
• Probabilistic weighted node selection
for PUT and GET operations
Performance
•18Tuesday, December 21, 2010
19Distributed Systems
dumpFS
Distributed Systems
Probabilistic weighted node selection• PUT uses Available Disk Space
• GET uses CPU Load
Performance
16
Node AAvl. Disk space: 57%
Node BAvl. Disk space: 47%
Should node A always be selected in PUT operations?
Tuesday, December 21, 2010
20Distributed Systems
dumpFS
Distributed Systems
Probabilistic weighted node selection
Performance
17
Node AAvl. Disk space: 57%
Node BAvl. Disk space: 47%
Rand(A) = Rand(1..57)Rand(B) = Rand(1..47)
Rand(B) can be greater than Rand(A)But the probability that it happens is < 50%
Use Rand(Node) instead of the direct value!
Tuesday, December 21, 2010
21Distributed Systems
dumpFS
Distributed Systems
DumpFS allows:• Redundant DB
• Partitioning for “infinite” growth
• Straightforward storage addition
• Clusters of Clusters
Scalability
18Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• REST / HTTP
• Erlang !!! - Server
• .Net - Client API
Technology
•22Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Our graphic design skills
• HDD I/O
• Time
What didn’t work
•23Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems
• Delete & Garbage collection
• Read Operations at arbitrary
locations in files
Future work
•24Tuesday, December 21, 2010
Distributed Systems
dumpFS
Distributed Systems 25
The END!Questions?
Tuesday, December 21, 2010