Date post: | 30-Jun-2015 |
Category: |
Technology |
Upload: | manfred-furuholmen |
View: | 4,907 times |
Download: | 5 times |
Fabrizio Manfred Furuholmen
Use Distributed File system as a Storage Tier
14/04/2023
2
Agenda
Introduction Next Generation Data Center Distributed File system
Distributed File system OpenAFS GlusterFS HDFS Ceph
Case Studies
Conclusion
14/04/2023
3
Class Exam
What do you know about DFS ?
How can you create a Petabyte storage ?
How can you make a centralized system log ?
How can you allocate space for your user or system, when you have a thousands of users/systems ?
How can you retrieve data from everywhere ?
14/04/2023
4
Introduction
Next Generation Data Center: the “FABRIC”
Key categories: Continuous data protection and disaster
recovery
File and block data migration across
heterogeneous environments
Server and storage virtualization
Encryption for data in-flight and at-rest
In other words: Cloud data center
14/04/2023
5
Introduction
Storage Tier in the “FABRIC” High Performance Scalability Simplified Management Security High Availability
Solutions Storage Area Network Network Attached Storage Distributed file system
14/04/2023
6
Introduction
What is a Distributed File system ?
“A distributed file system takes advantage of the
interconnected nature of the network by storing
files on more than one computer in the network
and making them accessible to all of them..”
Introduction
7
Part II
Implementations
8
How many DFS do you know ?
14/04/2023
9
OpenAFS: introduction
Key ideas: Make clients do work whenever possible.
Cache whenever possible.
Exploit file usage properties. Understand them. One-third of Unix files are temporary.
Minimize system-wide knowledge and change. Do not hardwire locations.
Trust the fewest possible entities. Do not trust workstations.
Batch if possible to group operations.
is the open source implementation of Andrew File system of IBM
14/04/2023
10
OpenAFS: design
OpenAFS: components
Server A
Server A+B
Server C
11
64
256
1024
4096
16384
65536
262144 4
64
1024
16384
0
5000
10000
15000
20000
25000
30000
35000
40000
kb
block
write
35000-40000
30000-35000
25000-30000
20000-25000
15000-20000
10000-15000
5000-10000
0-5000
4
16
64
256
1024
4096
16384 2048
16384
1310720
10000
20000
30000
40000
50000
60000
70000
80000
90000
a
43
read
80000-90000
70000-80000
60000-70000
50000-60000
40000-50000
30000-40000
20000-30000
10000-20000
0-10000
OpenAFS: performances
OpenAFS OpenAFS OSD 2 Servers
14/04/2023
13
OpenAFS: features
Uniform name space: same path on all workstations
Security: base to krb4/krb5, extended ACL, traffic encryption
Reliability: read-only replication, HA database, read/write replica in OSD version
Availability: maintenance tasks without stopping the service
Scalability: server aggregation
Administration: administration delegation
Performance: client side disk base persistent cache, big rate client per Server
openAFS: who uses it ?
14
OpenAFS: good for ...
15
14/04/2023
16
GlusterFS
“Gluster can manage data in a single global namespace on commodity hardware..”
Keys: Lower Storage Cost—Open source software runs on commodity
hardware
Scalability—Linearly scales to hundreds of Petabytes
Performance—No metadata server means no bottlenecks
High Availability—Data mirroring and real time self-healing
Virtual Storage for Virtual Servers—Simplifies storage and keeps VMs always-on
Simplicity—Complete web based management suite
14/04/2023
17
GlusterFS: design
14/04/2023
18
GlusterFS: components
volume posix1 type storage/posix option directory /home/export1end-volume
volume brick1 type features/posix-locks option mandatory subvolumes posix1end-volume
volume server type protocol/server option transport-type tcp option transport.socket.listen-port 6996 subvolumes brick1 option auth.addr.brick1.allow * end-volume
14/04/2023
19
Gluster: components
14/04/2023
20
Gluster: performance
14/04/2023
21
Gluster: carateristics
Uniform name space: same path on all workstation
Reliability: read-1 replication, asynchronous replication for disaster recovery
Availability: No system downtime for maintenance (better in the next release)
Scalability: Truly linear scalability
Administration: Self Healing, Centralized logging and reporting, Appliance version
Performance: Stripe files across dozens of storage blocks, Automatic load balancing, per volume i/o tuning
Gluster: who uses it ?
Avail TVN (USA) 400TB for Video on demand, video storage
Fido Film (Sweden)visual FX and Animation studio
University of Minnesota (USA)142TB Supercomputing
Partners Healthcare (USA)336TB Integrated health system
Origo (Switzerland)open source software development and collaboration platform
22
Gluster: good for ...
23
14/04/2023
24
Implementations
Implementations
Old way Metadata and data in the same place Single stream per file
New way Multiple streams are parallel channels
through which data can flow Files are striped across a set of nodes in
order to facilitate parallel access OSD Separation of file metadata
management (MDS) from the storage of file data
14/04/2023
25
HDFS: Hadoop
HDFS is part of the Apache Hadoop project which develops open-source software for reliable, scalable, distributed computing.
Hadoop was inspired by Google’s MapReduce and Google File system
14/04/2023
26
HDFS: Google File System
“ Design of a file systems for a different environment where assumptions of a general purpose file system do not hold—interesting to see how new assumptions lead to a different type of system…”
Key ideas: Component failures are the norm. Huge files (not just the occasional file) Append rather than overwrite is typical Co-design of application and file system API—specialization.
For example can have relaxed consistency.
“Moving Computation is Cheaper than Moving Data”
HDFS: MapReduce
27
HDFS: goals
28
HDFS: design
29
HDFS: components
30
14/04/2023
31
HDFS: features
Uniform name space: same path on all workstations
Reliability: rw replication, re-balancing, copy in different locations
Availability: hot deploy
Scalability: server aggregation
Administration: HOD
Performance: “grid” computation, parallel transfer
HDFS: who uses it ?
Major players
Yahoo!A9.comAOLBooz Allen HamiltonEHarmonyFacebookFreebaseFox Interactive MediaIBMImageShackISIJoostLast.fmLinkedInMetawebMeeboNingPowerset (now part of Microsoft)Proteus TechnologiesThe New York TimesRackspaceVeohTwitter…
32
HDFS: good for ...
33
Ceph
“Ceph is designed to handle workloads in which tens thousands of clients or more simultaneously access the same file or write to the same directory–usage scenarios that bring typical enterprise storage systems to their knees.”
Keys: Seamless scaling — The file system can be seamlessly expanded by simply
adding storage nodes (OSDs). However, unlike most existing file systems, Ceph proactively migrates data onto new devices in order to maintain a balanced distribution of data.
Strong reliability and fast recovery — All data is replicated across multiple OSDs. If any OSD fails, data is automatically re-replicated to other devices.
Adaptive MDS — The Ceph metadata server (MDS) is designed to dynamically adapt its behavior to the current workload.
34
Ceph: design
35
Ceph: features
36
37
Ceph: features
Ceph: features
38
Ceph: good for …
39
Others
40
Part III
Case Studies
41
14/04/2023
42
Class Exam
What can DFS do for you ?
How can you create a Petabyte storage ?
How can you make a centralized system log ?
How can you allocate space for your user or system, when you have a thousands of users/systems ?
How can you retrieve data from everywhere ?
File sharing
43
Web Service
44
Internet Disk: myS3
45
Log concentrator
46
Private cloud
14/04/2023
48
Conclusion: problems
FailureFor 10 PB of storage, you will have an average of 22 consumer-grade SATA drives failing per day.
Read/write timeEach of the 2TB drives takes approximately best case 24,390 seconds to be read and written over the network.
Data ReplicationData replication is the number of the disk drives, plus difference.
Do you have enough bandwidth ?
Conclusion
49
14/04/2023
50
Conclusion: next step
Links
51
I look forward to meeting you…
XVII European AFS meeting 2010 PILSEN - CZECH REPUBLIC
September 13-15
Who should attend: Everyone interested in deploying a globally accessible
file system Everyone interested in learning more about real world
usage of Kerberos authentication in single realm and federated single sign-on environments
Everyone who wants to share their knowledge and experience with other members of the AFS and Kerberos communities
Everyone who wants to find out the latest developments affecting AFS and Kerberos
More Info: http://afs2010.civ.zcu.cz/
14/04/2023
52