Date post: | 16-Jan-2015 |
Category: |
Technology |
Upload: | inktank |
View: | 521 times |
Download: | 5 times |
Burkhard Noltensmeierteuto.net Netzdienste GmbH
Erkan YanarConsultant
teuto.net Netzdienste GmbH
● 18 Mitarbeiter● Linux Systemhaus und
Webdevelopment● Ubuntu Advantage Partner● Openstack Ceph Service● Büros und Datacenter in
Bielefeld
Why Openstack ?
Infrastructure as a Sevice● Cloud Init (automated Instance provisioning) ● Network Virtualization● Multiple Storage Options ● Multiple APIs for Automation
● closed beta since September 2013● updated to Havana in October● Ubuntu Cloud Archive● 20 Compute Nodes● 5 Ceph Nodes● Additional Monitoring with Graphite
Provisioning and Orchestration
Openstack Storage Types
● Block Storage● Object Storage● Image Repository● Internal Cluster Storage
– Temorary Image Store
– Databases (Mysql Galera,MongoDB)
Storage Requirements
● Scalability● Redundancy● Performance● Efficient Pooling
Key Facts for our Decision
● One Ceph Cluster fits all Openstack needs● no „single Point of Failure“● POSIX compatibility via Rados Block Device● seamless scalability● commercial support by Inktank● GPL
Rados Block Storage ● Live migration ● Efficient Snapshots● Different types of storage avaiable (tiering)● Cloning for fast restore or scaling
How to start
● determine Clustersize
uneven amount of Nodes to enable negotiation ● Small start with at least 5 Nodes● either 8 or 12 Disks per Chassis● One Jounal per Disk ● 2 Journal SSD per Chassis
Rough calculation
● 3 Nodes, 8 Disks per Node, 2 replica● Netto = Brutto / 2 replica – 1 Node (33%) =
33%
Cluster Brutto● 24 2TB Sata Disks, 100 IOPS each
Cluster Netto● 15,8 Terrabyte, 790 IOPS
Rough calculation
● 5 Nodes, 8 Disks per Node, 3 replica● Netto = Brutto / 3 replica – 1 Node (20%) =
27%
Cluster Brutto ● 40 2TB Sata Disks, 100 IOPS each
Cluster Netto● 21,3 Terrabyte, 1066 IOPS
Ceph specifics
● Data is distributed throughout the Cluster● Unfortunately this destroys Data locality
tradeoff between blocksize an iops.● The bigger Blocks, the better is sequential
performance ● Double Write, SSD Journals strongly advised● Longterm fragmentation by small writes
Operational Challenges
● Performance● Availability● Qos (Quality of Service)
Ceph Monitoring in ostack
● Ensure Quality with Monitoring● Easy spotting of congestion Problems● Event Monitoring (e.g. disk failure)● Capacity management
What we did
● Disk monitoring with Icinga● Collect data via Ceph Admin Socket Json
interface● put it into Graphite ● enrich it with Meta Data
– with Openstack tennant
– Ceph Node
– OSD
Cumulated osd Performance
Single osd performance
Sum by Openstack tenant
Verify Ceph Performance
● Fio Benchmark with fixed file sizefio fsync=<n> runtime=60 size=1g –bs=<n> ...
● Different sync option nosync, 1, 100● Different Cinder Qos Service Options● Blocksize 64k 512k 1024k 4096k● 1 up to 4 VM Clients● Resulting in 500 Benchmark runs..
Cinder Quality of Service
$ cinder qoscreate highiops consumer="frontend"
read_iops_sec=100 write_iops_sec=100
read_bytes_sec=41943040 write_bytes_sec=41943040
$ cinder qoscreate lowiops consumer="frontend"
read_iops_sec=50 write_iops_sec=50
read_bytes_sec=20971520 write_bytes_sec=20971520
$ cinder qoscreate ultralowiopsconsumer="frontend"
read_iops_sec=10 write_iops_sec=10 read_bytes_sec=10485760 write_bytes_sec=10485760
Speed per Cinder Qos
Does it scale
Effect of syncing Files
Different Blocksize with sync
Ceph is somewhat complex, but
● reliable ● No unpleasent suprises (so far!)● Monitoring is important for resource
management and availabilty !