Date post: | 12-Aug-2015 |
Category: |
Software |
Upload: | donaghmccabe |
View: | 149 times |
Download: | 4 times |
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The OpenStack TM attribution statement should used: The
OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the
OpenStack Foundation’s permission.
Vancouver OpenStack®
Summit
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Maintaining and Operating Swift at Public Cloud Scale
Lorcan Browne
Donagh McCabe
May 18, 2015
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Agenda
Agenda
• Swift in Helion Public Cloud
• Monitoring
• Swift Runbook
• Deployments/Operations
• Q & A
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Swift in HP Helion Public Cloud
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5
HP Cloud Swift Ecosystem
Service
Monitoring/
Customer Issues
Deployments/
Operations
Operations
Runbook
SwiftData Center
Operators
Tech Ops
Network
Operations
CenterSwift Service
TeamOpenstack
Core Team
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6
HP Helion Public Cloud
Two Data Centers
• Over 3 years operational
• 18 PB of raw storage
• 3.5 billion objects
• 130 Proxy Servers
• ~700 Storage Nodes
• ~8,000 Storage Drives
Features
• 3 Replicas
• 3 “Availability” Zones
• Single storage policy
• Upstream Swift, except for:
– Content Delivery Network
– Support legacy SWAUTH
accounts
Swift at Scale
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7
Server Racking Example
1xRack: 1620 “TB” raw
replica count: 3
usable: 80%
3xRack: ~1 PB usable
Proxy- Account-Container
6x HP DL360pGen9
4x HP 800 GB SSD
Swift Object-Server
18x HP DL380 G9
15x HP 6TB SAS 7.2K
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8
Data Center – Failure Zones
LB LB LB
AZ1 AZ2 AZ3
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9
37%
55%
8% 0%
Objects – Number and Size
• Most object are small
• Bulk of objects are 1k to 64k
• A tiny fraction % (0.01) of objects
are very large.
< 1KB
> 100MB
100KB –
100MB
1KB-
100KB
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10
52%
47%
1%
small objects
large objects
containers
Objects – Space usage
• Half of capacity used by 0.01% of
objects
• Account and container databases
are ~1% of object size
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11
0
5
10
15
Mil
lio
ns
PUTs (4m) GETs (5.8m) DELETEs (1.5m) Other (10m)
Millions of Operations per day
Normalized to 1PB of user data
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12
83%
17%
PUTs GETs
PUT/GET Size – per 1PB of user data
• Normalized to 1PB of user data
• PUTs: 2.5 TB per day
• GETs: 508 GB per day
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Monitoring
Service Monitoring
Monitoring/Alerting
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14
Uptime/latency Monitor
swift-uptime-mon
Over several cycles we will visit every server in the
system
• 60 seconds between cycles
• ~100 Requests:
– GET/PUT/DELETE object requests
– GET /healthcheck requests
• Measures and logs:
– “Soft” failure – any failure
– “Hard” failure – is failed even after retry
– Latency (average/max in cycle)
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15
External view of uptime – 2 years of Pingdom
Rounded!
Actually
99.998%
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16
Smoke Tests
Jenkins Job
• Emphasis on features that require external
support (specifically Keystone). Runs in a
few minutes.
• Not regression or functional testing – we
cover that elsewhere in the development
cycle
• Runs hourly and more frequently prior to,
during and after software deploys
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17
Monitoring/Alerts
Obvious
• swift-recon – async_pending
• Hardware:
– Drive status
– NIC status
• Services running?
– PIDs still there?
– Responds to /healthcheck?
• Keystone validating tokens?
Less obvious
• Hardware
– NIC speeds
– I/O wait times (next slide)
– Firmware versions
• Replication time
• SSL certs approaching end of life?
• Numbers of file descriptors
• NTP operating?
• Connectivity to 11211 (memcached)
• DNS failures
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18
From collectl toolkit – histogram of number of requests in time range buckets
Drive I/O wait times
% disk-anal.pl -d
Disk: sda Wait: 62804 8 1 0 0 0 0 0 0 0 0
Disk: sdb Wait: 54901 886 138 14 70 52 1 0 15 0 0
...
Disk: sdk Time: 63410 9 11 0 0 0 0 0 0 0 0
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Viewing monitoring/alert data
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20
Dashboard – for metrics
• Trend spotting
• 24hours and 7 days
• Uses Public Cloud analytics pipeline
• Collectd/rrdtool
• Vertica database
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Swift Runbook
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22
Runbook
Swift Runbook
CMDB
Basic
systems
checks
Swift
operations
Monitoring
results
explanation
Log
interpretation
How-to-fix
procedures
Data Center
OperatorsTech Ops
Network
Operations
Center
• Essential for scale
and continued
development
• Populated by service
team
• Consumed by
NOC/Ops teams
• Continuously updated
• Automated where
possible
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23
Case Study - Unrecoverable Read ErrorsBackground:
• Data, once written, cannot later be read (URE)
• As seen in Swift:
– In objects. Automatically renames and gets object-replicator to create new copy
– In filesystem metadata. swift-drive-audit scans kern.log for evidence
Issue:
• NOC were spending a significant time fixing problems manually
• Warnings were not going away immediately after the fix
Action:
• Improve efficiency and reports from swift-drive-audit
• Automate repair
• Revalidate “problem” sectors – clears old alarms
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Deployment/Operating
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.25
Making Software/Configuration Changes
Development/Test
systems
QA production
mimic system
Production system
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.26
Deploying Changes To ProductionStart
Deploy to single
AZ
System
operating
?
Revert back
to original
code
More
AZs
?
End Deploy
No
No
Yes
• System Pre-Check:
• Smoke Tests
• Icinga
• Dashboard
• Check before, during and after deploy
• Currently using chef infrastructure
• 1 availability zone at a time
• 2/3rd of system always in a usable state
• Some deploys require rolling restarts
• Reload (not restart)
• Limit to ~10 servers at a time
System pre-
check
Yes
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.27
Managing the RingCSV file
swift-ring-builder
Package rings into debian package
Deploy rings to /var/cache/swift
Match
checksum?
Copy rings
to /etc/swiftDeploy fails
NoYes
# Ring parameters
acc_conf,15,3,24
con_conf,15,3,24
obj_conf,15,3,24
# IP, ZONE, TYPE
*10.184.9.123,1,SE1170s_3
*10.184.9.124,1,SE1170s_3
*10.184.10.1,1,SE1170s_3
*10.184.10.2,1,SE1170s_3
Diff CSV-builder: generate
add/remove commands
Deploy checksum to nodes
Add checksum to system config
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.28
Deploying Racks – Sequencing Ring Change
• Series of ring changes often required:
• Swift Dispersion Report – checks a sample of
containers and objects. Each replica on canonical
location?
• Replication Time – time increases dramatically
after a ring deploy due to partitions being shifted
between devices
• We don’t change rings for drive/server failures
• Swift proxies are not in the ring so much easier to
add or remove
System Pre-check
Replication
time?Wait
Dispersion
report?
Active
NO
100%
Deploy Ring
Changes
Wait
More Ring
Changes?
No
Yes
>3
copies
End
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.29
Rack deployments over one month
Replication Time
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.30
Operating Swift - Summary
Lessons
• Make procedures repeatable
• Monitor everything
• Keep on top of problems and
faults
• However, don’t panic when
async_pending is high
Day to Day
• General break-fix
• UREs (swift-drive-audit)
• Reviewing system state
• User queries
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The OpenStack TM attribution statement should used: The
OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the
OpenStack Foundation’s permission.
Q & A