Copyright © 2014 NTT DATA Corporation
Yuji Hagiwara
Platform Engineer, NTT DATA Corp.
Introduction to OpenStack Swift CloudOpen Japan 2014
2 Copyright © 2014 NTT DATA Corporation
Agenda
1.What is Swift?
2.Swift’s Latest Information
3.Swift’s Future
3 Copyright © 2014 NTT DATA Corporation
Who am I
Yuji Hagiwara – Platform Engineer, NTT DATA Corp.
Since 2011 -
Using OpenStack
Since 2013 -
Developing Searching on Swift
Demo App for Searching on Swift
4 Copyright © 2014 NTT DATA Corporation
2004 2007 2010 2013 2016
Amount of Unstructured Data
Background
Data Explosion on Enterprise – Amount of Unstructured Data has been growing.
• We need storage with Scalability, Durability, Availability.
Examples of Unstructured Data
• Media (Images, Videos, Audios)
• Web Contents
• Documents
• Backups/Archives
Where should we store these data?
One of the Solutions is Swift.
Growing
exponentially
EB or PB scale
5 Copyright © 2014 NTT DATA Corporation
What is Swift?
Swift is...
• A storage system with Scalability, Durability, Availability.
• The REST-ful Distributed Object Storage likely Amazon S3.
• One of OpenStack Core Components.
• Implemented by Python.
• A Open Source Software.
① Block Storage (Cinder)
② Object Storage (Swift)
6 Copyright © 2014 NTT DATA Corporation
Usage
so simple.
$ curl -XPUT --data-binary ‘@mydoc.txt‘
http://swift.example.com:8080/v1/account/container/object
$ curl –XGET
http://swift.example.com:8080/v1/account/container/object
$ curl –XDELETE
http://swift.example.com:8080/v1/account/container/object
7 Copyright © 2014 NTT DATA Corporation
Use cases of Swift
8 Copyright © 2014 NTT DATA Corporation
Swift as a storage for a variety of applications
Swift
System
Backup
REST API
CMS
Cyber
Duck
FTP-like use Digital
Distribution
Web
Apps …
♫
9 Copyright © 2014 NTT DATA Corporation
OpenStack Swift deployments and use cases
Name of enterprise Product/ service Description
Rackspace(USA) Cloud Files Cloud file share service by Rackspace itself. They use same code as OSS except for features such as authentication, Accounting and CDN(<500PB)
Korean Telecom (Sourth Korea)
ucloud storage service Object storage service using OpenStack/Swift (16PB+ size)
Sina (Republic of China) Sina App Engine(SAE) Public storage service. They moved to OpenStack from another technology MongoDB in 2012.
San Diego Supercomputer Center (USA)
SDSC Cloud Storage Services
Cloud storage service on SDSC. Users can select Amazon/S3 or Rackspace Swift.
SME Storage (USA) SMEStorage Open Cloud Platform
Cloud storage service based on Rackspace Cloud File
SoftLayer (USA) SoftLayer Object Storage Public object storage service. Acquisition by IBM
SwiftStack(USA) Swift Stack Provide professional service and Operation and management product
HP(USA) HP Cloud Private cloud storage service uses OpenStack.
Wikimedia(USA) Wikimedia storage Media files store for Wikipedia.
NII(JAPAN) Academic Cloud service Academic cloud service by National Institute of Informatics in Japan (NII)(Integrated and supported by NTT Data)
10 Copyright © 2014 NTT DATA Corporation
Inside Swift
11 Copyright © 2014 NTT DATA Corporation
Architecture: Nodes
Swift consist of 2-type Nodes: Proxy Node and Storage Node.
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node
…
HTTP Load balancer
Forward Data to node
Store data
Application
Proxy Node
12 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Architecture: The Ring
The Ring (static table for data allocation on storage node)
decide the optimal Storage Node by Name.
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
… Ring Ring Ring
Application
Ring Ring Ring Ring
13 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Architecture: The role of Ring
If you requested to Store the data “A”, 3 Replica nodes store the data “A”.
Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4
…
Proxy Node Proxy Node Proxy Node
… Ring Ring Ring
data
Data “B” Data “B” Data “B”
“A” must be located
at “1”, “2”, “4”
Data “A” Data “A”
Data “A”
Application
Ring Ring Ring Ring
14 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Architecture: The role of Ring
If you requested to Get the data “A”, One of Nodes reply the data “A”.
Storage Node 1 Storage Node 2 Storage Node 3 Storage Node 4
…
Proxy Node Proxy Node Proxy Node
… Ring Ring Ring
data
Data “B” Data “B” Data “B”
“A” must be located
at “1”, “2”, “4”
Data “A” Data “A”
Data “A”
Application
Ring Ring Ring Ring
data
15 Copyright © 2014 NTT DATA Corporation
Scalability
Proxy
Storage Storage Storage
Proxy (expand)
Proxy
Storage Storage Storage
Proxy
Storage (Expand)
(1) Expand proxy server
“Throughput” (2)Expand Storage servers
or disks “volume”
More Throughput
More Volume
16 Copyright © 2014 NTT DATA Corporation
Many processes working together
Swift
object-
replicator
account-
server
object-
auditor
object-
updater
object-
server
object-
expirer
account-
replicator
account-
auditor
account-
reaper
container-
server
container-
replicator container-
auditor
container-
updater
container-
sync
proxy-
server
17 Copyright © 2014 NTT DATA Corporation
Replicator
Node 1 Node 2 Node 3 Node 4
Node 2 Node 3 Node 4
(1) Each nodes checks data in others
Node 1 Node 2 Node 3 Node 4
(5) Recover disk
(6) recover data to original node
(4) Copy data to another node
Nor
mal
Defe
at
Reco
very
Node 1 Node 2 Node 3 Node 4 (2) Disk defeat
(3) Detect disk trouble
Node 1
Node 2 Node 3 Node 4 Node 1
Replicator Disk
(7)Delete temporal data
18 Copyright © 2014 NTT DATA Corporation
Replicator
Node 1 Node 2 Node 3 Node 4
Node 2 Node 3 Node 4
(1) Each nodes checks data in others
Node 1 Node 2 Node 3 Node 4
(5) Recover disk
(6) recover data to original node
(4) Copy data to another node
Nor
mal
Defe
at
Reco
very
Node 1 Node 2 Node 3 Node 4 (2) Disk defeat
(3) Detect disk trouble
Node 1
Node 2 Node 3 Node 4 Node 1
Replicator Disk
(7)Delete temporal data
19 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Normal state
Each Data has replicated.
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
…
Server Server Server Server
Ring Ring Ring
Ring Ring Ring Ring
Data “A” Data “A”
Data “A” Data “B” Data “B” Data “B”
Application
Auditor
Replicator
Auditor
Replicator
Auditor
Replicator …
…
…
Auditor
Replicator
…
20 Copyright © 2014 NTT DATA Corporation
Replicator
Node 1 Node 2 Node 3 Node 4
Node 2 Node 3 Node 4
(1) Each nodes checks data in others
Node 1 Node 2 Node 3 Node 4
(5) Recover disk
(6) recover data to original node
(4) Copy data to another node
Nor
mal
Defe
at
Reco
very
Node 1 Node 2 Node 3 Node 4 (2) Disk defeat
(3) Detect disk trouble
Node 1
Node 2 Node 3 Node 4 Node 1
Replicator Disk
(7)Delete temporal data
21 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Defeat state
If a disk is broken...
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
…
Server Server Server Server
Ring Ring Ring
Ring Ring Ring Ring
Data “A” Data “A”
Data “A” Data “B” Data “B” Data “B”
Application
Auditor
Replicator
Auditor
Replicator
Auditor
Replicator …
…
…
Auditor
Replicator
…
Broken
22 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Defeat state
Replicator detects the lost data and replicates the data to another node for
temporary.
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
…
Server Server Server Server
Ring Ring Ring
Ring Ring Ring Ring
Data “A” Data “A”
Data “A” Data “B” Data “B”
When detect a lost data,
Replicate the data.
Data “B”
Application
Auditor
Replicator
Auditor
Replicator
Auditor
Replicator …
…
…
Auditor
Replicator
…
Temporary data
Broken
23 Copyright © 2014 NTT DATA Corporation
Replicator
Node 1 Node 2 Node 3 Node 4
Node 2 Node 3 Node 4
(1) Each nodes checks data in others
Node 1 Node 2 Node 3 Node 4
(5) Recover disk
(6) recover data to original node
(4) Copy data to another node
Nor
mal
Defe
at
Reco
very
Node 1 Node 2 Node 3 Node 4 (2) Disk defeat
(3) Detect disk trouble
Node 1
Node 2 Node 3 Node 4 Node 1
Replicator Disk
(7)Delete temporal data
24 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Recovery state
When the broken disk is replaced to a fresh disk...
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
…
Server Server Server Server
Ring Ring Ring
Ring Ring Ring Ring
Data “A” Data “A”
Data “A” Data “B” Data “B”
Data “B”
Application
Auditor
Replicator
Auditor
Replicator
Auditor
Replicator …
…
…
Auditor
Replicator
…
25 Copyright © 2014 NTT DATA Corporation
HTTP Load balancer
Recovery state
Replicator replicates the data and removes the temporary data.
Storage Node Storage Node Storage Node Storage Node
…
Proxy Node Proxy Node Proxy Node
…
Server Server Server Server
Ring Ring Ring
Ring Ring Ring Ring
Data “A” Data “A”
Data “A” Data “B” Data “B”
Replicate the data to
the correct node.
Data “B”
Application
Auditor
Replicator
Auditor
Replicator
Auditor
Replicator …
…
…
Auditor
Replicator
…
Removed
26 Copyright © 2014 NTT DATA Corporation
Replicator
Node 1 Node 2 Node 3 Node 4
Node 2 Node 3 Node 4
(1) Each nodes checks data in others
Node 1 Node 2 Node 3 Node 4
(5) Recover disk
(6) recover data to original node
(4) Copy data to another node
Nor
mal
Defe
at
Reco
very
Node 1 Node 2 Node 3 Node 4 (2) Disk defeat
(3) Detect disk trouble
Node 1
Node 2 Node 3 Node 4 Node 1
Replicator Disk
(7)Delete temporal data
27 Copyright © 2014 NTT DATA Corporation
Latest Information
28 Copyright © 2014 NTT DATA Corporation
History and Trend of Community
2010.6 Start OpenStack Project
2010.10 1st release "Austin"
2013.4 “Grizzly”
2013.10 “Havana” 2014.10 “Juno”
Fundamental
Global Cluster
2013.4 “Icehouse”
Developing Supported
Timeline in each functions
Erasure Coding
Storage Policy
Hot Topics on Now
Development Trend in Swift
History of OpenStack
Now
29 Copyright © 2014 NTT DATA Corporation
Replication Erasure Coding
Size 3x original 2x original
Latest info: Erasure Coding
Data
Partial
Data Partial
Data
Parity
1
Data
Data Data Data
distribute
Parity
2
distribute
30 Copyright © 2014 NTT DATA Corporation
Latest info: Storage Policy
Data Data Data
Data Data Data
Data Data Data
Before
Data Data Data
Data Data
After
Partial
Data
Partial
Data
Parity
1
Parity
2
Same Policy on cluster Variety Policy on cluster
More flexibility, but More complex.
31 Copyright © 2014 NTT DATA Corporation
Future Direction of Swift
2 concepts:
Integrated Searchable Storage
Intelligent Resource Management
32 Copyright © 2014 NTT DATA Corporation
Integrated Searchable Storage
Swift should be integrated with Searching.
It means to need searching as Scalable, Durable, Available as Swift.
Swift
Storage System
Users
Operators
Managers
Store
Get
Search
33 Copyright © 2014 NTT DATA Corporation
Use cases of Search
1.Content Search
2.Detection for de-duplication
3.Tiered storage
Data
Hash
Hot Content – More Modified
Data
Hash Data
Hash
Data
Hash
Cold Content – Less Modified
Already Stored?
Cheap Storage
Modifed Date is older than 05/20/2014?
34 Copyright © 2014 NTT DATA Corporation
Future: Integrated Searchable Storage
How do we implement?
Internal External
Swift
Swift
preexist
Search
Engine
Hook
Index
Store
Search
Search Search Store
Search
feature
Storage
feature Storage
feature
Index
Internal External
Where do search Swift with search library
(such as Lucene)
Search Engine
(such as Solr)
Redundancy High Depend on Search engine
Availability High Depend on Search engine
Scalability High Depend on Search engine
Difficulty of implementation Hard Easy
35 Copyright © 2014 NTT DATA Corporation
Our Implementation
Internal Approach
• Hack Swift to embed the search library.
container-server
Indexing Searching
Proxy Node
Swift’s
Ordinary API
New Search
API
object-
server
Lucene SQLite
ContainerA
DB
ContainerB
DB
Container A
Index
Container B
Index …
…
account-
server
Metadata
Data Query
Application
Indexing Search
Distributed by the Ring
36 Copyright © 2014 NTT DATA Corporation
Future: Intelligent Resource Management
Swift has more and more different functions.
Swift
object-
replicator
account-
server
object-
auditor
object-
updater
object-
server
object-
expirer account-
replicator
account-
auditor
account-
reaper
container-
server
container-
replicator container-
auditor
container-
updater
container-
sync
proxy-
server
ec-
auditor
ec-
reconstructor
ec-stripe-
auditor
Erasure Coding
Storage Policy
Multi-ring support
Other arbitrary processes
Search...?
Compression...?
Encryption...?
37 Copyright © 2014 NTT DATA Corporation
Future: Intelligent Resource Management
Resources are drained! – IOPS, CPU, Network, Memory
Performance Priorities of these functions are
different by the Requirement.
Ex1) Store performance VS Search performance
Ex2) Service Level on Business Hour
VS on Outside Hour
More Intelligent Resource Management is necessary.
with cgroups
0
12
6 18 Business Hour
(High-prio to process
requests)
Outside
(High-prio to
check durability)
38 Copyright © 2014 NTT DATA Corporation
Summary
1.What is Swift? Swift is a Great OSS, for storing unstructured data.
2.Swift’s Latest Information Erasure Coding
Storage Policy
3.Swift’s Future Integrated Searchable Storage
Intelligent Resource management
39 Copyright © 2014 NTT DATA Corporation
PR: Demonstration is Now Available!
We exhibit the Demo Application(Contents delivery system) built with Swift.
• On-demand Delivery a lot of contents(Pictures or movies) stored at Swift.
• Implemented Searching on Swift. (Our original implementation)
(map for demo booth)
Copyright © 2011 NTT DATA Corporation
Copyright © 2014 NTT DATA Corporation
Thank you for your attention!
Please contact to [email protected] ,
if you have any questions or comments.
Q&A: Do you have any question?
41 Copyright © 2014 NTT DATA Corporation
Challenges and Questions
How to integrate Swift with cgroups?
How to use cgroups?
What is the best toolset for cgroups?
VFS?
libcgroup?
systemd?
How to control multiple hosts with cgroups dynamically?
How to integrate Swift with search?
What is the best implementation way?
What is the best search middleware?
How to search Multilingual?