+ All Categories
Home > Software > OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Date post: 10-Jun-2015
Category:
Upload: netways
View: 608 times
Download: 2 times
Share this document with a friend
Description:
Ceph has recently gained considerable momentum as a possible replacement for conventional storage technologies. Every new Ceph release brings a number of important improvements and interesting features such as Erasure Coding and Multi-Site replication. Work is on the way to make CephFS, the POSIX-compatible Ceph file system, ready for enterprise usage and the number of companies using Ceph is permanently increasing. More than enough reasons to take a closer look at recent Ceph developments: What's hot and boiling and which features do the Ceph developers have on their list for implementation next?
Popular Tags:
130
What’s next for Ceph? On the future of scalable storage Martin Gerhard Loschwitz © 2014 hastexo Professional Services GmbH. All rights reserved.
Transcript
Page 1: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

What’s next for Ceph? On the future of scalable storage

Martin Gerhard Loschwitz

© 2014 hastexo Professional Services GmbH. All rights reserved.

Page 2: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Who?

Page 3: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 4: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 5: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 6: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 7: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 8: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 9: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 10: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 11: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 12: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Quick reminder:

Object Storage

Page 13: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

Page 14: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 15: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Cephalopod (Wikipedia, user Nhobgood)

Page 16: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 17: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

RADOS

Page 18: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Redundant Autonomic Distributed Object Store

Page 19: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 Components

Page 20: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

OSDs

Page 21: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

HDD

FS

Page 22: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

Users

Objects

Page 23: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Unified Storage

Page 24: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

Users

Objects

Page 25: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

Page 26: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

Page 27: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

Page 28: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 29: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Users

Objects

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

MO

N

MO

N

MO

N

Page 30: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Data Placement

Page 31: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 32: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 33: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 34: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 35: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 36: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 37: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 38: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 39: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Parallelization

Page 40: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 2 1 1

MONs

Page 41: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 2 1 1

MONs

Page 42: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 2 1 1 1 2 2 1

MONs

Page 43: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 44: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

CRUSH

Page 45: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Controlled Replication Under Scalable Hashing

Page 46: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

By configuring CRUSH, you make the cluster

rack-aware.

Page 47: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MO

N

MO

N

MO

N

Users

Objects

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

OS

D

RADOS Block Device Block-level interface

driver for RADOS

RADOS Gateway ReSTful API to

access RADOS

CephFS POSIX file system

access to RADOS

Page 48: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

“Booooring!”

Page 49: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Cool Stuff ahead:

Erasure Coding Tiering

Multi-DC Setups Automation

CephFS Enterprise Support

Page 50: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Erasure Coding

Page 51: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 2 1 1 1 2 2 1

MONs

Page 52: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Until now, Ceph has really worked like

a standard RAID 1.

Page 53: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Every binary object exists two times.

Page 54: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 2 1 1 2 1 2 1

MONs

Page 55: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Works great. But it also reduces the net

capacity by 50%.

At least.

Page 56: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

That is where Erasure Coding comes in.

It makes Ceph work

like a RAID 5.

Page 57: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Mostly developed by Loic Dachary

Page 58: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Idea: Split binary objects into even smaller chunks

Page 59: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 60: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 61: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 62: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

Page 63: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

This reduces the amount of space required for replicas enormously!

Page 64: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Different replication factors available

Page 65: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

But: The lower the level is, the longer it takes to re-calculate

missing chunks.

Page 66: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Available in Ceph 0.80.

Page 67: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Tiering

Page 68: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Not all data stored in Ceph is equal.

Page 69: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Often needed, fresh data is usually expected

to be served quickly.

Page 70: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Also, customers may be willing to accept

slower performance in exchange for lower prices.

Page 71: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Until now, that wasn’t easy to implement in RADOS due to a

number of limitations.

Page 72: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

With Ceph 0.80, pools will allow to

store data on different, hardware, based on

its performance

Page 73: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Wait. Pools?

Page 74: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Pools are a logical unit in RADOS. A pool is a bunch of

Placement Groups.

Page 75: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

By using tiering, Pools can be tied to specific hardware components.

Page 76: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

All replication happens intra-pool

Page 77: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Data may be moved from one pool to

another pool in RADOS

Page 78: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Available in Ceph 0.80.

Page 79: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Multi-DC Setups

Page 80: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ceph was designed for high-performance,

synchronous replication

Page 81: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Off-Site replication is typically asynchronous.

Page 82: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Bummer!

Page 83: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

But starting with Ceph 0.67, the RADOS Gateway

supports “Federation”

Page 84: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

1

MONs

DC 2

DC 1

RADOS

Gateway

RADOS

Gateway

Sync-Agent

Sync-Agent

Page 85: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

MONs

1

MONs

DC 2

DC 1

RADOS

Gateway

RADOS

Gateway

Sync-Agent

Sync-Agent

1

Page 86: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

In fact, the federation feature adds asynchronous

replication on top of the RADOS storage cluster

Page 87: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Still needs better integration with the

other Ceph components

Page 88: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Automation

Page 89: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ceph clusters will almost always be

deployed using tools for automation

Page 90: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Thus, it needs to play together well with Chef, Puppet & Co.

Page 91: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Chef: Yay!

Chef cookbooks are maintained and

provided by Inktank.

Page 92: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Puppet: Ouch

Inktank does not provide Puppet modules

for Ceph deployment

Page 93: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Right now, at least 6 concurring modules exist on GitHub, some

forks of each other

Page 94: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

None of these use ceph-deploy, though.

Page 95: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

But there is hope: puppet-cephdeploy does use ceph-deploy

Page 96: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Needs some additional work, but generally,

looks very promising and already works

Page 97: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Plays together nicely even with ENCs such

as the Puppet Dashboard or the Foreman project

Page 98: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 99: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

CephFS

Page 100: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Considered Vapoware by some people already.

But that’s not fair!

Page 101: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

CephFS is already available and works.

Well, mostly.

Page 102: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

For CephFS, the really critical component is the Metadata Server (MDS)

Page 103: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Running CephFS today with exactly one active

MDS is fine and will most likely not cause trouble.

Page 104: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

But Sage wants the MDS to scale-out properly so

that running several active MDSes at a time works

Page 105: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

That’s called Subtree Partitioning. Every active MDS will be responsible for the meta-data of a certain subtree of the POSIX-compatible FS

Page 106: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Right now, Subtree partitioning is what’s

causing trouble.

Page 107: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

CephFS is not Inktank’s main priority; likely to

be released as “stable” in Q4 2014

Page 108: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Enterprise Support

Page 109: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Major companies willing to run Ceph need some

type of support contract.

Page 110: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Inktank has started to offer that support through a

product called “Inktank Ceph Enterprise” (ICE)

Page 111: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Gives users Long-Term support for certain Ceph releases (such as 0.80)

and hot-fixes for problems

Page 112: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Also brings Calamari, Inktank’s Ceph GUI

Page 113: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?
Page 114: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Distribution Support

Page 115: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Inktank does a lot to make installing Ceph

on different distributions as smooth as possible already.

Page 116: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ye olde OSes:

Ubuntu 12.04 Debian Wheezy

RHEL 6 SLES 11

Page 117: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ubuntu 14.04: May 2014

Page 118: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

RHEL 7: December 2014

Page 119: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Release Schedule

Page 120: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Firefly (0.80): May 2014, along

with ICE 1.2

Page 121: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Giant: Summer 2014

(Non-LTS version)

Page 122: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

The “H”-release: December 2014,

along with ICE 2.0

Page 123: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ceph Days

Page 124: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ceph Days are information events run by Inktank all

over the world.

Page 125: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

2 have happened in Europe so far:

London (October 2013)

Frankfurt (Februar 2014)

Page 126: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Ceph Days allow to gather with others willing to use

Ceph, exchange experiences.

Page 127: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

And you can meet

Sage Weil

Page 128: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

No shit. You can meet

Sage Weil!

Page 129: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

Special thanks to Sage Weil (Twitter: @liewegas)

& Crew for Ceph Inktank (Twitter: @inktank)

for the Ceph-Logo

Page 130: OSDC 2014: Martin Gerhard Loschwitz - What's next for Ceph?

[email protected]

goo.gl/S1sYZ (me on Google+)

twitter.com/hastexo

hastexo.com

2 2 1 1 1 2 2 1

MONS


Recommended