+ All Categories
Home > Documents > Implementing Witness service for various cluster failover ...

Implementing Witness service for various cluster failover ...

Date post: 07-Feb-2022
Category:
Upload: others
View: 11 times
Download: 0 times
Share this document with a friend
43
Implementing Witness service for various cluster failover scenarios Rafal Szczesniak EMC/Isilon
Transcript
Page 1: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

1

Implementing Witness service for various cluster failover scenarios

Rafal SzczesniakEMC/Isilon

Page 2: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

2

Long time ago vs. now

SMB1 – no high availability at all

2

Page 3: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

3

Long time ago vs. now

SMB1 – no high availability at all

SMB2 – durable and resilient handles (file opens)

3

Page 4: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

4

Long time ago vs. now

SMB1 – no high availability at all

SMB2 – durable and resilient handles (file opens)

SMB3 – persistent handles, multi-channel and Witness

4

Page 5: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

5

What is Witness?

DCE/RPC interface (see [MS-SWN])

Service providing early detection of connection failures instead of relying on TCP timeouts

Means of (partial) control over client connections

5

Page 6: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

6

What is Witness?

Page 7: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

7

OneFS cluster

Isilon IQ Storage LayerIntracluster

Communication Infiniband

Clients

Client/Application Layer Ethernet Layer

(optional 2nd

switch)(optional 2nd

switch)

NFS, SMB,FTP, HTTP,

HDFS

NFS, SMB,FTP, HTTP,

HDFSClients

Clients

(optional 2nd switch)(optional 2nd switch)

Page 8: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

8

Witness Service in OneFS cluster

Isilon IQ Storage LayerIntracluster

Communication Infiniband

Clients

Client/Application Layer Ethernet Layer

Clients

(optional 2nd switch)(optional 2nd switch)

SMB Connection

Witness Registration

Page 9: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

9

Interfaces and Groups

Interface group as an abstraction of cluster nodes’ network interfaces

Usually the same as OneFS address pool

Separate groups for separate OneFS Access Zones

9

Page 10: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

10

Caching the state of interfaces

Requesting the interface information from the system all the time can be expensive

The interface state does not change so often

We can cache the information for as long as we need it

10

Page 11: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

11

Caching the state of interfaces

11

Page 12: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

12

Caching on-demand

The internal list of interfaces is propagated when needed

The number of interfaces can be substantial, especially in a cluster with multiple Access Zones

Updating a large cache could be expensive too, so it’s easier to keep track of only those interfaces the clients ask about

12

Page 13: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

13

Resource monitor

Thin layer providing access to the cluster “resources”

The only resources monitored (at the moment): Interface, Interface Group

Allows querying the current information

Allows subscribing for events and unsubscribing when the server is no longer interested in updates

13

Page 14: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

14

What does the availability mean?

Network interface failures

14

Page 15: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

15

What does the availability mean?

Network interface failures

Server process crashes or deadlocks

15

Page 16: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

16

What does the availability mean?

Network interface failures

Server process crashes or deadlocks

System crashes

16

Page 17: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

17

Resource monitor modules and events

Individual modules can keep track of all sorts of things independently

Subscribing certain (or any) changes enables the module to submit events to Interface or Interface Group

Witness server has the authority to filter the events and make its own decisions on how the clients should be notified

17

Page 18: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

18

Resource monitor modules

18

Page 19: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

19

Resource events

Virtually any change happening to a subscribed resource can generate an event

Examples of events to watch for:

Interface state change to unavailable

New interface added to an Interface Group

Submitted events are “pre-treated” by the server before they are used to generate client notifications

19

Page 20: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

20

Resource events (contd.)

Modules have a large degree of freedom in what can cause an event submission

The server has the authority to say which events will turn into the actual notifications

20

Page 21: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

21

Resource event

21

What does it include?

Module Id

Type of event (changed/added/removed)

Resource

Destination (optional, if the module has any suggestions)

Page 22: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

22

Interface events queue

22

Page 23: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

23

Keeping track of the availability

Multiple different modules look at different aspects of availability

We need all of them to give us a “go” in order to consider an Interface available

Witness server updates a list of Problems for each Interface as “go-s” and “no-go-s” come in their respective events

The list is empty = There are no problems = The interface is available

23

Page 24: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

24

Keeping track of availability

24

Page 25: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

25

Updating interface state

Any module can submit events to an interface at any time (given subscriptions)

Witness server starts a work item (a function started in a separate thread) to process the events

After processing, subsequent work items are started to queue notifications in each individual client registration

Work items queuing the notifications resume execution of asynchronous request and send the responses to the witness clients

25

Page 26: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

26

Updating interface (submit)

26

Page 27: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

27

Updating interface state (process)

27

Page 28: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

28

Updating interface state (wake up)

28

Page 29: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

29

Updating interface state (notify)

29

Page 30: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

30

Resource monitor modules

Different modules can keep track of different things independently

Each module handles its specific failover scenario

Page 31: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

31

Scenario: Testing

A module with an IPC interface and a command line client simulates the network interfaces and groups and their changes

Can create and keep an arbitrary number of groups and interfaces

Useful for simulating unusual events

31

Page 32: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

32

Testing module (netsim)

32

Page 33: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

33

Scenario: Network interface failure

Wired to OneFS cluster networking configuration (Flexnet)

Interface and address pool information received from the system service

Waiting for changes in a separate thread watching individual address pools

Notified through file descriptors

33

Page 34: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

34

Flexnet Service in OneFS cluster

Isilon IQ Storage LayerIntracluster

Communication Infiniband

Clients

Client/Application Layer Ethernet Layer

(optional 2nd

switch)(optional 2nd

switch)

NFS, SMB,FTP, HTTP,

HDFS

NFS, SMB,FTP, HTTP,

HDFSClients

Clients

(optional 2nd switch)(optional 2nd switch)Fle

xnet

Page 35: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

35

Network module

35

Page 36: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

36

Scenario: Server process failure

OneFS Group Manager watching other nodes in the cluster provides the feed

It can keep track of the state of certain processes on other nodes

The module gets notified about the changes in the same way as Network module

36

Page 37: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

37

Group Manager in OneFS cluster

Isilon IQ Storage LayerIntracluster

Communication Infiniband

Clients

Client/Application Layer Ethernet Layer

(optional 2nd

switch)(optional 2nd

switch)

NFS, SMB,FTP, HTTP,

HDFS

NFS, SMB,FTP, HTTP,

HDFSClients

Clients

(optional 2nd switch)(optional 2nd switch)

Group Manager

Page 38: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

38

Scenario: Maintenance

Sometimes we need to gracefully take a node off the cluster

Existing client connections should “go away”

The module can make the node interfaces look unavailable

It can also move all connections to a different node or even a completely different group

38

Page 39: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

39

Beyond failover

Witness “move” notification can be used for load balancing

What would it take?

Connection resource type (to have a control over individual connections)

A module checking the load on other nodes and requesting the move if one of them is overloaded (perhaps another use for witness)

39

Page 40: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

40

Beyond Witness itself

Witness RPC is not in fact tied to SMB protocol very much

Information provided by the Resource Monitor (network interfaces status) may be useful for other services, too

40

Page 41: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

41

Beyond Witness itself

41

Page 42: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

42

Beyond Witness itself

42

Page 43: Implementing Witness service for various cluster failover ...

2012 Storage Developer Conference. © EMC. All Rights Reserved.2014 Storage Developer Conference. © EMC Corporation. All Rights Reserved.

43

Thank you!

Questions?

Rafal Szczesniak

[email protected]

43


Recommended