+ All Categories
Home > Software > IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Date post: 14-Jul-2015
Category:
Upload: david-ware
View: 711 times
Download: 7 times
Share this document with a friend
Popular Tags:
63
© 2015 IBM Corporation AME-2273 IBM MQ: Managing workloads, scaling and availability with MQ clusters David Ware
Transcript
Page 1: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

© 2015 IBM Corporation

AME-2273 IBM MQ: Managing workloads, scaling and availability with MQ clusters

David Ware

Page 2: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

2

Agenda

• Why a cluster?

• Service availability

• Location dependency

• Avoiding interference

• Clusters and DR

Page 3: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

3

Introduction

• This session breaks with tradition for clustering sessions by approaching

the topic from a point of view of a set of common clustering scenarios or

‘use cases’.

• We will build up from a fairly simple and common initial clustering setup

to tackle some of the more complicated issues which often come up in

evolving clustered environments.

• Although not looking at the topic from a list of features or “how do I use

widget x” approach, as we work through the examples we will see where

some of the recent additions to IBM MQ’s clustering capabilities are

relevant to these everyday problems.

Page 4: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

4

Terminology / Key

Lots of different terminology floats around when you get into application

infrastructure discussions… Clients, applications, servers, services,

requesters, responders… For the purposes of this session:

Client – a ‘client’ in the general sense (whether

connected locally or remotely), uses IBM MQ to send one

off datagrams or initiates requests to services and waits

for replies.

Service – a process which consumes messages and

takes some action, often needing to reply to the

requesting Client.

Queue Manager

Note that there may be more than one Instance of a client or service, either

connected to a given queue manager or the infrastructure as a whole.

A set of clients and services working together to achieve some useful end

goal make up an ‘application’.

Service

Client

QMgr

Full repository

Queue Manager

QMgr

Page 5: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

5

Why a Cluster?

5

Page 6: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

6

Where it all begins…

QMgrClient 1

Service 1

QMgr

Page 7: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

7

QMgr QMgr

Client 1

Service 1

Where it all begins…

Page 8: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

8

Client 1

Over time…

Service 1

Client 2

Client 2

Client 3

Service 3

Service 2Service 1

QMgrQMgr QMgrQMgr

Client 1Client 1

Page 9: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

QMgr

QMgr

QMgr

QMgr

QMgr

QMgr

QMgr

9

Over time…

App 1

Service 1

Client 2

Client 2

Client 3

Service 2

App 1Client 1

Service 1

QMgr

QMgr

QMgr

QMgr

Client 3

Client 1

Client 4

App 4App 4Client 4

Service 4

Service 3

Service 1Service 1

Service 3

Page 10: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

QMgr

QMgr

QMgr

QMgr

QMgr

QMgr

QMgr

10

…Now you needa cluster

App 1

Service 1

Client 2

Client 2

Client 3

Service 2

App 1Client 1

Service 1

QMgr

QMgr

QMgr

QMgr

Client 3

Client 1

Client 4

App 4App 4Client 4

Service 4

Service 3

Service 1Service 1

Service 3

Page 11: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

11

Basic Cluster

• This illustrates the first reason we may move to a IBM MQ cluster – simplified

administration. Channel definitions - and, if we choose, the addressing of

individual queues within the cluster - are tasks which no longer have to be

manually carried out by the administrator.

• At this point we still only have single instances of ‘service’ queues.

• A degree of vertical scaling can be achieved by adding instances of the Service

processes connecting to the single queue manager hosting a particular queue, if

the application can cope with ‘interleaved’ processing.

Page 12: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

QMgr

Service 1Service 1

Service 1

12

Starting to scale horizontally…

• The second reason for a cluster…•Workload Balancing

•Service Availability

App 1App 1Client 1

QMgr

QMgr

QUEUE 1

QUEUE 1

Page 13: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

13

Starting to scaleHorizontally

• By adding instances of ‘service’ queues, we can start to scale applications

across multiple hosts and beyond the ‘vertical’ scaling on a single (albeit maybe

multi-CPU) system.

• Using IBM MQ Clustering allows instances to be added and removed without

modifying the client or server applications

• But may require some thinking ahead to design with this in mind – e.g. avoiding hard

coded queue manager names

• As soon as we have multiple instances, questions arise about ‘choosing’ an

instance for a request, so as a natural consequence workload balancing

becomes available

• Location transparency for ‘the service’ has been achieved at this point, but there

is still a strong coupling between ‘an instance’ and a particular queue manager –

for example for reply routing (see next section)

Page 14: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Availability

Page 15: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

15

• Target cluster queues

• Cluster transmission queues

Service 1

App 1App 1Client 1

Service 1

QMgr

QMgr

QMgr

Where can the messages get stuck?

Target cluster queue

Cluster transmission queue

Page 16: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

16

Service 1

App 1App 1Client 1

Service 1

QMgr

QMgr

QMgr

The service queue manager/host fails

Message reallocationUnbound messages on the

transmission queue can be

diverted

When a queue manager fails:• Ensure messages are not bound to it

Page 17: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

17

Service 1

App 1App 1Client 1

Service 1

QMgr

QMgr

QMgr

The service queue manager/host fails

QMgr

Locked messagesMessages on the failed queue manager

are locked until it is restarted

Restart the queue

managerUse multi-instance queue

managers or HA clusters to

automatically restart a

queue manager

Reconnect the serviceMake sure the service is

restarted/reconnects to the

restarted queue manager

When a queue manager fails:• Ensure messages are not bound to it

• Restart it to release queued messages

Page 18: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

18

Failed Service QueueManager

• When a ‘service’ queue manager fails, request messages which have reached

that queue manager or responses on transmission queues are inevitably lost

until that queue manager can be restarted.

• However – service availability can be maintained by making sure that there is

sufficient capacity in other hosts to cope with all requests being loaded onto

them.

• This will be smoother and give higher availability if client applications can be

designed to avoid server affinity and strict message ordering requirements –

BIND_NOT_FIXED. Reallocation will then mean that even in-flight requests can

be re-routed.

• To avoid the trapped request problem, consider HA clustering technology or

multi-instance queue managers.

Page 19: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Service application

availability

19

Page 20: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

20

QMgr

QMgr

QMgr

• Cluster workload balancing does not take into account the

availability of receiving applications.

• Or a build up of messages.

• Messages will continue to be routed to a queue that is not being

processed

Service 1

App 1App 1Client 1

Service 1

The service application fails

Blissful ignoranceThis queue manager is unaware

of the failure to one of the

service instances

Unserviced messagesHalf the messages will quickly

start to build up on the service

queue

Page 21: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

21

• IBM MQ provides a sample monitoring service

• Regularly checks for attached consuming applications• Updates the cluster workload balancing configuration to reflect this

• So, what happens?...

• Generally suited to steady state service applications

Service 1

App 1App 1Client 1

Service 1

QMgr

QMgr

QMgr

Monitoring for service failures

QMgrQMgr

1. Detecting a changeWhen a change to the open

handles is detected the cluster

workload balancing state is

modified

2. Sending queue managersNewly sent messages will be sent to

active instances of the queue

3. Moving messagesAny messages that slipped

through will be transferred to

an active instance of the queue

Page 22: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

22

Cluster Queue MonitoringSample

• amqsclm, is provided since 7.0.1.8 to ensure messages are directed towards the

instances of clustered queues that have consuming applications currently attached.

This allows all messages to be processed effectively even when a system is

asymmetrical (i.e. consumers not attached everywhere).

• In addition it will move already queued messages from instances of the queue

where no consumers are attached to instances of the queue with consumers.

This removes the chance of long term marooned messages when consuming

applications disconnect.

• The above allows for more versatility in the use of clustered queue topologies

where applications are not under the direct control of the queue managers. It also

gives a greater degree of high availability in the processing of messages.

• The tool provides a monitoring executable to run against each queue manager in

the cluster hosting queues, monitoring the queues and reacting accordingly.

• The tool is provided as source (amqsclm.c sample) to allow the user to understand

the mechanics of the tool and customise where needed.

Page 23: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

23

AMQSCLM Logic

• Based on the existing MQ cluster workload balancing mechanics:• Uses cluster priority of individual queues – all else being equal, preferring to send messages to instances of

queues with the highest cluster priority (CLWLPRTY).

• Using CLWLPRTY always allows messages to be put to a queue instance, even when no consumers are attached to any instance.

• Changes to a queue’s cluster configuration are automatically propagated to all queue managers in the cluster that are workload balancing messages to that queue.

• Single executable, set to run against each queue manager with one or more

cluster queues to be monitored.

• The monitoring process polls the state of the queues on a defined interval:• If no consumers are attached:

– CLWLPRTY of the queue is set to zero (if not already set).

– The cluster is queried for any active (positive cluster priority) queues.

– If they exist, any queued messages on this queue are got/put to the same queue. Cluster workload balancing will re-route the messages to the active instance(s) of the queue in the cluster.

• If consumers are attached:

– CLWLPRTY of the queue is set to one (if not already set).

• Defining the tool as a queue manager service will ensure it is started with

each queue manager

Page 24: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Client failures

Page 25: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

25

Service 1

App 1App 1Client 1

Service 1

Client host failure with an in flight request/response

QMgr

QMgr

QMgr

QMgr

• Reply messages are bound to the originating

queue manager, with no ability to redirect.

• This may or may not be what’s required…

Reply message boundThe reply message will be locked to

that outbound queue manager

Request messageTypically a request message will fill in

the reply ReplyToQmgr based on the

outbound queue manager

Page 26: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

26

Service 1

App 1App 1Client 1

Service 1

Unlocking a response message

QMgr

QMgr

QMgr

QMgr

• Reply-to queue aliases and reply-to queue manager aliases can be used to blank out the

outbound resolution of the ReplyToQMgr field.

• But watch out, replies will now go to queue managers where the requestor may not be

connected – this only suits some scenarios

DEF QLOCAL(REPLYQ) CLUSTER(CLUSTER1)

DEF QREMOTE(REPLYQALIAS) RNAME(REPLYQ)RQMNAME(DUMMY)

Name resolutionOutgoing request resolves the ReplyToQ

to be ‘REPLYQ’ and ReplyToQMgr to be

‘DUMMY’Replying applicationApplication replies to ‘REPLYQ’ on

queue manager ‘DUMMY’

Name resolutionTarget queue manager ‘DUMMY’ is

resolved to ‘ ’, allowing cluster

resolution to occur

DEF QLOCAL(REPLYQ) CLUSTER(CLUSTER1)

DEF QREMOTE(REPLYQALIAS) RNAME(REPLYQ)RQMNAME(DUMMY)

Requesting applicationRequest message sets ReplyToQ

to be ‘REPLYQALIAS’ and

ReplyToQMgr to ‘ ’

DEF QREMOTE(DUMMY) RNAME(‘ ’)RQMNAME(‘ ’)

Page 27: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

27

Failed ‘client’ queuemanager

• Traditional applications will use ReplyToQMgr which has been set on outgoing

request. So may need to consider ReplyToQueueAlias to route response

through workload balancing.

• Managing reconnection beyond scope of this session, and in ideal world will

reconnect to same queue manager (may involve HA clusters, multi instance

queue managers), however…

• Clustered reply queues give various possibilities. Simplest case is ‘shared

responses’ but not really worth discussing further… lets assume need to get

back to particular client ‘instance’.

1) Can use priority to prefer ‘usual’ location. Using some form of polling perhaps,

ensure client connects / reconnects to particular queue manager whenever it is

up. If down, client and replies fail over to backup.

2) OR: can use AMQSCLM again to get replies to follow connection

Page 28: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Location Dependency

28

Page 29: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

29

Global applications

QMgr QMgr

QMgr QMgr

Service Service

Service Service

QMgr

QMgr

App 1App 1Client

QMgr

QMgr

App 1App 1Client

New York

London

• Prefer traffic to stay geographically local

USA

Europe

Page 30: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

30

Global applications

QMgr QMgr

Service Service

QMgr

QMgr

App 1App 1Client

QMgr

QMgr

App 1App 1Client

New York

London

• Prefer traffic to stay geographically local

• Except when you have to look further afield

• How do you do this with clusters?

USA

Europe

Page 31: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

31

One cluster

• A single cluster is often the simplest and best approach even when large

distances are involved – for example, cluster certainly doesn’t have to be

limited to a particular datacenter.

• However, often for obvious reasons we would rather keep as much traffic as

possible ‘local’, and we would like to know that if we lose our connection to the

outside world for a period of time, things can keep running.

• Conversely though, if a particular service is down locally, we’d like to make use

of the remote instance (even if it may be a bit slower than usual).

• Finally, we’d like our applications to ‘look the same’ wherever we connect – the

deploying administrator might know this instance is running in London, but

does the application really have to be coded to cope with that?

Page 32: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Setting this up

Page 33: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

33

One cluster

QMgr

Service

QMgr

App 1App 1Client

New York

London

• Clients always open AppQ

• Local alias determines the preferred region

• Cluster workload priority is used to target geographically local cluster aliases

• Use of CLWLPRTY enables automatic failover•CLWLRANK can be used for manual failover

Service

App 1App 1Client

DEF QALIAS(AppQ) TARGET(NYQ)

DEF QALIAS(NYQ) TARGET(ReqQ)CLUSTER(Global)CLWLPRTY(9)

AppQ NYQ

ReqQ

A A

QMgr

AppQ

A

LonQ

A

QMgr

NYQ

ReqQ

A

LonQ

A

DEF QALIAS(AppQ) TARGET(LonQ)

DEF QALIAS(LonQ) TARGET(ReqQ)CLUSTER(Global)CLWLPRTY(4)

DEF QALIAS(LonQ) TARGET(ReqQ)CLUSTER(Global)CLWLPRTY(9)

DEF QALIAS(NYQ) TARGET(ReqQ)CLUSTER(Global)CLWLPRTY(4)

Page 34: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

34

QMgr QMgr

QMgr QMgr

Service Service

Service Service

QMgr

QMgr

App 1App 1Client

QMgr

QMgr

App 1App 1Client

New York

London

USA

EUROPE

QMgrQMgr

QMgrQMgr

The two cluster alternative

• The service queue managers join both geographical clusters•Each with separate cluster receivers for each cluster, at different cluster priorities. Queues are clustered in both clusters.

• The client queue managers are in their local cluster only.

Page 35: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

35

Two cluster approach

• Define two clusters.• USA and EUROPE

• For service queue managers, define separate cluster receiver channels, one for each cluster. Set the CLWLPRTY high on the one for the localcluster and low for the remote one.

• For service queue managers in New York:– DEFINE CHANNEL(USA.NYxx) CHLTYPE(CLUSRCVR) …. CLUSTER(USA) CLWLPRTY(9)

– DEFINE CHANNEL(EUROPE.NYxx) CHLTYPE(CLUSRCVR) …. CLUSTER(EUROPE) CLWLPRTY(4)

• For service queue managers in London:– DEFINE CHANNEL(EUROPE.LONxx) CHLTYPE(CLUSRCVR) …. CLUSTER(EUROPE) CLWLPRTY(9)

– DEFINE CHANNEL(USA.LONxx) CHLTYPE(CLUSRCVR) …. CLUSTER(USA) CLWLPRTY(4)

• Define a namelist of each service queue manager that contains both clusters and use this when clustering queues.

– DEFINE NAMELIST(GLOBAL) NAMES(USA,EUROPE)

– DEFINE QLOCAL(QUEUE1) CLUSNL(GLOBAL)

• Client queue managers only join the cluster that is local to them.

• The client queue managers will choose the instances of queues that are on queue managers with the highest CLWLPRTY on the channel.

• For example, a queue manager in the EUROPE cluster will only see the EUROPE.* channels. So London queue managers will have a CLWLPRTY of 9 and New York queue managers only 4, so preferring London whilst it is available.

Page 36: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Avoiding interference

36

Page 37: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

37

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

App 1App 1Client

ServiceService

Real time queries

Big data transfer

Audit events

One cluster, one pipe

• Often a IBM MQ backbone will be used for multiple types of traffic

Page 38: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

38

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

App 1App 1Client

ServiceService

QMgr

QMgrQMgr

QMgr

QMgrQMgr

Channels

• Often a IBM MQ backbone will be used for multiple types of traffic

• When using a single cluster and the same queue managers, messages all share

the same channels

• Even multiple cluster receiver channels in the same cluster will not separate out

the different traffic types

One cluster, one pipe

Page 39: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

39

• Mice and Elephants

• Large non real time data is contending for resources with small ‘live’ request

response transactions.

• with due attribution to T-Rob: http://www.ibm.com/developerworks/websphere/techjournal/0804_mismes/0804_mismes.html

• All Workload Balancing at the messaging / channel level

• No distinction between a request that needs a week of CPU at the other end,

and one which needs 1 ms.

• Pub Sub requires high ‘meshing’ – all queue managers aware of whole

cluster

• Potentially lots of channel work for hosts not interested in pub sub activity

when superimposed on existing cluster

• Denial of Service potential

• One application out of control = full cluster Transmit queue until someone can

manually intervene

The Cluster as the ‘pipe’

- Problems

Page 40: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

QMgr

QMgrQMgr

QMgr

QMgrQMgr

40

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

ServiceService

App 1App 1Client

App 1App 1Client

ServiceService

Cluster

Cluster

ClusterChannels

Channels

Channels

• Often a IBM MQ backbone will be used for multiple types of traffic

• When using a single cluster and the same queue managers, messages all share

the same channels

• Even multiple cluster receiver channels in the same cluster will not separate out

the different traffic types

• Multiple overlaid clusters with different channels enable separation

One cluster, one pipe

Page 41: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

41

• Putting application in a separate cluster gives option of also giving it its

own channel

• Applications with a need for a strictly controlled WLM ratio can be given

their own clusters for this reason. However, bear in mind cost of too

many overlapping clusters

– RFE 23391

• In general, try and group applications with similar requirements rather

than ending up with channel for every application

• Real time / Batch / Pub sub

• Applications don’t need to know which cluster their resources are in as

long as configuration is managed correctly on their behalf

• New in WebSphere MQ 7.1: Pub/sub can be limited to specialised

clusters / queue managers using PSCLUS attribute

• New in WebSphere MQ 7.5: Channels for different clusters can also

be separated at transmission queue level

The Cluster as the ‘pipe’

Page 42: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

42

QMgr

QMgrQMgr

Workload balancing level interference

Service 1

Client 1

Service 1

QMgr

Service 2

Client 2

• Cluster workload balancing is at the channel level.• Messages sharing the same channels, but to different target queues will be counted

together.

• The two channels here have an even 50/50 split of messages…

• …but the two instances of Service 1 do not!

• Split Service 1 and Service 2 queues out into separate clusters, queue managers or customise workload balancing logic.

100

50

75

25

50

• Multiple applications sharing the samequeue managers and the samecluster channels.

150

Queue 175

Queue 1

Queue 2

75

Page 43: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

43

QMgr

• Separation of Message Traffic

• With a single transmission queue there ispotential for pending messages for one clusterchannel to interfere with messages pending foranother cluster channel.

• Management of messages

• Use of queue concepts such as MAXDEPTH are not usefulwhen using a single transmission queue for more than one channel.

• Monitoring

• Tracking the number of messages processed by a cluster channel can be difficult/impossible using a single queue.

• Performance

• In reality a shared transmission queue is not always the bottleneck, often other solutions to improving channel throughput (e.g. multiple cluster receiver channels)are really what’s needed. But sometimes it will help.

• Multiple cluster transmission queues

Cluster transmit queue

QMgr

QMgr

QMgr

QMgr

V7.5 V8Distributed z/OS & IBM i

Page 44: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

44

• Configured on the sending queue manager, not the owners of the cluster receiver channel definitions.

• Queue Manager switch to automatically create a dynamic transmission queue per cluster sender channel.

ALTER QMGR DEFCLXQ( CHANNEL )

• Dynamic queues based upon model queue.SYSTEM.CLUSTER.TRANSMIT.MODEL

• Well known queue names.SYSTEM.CLUSTER.TRANSMIT.<CHANNEL-NAME>

Multiple cluster transmit queues: Automatic

QMgr QMgr

QMgr

ChlA

ChlB

SYSTEM.CLUSTER.TRANSMIT.ChlA

SYSTEM.CLUSTER.TRANSMIT.ChlC

SYSTEM.CLUSTER.TRANSMIT.ChlB

ChlA

ChlC

ChlC

ChlB

Page 45: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

45

QMgr QMgr

QMgr

• Still configured on the sending queue manager, not the owners of the cluster

receiver channel definitions.

• Administratively define a transmission queue and configure which cluster

sender channels will use this transmission queue.DEFINE QLOCAL(GREEN.XMITQ) CLCHNAME(‘Green.*’) USAGE(XMITQ)

• Set a channel name pattern in CLCHNAME

• Single/multiple channels (wildcard)

– E.g. all channels for a specific cluster

(assuming a suitable channel naming convention!)

• Any cluster sender channel not

covered by a manual transmission

queue defaults to the DEFCLXQ

behaviourGreen.A

Pink.B

Pink.A

GREEN.XMITQ Green.A

Pink.A

Pink.B

PINK.XMITQ

Multiple cluster transmit queues: Manual

Page 46: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Disaster Recovery

Page 47: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

47

MQ Clusters and Disaster Recovery

• Everybody seems to do disaster recovery slightly differently

• MQ Clusters can be made to fit with most setups

• But some more easily than others…

Page 48: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Synchronous

replication

Page 49: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

49

What to recover

• Before discussing Disaster recovery in a clustered environment, we need to

think about what we mean by disaster recovery even on a single queue

manager basis.

• Many people will think of synchronous replication (using underlying host, disk

replication etc.) as the ‘gold standard’

• The only way we can really achieve guaranteed identical configuration OR data.

• Significant performance cost

• Impractical in many scenarios (e.g. distance between data centres)

• When moving beyond synchronous we have to consider whether it is only

‘configuration’ (queue, channel definitions, etc) or also data which we are trying

to restore.

• Restoring data implies IBM MQ may be being questionably used as ‘system of record’

– possibly revisit architecture?

• Application must be designed to deal with duplicate and/or lost messages

• In a cluster, the line between data and config is further blurred

• Cluster knowledge exists as state data on repository queue for example.

Page 50: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

50

Datacenter Replication‘in step’

Synchronous Replication

Datacenter 1

1

QMgr

2

QMgr

DB

DB

Datacenter 2

QMgr QMgr

FR

QMgr

DB

DB

QMgr

QMgr

QMgr

QMgr

FR

• Outside the datacenter, DR queuemanagers must be locatable in eithersite

• IP switching layer or comma separatedchannel names

• After failover, everything continues as if nothing happened

QMgr

FR

1 2

Page 51: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

51

Synchronous diskreplication

• In either of these scenarios life is simple from a clustering perspective

• As long as all queue managers see the fail over instance as ‘in the same place’,

or understand 7.0.1 style connection names, no other consideration is required

• Some administrators will prefer to include full repositories in DR failover unit –

no strong requirement for this unless other factors apply

• As long as good availability expectation between the pair, can add a new one at

leisure to pick up the strain in event of real loss of one.

• May add significantly to performance cost of replication

Page 52: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Asynchronous

replication

Page 53: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

53

Datacenter Replication‘asynchronous’

Asynchronous Replication

Datacenter 1

1

QMgr

2

QMgr

Datacenter 2

QMgr QMgr

QMgr

QMgr

QMgr

• Backup queue managers contain

historic messages (if any)…

• ...Cluster state will also be historic so

refreshing cluster state on

failover and failback is essential.

– But refreshing a full repository is often too invasive…

1 2

QMgr

FRQMgr

FR

QMgr

FR

Page 54: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

54

Datacenter Replication‘asynchronous’

Asynchronous Replication

Datacenter 1

1

QMgr

2

QMgr

Datacenter 2

QMgr QMgr

QMgr

QMgr

QMgr

• Don’t replicate or failover the full

repositories.

• Perhaps have one in each data

center, each always running.

• No need to refresh them, and always at least one full repository is available

1 2

QMgr

FR

QMgr

FR

Page 55: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

55

Asynchronous replication

• Remember that cluster state will be persisted as ‘data’

• REFRESH mandatory on restore from backup, failover, or ‘fail back’ to live.

• Either a cloned queue manager or a backup can be made to work

• Our experience is that when things go wrong, a ‘true’ backup is easier to work with

• Same process then whether preserving some application data or not

• Most common problem – missed refresh when reverting to ‘Live’ – things may appear to work for a while…

• IP address / conname can:

• Change and be ok once re-advertised (after a REFRESH)

– For a manually cloned queue manager this is probably the default

– Blank connames also make this easy

• Change but not actually need to re-advertise

– E.g. Comma separated list – still need the REFRESH step though

• Remain the same

– Assuming have capability to modify routing, or DNS conname used

• A new queue manager appearing with same name as an existing (possibly failed) queue manager will be allowed in to take its place.

• Message AMQ9468 / CSQX468I added to warn when this occurs

Page 56: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

No replication

Page 57: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

57

No Replication‘warm standby’

Datacenter 2

QMgr

QMgr

QMgr

• Backup queue managers are always running.• Cluster workload balancing used to direct

traffic to primary queue managers(e.g. CLWLPRTY/RANK).

• Failed connections are detected andmessages routed to secondary qmgrs.

• Messages will be trapped on failed queuemanagers in the event of a failure.

• The applications and the system must be designed to accommodate this configuration.

Datacenter 1

QMgr

1

QMgr

2

QMgr

3

QMgr

4

QMgr

3

QMgr

FR

QMgr

FR

Page 58: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

58

Warm standby

• This is the other ‘easy’ scenario, and the recommendation wherever

possible

• Applications must be designed to have loose affinities to any particular

queue manager

• Put messages to clustered queues rather than Queue X @ QMgr Y

• CLWLPRTY or for manual control CLWLRANK allow smooth switching

between primary and secondary instances as required

• Data on failed queue managers will be trapped unless and until

restarted

• Implies applications must be able to replay

• If this means duplication possible, restart procedures may need to clear out

• This is the same as the earlier ‘One cluster’ scenario

Page 59: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

59

Summary

• Why a cluster?

• Service availability

• Location dependency

• Avoiding interference

• Clusters and disaster recovery

Page 60: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

60

Notices and Disclaimers

Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or

transmitted in any form without written permission from IBM.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with

IBM.

Information in these presentations (including information relating to products that have not yet been announced by IBM) has been

reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM

shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY,

EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF

THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT

OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the

agreements under which they are provided.

Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without

notice.

Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are

presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual

performance, cost, savings or other results in other operating environments may vary.

References in this document to IBM products, programs, or services does not imply that IBM intends to make such products,

programs or services available in all countries in which IBM operates or does business.

Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not

necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither

intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.

It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal

counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s

business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or

represent or warrant that its services or products will ensure that the customer is in compliance with any law.

Page 61: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

61

Notices and Disclaimers (con’t)

Information concerning non-IBM products was obtained from the suppliers of those products, their published

announcements or other publicly available sources. IBM has not tested those products in connection with this

publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM

products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to

interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED,

INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A

PARTICULAR PURPOSE.

The provision of the information contained herein is not intended to, and does not, grant any right or license under any

IBM patents, copyrights, trademarks or other intellectual property right.

• IBM, the IBM logo, ibm.com, Bluemix, Blueworks Live, CICS, Clearcase, DOORS®, Enterprise Document

Management System™, Global Business Services ®, Global Technology Services ®, Information on Demand,

ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™,

PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,

pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, SoDA, SPSS, StoredIQ, Tivoli®, Trusteer®,

urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of

International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and

service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on

the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.

Page 62: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Thank YouYour Feedback is

Important!

Access the InterConnect 2015

Conference CONNECT Attendee

Portal to complete your session

surveys from your smartphone,

laptop or conference kiosk.

Page 63: IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters

Recommended