Post on 14-Apr-2017
transcript
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
ITM216
Deployment Options with Business Continuity for
SAP HANA (HA and DR) (update 1.11.2016)
Tomas KROJZL, SAP HANA Distinguished Engineer, SAP Mentor
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
2
Disclaimer
This presentation is based on official IBM and SAP information however it is not
representing official position of IBM nor SAP and does not contain statements about
future direction. Content of this presentation is based on personal view and
experience of presenter and cannot be associated with IBM or SAP corporations.
This document is provided without a warranty of any kind, either express or implied,
including but not limited to, the implied warranties of merchantability, fitness for a
particular purpose, or non-infringement. IBM nor SAP assumes no responsibility for
content of this document.
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Agenda
3
• Introduction into Business Continuity
• SAP HANA Host Auto-Failover
• SAP HANA System Replication
• Comparison of Individual Options
• Typical Deployment Options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Availability Considerations
4
• Availability can be measured on different
levels in the stack
• It can include planned downtime
(maintenances) or can be based only on
unplanned downtime (outages)
• Period against which the availability is
measured (monthly versus yearly) –
important if “mean time between failures”
(MTBF) is more than one monthSAP HANA
Operating SystemSUSE Linux Enterprise Server for SAP
Red Hat Enterprise Linux for SAP HANA
Server HWInternal
Network
Storage
Subsystem
Applicatione.g. SAP NetWeaver
Business Process
Database Level
Operating System Level
Application Level
Business Process Level
Infrastructure Level
Unplanned Downtime
(outages)
Planned Downtime
(maintenances)
?
Availability % 99% 99.5% 99.8% 99.9%
Downtime per year 3.65 day 1.83 days 17.5 hour 8.76 hour
Downtime per month 7.20 hour 3.60 hour 86.2 min 43.8 min
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
“Outage Duration”
Recovery Time
Objective (RTO)
is saying how fast you can
recover from an outage
“Data Loss Period”
Recovery Period
Objective (RPO)
is saying how much data
can be lost during recovery
Important Business Continuity KPIs
• Maximum targeted duration of time within which
business process must be restored after disruption
• Can include investigation time, recovery time itself,
testing and communication to the users
5
Recovery Time Objective (RTO)
• Maximum targeted time period in which data might
be lost as a result of disruption
• Only loss of committed transactions is taken into
account (loss of incomplete transactions is not taken
into account)
Recovery Point Objective (RPO)Ba
cku
p
Ba
cku
p
Reco
ve
ry
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
SAP HANA Business Continuity
• Protection against single server failure
• Hardware failure (e.g. CPU failure)
• Software malfunction (e.g. OS crash)
• Typically inside same location
• Goal is to stay close to the rest of the
customer landscape (bandwidth, latency)
6
• Protection against Data Center failure
• Natural disasters (e.g. floods, earthquakes)
• Man made disasters (e.g. riots, terrorism)
• Typically against other location
• Goal is to move whole customer landscape to
new location (Data Center)
• Different requirements on “safe distance” (next
city versus different continent)
Availability (incl. High Availability) Disaster Recovery
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
SAP HANA Business Continuity
• SAP HANA Host Auto-Failover
• Approach based on +1 node (might be up to +3)
• Additional nodes CANNOT be used for non-prod
• SAP HANA System Replication
• Additional compatible environment required
• Additional nodes CAN be used for non-prod
• Failover can be automated for single-node
• VMware High Availability (Infrastructure Level)
7
• Passive Spare (Backup/Restore)
• Backups must be replicated across both sites
• SAP HANA System Replication
• Storage Replication
• Continuous replication of data between storage
subsystem on primary and secondary side
Availability Techniques Disaster Recovery Techniques
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Agenda
8
• Introduction into Business Continuity
• SAP HANA Host Auto-Failover
• SAP HANA System Replication
• Comparison of Individual Options
• Typical Deployment Options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
9
SAP HANA Host Auto-Failover for SAP HANA Single-node
• Each SAP HANA node represents
running instance
• Only active node is having own data and
log files
• There are no data and log files on stand-
by node
• Data is written only to active node
• SQL query will retrieve data from active
node
Node 01
(active)
Node 02
(stand-by)
DB
L
SAP HANA
instance
SAP HANA
instance
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node 02
(stand-by)
Node 01
(active)
Node 01
(crashed)
SAP HANA
instance
SAP HANA
instance
10
SAP HANA Host Auto-Failover for SAP HANA Single-node
• Active node will fail (data-set is not
available anymore)
SAP HANA
instance
DB
L
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node 02
(stand-by)
Node 01
(active)
Node 02
(active)
Node 01
(crashed)
SAP HANA
instance
SAP HANA
instance
10
SAP HANA Host Auto-Failover for SAP HANA Single-node
• Active node will fail (data-set is not
available anymore)
• Stand-by node will take over data and
log files of failed node and will replace it
(data-set is available again)
• SQL query can again retrieve data from
new active node
SAP HANA
instance
DB
L
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
11
SAP HANA Host Auto-Failover for SAP HANA Scale-out
• Each SAP HANA node represents
running instance
• Each active node is having own data and
log files
• There are no data and log files on stand-
by node
• Data is distributed into pre-defined
locations (shared nothing architecture)
• SQL query will retrieve data from all
nodes in parallel
Node 02
(active)
SAP HANA
instance
Node 01
(active)
Node 03
(active)
Node 04
(stand-by)
DB
L
DB
L
DB
L
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node 02
(active)
SAP HANA
instance
Node 02
(crashed)
SAP HANA
instance
12
SAP HANA Host Auto-Failover for SAP HANA Scale-out
• One node will fail (data-set is not
complete anymore)
Node 01
(active)
Node 03
(active)
Node 04
(stand-by)
DB
L
DB
L
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
DB
L
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node 02
(active)
SAP HANA
instance
Node 02
(crashed)
SAP HANA
instance
12
SAP HANA Host Auto-Failover for SAP HANA Scale-out
• One node will fail (data-set is not
complete anymore)
• Stand-by node will take over data and
log files of failed node and will replace it
(data-set is complete again)
• SQL query can again retrieve data from
all nodes in parallel
• Multiple stand-by nodes are allowed but
not common
• One stand-by node is serving as fail-over
target for any of the worker nodes – no
data preloading possible
Node 01
(active)
Node 03
(active)
Node 04
(stand-by)
Node 04
(active)
DB
L
DB
L
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
DB
L
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Agenda
13
• Introduction into Business Continuity
• SAP HANA Host Auto-Failover
• SAP HANA System Replication
• Comparison of Individual Options
• Typical Deployment Options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B1
(replicating)
Node B2
(stand-by)
SAP HANA
instance
SAP HANA
instance
14
SAP HANA System Replication for SAP HANA Single-node
• SAP HANA System Replication needs
additional “compatible” environment
• During System Replication each primary
worker node is paired with one dedicated
node on secondary side
Node A2
(stand-by)
SAP HANA
instance
Node A1
(active)
SAP HANA
instance
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B1
(replicating)
SAP HANA
instance
14
SAP HANA System Replication for SAP HANA Single-node
• SAP HANA System Replication needs
additional “compatible” environment
• During System Replication each primary
worker node is paired with one dedicated
node on secondary side
• Stand-by node is not required on
secondary side
• When data is written on primary node
then related change is replicated to
secondary node
• This replication is performed on
database level
Node A2
(stand-by)
SAP HANA
instance
Node A1
(active)
SAP HANA
instance
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B3
(replicating)
Node B1
(replicating)
Node B2
(replicating)
Node B4
(stand-by)
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
15
SAP HANA System Replication for SAP HANA Scale-out
• SAP HANA System Replication needs
additional “compatible” environment
• During System Replication each primary
worker node is paired with one dedicated
node on secondary side
Node A4
(stand-by)
SAP HANA
instance
Node A3
(active)
SAP HANA
instance
DBL
Node A1
(active)
SAP HANA
instance
DBL
Node A2
(active)
SAP HANA
instance
DBL
DBL
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B3
(replicating)
Node B1
(replicating)
Node B2
(replicating)
SAP HANA
instance
SAP HANA
instance
SAP HANA
instance
15
SAP HANA System Replication for SAP HANA Scale-out
• SAP HANA System Replication needs
additional “compatible” environment
• During System Replication each primary
worker node is paired with one dedicated
node on secondary side
• Stand-by node is not required on
secondary side
• When data is written on primary node
then related change is replicated to
secondary node
• This replication is performed on
database level
Node A4
(stand-by)
SAP HANA
instance
Node A3
(active)
SAP HANA
instance
DBL
Node A1
(active)
SAP HANA
instance
DBL
Node A2
(active)
SAP HANA
instance
DBL
DBL
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B1
(replicating)
SAP HANA
instance
Node A2
(stand-by)
SAP HANA
instance
16
SAP HANA System Replication - Advantages
• Main advantages of SAP HANA System
Replication approach:
Node A1
(active)
SAP HANA
instance
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B1
(active)
SAP HANA
instance
Node A2
(stand-by)
SAP HANA
instance
16
SAP HANA System Replication - Advantages
• Main advantages of SAP HANA System
Replication approach:
• Secondary side can be “disconnected” to serve
different purpose – for example:
• Specialized testing
• Fast Back-out plan in case of failed change
• Near Zero Downtime Database Upgrades
Node A1
(active)
SAP HANA
instance
DBL
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Node B1
(replicating)
SAP HANA
instance
Node B1
(repl./active)
DB
L
DB
L
Non-prod.
SAP HANA
Node A2
(stand-by)
SAP HANA
instance
16
SAP HANA System Replication - Advantages
• Main advantages of SAP HANA System
Replication approach:
• Secondary side can be “disconnected” to serve
different purpose – for example:
• Specialized testing
• Fast Back-out plan in case of failed change
• Near Zero Downtime Database Upgrades
• Non-production can be hosted on secondary
side – must be stopped in case of failover
• Protection against disk corruptions
• Can be used in various scenarios –
including Disaster Recovery or DB copy
Node A1
(active)
SAP HANA
instance
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
17
SAP HANA System Replication – Replication Modes
• Replication process is having three
important milestones
• Four different replication modes are
available
• Synchronous modes are sensitive to
network latency therefore can be used
only across small distance
• Full sync mode will stop operation on
primary system in case that secondary
system is not reachable
3. Secondary system persists
the information (I/O impact)
2. Secondary system receives
the information (network impact)
1. Primary system sends
the information
Asynchronous
Synchronousin-memory
Synchronous
Synchronouswith full sync
RPO > 0
RPO ~ 0
RPO = 0
RPO = 0
No impact
Primary side is blockeduntil secondary reconnects
Primary side may beblocked until time-out
Impact in case that connection
between systems is lost
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
SAP HANA System Replication – Operation Modes
• Same layout of data across pages
• Full compatibility across features
• More free resources on secondary side
for non-productive system
18
• Same information but different layout
• Reduced network traffic
• Reduced takeover time
• Base for planned active-active setup
Delta Data Shipping Mode Log Replay Mode (SP11+)
Node A1
(active)
DBL
Storage
Node B1
(replicating)
DBL
Storage
Full Data
Delta Data
Log Entries
Shipment types:
Node A1
(active)
DBL
Storage
Node B1
(replicating)
DBL
Storage
Full Data
Log Entries
Shipment types:
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
SAP HANA Multitier System Replication
A » B B » C Version Use Case (example)
SYNC SYNC SP12+ Local solution for fast failover
combined with DR to nearby
location.SYNC SYNCMEM SP12+
SYNCMEM SYNC SP11+
SYNCMEM SYNCMEM SP12+
SYNC ASYNC SP07+ Local solution for fast failover
combined with DR to distant
location.SYNCMEM ASYNC SP07+
ASYNC ASYNC SP11+ Multi-level DR solution.
19
• Only supported scenario (today) - three
SAP HANA databases in “pipeline”
topology (A » B » C)
• Supported for both single-node and
scale-out deployment options
• Only listed combinations of replication
modes are possible
• Combinations of operation modes
(Delta Data Shipping and Log Replay)
are not supported
Node A
(active)
SAP HANA
instance
DBL
Node B
(replicating)
SAP HANA
instance
DBL
Node C
(replicating)
SAP HANA
instance
DBL
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Agenda
20
• Introduction into Business Continuity
• SAP HANA Host Auto-Failover
• SAP HANA System Replication
• Comparison of Individual Options
• Typical Deployment Options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Availability / High Availability - Comparison
21
High Availability scenarios
Protection against
Can decrease
planned
downtime
Op
era
tio
n
co
mp
lex
ity
Ap
pro
ac
h Cost implications
Relative
RTO
Typical
RPO
Ca
n p
as
siv
e H
W
ho
st
no
n-p
rod
HW
failure
OS
Failure
Failure
of one
DB node
Failure
of all DB
nodes
Single-
node
Scale-
out
Single-
node
Scale-
out
SAP HANA Host Auto-Failover
(one stand-by node)Y Y Y N
HW, OS
onlyN low N+1 medium medium medium 0 N
SAP HANA System Replication
(synchronous) - manual failoverY Y Y Y Y Y low N+N medium high
very
high0 Y (*1)
SAP HANA System Replication
(synchronous) - automated failoverY Y Y Y Y (*2) Y (*2) medium N+N medium high
very
low (*3)0 Y/N (*4)
VMware High Availability
(Infrastructure Level option)Y partial N N HW only HW only low
Infra.
optionlow low medium 0 n/a
(*1) Non-productive system will be stopped in case of failover, separate storage for non-productive system is required
(*2) Special treatment is required to prevent cluster to takeover during maintenance
(*3) Very low only in case of “Log Replay Mode” (“HotStandby” setup) – otherwise low
(*4) Non-productive system is supported only in single-node (scale-up) scenario
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Disaster Recovery - Comparison
22
(*1) Non-productive system will be stopped in case of failover, separate storage for non-productive system is required
Disaster Recovery scenariosChecked for
consistency
Operation
complexityApproach
Cost
implications
Relative
RTOTypical RPO
Can passive
HW host
non-prod
Passive Spare (Backup/Restore) -
preinstalledY low N+N medium high hours Y (*1)
SAP HANA System replication
(synchronous)Y low N+N medium low 0 Y (*1)
SAP HANA System replication
(asynchronous)Y low N+N medium low
seconds /
minutesY (*1)
Storage Replication
(synchronous) - preinstalledN low N+N medium low 0 Y (*1)
Storage Replication
(asynchronous) - preinstalledN low N+N medium low
seconds /
minutesY (*1)
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Agenda
23
• Introduction into Business Continuity
• SAP HANA Host Auto-Failover
• SAP HANA System Replication
• Comparison of Individual Options
• Typical Deployment Options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Business Continuity – Typical Deployment Options (Single-node)
24
active
P’
active
P
synchronous
automatic(Pacemaker
cluster or
3rd party SW)
active
P’ Q
active
P
synchronous
automatic
(Pacemaker
cluster or
3rd party SW)
active
P’’
active
P’
asynchronous
active
P
synchronous
automatic
(Pacemaker
cluster or
3rd party SW)
manual
Location 1 Location 2(low latency)
Location 3(high latency)
Location 1 Location 2(low latency)
Location 3(high latency)
active
P’’ D
stand-by
P’’ D
active stand-by
P’ Q P’ Q
asynchronous
active sby
P P
synchronous
manual manual
active stand-by
P’ Q P’ Q
active sby
P P
synchronous
manual
active sby
P P
automatic(HANA native)
Also referred as
“Single-node
HA cluster”
active
P’ Q
stand-by
P’ Q
asynchronous
active sby
P P
manual
Mandatory Component
P
Q
D
Optional Component
Production Node
Quality Node
Development Node
Legend:
SAP HANA System
Replication Flow
A » B B » C Version
SYNC SYNC SP12+
SYNC SYNCMEM SP12+
SYNCMEM SYNC SP11+
SYNCMEM SYNCMEM SP12+
SYNC ASYNC SP07+
SYNCMEM ASYNC SP07+
ASYNC ASYNC SP11+
Additional combinations of replication modes are possible:
(note the network latency dependencies)
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Location 1 Location 2(low latency)
Location 3(high latency)
Business Continuity – Typical Deployment Options (Scale-out)
25
Location 1 Location 2(low latency)
Location 3(high latency)
activeactive activeactive active stand-by
Q P’ Q
active sby
P
manual
synchronous
P P’
activeactiveactiveactive active
Q
stand-by
P’ Q
active sby
P
manual
asynchronous
P P’
activeactiveactive sby
P P
automatic(HANA native)
Also referred as
“Scale-out
cluster with HA”
active activeactive active activeactiveactive
D
stand-by
P’’ D
active stand-by
Q P’ Q
active sby
P
manual manual
asynchronoussynchronous
P’’P’P
Mandatory Component
P
Q
D
Optional Component
Production Node
Quality Node
Development Node
Legend:
SAP HANA System
Replication Flow
A » B B » C Version
SYNC SYNC SP12+
SYNC SYNCMEM SP12+
SYNCMEM SYNC SP11+
SYNCMEM SYNCMEM SP12+
SYNC ASYNC SP07+
SYNCMEM ASYNC SP07+
ASYNC ASYNC SP11+
Additional combinations of replication modes are possible:
(note the network latency dependencies)
activeactive activeactive active sby
P’
active sby
P
automatic(Pacemaker
cluster or
3rd party SW)
synchronous
P P’mm
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Additional Materials (1/3)
• How to Perform System Replication for SAP HANA
https://scn.sap.com/docs/DOC-47702
• FAQ: High Availability for SAP HANA
https://scn.sap.com/docs/DOC-66702
• Introduction: High Availability for SAP HANA
https://scn.sap.com/docs/DOC-65585
• High Availability for SAP HANA
https://help.sap.com/saphelp_hanaplatform/helpdata/en/6d/252d
b7cdd044d19ad85b46e6c294a4/content.htm
• High Availability and Disaster Recovery with the SAP HANA
Platform
https://open.sap.com/courses/hshd1
26
• SAP Note 1999880 - FAQ: SAP HANA System Replication
https://launchpad.support.sap.com/#/notes/1999880/E
• SAP Note 2303243 - SAP HANA Multitier System Replication –
supported replication modes between sites
https://launchpad.support.sap.com/#/notes/2303243/E
• SAP HANA Distinguished Engineer (HDE) Webinar: Overview
of SAP HANA On-Premise Deployment Options (replay)
https://scn.sap.com/community/hana-in-
memory/blog/2016/05/25/sap-hana-distinguished-engineer-hde-
webinar-overview-of-sap-hana-on-premise-deployment-options
• SAP HANA Distinguished Engineer (HDE) Webinar: Overview
of SAP HANA On-Premise Deployment Options
https://www.slideshare.net/TomasKrojzl/sap-hana-distinguished-
engineer-hde-webinar-overview-of-sap-hana-onpremise-
deployment-options
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Additional Materials (2/3)
• SAP on SUSE
https://scn.sap.com/docs/DOC-49204
• Best Practices for Mission-Critical SAP Applications
https://www.suse.com/products/sles-for-sap/resource-
library/sap-best-practices
• Automate SAP HANA System Replication with SLES for SAP
Applications
https://scn.sap.com/docs/DOC-56278
• How to Set Up SAPHanaSR in the Cost Optimized SAP HANA
SR Scenario - Part I
https://scn.sap.com/docs/DOC-65899
• How to Set Up SAPHanaSR in the Cost Optimized SAP HANA
SR Scenario - Part II
https://scn.sap.com/docs/DOC-68633
27
• SAPHanaSR-ScaleOut: Automating SAP HANA System
Replication for Scale-Out Installations with SLES for SAP
Applications
https://www.suse.com/communities/blog/saphanasr-scaleout-
automating-sap-hana-system-replication-scale-installations-sles-
sap-applications
• SAP HANA System Replication Automation (HanaSR) for
HANA Scale-Out now officially available with SLES for SAP 12
SP2
https://www.suse.com/communities/blog/sap-hana-system-
replication-automation-hanasr-hana-scale-now-officially-
available-sles-sap-12-sp2
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
Additional Materials (3/3)
• SAP HANA System Replication simplified
https://www.slideshare.net/snoopy1710/sap-hana-system-
replication-simplified
• SAP on Red Hat
https://scn.sap.com/docs/DOC-37811
• Technical Resources for SAP HANA on Red Hat
https://scn.sap.com/docs/DOC-37812
• Automated SAP HANA System Replication with Pacemaker on
RHEL Setup Guide
https://access.redhat.com/articles/1466063
28
© 2016 IBM Corporation
SAP HANA Distinguished Engineers
29
Tomas KROJZLSAP HANA Architect,SAP HANA Distinguished Engineer,SAP Mentor
IBM Innovation CenterCentral Europe
Technicka 2995/21616 00 BrnoCzech Republic
Mobile: +420-731-435-817tomas_krojzl@cz.ibm.com
Twitter: @krojzl
Thank You!
FeedbackPlease complete your session evaluation.