Geographically Dispersed Percona XtraDB Cluster Deployment
Marco (the Grinch) Tusa September 2017 Dublin
2
Marco “The Grinch” • Open source enthusiast • Percona consulting Team Leader
About me
3
Agenda• WhatisPXC
• Whennodesinteracts
• Letusclarify,geodispersed-Whattokeepinmindthen
• Howtomeasurelatencycorrectly
• Usetherightway(sync/async)
• Usehelplikereplicationmanager
4
What is PXC/Galera?(Virtually)SynchronousReplication:
• Truemulti-master• Noslavelag• Nomaster-slavefailoverorVIP• Multi-threadedapplayers• Automaticnodeprovisioning• Elasticscale(in–out)• Geographicdistributed(withsegments)• MixwithAsyncreplication Galera
Balancer
Webtraffic
5
What PXC/Galera is NOT?NotWrite-scalablesolution
Notgreatforahighamountofparallel,smallrequestsNotgreatforworkingwithForeignKeysNotgoodforshardingData(eachnodehastheentiredataset)
Galera
Balancer
Webtraffic
6
What is a Node StandardMySQLReplication
Master
Slave
Slave
• GaleraMySQLReplication
Node
Node Node
9cba28fa-a8be-11e4-8f41-9f963e1dbf4f
7
SegmentsAsegmentisalogicalgroupingofnodes.ReplicationbetweenSegmentisoptimized(writeset-somelevelofcommunication)
Trafficandmessagingisreduced
IncaseofSST,thedonorischosenbyproximity
8
More nodes more problemsUseatwophasecommit,ordistributedlockingwithcapacityformula:m=nxoxt(wheremessages/sec=numberofnodesduetoprocessonumberofoperationwithttransactionthroughput)
9
When nodes interacts• Keepaliveandchecksforclusterhealth
• Writesetonwritercommit
• Certificationresults
• Ackonlocalapply
• FlowControl
• IST/SST
10
Let us clarify, geo dispersed 1Geodispersed
ormulti-site,clusterisaclusterconfigurationusedtohelpensurehigh
systemandapplicationavailabilityintheeventofsitedisaster.Inthis
configuration,serversareseparatedgeographicallyandthephysical
storage(quorumdiskor)DATAissynchronouslyreplicatedbetween
sites. (http://www.expertglossary.com/storage/definition/geo-dispersed-cluster)
11
Let us clarify, geo dispersed 2For some environments, latency is the sole focus of performance.
As an example of latency, shows a network transfer, such as an HTTP GET request, with the time split into latency and data transfer components.
12
Geo dispersed Geodispersedisdeterminatebythelatencyexisting
betweennodes
NOTbythegeographiclocationitself.
13
How to measure latency correctly 1 • wsrep_evs_repl_latency
(Itmeasureslatencyfromthetimepointwhenamessageissentouttothe
timepointwhenamessageisreceived.)
• wsrep_replicated/wsrep_replicated
• netperf
14
How to measure latency correctly 2
14
How to measure latency correctly 2
PING
14
How to measure latency correctly 2
PING
14
How to measure latency correctly 2
Why?
15
Brief digression
Ref:https://goo.gl/kDTYnW
15
Brief digression
Ref:https://goo.gl/kDTYnW
15
Brief digression
Ref:https://goo.gl/kDTYnW
16
Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP
• Defaultdatasizeis56bytesplusheader(8bytes)
16
Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP
• Defaultdatasizeis56bytesplusheader(8bytes)
16
Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP
• Defaultdatasizeis56bytesplusheader(8bytes)
ping -M do -s 1473 -c 3 192.168.0.34
16
Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP
• Defaultdatasizeis56bytesplusheader(8bytes)
ping -M do -s 1473 -c 3 192.168.0.34
16
Brief digression • PINGuseICMP(InternetControlMessageProtocol)NOTTCPoverIP
• Defaultdatasizeis56bytesplusheader(8bytes)
Notgoodenough!
17
Brief digression
17
Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.
TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:
17
Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.
TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:
17
Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.
TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:
Looksthesamethingthanbeforeright?
17
Brief digressionTCPmeansTransmissionControlProtocolandasthenamesays,itisdesigntocontrolthedatatransmissionhappeningbetweensourceanddestination.
TCPimplementationsusetheIPprotocolencapsulationforthetransmissionofthedata:
Looksthesamethingthanbeforeright?
WRONG!
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
Isstreamoriented
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnection
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnectionMonitorthedatatransfer
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmission
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstream
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstreamFull-duplexconnection
18
Brief digressionATCPimplementationhasseveralcharacteristicsthatmakesensetosummarize:
IsstreamorientedEstablishaconnectionMonitorthedatatransferBufferedtransmissionUnstructuredstreamFull-duplexconnectionStreamasasequenceofoctetsplitinsegments
19
Brief digressionTCPdispatcheruseDynamicSlideWindow
19
Brief digressionTCPdispatcheruseDynamicSlideWindow
19
Brief digressionTCPdispatcheruseDynamicSlideWindow
Dispatchermanagesthreepointersassociatedtoeachconnection:ThefirstpointerindicatethestartoftheslidingwindowThesecondpointerindicatesthehigheroctetthatcanbedispatchtet.Thethirdpointerindicatesthewindowlimit
20
How to measure latency correctly 3 Backtous
20
How to measure latency correctly 3 Backtous
• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34
20
How to measure latency correctly 3 Backtous
• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34
• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)
20
How to measure latency correctly 3 Backtous
• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34
• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)
• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization
20
How to measure latency correctly 3 Backtous
• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34
• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)
• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization
• Testwithnetperf(IE)• netperf-H192.168.1.51-tTCP_RR-v2-l60---b2-r250K-R1M-s250K,10M-S10K,256K
20
How to measure latency correctly 3 Backtous
• CheckforgatewayMTUcut• ping -M do -s 1473 -c 3 192.168.0.34
• Considersentandreceivedmessages(IEWsrepreplicatedbytes&Wsrepreceivedbytes)
• CheckKernelsettingsfor:• Buffering• Congestioncontrol• Frameutilization
• Testwithnetperf(IE)• netperf-H192.168.1.51-tTCP_RR-v2-l60---b2-r250K-R1M-s250K,10M-S10K,256K
• Checkthewsrep_evs_repl_latencyvalueinSHOWGLOBALSTATUSlike‘wsrep%’;
21
What is the right limit?
Dependsbytheusage
Balance incomingwrite/s consistencereads
22
Last chance for (virtually) Synchronous Wansettings:
22
Last chance for (virtually) Synchronous Wansettings:
evs.inactive_check_period = PT30S;evs.inactive_timeout = PT1M;evs.suspect_timeout = PT40S; evs.stats_report_period = PT3M;
evs.join_retrans_period=PT0.5S !don’tusePING
22
Last chance for (virtually) Synchronous Wansettings:
evs.inactive_check_period = PT30S;evs.inactive_timeout = PT1M;evs.suspect_timeout = PT40S; evs.stats_report_period = PT3M;
evs.join_retrans_period=PT0.5S !don’tusePING
Master_Slavenotaverygoodoptionthoughwsrep_provider_options = "gcs.fc_limit = 256; gcs.fc_factor = 0.99; gcs.fc_master_slave = YES"
23
Async replication kicks in Wecanusealmostthesamemodelsweused
23
Async replication kicks in Wecanusealmostthesamemodelsweused
ChallengeistoshiftfromoneMaster-NodetoanewoneOrfromaslavetoanother
24
Async replication ways StandardbinlogpositionusingXIDandwsrep_last_committed+----------------------+---------+ | Variable_name | Value | +----------------------+---------+ | wsrep_last_committed | 3282552 | +----------------------+---------+
Binlog# at 544 #170920 19:26:56 server id 3306 end_log_pos 575 CRC32 0x3ae1edcd Xid = 3282552
Simpletoinstall/setupNightmaretomaintain
25
Async replication ways UsingGTIDAllnodesonaclusterwillhavethesameGTIDMovefromaslavefromonenodetoanothercanbeautomated.Existing: YvesTrudeausolution:https://github.com/y-trudeau/Mysql-tools/tree/master/PXC SinglelinkDC1->DC2 MultipleLinkDC1->DC2->DC3
26
Conclusions
• PlancarefullyyounetworkandDC-DCconnectivity
• KeepthenumberofnodesinsideaPXCclustertominimum
• Testproperly(notping)thelatencyonthenetwork
• UsePXC/Galerareplicationbetweengeo-distributedonlyifitissafe
• DonotesitatetoshifttoAsyncreplication
• UseexistingsolutionstohelpyoumanageasyncreplicationbetweenPXCs
27
28
Q&A
29
Contacts
To contact Me
To follow me
http://www.tusacentral.net/
http://www.percona.com/blog/
https://www.facebook.com/marco.tusa.94
@marcotusa
http://it.linkedin.com/in/marcotusa/
“Consulting = No mission refused!”