1
MariaDBColumnStore
2
Weshouldbetalkingabouttheanaly6csofthings,nottheinternetofthings.
JimDavisCMO,SAS
“
”
3
CurrentStateofAnaly5cs
• Tradi5onalOLAP
o Costtoperform
o AppliancesorProprietarySolu6ons
• Big-DataAnaly5cs
o Scaletoperform
o Non-SQLInterfaces
• Analy5csandTransac5onSepara5on
4
WhyMariaDBColumnStore
PricetoPerformanceatScale
DataAnaly6csusingSQLorSPARK
UnifiedSimplicity(Transac6onandAnaly6csunderthesameRoof)
Open-SourceGPL2
SQL
5
WhyCustomersChooseMariaDBColumnStoreSCALE● Massivelyparallelarchitecturedesignedforbigdatascalingtoprocesspetabytesofdata
● Readperformancescaleslinearlywithdatagrowth
SPEED● Excep6onalperformance
● Real-6meresponsetoanaly6csqueriesandHighspeeddataloading
SECURITYandRELIABILITY● Datawithencryp6onfordatainmo6on,rolebasedaccessandauditfeaturesof
MariaDBEnterprise
● Built-inhighavailabilityataccessanddatalayers
SIMPLICITYwithPOWER● Simplifiedmanagementandmaintenance,Easyinstalla6onandscaling
● SameinterfaceasMariaDBandMySQL,AVachestowiderangeofBItools
ColumnarDistributedDataStorage
MariaDBSQLFrontEnd
QueryEngine
UserModules
PerformanceModules1 ... Performance
ModulesNPerformanceModules2
PerformanceModules3
Clients
UserConnec5ons
6
MariaDBColumnStoreArchitecture▪ UserModule:ProcessesSQLRequests▪ PerformanceModule:DistributedProcessingEngine
7
Row-OrientedvsColumn-OrientedRow-oriented: rows stored sequentially in a file
Key Fname Lname State Zip Phone Age Sales1 Bugs Bunny NJ 11217 (123)938-3235 34 1002 Yosemite Sam CT 95389 (234)375-6572 52 5003 Daffy Duck IA 10013 (345)227-1810 35 2004 Elmer Fudd CT 04578 (456)882-7323 43 105 Witch Hazel CT 01970 (567)744-0991 57 250
Column-oriented: each column is stored in a separate file. Each column for a given row is at the same offset. Key12345
FnameBugsYosemiteDaffyElmerWitch
LnameBunnySamDuckFuddHazel
StateNJCTIACTCT
Zip1121795389100130457801970
Phone(123)938-3235(234)375-6572(345)227-1810(456)882-7323(567)744-0991
Age3452354357
Sales10050020010250
8
DataStorage-ExtentsandPMs
Extent 1 Extent 2
Extent 3 Extent 4
Extent 5 Extent 6
Extent 7 Extent 8
PM 1 PM 2
Extent 1 Extent 2 Extent 3 Extent 4
Extent 5 Extent 6 Extent 7 Extent 8
PM 1 PM 2 PM 4 PM 3
● ExtentMap
○ Inmemorymeta-dataofanextent’smin,maxvalueforacolumn,extent’sphysicalblockoffsetandPMonwhichtheextentresides
DataInges5on● BulkdataloadHadoopissuitablefor
○ cpimport:CSVandBinary
○ LOADDATAINFILE:CSV
● ApacheSqoopIntegra6on:○ Integra6onwithcpimportandsqlinterface
● FutureRelease○ DataStreamingfromMariaDB/MySQLdatabasetoMariaDBColumnStorecluster
• via Kafka
• Avro data record
DataInges5on-BulkDataLoad● cpimport
○ Fastestwaytoloaddata• Load data from CSV file • Load data from Standard Input • Load data from Binary Source file
○ Mul6pletablesincanbeloadedinparallelbylaunchingmul6plejobs○ Readqueriescon6nuewithoutbeingblocked○ Successfulcpimportisauto-commiVed○ Incaseoferrors,en6reloadisrolledback
● LOADDATAINFILE○ Tradi6onalwayofimpor6ngdataintoanyMariaDBstorageenginetable○ Upto26messlowerthancpimportforlargesizeimports○ Eithersuccessorerroropera6oncanberolledback
Analy5cs Indatabaseanaly6cswithcomplexjoins,windowingfunc6onsandUDFsOutofboxBIToolsconnec6vity,Analy6csintegra6onwithR
Scale • Columnar,MassivelyParallel• Linearscalability
Performance • Highperformanceadhocanalysis• Consistentqueryresponse6me
HighAvailability Builtinredundancyandhighavailability
EaseofUse • ANSISQLcompa6ble• ACIDcompliant• Noindexes,Nomaterializedviews• Nomanualpar66oning
DataInges5on CONNECTEngineCreateTableasSelectHighspeedparalleldataloadandextract
Security SSLsupport,AuditPlugin,Authen6ca6onPlugin,RoleBasedAccess
DeploymentOp5ons Onpremise,AWS,Hadoop11
MariaDBColumnStore1.0
• Harvestnewvaluefromlargehistoricaldatasetsbyderivingnewinsights• Supportgrowthinyourbusiness,whilecon6nuetodeliverhighservicelevels
fordataanaly6cs
Rows/DataSizeScope
110010,0001,000,000100,000,00010,000,000,000100,000,000,00010-100GB 100-1000GB 1-10TB 10-100TB...PB
MariaDBEnterpriseOLTP
MariaDBEnterpriseEnterpriseOLAP
UseCase:ScalingBigDataAnaly5cs
12
13
UseCase:ScalingBigDataAnaly5cs
● Anorganiza6onisgenera6nglargeamountofopera6onaldata
● Mul6pletera-bytesofhistoricaldata
● Withgrowthinbusinessandinopera6onaldata
○ Analy6csqueryperformancedegrades
○ Imprac6caltodoanaly6cs
● PutpastdataintoMariaDBColumnStore
● Asdatagrows
● Performanaly6cswithoutperformancedegrada6on
● LinearScalabilitywithdatagrowth
BusinessChallenge MariaDBSolu6on
1 2 3
MariaDB ColumnStore 1.0
Add new node(s)
● Uncovernewbusinessopportunitywithdataexplora6onandanaly6csonpetabytedatavolumes
● Generatereal-6meinsightstoinformandenhancelivecustomerinterac6ons
UseCase:DiscoverInsight
UseCase:DiscoverInsight
Challenges
● Needtoanalyzereal-6meandhistoricalflightparameterdata
● Too6me-consumingtoperformanaly6cswithcurrenttoolset
● MostdataanalysthaveSQLbackground
Objec5ves:● Maintainflightsafety-accurately
predictpartreplacementt● Providehighservicelevelsand
minimizecost-proac6velyplanequipmentmaintenanceandre6rement
GlobalCommercialAvia5onManufacturer
HistoricalDATAReal-6mein-flightperformancedata
• Complex-join,aggrega6onandwindowingfunc6ons
• Highspeedreal-6meperformance
Micro-batchuploadreal-6meflightperformanceintoMariaDBColumnStore
Analy6csDATAScien6st
FamiliarSQLInterface
Thecompanyplanstosellthissolu5onasaservicetocommercialairliners
Timelymaintenanceforecast,partreplacement,
flightre5rement
● FamiliarSQLinterfacesdemocra6zesaccesstobigdatatolargeruserbase
● AVachwiderangeofBItoolsviaMariaDB/MySQLconnectors
● GekngmostvalueoutofbigdatawhileminimizingOpexcost
● LeverageHadoopdeployments
UseCase:AcceleratedAnaly5cswithSQL&SPARK
17
UseCase:AcceleratedAnaly5cswithHadoop
● MariaDBColumnStoreOLAPcanrunonpremise,oncloudoronHadoopcluster
● IngestdatafromHadoop
● MatureANSI-SQLcompliance
● Stellarperformance:70to806mesfasterthanSQL-on-HadoopcounterpartsHive,HbaseandImpala
● Matureinterfaces
BusinessChallenge MariaDBSolu6on● LargeamountofdatainHadoop
● Hadoopissuitablefor
○ batchprocessing
○ TransformsviaMap-Reduceprogramming
● Real-6meanaly6csonHadoop
○ SpeedcannotmeetbusinessrequirementwiththeHadooptoolset
● ShortageofHadoopskillsforDataScien6st/BA
○ SQLinterfacesonHadoopToolsarenotmature
MapReduceHBase MariaDBColumnStore
HadoopDistributedFileSystem
Pig/Hive
BatchProcessing HighPerformanceanaly6cs
● ImprovedDBAproduc6vity
● FamiliarSQLinterfacesdemocra6zesaccesstobigdatatolargeruserbase
● Reducedopera6onalcomplexity
● GekngmostvalueoutofbigdatawhileminimizingDBAOpexcost
UseCase:SimplifyingBigDataManagement
19
UseCase:SimplifyingBigDataManagement
● MariaDBColumnStore
● Libera6onfromIndexmanagement
● Automa6cpar66oning
● Easytogrow
● Micro-batchbulkloadforreal-6medata-flow
BusinessChallenge MariaDBSolu6onComplexityofdatamanagementincreasesasdatavolumegrows
● Tedioustokeepupwithindexesandpar66oningasdatagrow
● Scaling-outorScalingupmanagement
● Movingopera6onaldatatobigdataanaly6csplalorminreal-6me
PMNode
cpimport
Source Source Source
UMNode
PMNode
PMNode
20
MariaDBColumnStoreRoadmap
Firstrelease• MariaDBColumnStore(Por6ngofInfiniDBonMariaDB10.1)• AmazonEBSsupport• CreateTableLike/AsSelect
FutureReleases• SparkIntegra6on• DataStreamingintegra6onwithMaxScale• Na6veAPIforcolumnarfile• JoinandFilterperformanceop6miza6on• ROLLUP,CUBEinMariaDBColumnStore• ASOFimplementa6oninMariaDBServer• CONNECTEnginesupportinMariaDBServer• SQLEditor(OSSor3rdpartypartner)
Subscrip6onoffering
21
● BETAreleaseinQ42016.
● Signupforno6fica6onofBETAavailabilitytoday
● ProductPagehVps://mariadb.com/products/mariadb-columnstore
LearnmoreaboutMariaDBColumnStore
22
Q&A
23
ThankYou