Intro to MarkLogic Data Hub Architecture · 13 June 2019© MARKLOGIC CORPORATION Pete Aven. Senior...

Post on 10-Jul-2020

1 views 0 download

transcript

13 June 2019© MARKLOGIC CORPORATION

Pete AvenSenior Principal Solutions Engineer

@peteaven

Intro to MarkLogic Data Hub Architecture

Data is (still) in silos

MarkLogic Data Hub

ADVANCED SECURITY

SMART CURATION

UNIFIED PLATFORM

LOAD DATA AS IS SIMPLE DEVELOPMENT

KEY PRINCIPLES

Agility in Action Data Services First

Business Answer First

Expect and Embrace Change

Governed by Default

Deploy Anywhere

DATA AGILITY

Data Services First Minimize up-front work by

focusing on business valueand working back to data

Reduce execution risk with aggressive scoping, frequent iterations, continuous feedback

Increase returns on cumulative data

Direct analogue and enabler for Agile software development

Focus on Ease and Impact (ROI)Ea

se o

f Exe

cutio

n

Business Impact

Find Orders

High impact

Predict fraud

LEADING WITH BUSINESS VALUE

Flag Fraud

The Data Hub In Action

ERP

CRM

Taxonomy

Inventory

places

includes

Customer

Order

Product

wheresMyOrder

Flag Fraud

Segregate Cohorts

DATA SERVICES FIRSTI

MarkLogic Architecture

STORAGE LAYERScalability and Elasticity

ACID Transactions

INTERFACE LAYER

Data ServicesJSON, XML, RDF, Geo, Text, Binaries

RESTAPI

Graph / SPARQL

QUERY LAYER

JS XQuery SPARQL

JavaScript XQuery SPARQLSQL

INDEXES Universal Index

Geospatial Index

Triple Index

AutomatedFailover

ReverseIndex

DATA LOGIC

Expect and Embrace Change

Upstream

Quality and Meaning

New Sources

Messy or unexpected data

Ambiguous or conflicting definitions

Downstream

Business Requirements

New opportunities enabled by creative reuse

Get value sooner

Experiment with less cost

Everywhere

Compliance and Governance

New regulations and enforcement

Increased threats

Sharing not hoarding

ERP

/wheresMyOrder

Customer

Collection:/acme/customers

Hierarchical, sparse, high cardinality

Precise structure to free text

Change the data, change the schema

Standard JSON or XML, text, binary

Documents Represent Data More Naturally

DATA MODEL

ERP

/wheresMyOrder

Customer

Collection:/acme/customers

Document Data Model Load as is Universal indexing

Values Full text Structure Scalar ranges Geospatial

Schema on read Organize by collections,

directories

ERP

/wheresMyOrder

CRM eCommerce

Customer

Customer

prov:derivedFrom

prov:generatedBy

rdf:type

rdf:type

prov:wasRevisionOf

/wheresMyOrder

e7e9879a…

?

Customer

Order

Product

Purchased

Places

Includes

RelationshipsGRAPHS

Entities are documents

Relationships are triples

- Entities related to Entities

- Entities related to Facts

- Facts related to Facts

Infer new relationships

Derived fromType

Type

PII

SSN

Semantic RelationshipsGRAPHS

Entities are documents

Relationships are triples

- Entities related to Entities

- Entities related to Facts

- Facts related to Facts

Semantic RelationshipsGRAPHS

Entities are Documents

Relationships are triples

- Entities related to Entities

- Entities related to Facts

- Facts related to Facts

73fa4dc0…

Is Concept

Same As

SKOS

Customer

30d623ff…

,

Order Product

acme:includesacme:places

73fa4dc0…

isConcept

rdf:type

rdf:type

acme:purchased

e7e9879a…

acme:powerOfAttorney

0.

1.

rdf:type

prov:generatedBy

Governed by DefaultPUTTING THE “MS” BACK IN DBMS

Manage policy along with the data and metadata that it governs

Query that policy just like data to make enforcement model-driven

Automatically enforce policy in the database

Track lineage as data and policy change

Developer

Ummm,Can you repeat that please?

Domain Expert

Domain Expert

MODEL-DRIVEN

Data, Metadata, and Policy Model important business concepts as

needed (and not before)

Manage policy along with the data and the metadata it governs

Drive business processes and configuration from queryable policy definitions

Customer

e7e9879a…Secure by DesignGOVERNED BY DEFAULT

Confidentiality: Role-based access control and encryption at rest, in motion

Integrity: Transactional consistency and auditable trustworthiness

Availability: Elastic scale out and HA/DR

Deploy AnywhereAGILE INFRASTRUCTURE

Align infrastructure costs with SLAs using elastic scaling

Avoid lock-in with flexible cloud and on premise deployment

Reduce risk with automation and componentization

MarkLogic in Any CloudCLOUD NEUTRAL

• Proven in the cloud

• Private, hybrid, or public cloud

• AWS, Azure, and Google Cloud (and others)

• Deployment automation

CUSTOMER LDAP

CUSTOMERVPC

VPCPEERING

INGESTION & CURATION ACCESS

SERVICE VPC

(CUSTOMER ISOLATED)

LOAD BALANCER

LOAD BALANCER

Data

D-NODES

Operational AnalyticalCuration

LOAD BALANCER(8010-13-8000)

MarkLogic Tools

DMSDK MLCP REST API

MarkLogic Data Hub Service

Much More Than a Database as a Service

On-Premises

DATA CENTERS

NETWORKING

STORAGE

SERVERS

VIRTUALIZATION

OS

DOCUMENT DB

GRAPH DB

RELATIONAL DB

SEARCH

ETL

MDM

SECURITY

APPS

DATA CENTERS

NETWORKING

STORAGE

SERVERS

VIRTUALIZATION

OS

DOCUMENT DB

GRAPH DB

RELATIONAL DB

SEARCH

ETL

MDM

SECURITY

APPS

DATA CENTERS

NETWORKING

STORAGE

SERVERS

VIRTUALIZATION

OS

DOCUMENT DB

GRAPH DB

RELATIONAL DB

SEARCH

ETL TOOLS

MDM

SECURITY

APPS

DATA CENTERS

NETWORKING

STORAGE

SERVERS

VIRTUALIZATION

OS

DOCUMENT DB

GRAPH DB

RELATIONAL DB

SEARCH

ETL TOOLS

MDM

SECURITY

APPS

IaaS DBaaS Data Hub Service

The most comprehensive out-of-the-box cloud service stack

HARMONIZATION HARMONIZATION HARMONIZATION HARMONIZATION

Tool Chain

KEY PRINCIPLES

Agility in Action Data Services First

Business Answer First

Expect and Embrace Change

Governed by Default

Deploy Anywhere

Thank you