Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | stuart-robards |
View: | 217 times |
Download: | 0 times |
1This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Paul Preuveneers – Principal Technologist
Lee Pollington – Principal Consultant
The Only Operational Database Technology for Mission-Critical Big Data Applications
2This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
3This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Volume
Complexity VariabilityValueVariety
Petabyte / ExabyteBillions of itemsSocial MediaMachine dataData processes producing data
10Ks of transactions per secondIn & outStreamsBulk processing
PatternsInferenceUnstructuredDisparate eventsRelationships
Varied sourcesVaried data typesChanging data typesValue from decision supportValue from operational efficiencies
Velocity
4This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic?•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
5This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What is MarkLogic Server?
•Special Purpose DBMS for poly-structured information, with enterprise expectations
• ACID transactions
• Backup, Full/Partial Replication, Distributed Txns
•Search Engine Kernel, with enterprise expectations
• Full text
• Faceted navigation, at massive scale
• Boolean, proximity, stemming, tokenization, decompounding, case, diacritics, language…
•Application Server
• HTTP (including RESTful)
• XCC Java/.NET
• WebDAV
6This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What makes MarkLogic DBMS Special?
•Not Relational (RDBMS)
•XML
• The Only Data Model Required
• Schema Agnostic
• Text a First-class Citizen among Data Types
• XQuery/XSLT
•Optimized Search Engine Algorithms
•Very Low DBA Overhead (0.5 FTE / 100 hosts)
•5-Minute Install
•5-Minute Scale-Out
•Database and Search Engine are the same
7This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
What makes MarkLogic Search Special?
•Transactional: Enterprise Scale (no index latency)
•Unicode (Internationalization)
•Multiple Query Types• Analytics: Aggregation, Facets & Ranges, Co-occurrence, Geospatial
• Text Search: Boolean, Stemming, Word Lexicons, Dictionary & Thesauri
• Alerting: Profiles, Alerts, Filters, Tipping, Selectors, “Triggers” …
• Powerful Search Combination (e.g. Text + Analytics + Alerting)
•Processing Near the Data (fast search, low bandwidth)• Database and Search Engine are the same
8This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
123, 127, 129, 152, 344, 791 . . .
122, 125, 126, 129, 130, 167 . . .
123, 126, 130, 142, 143, 167 . . .
123, 130, 131, 135, 162, 177 . . .
126, 130, 167, 212, 219, 377 . . .
. . .
. . .
Document References
126, 130, 167, …
Term Term List
Range Indexes
“accelerating”
“creation”
“content”
“application”
“agility”
<article>
<article> / <title>
product: MarkLogic
Geospatial
Search: Universal Index
9This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
MarkLogic Can Scale
•Scale Up: Typically 1 TB+ XML per Server
•Scale Out: Low Hundreds(++) of Servers in a Cluster
•Commodity Hardware
• 2-CPU x 6-core/hyperthreaded
• 32+ GB RAM
• 3x disk: local mount with failover
•OS
• Linux RHEL 5
• Solaris 10
• Windows 2003/8 (XP/Vista/7 for Dev)
10This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
E Host 1
partition1
E Host 3
D Host 4 D Host 5 D Host 6 D Host k
partition2 partition3 partitionm
E Host 2
partition4
HA&DR
AppServer
Data
Same Code-base
Shared-Nothing Cluster
11This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Agenda
•Big Data and MarkLogic•What is MarkLogic?•MarkLogic in Financial Services•MarkLogic Integration Points (Connectors / Toolkits)
12This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Financial Services Solutions
• Operational Data Store / Trade Store
Highly Transactional
• ISDA Contract Analysis (Electronic & Paper)• Document Analysis (e.g. Sales Process, Financial Directives)• Situational Awareness• Customer On-Boarding
Content Aggregation & Discovery
• Research / Policy Authoring & Distribution
Content Publishing
13This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Operational Data Store / Trade Store
- High Volume Trades (Derivatives, Equities, FX etc.) in siloes
- Mostly represented in XML (e.g. FpML, FIXML)
- Point-in-time queries (e.g. exposure by counterparty)
- Risk Management (understand exposure, auditing)
What is it?
- High Performance with Native XML compared to RDBMS
- We are a transactional DB (ACID + business continuity)
- Less hardware required / commodity servers
- No shredding of XML (lowers risk of corruption)
- Can aggregate over multiple schemas
- Easily accommodate new schemas, changes in schema
Why are we good at it?
14This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Operational Data Store / Trade Store
Example: JP Morgan Chase ODS
Live for 12+ months
2.25 million OTC Derivatives (450+ million documents)
Strategic platform mandated for core transaction processing
Short-listed for Best Investment Banking Initiative at The Banking Technology Awards 2011
Agile onboarding of new Derivatives products
Huge reduction in time to process FO XML messages
20 Sybase systems replaced with 3-Node MarkLogic cluster
15This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
It's a Trade Processing Story
Started with Derivatives
Natural fit with documents
Complex instruments, “low volume” instruments
It’s a trade workflow engine
Enterprise Service Bus / Component architecture
New products
Modifications to existing products
Securities had a new challenge for us
16This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
ISDA Contract Analysis
- Swaps / Derivatives Contracts
- Risk Management (understand exposure)
- Effect of Change (e.g. credit rating, termination events)
What is it?
- Contracts are combination data/text
- Front-end solutions like Exari use Word for contract authoring but output structured XML
- Good query functions for filtering and aggregation of exposure as well as other what-if scenarios
Why are we good at it?
- If in paper form, OCR and enrichment is required. This is hard, time-consuming and costly (up to $150 per doc for managed service)
- Most contracts are in paper form (90+ percent)
Where do we need help?
17This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Document Analysis (e.g. Sales Process, Financial Directives)
- Making sense of poly-structured data (avoid BIG fines)- Extracting patterns and trends (e.g. did we say the right thing to our customer at the right time? / PPI mis-selling)- Developing value calculations in hard-to-handle formats (i.e. aggregating and unlocking the calculations in Excel)
What is it?
- Good conversion tools for PDF, MS Office etc.- Great full-text search to analyse converted documents- Inclusion of external content where applicable (RSS, Social Media, Web Sites)- Group individual Excel spreadsheets for powerful analysis
Why are we good at it?
- Enrichment often requires substantial domain expertise
Where do we need help?
18This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Situational Awareness
- Trading Decision Support- Amalgamation of internal/external poly-structured data- Heavy geospatial element- Analysis across datasets (vessels, pipes, weather, RSS)
What is it?
- Quick take-up of new sets of data- ML is good at geospatial queries- ML is good at incorporating external data (web, RSS etc.)
Why are we good at it?
19This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Situational Awareness
20This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Customer On-Boarding
- Content Aggregation from multiple CMS
- KYC / Holistic view of customer (good communication)
- Avoid duplication of effort (faster on-boarding)
- Rapid search and retrieval
What is it?
- Feature-rich, fast search at volume
- 30 Digits allows us to extract from multiple CMS
- Flexible metadata-handling (dynamic facets)
- Able to apply security model from underlying CMS
Why are we good at it?
- Lots of content is image-based / requires OCR and data enrichment
Where do we need help?
21This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Research / Policy Authoring & Distribution
- Template-driven authoring
- Ensuring consistency, validation and component re-use
- Dynamic Publishing (VISA, Morgan Stanley, Citigroup)
What is it?
- Easy template creation and maintenance
- Great integration with MS Office
- Componentisation and versioning easy in ML
- Dynamic assembly based on role/geography etc.
Why are we good at it?
22This document is CONFIDENTIAL and its circulation and use are RESTRICTED. © 2012 KPMG LLP, a UK limited liability partnership, is a subsidiary of KPMG Europe LLP and a member firm of the KPMG network of independent member firms affiliated with KPMG International, a Swiss cooperative. All rights reserved.
Thank You – Questions?