+ All Categories
Home > Documents > High Performance Enterprise Data Propagation Russell Donovan.

High Performance Enterprise Data Propagation Russell Donovan.

Date post: 11-Jan-2016
Category:
Upload: tyrone-floyd
View: 213 times
Download: 0 times
Share this document with a friend
50
High Performance Enterprise Data Propagation Russell Donovan
Transcript
Page 1: High Performance Enterprise Data Propagation Russell Donovan.

High PerformanceEnterprise Data Propagation

Russell Donovan

Page 2: High Performance Enterprise Data Propagation Russell Donovan.

BMC Company Profile

Established in 1980 Leader in Application Management Estimated FY2000 Revenues of $1.8B Over 6,000 Employees Development Labs in Austin (TX), Conyers (GA),

Houston, San Jose, Sunnyvale (CA), Waltham (MA) Germany, Israel, Singapore

Market Coverage in Over 50 Countries Member of the S&P 500

Page 3: High Performance Enterprise Data Propagation Russell Donovan.

BMC Software e-Business Availability

Provides application management solutions that ensure the availability, performance, and recovery of business-critical applications.

We call this application service assurance and it means that the applications companies and their customers rely on will be there when they need them.

e-vailability - We Guarantee Our Solutions!

Page 4: High Performance Enterprise Data Propagation Russell Donovan.

Enterprise Data Propagation (EDP)Requirement For All Enterprises

Need to synchronize data between legacy systems and distributed relational databases for:

Data warehousing, operational data stores, data mining

e-Business applications access to legacy data

Enterprise application integration

Distributed enterprises, ERP solutions, Acquisitions

70% of corporate data in IMS, VSAM, DB2

DB2

OtherVSAM

IMS

Need high performance solutions Need near real time solutionsNeed high performance solutions Need near real time solutions

Page 5: High Performance Enterprise Data Propagation Russell Donovan.

Data Propagation - Strategies For Synchronizing Multiple Copies of Data

Copy

Unload/Load

Distributed Database2 Phase Commit

SQL Query

Source Target

Change Capture --- With Asynchronous Propagation

Page 6: High Performance Enterprise Data Propagation Russell Donovan.

Key Challenges Implementing a Data WarehouseKey Challenges Implementing a Data Warehouse

Data Management Review Survey Business rule analysis

Managing End User Expectation

Business data modeling

Reliability and integrity of data

Data acquisition

Meta Data management

Managing Management Expectation

Database performance

Data Management Review Survey Business rule analysis

Managing End User Expectation

Business data modeling

Reliability and integrity of data

Data acquisition

Meta Data management

Managing Management Expectation

Database performance

Page 7: High Performance Enterprise Data Propagation Russell Donovan.

Data Warehouse Implementations

For Customers With Large operational databases High transaction rates 24x7 operations requirements

Critical Management Issues Availability of operational systems Performance of operational transactions Maintaining service levels Increasing volumes of data Time required to load and refresh data warehouse Quality, currency & accuracy of decision making data

Page 8: High Performance Enterprise Data Propagation Russell Donovan.

Building Data Warehouses: A Perspective

Operational DatabaseImages

Subject Oriented

Operational Images

Time VariantSubject OrientedData Warehouses

DataMarts

OperationalDatabases

VSAM Files

Page 9: High Performance Enterprise Data Propagation Russell Donovan.

Building Data Warehouses:A Perspective

VSAM

IMS

DB2

End-Users

Mainframe Tools

Prism, ETICarleton, SASPlatinum

Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy

DataMart

DataWare-House

DataMart

Data Warehouse

Oper.DataStore

Page 10: High Performance Enterprise Data Propagation Russell Donovan.

Building Data Warehouses:A Perspective

VSAM

IMS

DB2End-Users

Mainframe Tools

Prism, ETICarleton, SASPlatinum

Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy

DataMart

DataWare-House

DataMart

Data Warehouse

Oper.DataStore

Dist. SystemsTools

Informatica, Constellar D2K, Sagent, Ardent

Integration

Area

Page 11: High Performance Enterprise Data Propagation Russell Donovan.

Building Data Warehouses:A Perspective

End-Users

Mainframe Tools

Prism, ETICarleton, SASPlatinum

Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy

DataMart

DataWare-House

DataMart

Data Warehouse

Oper.DataStore

Dist. SystemsTools

Informatica, Constellar D2K, Sagent, Ardent

Integration

Area

ChangeDataMove

Bulk DataMove

VSAM

IMS

DB2

BMCSolution

Page 12: High Performance Enterprise Data Propagation Russell Donovan.

Change Data Propagation: A Perspective

Change Data Propagation Is Preferred When: Databases are large and bulk move would take too long

Batch window limitationsDatabase availability limitations

Support for 24 x 7 is a requirement of operational application Minimum latency “Near-real-time” is required in target database! Currency of information in target database is important Small percentage of a large database has changed Need to reduce network traffic by transmitting only data changes

SourceTarget

Page 13: High Performance Enterprise Data Propagation Russell Donovan.

Synchronous Data PropagationOriginal update waits until all targets are updated

Single, global transaction with multi-site, coordinated commit processing

Asynchronous Data PropagationPropagation of updates occurs asynchronous to originating

transaction Minimizes resource consumption at source Minimizes impact on source transaction response times

Transaction BasedChange Data Propagation

Source Target

Source Target

Page 14: High Performance Enterprise Data Propagation Russell Donovan.

Synchronous vs AsynchronousChange Data Propagation

Advantages: Real time propagationAll sites always synchronized

Disadvantages:Transaction response timeData availability impactSystem resiliencyUsually not practical

Source transaction completes when all databases updated

Advantages:Minimum performance impactAvailabilityAutonomyRecoverability

Disadvantages: target locations updates may be delayedAll sites not always synchronized

Source transaction does not wait for target databases to be

updated

Synchronous 2 Phased CommitSynchronous 2 Phased Commit Asynchronous Data PropagationAsynchronous Data Propagation

Page 15: High Performance Enterprise Data Propagation Russell Donovan.

Asynchronous Change Capture:Implementation Considerations

Trigger Based Triggers used to capture changes to database records Incremental updates collected in staging tables Significant resource consumption for triggers and logging Typically low volume applications (< 20 transactions/second)

Log Exit Based Increased logging in operational environment Increased response times for source transactions Increased resource consumption Log management issues

Log Post Process Based Increased logging in operational environment Log management issues Long latency interval can not support near real time

Page 16: High Performance Enterprise Data Propagation Russell Donovan.

Enterprise Data Propagation (EDP) The BMC Solution

A Data Propagation Management System

Common look and feel Integrated transformations and mappings Integrated recovery/restart

A single point of access for managing Legacy data propagation across the enterprise

Efficient change captureBasic data transformation High performance data movementHigh performance utilities

OperationalDataStore

Bulk Data

Change Data

VSAMVSAM

IMSIMS

DB2DB2

FastFastPathPath

Page 17: High Performance Enterprise Data Propagation Russell Donovan.

ChangeDataMove:Product Positioning

Positioning

ChangeDataMove is a high performance, efficient, change data propagation solution, which captures changes made to IMS, Fast Path, VSAM, and DB2 databases, and propagates those changes to the most prevalent relational databases.

What It DoesTransaction-based data propagation

Supports high volume production applications with hundreds of transactions per second

Supports ‘near real-time’ as well as scheduled data propagation

AdvantagesA data propagation system (complete solution vs a point product)Highly efficient change capture does not impact applications

Only solution for IMS, FastPath and VSAM that does not require logging

Optionally integrated with DataMove for bulk data movement

Page 18: High Performance Enterprise Data Propagation Russell Donovan.

Change Data Propagation for IMS and VSAM

Synchronous Change Capture Transparent high performance

change capture Minimum impact on source system

logging, CPU & user response time Data is available immediately for

asynchronous propagation

Asynchronous Data Propagation Data Propagated Within Context of

Original Transaction Updates applied in proper

sequence Inter and intra-table consistency Source and target(s) consistent

within transaction boundaries

EDPEDPApplyApplyEDPEDP

ApplyApply

1

2

3

1

2

3

EDPEDPLogLog

Not affected by network delays or slow remote processors Supports “Near Real Time” and/or Scheduled Propagation

Not affected by network delays or slow remote processors Supports “Near Real Time” and/or Scheduled Propagation

Page 19: High Performance Enterprise Data Propagation Russell Donovan.

Resides within the IMS environment Captures DL/I calls as they occur

Supports IMS/TM (MPP,BMP), Fast Path, CICS DBCTL, Batch DL/I Commits updates at transaction or job (batch) end

Based on BMC Software’s CHANGE

RECORDING FACILITY

Database

ECCRECCR

UserApplication

IMSSubsystem

IMS

BMCApplyBMCApply

OEMApplyOEMApply

EDPLogger

EDPLogger LRPLRP TNRTNR

EDPLog

IMS Change CaptureIMS Change Capture

DB2DB2

OracleOracle

SQLSQLServerServer

SybaseSybase

UDBUDB

Page 20: High Performance Enterprise Data Propagation Russell Donovan.

Captures changes at each Get, Put & Erase request Utilizes CICS TRUE, File, and Re-sync exits Resides as functional part of CICS address space

Participates in two phase commit with CICS transaction Updates are committed when transaction commits

Database

UserApplication

CICSSubsystem

VSAM ECCRECCR

BMCApplyBMCApply

OEMApplyOEMApply

EDPLogger

EDPLogger LRPLRP TNRTNR

EDPLog

CICS/VSAM Change CaptureCICS/VSAM Change Capture

DB2DB2

OracleOracle

SQLSQLServerServer

SybaseSybase

UDBUDB

Page 21: High Performance Enterprise Data Propagation Russell Donovan.

VSAM Batch Change Capture

Journad exit dynamically activated ECCR resides within the batch address space UOW is complete when application closes VSAM file

Based on BMC Software’s RECOVERY PLUSfor CICS/VSAM product.

Database

UserApplication

BatchAddressSpace

VSAM ECCRECCR

BMCApplyBMCApply

OEMApplyOEMApply

EDPLogger

EDPLogger LRPLRP TNRTNR

EDPLog

DB2DB2

OracleOracle

SQLSQLServerServer

SybaseSybase

UDBUDB

Page 22: High Performance Enterprise Data Propagation Russell Donovan.

DB2 MVS Change Capture

Requires DB2 change data capture be activated Reads log records via DB2 IFI, external decompression Maintains multiple versions of schema

Uses DB2IFI Facility

UserApplication

DB2MVS

DB2ECCRDB2

ECCR

BMCApplyBMCApply

OEMApplyOEMApply

EDPLogger

EDPLogger LRPLRP TNRTNR

EDPLog

DB2DB2

OracleOracle

SQLSQLServerServer

SybaseSybase

UDBUDB

Page 23: High Performance Enterprise Data Propagation Russell Donovan.

The Transformation Process

VSAM Files TransformationTransformationTransformationTransformation

Transforms IMS, Fast Path and VSAM data to relational formatsHierarchical structures to relational structures

Converts non-relational data types to relational Uses relational DBMS catalog informationUses copy libraries and IMS database descriptorsAutomatically handles Dates, Times, Data TypesRepeating groups, Redefined records

Customizable through user exits

Page 24: High Performance Enterprise Data Propagation Russell Donovan.

Possible Target Keys

To allow resulting target rows to be unique Replication Key (REPKEY)

This key will make the target row unique For IMS it is the full concatenated key or segments RBA

Ancestor Keys If REPKEY is a composite key (I.e. IMS concatenated key) each level is

available to be used as the key of the target row Sequential number

If a single input segment or record creates multiple output rows, a sequential numeric column can be generated.

Any field in the input segment or record

Page 25: High Performance Enterprise Data Propagation Russell Donovan.

Transforming Cobol Structures

Repeating Groups all repeated fields to a single target column As individual rows in the same or a different table

Update results in set of deletes and inserts for target rows

Redefined Records assigned unique names and schema definitions Record identification exit identifies record types Schema applied to segment or record based on redefined record type

Redefined records can be propagated to same or different targets

Page 26: High Performance Enterprise Data Propagation Russell Donovan.

High Performance Transport & Apply Data is blocked, compressed and encrypted Multi-threaded apply tasks for increased performance

TTRRAANNSSPPOORRTT

TTRRAANNSSPPOORRTT

DB2DynamicMemoryStagingQueueTCP/IP TT

RRAANNSSPPOORRTT

TTRRAANNSSPPOORRTT

Send Receive

EDPApplyEDP

Apply

EDPApplyEDP

Apply

EDPApplyEDP

Apply

EDPApplyEDP

Apply

EDPApplyEDP

Apply

EDPApplyEDP

Apply

OracleDynamicMemoryQueue

Page 27: High Performance Enterprise Data Propagation Russell Donovan.

Automated Schema Replication

Reduce administration costs by automating the creation of target tables from IMS, VSAM, and DB2 source schema

VSAM Files

DBDCopybook

Copybook

DB2Catalog

Oracle

DB2

SchemaMoveSchemaMoveSchemaMoveSchemaMove

MS SQLServer

Page 28: High Performance Enterprise Data Propagation Russell Donovan.

Bulk Data Propagation

Bulk move is usually simpler and easier to implement

Needed to initially create or to refresh a target databaseBulk move is the preferred solution when:

Data volumes are not large and the move can be performed within time constraints

Database availability is not a concern (source/target)Network volumes and network overhead are not issues Currency of information in target database is not a concern Change data propagation cannot handle the volumes

Source Target

Page 29: High Performance Enterprise Data Propagation Russell Donovan.

Bulk Data Movement DB2 to OracleThe Traditional Approach

DB2

OracleFile

TCP/IP

UNIX Server

MVS Host

DB2ExtractDB2

ExtractDB2DB2

GatewayGateway

GatewayGateway

OracleOracleOracleLoaderOracleLoader

File 35%35%

52%52%

13%13%

DB2 Unload 20 min.DB2 Unload 20 min.

File Transfer 7 min.File Transfer 7 min.

Oracle SQL Load 28 min.Oracle SQL Load 28 min.

TimeTime

Total Time 55 min.Total Time 55 min.

Page 30: High Performance Enterprise Data Propagation Russell Donovan.

Bulk Data Movement DB2 to OracleParallel Unload Parallel load

DB2

OracleFile

FileParallel Unload 7 minParallel Unload 7 min

File Transfer 7 min.File Transfer 7 min.

Oracle SQL Load 16MOracle SQL Load 16M

TimeTime

TCP/IP

UNIX Server

MVS Host

DB2ExtractDB2

ExtractDB2DB2

GatewayGateway

GatewayGateway

OracleOracleOracleLoaderOracleLoader Total Time 30 min.Total Time 30 min.

Page 31: High Performance Enterprise Data Propagation Russell Donovan.

Bulk Data Movement DB2 to OracleParallel Unload/load & Piping

DB2

OracleFile

TCP/IP

UNIX Server

MVS Host

DB2ExtractDB2

ExtractDB2DB2

GatewayGateway

GatewayGateway

OracleOracleOracleLoaderOracleLoader

File Parallel UnloadParallel Unload

PIPINGPIPING

Parallel loadParallel load

Oracle load starts as firstrecord is read from DB2Oracle load starts as firstrecord is read from DB2

Total Time 17 min.Total Time 17 min.

Page 32: High Performance Enterprise Data Propagation Russell Donovan.

DataReach: Product positioning

Positioning

DataReach is a high performance, high availability data movement solution for extracting MVS/ESA DB2 data and loading it into Informix, Oracle or Sybase database on Unix. A joint development effort of EMC & BMC - Not A Product We Sell Today

What It Does Uses EMC Storage to move data at channel speeds vs network

speeds Moves the work of extracting DB2 MVS data from MVS to Unix

Advantages Moves data 10 to 100 times faster than network solutions Completely eliminates mainframe processing Completely eliminates network traffic and network overhead Allows nearly 100% availability of the source DB2 database Enables customers to more frequently refresh data warehouses

Page 33: High Performance Enterprise Data Propagation Russell Donovan.

Bulk Data Movement DB2 to OracleThe DataReach Approach

DB2

OracleOracleOracle

OracleLoader

OracleLoader

UNIX Host

MVS Host

DB2ExtractDB2

Extract

DB2DB2

IntermediateFile

DataReach Directly Extracts DB2 DataDataReach Directly Extracts DB2 DataEliminates network traffic & network overheadEliminates network traffic & network overheadFamiliar SQL-based SELECT syntaxFamiliar SQL-based SELECT syntaxSubset of data via WHERE predicateSubset of data via WHERE predicateOptional parallel extraction capability Optional parallel extraction capability Optional access via DB2 Index structuresOptional access via DB2 Index structuresData conversion Data conversion

EBCDIC to ASCIIEBCDIC to ASCII DB2 to generic formatDB2 to generic format

Direct load of Oracle, Sybase, InformixDirect load of Oracle, Sybase, InformixOptional parallel load capabilityOptional parallel load capabilityDistributed capabilitiesDistributed capabilities

DataReach Directly Extracts DB2 DataDataReach Directly Extracts DB2 DataEliminates network traffic & network overheadEliminates network traffic & network overheadFamiliar SQL-based SELECT syntaxFamiliar SQL-based SELECT syntaxSubset of data via WHERE predicateSubset of data via WHERE predicateOptional parallel extraction capability Optional parallel extraction capability Optional access via DB2 Index structuresOptional access via DB2 Index structuresData conversion Data conversion

EBCDIC to ASCIIEBCDIC to ASCII DB2 to generic formatDB2 to generic format

Direct load of Oracle, Sybase, InformixDirect load of Oracle, Sybase, InformixOptional parallel load capabilityOptional parallel load capabilityDistributed capabilitiesDistributed capabilities

Page 34: High Performance Enterprise Data Propagation Russell Donovan.

DataReach: How It Works

SYMMETRIX ESP

DB2Source

FBA Volumes

MVS SystemDB2

CKD Volumes

TargetDBMS

Escon Channels

SCSI Channels

Extractor TranslationModule

Native load utilityTarget RDBMS

UNIX Flat File

TargetDBMS

Page 35: High Performance Enterprise Data Propagation Russell Donovan.

DataReach: Performance Benchmark

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

800.00

900.00

1,000.00

10 Mbytes 100 Mbytes 1 Gbyte 5 Gbytes 10 Gbytes

Size

Min

ute

s

Traditional

DataReach

DB2 to Oracle on HP/UX

Page 36: High Performance Enterprise Data Propagation Russell Donovan.

Traditional Process vs DataReach

Elapsed Time Components 1 GB of Data

DB2 to Oracle on HP/UX

0:20:06

0:07:10

0:27:53

0:16:35

0:00:00

0:07:12

0:14:24

0:21:36

0:28:48

0:36:00

0:43:12

0:50:24

0:57:36

1:04:48

Traditional Process DataReach Process

Ela

ps

ed

Tim

e in

Min

ute

s

DataReach Process Time

Oracle SQL Load Time

File Transfer Time

DB2 Unload Time

Page 37: High Performance Enterprise Data Propagation Russell Donovan.

DataReach: Operational Considerations

Data Consistency: Quiesce DB2 High Availability: Use A mirror copy in Symmetrix Security: DataReach Authorization Table in DB2

DB2 Read access Unix Login Target RDBMS authorizations

Page 38: High Performance Enterprise Data Propagation Russell Donovan.

Extract, Transform, Move & Load OptionsA Performance Perspective

M Bytes per HourM Bytes per HourM Bytes per HourM Bytes per Hour

0

500

1000

1500

2000

2500

3000

3500

4000

Change DataPropagation

RYO BulkMove

Solutions

DataReachParallel Unload/LoadPiping

Making The Right Choice

Page 39: High Performance Enterprise Data Propagation Russell Donovan.

High Performance Data Propagation Strategy for Supporting Data Warehouse

Operational Applications

DataWarehouse

Data WarehouseRefreshChange

HistoryDataMart

DataMart

Integration Area

Business Intelligence Systems

OperationalDataStore

High Performance Data Propagation

Other

VSAM

DB2

Fast Path

IMS

Page 40: High Performance Enterprise Data Propagation Russell Donovan.

High Performance Data Propagation Strategy for Supporting DW & e-Business

Operational Applications

DataWarehouse

App. Server

Updates

Inqu

ires

Data WarehouseRefresh

ChangeHistory

DataMart

DataMart

Integration Area

Business Intelligence Systems

Web Server

OperationalDataStore

High Performance Data Propagation

Other

VSAM

DB2

Fast Path

IMS

Page 41: High Performance Enterprise Data Propagation Russell Donovan.

High Performance Data Propagation Strategy for Enterprise Application Integration

Operational Applications

High Performance Data Propagation

Other

VSAM

DB2

IMS

Messaging

BulkMessage Queue

ChangeMessage Queue

PeopleSoft

SAP

e-businessApplications

Baan

Oracle

DataMart

ERPTools

DataWarehouse

Note: This is a BMC Services Offering

Page 42: High Performance Enterprise Data Propagation Russell Donovan.

Major U.S. Brokerage Firm

Application Integration exampleGlobal corporation headquartered in New York City providing:SecuritiesAsset ManagementCredit and transaction services

Page 43: High Performance Enterprise Data Propagation Russell Donovan.

The Problem

Business challengeMigration to new strategic DBMS could not

impact business operations

Technical challengeKeep current ADABAS DBMS synchronized

with new strategic DB2 DBMSThe solution had to be sustainable for the long-

term and also be scalable

Page 44: High Performance Enterprise Data Propagation Russell Donovan.

The Solution

Client already had an ADABAS log capture mechanism and MQSeries. A “Custom Adapter for Source MQSeries” to Change Data Move

written in ASM runs as a started task

Primarily batch with over 700 files (as sources).

ADABAS Log Capture

BatchAddressSpace

MQSeriesQueue

CustomAdapterCustomAdapter

EDMLogger

EDMLogger

EDMLog

MQGETMQGET

Page 45: High Performance Enterprise Data Propagation Russell Donovan.

Major U.S. Bank

e-Business exampleProvides anytime, anywhere access to products and services through:Walk up servicesAutomated Teller Machines (ATM)24-Hour Phone BankingInternet banking

Offices in 17 Midwestern and Western states

Page 46: High Performance Enterprise Data Propagation Russell Donovan.

The Problem

Business ChallengeMultiple access methods drive a need to

provide a common method to authenticate an account owner

Technical ChallengeAccount verification information is

maintained in purchased IMS applicationMove to leading edge Storage Area Network

technology and required integration.

Page 47: High Performance Enterprise Data Propagation Russell Donovan.

The Solution

Target is not a “conventional DBMS” but a storage area network.

High data volumes Target data written to MQSeries

LRPLRP TNRTNR DB2

Process Action ControllerProcess Action Controller

End UOWData

End UOWData

MQSeriesQueue

Custom Adapter

Custom Adapter

MQPUT

Page 48: High Performance Enterprise Data Propagation Russell Donovan.

Change target DBMS without impacting operational applications Move target DB from Sybase to Oracle to SQL Server to UDB to ??

High Performance Data Propagation

Facilitating DBMS MigrationsHigh Performance Data Propagation

Facilitating DBMS Migrations

BMCApplyBMCApply

OEMApplyOEMApply

EDPLogger

EDPLogger LRPLRP TNRTNR

EDPLog

Oracle

DB2

UDB

SQLServer

SybaseECCRECCR

UserApplication

DB2IMS

Fast PathVSAM

Page 49: High Performance Enterprise Data Propagation Russell Donovan.

BMC’s Data Propagation is Different?

Transaction based data propagation supports applications executing hundred’s of transactions/second For IMS, Fast Path, CICS VSAM and VSAM Batch

Does not use IBM* capture exits, logs, or require any additional logging Automatically transforms non-relational data structures to relational Supports “Near-Real-Time” with minimum latency for target updates No requirement for DB2 staging tables and associated logging Captures changes from VSAM batch applications even when no logs are used

For DB2 No requirement for DB2 staging tables and associated logging Transaction consistent propagation Supports “Near-Real-Time” with minimum latency for target updates

Component of a Complete Enterprise Data Movement Solution Common management console - Easy to administer Integrated restart/recovery of the propagation process Shared data transformations

Page 50: High Performance Enterprise Data Propagation Russell Donovan.

Extract, Transform, Move & Load OptionsA Performance Perspective

M Bytes per HourM Bytes per HourM Bytes per HourM Bytes per Hour

0

500

1000

1500

2000

2500

3000

3500

4000

Change DataPropagation

RYO BulkMove

Solutions

DataReachParallel Unload/LoadPiping

Making The Right Choice


Recommended