Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | tyrone-floyd |
View: | 213 times |
Download: | 0 times |
High PerformanceEnterprise Data Propagation
Russell Donovan
BMC Company Profile
Established in 1980 Leader in Application Management Estimated FY2000 Revenues of $1.8B Over 6,000 Employees Development Labs in Austin (TX), Conyers (GA),
Houston, San Jose, Sunnyvale (CA), Waltham (MA) Germany, Israel, Singapore
Market Coverage in Over 50 Countries Member of the S&P 500
BMC Software e-Business Availability
Provides application management solutions that ensure the availability, performance, and recovery of business-critical applications.
We call this application service assurance and it means that the applications companies and their customers rely on will be there when they need them.
e-vailability - We Guarantee Our Solutions!
Enterprise Data Propagation (EDP)Requirement For All Enterprises
Need to synchronize data between legacy systems and distributed relational databases for:
Data warehousing, operational data stores, data mining
e-Business applications access to legacy data
Enterprise application integration
Distributed enterprises, ERP solutions, Acquisitions
70% of corporate data in IMS, VSAM, DB2
DB2
OtherVSAM
IMS
Need high performance solutions Need near real time solutionsNeed high performance solutions Need near real time solutions
Data Propagation - Strategies For Synchronizing Multiple Copies of Data
Copy
Unload/Load
Distributed Database2 Phase Commit
SQL Query
Source Target
Change Capture --- With Asynchronous Propagation
Key Challenges Implementing a Data WarehouseKey Challenges Implementing a Data Warehouse
Data Management Review Survey Business rule analysis
Managing End User Expectation
Business data modeling
Reliability and integrity of data
Data acquisition
Meta Data management
Managing Management Expectation
Database performance
Data Management Review Survey Business rule analysis
Managing End User Expectation
Business data modeling
Reliability and integrity of data
Data acquisition
Meta Data management
Managing Management Expectation
Database performance
Data Warehouse Implementations
For Customers With Large operational databases High transaction rates 24x7 operations requirements
Critical Management Issues Availability of operational systems Performance of operational transactions Maintaining service levels Increasing volumes of data Time required to load and refresh data warehouse Quality, currency & accuracy of decision making data
Building Data Warehouses: A Perspective
Operational DatabaseImages
Subject Oriented
Operational Images
Time VariantSubject OrientedData Warehouses
DataMarts
OperationalDatabases
VSAM Files
Building Data Warehouses:A Perspective
VSAM
IMS
DB2
End-Users
Mainframe Tools
Prism, ETICarleton, SASPlatinum
Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy
DataMart
DataWare-House
DataMart
Data Warehouse
Oper.DataStore
Building Data Warehouses:A Perspective
VSAM
IMS
DB2End-Users
Mainframe Tools
Prism, ETICarleton, SASPlatinum
Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy
DataMart
DataWare-House
DataMart
Data Warehouse
Oper.DataStore
Dist. SystemsTools
Informatica, Constellar D2K, Sagent, Ardent
Integration
Area
Building Data Warehouses:A Perspective
End-Users
Mainframe Tools
Prism, ETICarleton, SASPlatinum
Query ToolsBrio, Bus. ObjectsCOGNOSMicrostrategy
DataMart
DataWare-House
DataMart
Data Warehouse
Oper.DataStore
Dist. SystemsTools
Informatica, Constellar D2K, Sagent, Ardent
Integration
Area
ChangeDataMove
Bulk DataMove
VSAM
IMS
DB2
BMCSolution
Change Data Propagation: A Perspective
Change Data Propagation Is Preferred When: Databases are large and bulk move would take too long
Batch window limitationsDatabase availability limitations
Support for 24 x 7 is a requirement of operational application Minimum latency “Near-real-time” is required in target database! Currency of information in target database is important Small percentage of a large database has changed Need to reduce network traffic by transmitting only data changes
SourceTarget
Synchronous Data PropagationOriginal update waits until all targets are updated
Single, global transaction with multi-site, coordinated commit processing
Asynchronous Data PropagationPropagation of updates occurs asynchronous to originating
transaction Minimizes resource consumption at source Minimizes impact on source transaction response times
Transaction BasedChange Data Propagation
Source Target
Source Target
Synchronous vs AsynchronousChange Data Propagation
Advantages: Real time propagationAll sites always synchronized
Disadvantages:Transaction response timeData availability impactSystem resiliencyUsually not practical
Source transaction completes when all databases updated
Advantages:Minimum performance impactAvailabilityAutonomyRecoverability
Disadvantages: target locations updates may be delayedAll sites not always synchronized
Source transaction does not wait for target databases to be
updated
Synchronous 2 Phased CommitSynchronous 2 Phased Commit Asynchronous Data PropagationAsynchronous Data Propagation
Asynchronous Change Capture:Implementation Considerations
Trigger Based Triggers used to capture changes to database records Incremental updates collected in staging tables Significant resource consumption for triggers and logging Typically low volume applications (< 20 transactions/second)
Log Exit Based Increased logging in operational environment Increased response times for source transactions Increased resource consumption Log management issues
Log Post Process Based Increased logging in operational environment Log management issues Long latency interval can not support near real time
Enterprise Data Propagation (EDP) The BMC Solution
A Data Propagation Management System
Common look and feel Integrated transformations and mappings Integrated recovery/restart
A single point of access for managing Legacy data propagation across the enterprise
Efficient change captureBasic data transformation High performance data movementHigh performance utilities
OperationalDataStore
Bulk Data
Change Data
VSAMVSAM
IMSIMS
DB2DB2
FastFastPathPath
ChangeDataMove:Product Positioning
Positioning
ChangeDataMove is a high performance, efficient, change data propagation solution, which captures changes made to IMS, Fast Path, VSAM, and DB2 databases, and propagates those changes to the most prevalent relational databases.
What It DoesTransaction-based data propagation
Supports high volume production applications with hundreds of transactions per second
Supports ‘near real-time’ as well as scheduled data propagation
AdvantagesA data propagation system (complete solution vs a point product)Highly efficient change capture does not impact applications
Only solution for IMS, FastPath and VSAM that does not require logging
Optionally integrated with DataMove for bulk data movement
Change Data Propagation for IMS and VSAM
Synchronous Change Capture Transparent high performance
change capture Minimum impact on source system
logging, CPU & user response time Data is available immediately for
asynchronous propagation
Asynchronous Data Propagation Data Propagated Within Context of
Original Transaction Updates applied in proper
sequence Inter and intra-table consistency Source and target(s) consistent
within transaction boundaries
EDPEDPApplyApplyEDPEDP
ApplyApply
1
2
3
1
2
3
EDPEDPLogLog
Not affected by network delays or slow remote processors Supports “Near Real Time” and/or Scheduled Propagation
Not affected by network delays or slow remote processors Supports “Near Real Time” and/or Scheduled Propagation
Resides within the IMS environment Captures DL/I calls as they occur
Supports IMS/TM (MPP,BMP), Fast Path, CICS DBCTL, Batch DL/I Commits updates at transaction or job (batch) end
Based on BMC Software’s CHANGE
RECORDING FACILITY
Database
ECCRECCR
UserApplication
IMSSubsystem
IMS
BMCApplyBMCApply
OEMApplyOEMApply
EDPLogger
EDPLogger LRPLRP TNRTNR
EDPLog
IMS Change CaptureIMS Change Capture
DB2DB2
OracleOracle
SQLSQLServerServer
SybaseSybase
UDBUDB
Captures changes at each Get, Put & Erase request Utilizes CICS TRUE, File, and Re-sync exits Resides as functional part of CICS address space
Participates in two phase commit with CICS transaction Updates are committed when transaction commits
Database
UserApplication
CICSSubsystem
VSAM ECCRECCR
BMCApplyBMCApply
OEMApplyOEMApply
EDPLogger
EDPLogger LRPLRP TNRTNR
EDPLog
CICS/VSAM Change CaptureCICS/VSAM Change Capture
DB2DB2
OracleOracle
SQLSQLServerServer
SybaseSybase
UDBUDB
VSAM Batch Change Capture
Journad exit dynamically activated ECCR resides within the batch address space UOW is complete when application closes VSAM file
Based on BMC Software’s RECOVERY PLUSfor CICS/VSAM product.
Database
UserApplication
BatchAddressSpace
VSAM ECCRECCR
BMCApplyBMCApply
OEMApplyOEMApply
EDPLogger
EDPLogger LRPLRP TNRTNR
EDPLog
DB2DB2
OracleOracle
SQLSQLServerServer
SybaseSybase
UDBUDB
DB2 MVS Change Capture
Requires DB2 change data capture be activated Reads log records via DB2 IFI, external decompression Maintains multiple versions of schema
Uses DB2IFI Facility
UserApplication
DB2MVS
DB2ECCRDB2
ECCR
BMCApplyBMCApply
OEMApplyOEMApply
EDPLogger
EDPLogger LRPLRP TNRTNR
EDPLog
DB2DB2
OracleOracle
SQLSQLServerServer
SybaseSybase
UDBUDB
The Transformation Process
VSAM Files TransformationTransformationTransformationTransformation
Transforms IMS, Fast Path and VSAM data to relational formatsHierarchical structures to relational structures
Converts non-relational data types to relational Uses relational DBMS catalog informationUses copy libraries and IMS database descriptorsAutomatically handles Dates, Times, Data TypesRepeating groups, Redefined records
Customizable through user exits
Possible Target Keys
To allow resulting target rows to be unique Replication Key (REPKEY)
This key will make the target row unique For IMS it is the full concatenated key or segments RBA
Ancestor Keys If REPKEY is a composite key (I.e. IMS concatenated key) each level is
available to be used as the key of the target row Sequential number
If a single input segment or record creates multiple output rows, a sequential numeric column can be generated.
Any field in the input segment or record
Transforming Cobol Structures
Repeating Groups all repeated fields to a single target column As individual rows in the same or a different table
Update results in set of deletes and inserts for target rows
Redefined Records assigned unique names and schema definitions Record identification exit identifies record types Schema applied to segment or record based on redefined record type
Redefined records can be propagated to same or different targets
High Performance Transport & Apply Data is blocked, compressed and encrypted Multi-threaded apply tasks for increased performance
TTRRAANNSSPPOORRTT
TTRRAANNSSPPOORRTT
DB2DynamicMemoryStagingQueueTCP/IP TT
RRAANNSSPPOORRTT
TTRRAANNSSPPOORRTT
Send Receive
EDPApplyEDP
Apply
EDPApplyEDP
Apply
EDPApplyEDP
Apply
EDPApplyEDP
Apply
EDPApplyEDP
Apply
EDPApplyEDP
Apply
OracleDynamicMemoryQueue
Automated Schema Replication
Reduce administration costs by automating the creation of target tables from IMS, VSAM, and DB2 source schema
VSAM Files
DBDCopybook
Copybook
DB2Catalog
Oracle
DB2
SchemaMoveSchemaMoveSchemaMoveSchemaMove
MS SQLServer
Bulk Data Propagation
Bulk move is usually simpler and easier to implement
Needed to initially create or to refresh a target databaseBulk move is the preferred solution when:
Data volumes are not large and the move can be performed within time constraints
Database availability is not a concern (source/target)Network volumes and network overhead are not issues Currency of information in target database is not a concern Change data propagation cannot handle the volumes
Source Target
Bulk Data Movement DB2 to OracleThe Traditional Approach
DB2
OracleFile
TCP/IP
UNIX Server
MVS Host
DB2ExtractDB2
ExtractDB2DB2
GatewayGateway
GatewayGateway
OracleOracleOracleLoaderOracleLoader
File 35%35%
52%52%
13%13%
DB2 Unload 20 min.DB2 Unload 20 min.
File Transfer 7 min.File Transfer 7 min.
Oracle SQL Load 28 min.Oracle SQL Load 28 min.
TimeTime
Total Time 55 min.Total Time 55 min.
Bulk Data Movement DB2 to OracleParallel Unload Parallel load
DB2
OracleFile
FileParallel Unload 7 minParallel Unload 7 min
File Transfer 7 min.File Transfer 7 min.
Oracle SQL Load 16MOracle SQL Load 16M
TimeTime
TCP/IP
UNIX Server
MVS Host
DB2ExtractDB2
ExtractDB2DB2
GatewayGateway
GatewayGateway
OracleOracleOracleLoaderOracleLoader Total Time 30 min.Total Time 30 min.
Bulk Data Movement DB2 to OracleParallel Unload/load & Piping
DB2
OracleFile
TCP/IP
UNIX Server
MVS Host
DB2ExtractDB2
ExtractDB2DB2
GatewayGateway
GatewayGateway
OracleOracleOracleLoaderOracleLoader
File Parallel UnloadParallel Unload
PIPINGPIPING
Parallel loadParallel load
Oracle load starts as firstrecord is read from DB2Oracle load starts as firstrecord is read from DB2
Total Time 17 min.Total Time 17 min.
DataReach: Product positioning
Positioning
DataReach is a high performance, high availability data movement solution for extracting MVS/ESA DB2 data and loading it into Informix, Oracle or Sybase database on Unix. A joint development effort of EMC & BMC - Not A Product We Sell Today
What It Does Uses EMC Storage to move data at channel speeds vs network
speeds Moves the work of extracting DB2 MVS data from MVS to Unix
Advantages Moves data 10 to 100 times faster than network solutions Completely eliminates mainframe processing Completely eliminates network traffic and network overhead Allows nearly 100% availability of the source DB2 database Enables customers to more frequently refresh data warehouses
Bulk Data Movement DB2 to OracleThe DataReach Approach
DB2
OracleOracleOracle
OracleLoader
OracleLoader
UNIX Host
MVS Host
DB2ExtractDB2
Extract
DB2DB2
IntermediateFile
DataReach Directly Extracts DB2 DataDataReach Directly Extracts DB2 DataEliminates network traffic & network overheadEliminates network traffic & network overheadFamiliar SQL-based SELECT syntaxFamiliar SQL-based SELECT syntaxSubset of data via WHERE predicateSubset of data via WHERE predicateOptional parallel extraction capability Optional parallel extraction capability Optional access via DB2 Index structuresOptional access via DB2 Index structuresData conversion Data conversion
EBCDIC to ASCIIEBCDIC to ASCII DB2 to generic formatDB2 to generic format
Direct load of Oracle, Sybase, InformixDirect load of Oracle, Sybase, InformixOptional parallel load capabilityOptional parallel load capabilityDistributed capabilitiesDistributed capabilities
DataReach Directly Extracts DB2 DataDataReach Directly Extracts DB2 DataEliminates network traffic & network overheadEliminates network traffic & network overheadFamiliar SQL-based SELECT syntaxFamiliar SQL-based SELECT syntaxSubset of data via WHERE predicateSubset of data via WHERE predicateOptional parallel extraction capability Optional parallel extraction capability Optional access via DB2 Index structuresOptional access via DB2 Index structuresData conversion Data conversion
EBCDIC to ASCIIEBCDIC to ASCII DB2 to generic formatDB2 to generic format
Direct load of Oracle, Sybase, InformixDirect load of Oracle, Sybase, InformixOptional parallel load capabilityOptional parallel load capabilityDistributed capabilitiesDistributed capabilities
DataReach: How It Works
SYMMETRIX ESP
DB2Source
FBA Volumes
MVS SystemDB2
CKD Volumes
TargetDBMS
Escon Channels
SCSI Channels
Extractor TranslationModule
Native load utilityTarget RDBMS
UNIX Flat File
TargetDBMS
DataReach: Performance Benchmark
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
900.00
1,000.00
10 Mbytes 100 Mbytes 1 Gbyte 5 Gbytes 10 Gbytes
Size
Min
ute
s
Traditional
DataReach
DB2 to Oracle on HP/UX
Traditional Process vs DataReach
Elapsed Time Components 1 GB of Data
DB2 to Oracle on HP/UX
0:20:06
0:07:10
0:27:53
0:16:35
0:00:00
0:07:12
0:14:24
0:21:36
0:28:48
0:36:00
0:43:12
0:50:24
0:57:36
1:04:48
Traditional Process DataReach Process
Ela
ps
ed
Tim
e in
Min
ute
s
DataReach Process Time
Oracle SQL Load Time
File Transfer Time
DB2 Unload Time
DataReach: Operational Considerations
Data Consistency: Quiesce DB2 High Availability: Use A mirror copy in Symmetrix Security: DataReach Authorization Table in DB2
DB2 Read access Unix Login Target RDBMS authorizations
Extract, Transform, Move & Load OptionsA Performance Perspective
M Bytes per HourM Bytes per HourM Bytes per HourM Bytes per Hour
0
500
1000
1500
2000
2500
3000
3500
4000
Change DataPropagation
RYO BulkMove
Solutions
DataReachParallel Unload/LoadPiping
Making The Right Choice
High Performance Data Propagation Strategy for Supporting Data Warehouse
Operational Applications
DataWarehouse
Data WarehouseRefreshChange
HistoryDataMart
DataMart
Integration Area
Business Intelligence Systems
OperationalDataStore
High Performance Data Propagation
Other
VSAM
DB2
Fast Path
IMS
High Performance Data Propagation Strategy for Supporting DW & e-Business
Operational Applications
DataWarehouse
App. Server
Updates
Inqu
ires
Data WarehouseRefresh
ChangeHistory
DataMart
DataMart
Integration Area
Business Intelligence Systems
Web Server
OperationalDataStore
High Performance Data Propagation
Other
VSAM
DB2
Fast Path
IMS
High Performance Data Propagation Strategy for Enterprise Application Integration
Operational Applications
High Performance Data Propagation
Other
VSAM
DB2
IMS
Messaging
BulkMessage Queue
ChangeMessage Queue
PeopleSoft
SAP
e-businessApplications
Baan
Oracle
DataMart
ERPTools
DataWarehouse
Note: This is a BMC Services Offering
Major U.S. Brokerage Firm
Application Integration exampleGlobal corporation headquartered in New York City providing:SecuritiesAsset ManagementCredit and transaction services
The Problem
Business challengeMigration to new strategic DBMS could not
impact business operations
Technical challengeKeep current ADABAS DBMS synchronized
with new strategic DB2 DBMSThe solution had to be sustainable for the long-
term and also be scalable
The Solution
Client already had an ADABAS log capture mechanism and MQSeries. A “Custom Adapter for Source MQSeries” to Change Data Move
written in ASM runs as a started task
Primarily batch with over 700 files (as sources).
ADABAS Log Capture
BatchAddressSpace
MQSeriesQueue
CustomAdapterCustomAdapter
EDMLogger
EDMLogger
EDMLog
MQGETMQGET
Major U.S. Bank
e-Business exampleProvides anytime, anywhere access to products and services through:Walk up servicesAutomated Teller Machines (ATM)24-Hour Phone BankingInternet banking
Offices in 17 Midwestern and Western states
The Problem
Business ChallengeMultiple access methods drive a need to
provide a common method to authenticate an account owner
Technical ChallengeAccount verification information is
maintained in purchased IMS applicationMove to leading edge Storage Area Network
technology and required integration.
The Solution
Target is not a “conventional DBMS” but a storage area network.
High data volumes Target data written to MQSeries
LRPLRP TNRTNR DB2
Process Action ControllerProcess Action Controller
End UOWData
End UOWData
MQSeriesQueue
Custom Adapter
Custom Adapter
MQPUT
Change target DBMS without impacting operational applications Move target DB from Sybase to Oracle to SQL Server to UDB to ??
High Performance Data Propagation
Facilitating DBMS MigrationsHigh Performance Data Propagation
Facilitating DBMS Migrations
BMCApplyBMCApply
OEMApplyOEMApply
EDPLogger
EDPLogger LRPLRP TNRTNR
EDPLog
Oracle
DB2
UDB
SQLServer
SybaseECCRECCR
UserApplication
DB2IMS
Fast PathVSAM
BMC’s Data Propagation is Different?
Transaction based data propagation supports applications executing hundred’s of transactions/second For IMS, Fast Path, CICS VSAM and VSAM Batch
Does not use IBM* capture exits, logs, or require any additional logging Automatically transforms non-relational data structures to relational Supports “Near-Real-Time” with minimum latency for target updates No requirement for DB2 staging tables and associated logging Captures changes from VSAM batch applications even when no logs are used
For DB2 No requirement for DB2 staging tables and associated logging Transaction consistent propagation Supports “Near-Real-Time” with minimum latency for target updates
Component of a Complete Enterprise Data Movement Solution Common management console - Easy to administer Integrated restart/recovery of the propagation process Shared data transformations
Extract, Transform, Move & Load OptionsA Performance Perspective
M Bytes per HourM Bytes per HourM Bytes per HourM Bytes per Hour
0
500
1000
1500
2000
2500
3000
3500
4000
Change DataPropagation
RYO BulkMove
Solutions
DataReachParallel Unload/LoadPiping
Making The Right Choice