TIBCO® MDM Performance Tuning Guide

TIBCO® MDMPerformance Tuning GuideVersion 9.3.0December 2020Document Updated: March 2021

Copyright © 1999-2020. TIBCO Software Inc. All Rights Reserved.

TIBCO® MDM Performance Tuning Guide

2 | Contents

Contents

Contents 2

Performance Tuning Overview 5Deployment Mode 5

Hardware and Operating System Tuning 7Physical Memory 7CPU 8Input Output Throughput 8

Java Virtual Machine Tuning 9Heap Size 9

JVM Parameters 10

Data Source Configuration on JBoss WildFly Application Server 12Multiple JBoss Instances on One Machine 12

TIBCO MDM Cache Calculation 14

Apache Ignite Tuning 15

Oracle Database Configuration and Performance Tuning 17Initialization Parameters Configuration 17Tablespaces Configuration 19Database Table Pinning in Memory 20

Microsoft SQL Server Database Configuration and Tuning 22Database Optimizations on Microsoft SQL Server - Customer Study 23

Heavy Input Output Contentions on TEMPDB Files 23

CXPACKET Wait-type Observed Frequently 24

Lesser Dedicated Memory to Operating System 24


3 | Contents

Incorrect Snapshot Isolation Level Settings 24

File-Grouping 25

Antivirus Configuration 25

Lack of Indexes on MVT Tables 25

Query Optimization on Product Queries 25

Storage or IO Contention 25

Over-Aggressive Scheduled Maintenance Jobs 26

EMS Configuration and Tuning 27

Load Balancing 28

Database Loader Tuning 29When to Use Database Loader 29

Performance Best Practices for Database Loader 30

Data Source Indexes 30

Data Source Upload Limit 31

Import 33Direct Load Import 33Performance Best Practices for Import 33

Rulebase Execution Optimization 35Rulebase Parallel Execution 35Rulebase Execution Directive 36

Validation Algorithm on Rulebase Execution Directives 37

Configuration Properties for Record Bundle Optimization 40

Configuration Properties for Rulebase Directives 40

Impact on Propagate Rulebase 41

Workflow Execution 42Regular Workflow Processing 42In-Memory Workflow Processing 42Impact of In-Memory Workflows on UI 43


4 | Contents

Regular Workflows versus In-Memory Workflows 44In-Memory Configuration through the Configurator 44

When to Use In-Memory Workflow Execution 45

Workflow Configurations 45

Limitations for In-Memory Workflows 47

Workflow Activity Parallelization 48Activity Parallelization Configuration 49Async Call Queue 50Activity Parallelization Workflow 51Record Bundling Optimization 51Record Caching Optimization 52

Preload 55Performance Best Practices for Preloading 56Guidelines for Sizing 56

Purge 58Performance Best Practices for Purge 58

Configuration for Logging 60Standard Logging Levels 60

Performance Tuning Tips 62

TIBCO Documentation and Support Services 63

Legal and Third-Party Notices 65


5 | Performance Tuning Overview

Performance Tuning OverviewTIBCO MDM handles multiple functions such as data cleansing, data retrieval, and bulk data inserts where the volume of data can be millions of records. This topic provides information on TIBCO MDM performance tuning methodologies and best practices, and hardware and software tuning techniques. It also contains information on OS, JVM, and TIBCO MDM configuration parameters tuning.

System performance can vary depending on the volume of records that TIBCO MDM manages, concurrent users accessing the application, and the record level hierarchy (records depth with relationships). The TIBCO MDM Performance Tuning provides information on tuning the various layers of TIBCO MDM for stability, scalability, and large data volumes that can help achieve optimal system performance.

Deployment ModeThe following diagram depicts a high-level architectural overview of TIBCO MDM and shows the various layers described in the guide.

The following layers are described in the guide:

l Hardware and Operating system

l Application layer

l Distributed Cache layer

l Messaging layer


6 | Performance Tuning Overview

TIBCO MDM Overview


7 | Hardware and Operating System Tuning

Hardware and Operating System TuningTIBCO MDM manages the existing data as well as the data in motion across the organization.

The key performance parameters include:

l Volume of data

l Complexity of data (record hierarchy with relationships)

l Workload on the system

Allocating the correct hardware is critical for proper sustained performance of the solution. The correct size of the hardware required to effectively run the final solution depends on the volume of the data and the overall complexity of the solution.

You can scale TIBCO MDM performance by running TIBCO MDM on faster hardware (that is, more CPUs and increased input/output) or by increasing the physical memory available.

Depending on the resources available, you can opt for horizontal scaling by deploying multiple TIBCO MDM instances on the same machine or vertical scaling by deploying multiple TIBCO MDM instances on different machines.

Physical MemoryPhysical memory available on the system must satisfy the TIBCO MDM Heap settings (along with JVM or native thread stack overhead) and accommodate any other processes.

Size the available physical memory accordingly to accommodate the heap settings of the TIBCO MDM Server, memory allocated to the cache server, and any other processes running on the machine.

Before you start multiple instances of TIBCO MDM on the same machine and the cache server, ensure that enough memory is available.

Allocate memory to the distributed cache in the beginning depending on the system memory available. Size the cache according to the memory available to ensure that eviction does not trigger when extensive caching is used in the implementation.

Preload, an important function, loads the specified number of records into the cache from the database as defined in the TIBCO MDM configuration.


8 | Hardware and Operating System Tuning

Insufficient memory available on the system for the records to be loaded in cache can trigger an eviction. Memory requirements depend on the attributes defined in the metadata model and the complexity of the data (record hierarchy with relationships). For more details, see Preload.

CPUTIBCO MDM relies on highly efficient threading to achieve its throughput in handling data and serving content to the user. For optimal performance, deploy TIBCO MDM, Database, and EMS on separate physical machines.

Note: Running other third-party applications creates competition for CPU and might have a negative effect on TIBCO MDM Server performance.

Direct Load Import is one of the TIBCO MDM features which is CPU intensive. If a CPU resource on a TIBCO MDM instance is available, you can increase the pool size of the asynchronous queue or increase the number of TIBCO MDM instances.

In-memory workflow executions are another feature that is CPU intensive.

For concurrent processing including operations such as UI, web services, and Bulk Load, ensure that you have adequate CPU resources available for processing the requests.

If CPU usage on an TIBCO MDM instance is high, you might want to use Load Balancing. See Load Balancing for more details.

Input Output ThroughputTIBCO MDM is a highly input/output intensive system which receives, stores, and retrieves data from the database.

Application logging level can also have an impact on input/output performance. See Configuration for Logging for more details.

Consider deploying faster storage hardware such as network, drives, or implement striping for multiple data destinations. If the input/output throughput for the system is below industry standards, check for input/output chain misconfiguration, ensure the firmware is up-to-date, or validate the SAN (Storage Area Network) routing configuration. You can also reduce the latency between the database and the TIBCO MDM server.


9 | Java Virtual Machine Tuning

Java Virtual Machine TuningThe Java Virtual Machine (JVM) tuning.

Heap SizeThe JVM heap is a repository of active objects, inactive objects, and free memory. When an object can no longer be reached from any pointer in the running program, it is considered garbage and ready for collection. When the JVM heap runs out of memory, all processing in the JVM stops until garbage collection completes.

The JVM heap size is important because it controls how often, and for how long garbage collection runs. If garbage collection runs too infrequently, the extra time that is required to complete garbage collection can negatively affect the TIBCO MDM performance.

These are the recommended JVM settings for TIBCO MDM:

l Java heap (-Xms4096m -Xmx4096m). To avoid out of memory errors use identical settings for minimum and maximum memory. A good practice is to set the 512 MB heap per configured worker thread. For example, if there are eight worker threads, setting a heap of 4 GB is sufficient. The setting also depends on the bundle sizes.

l Performance options:

XX:NewSize=256m. Default size of new generation (in bytes) XX:MaxPermSize=512m. Size of the Permanent Generation

l Behavioral options:

XX:-UseParallelGC. Use parallel garbage collection for scavenges.

l Debugging options:

XX:+HeapDumpOnOutOfMemoryError. Dump heap to file when java.lang.OutOfMemoryError occurs.

Use the debugging options only if you suspect a memory leak and you need to take the heap dump out of memory. This is not recommended in a production environment.

For garbage collection, -XX:-ParallelGC is recommended to select the parallel garbage collector for the new generation of the Java heap. For more information, see Oracle documentation.

https://www.oracle.com/



The JVM heap size is a configurable parameter that you can set when you start the TIBCO MDM instance. You might also want to review the G1 garbage Collector. For more information, see Oracle documentation.

JVM ParametersThe JVM parameters are set in standalone.conf under the $JBOSS_HOME/bin directory.

You can set the following for JVM parameters for TIBCO MDM:

Parameter Value Additional Information

-Xms4096m Java Heap Size

-Xmx4096m Maximum Java Heap Size

-XX:MAxPermSize512m Maximum Size for Permanent Generation

Heap

-XX:NewSize256m Default Size of New Generation

-XX:+UsePArallelGC Use parallel garbage collection for

scavenges

XX:+HeapDumpOnOutOfMemoryError

Dump heap to file when java.lang. OutOfMemoryError is thrown

-Dsun.rmi.dgc.server.gcInterval

3600000 ms

Ensures that unreachable remote objects are unexported and garbage is collected in a timely fashion. The value of this property represents the maximum interval (in milliseconds) that the RMI runtime allows between garbage collections of the local

JVM Parameters

http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html




Parameter Value Additional Information

heap. The default value is 60000 milliseconds (60 seconds).

-Dsun.rmi.dgc.client.gcInterval

3600000 ms

Ensures that DGC clean calls for unreachable remote references are delivered in a timely fashion. The value of this property represents the maximum interval (in milliseconds) that the RMI runtime allows between garbage collections of the local heap. The default value is 60000 milliseconds (60 seconds).

To improve overall throughput, consider using a parallel garbage collection strategy. This type of strategy works best with workloads that are designed to:

l reduce or eliminate garbage collection pauses

l trade some memory throughput to accomplish the reduction or elimination

For more information about JVM, options, see Oracle documentation.

Note: To verify the effectiveness of the garbage collection strategy, run the application, and measure either response times or the throughput relative to garbage collection pause times, or both.



12 | Data Source Configuration on JBoss WildFly Application Server

Data Source Configuration on JBoss WildFly Application ServerWith Java EE6, the new annotation @Data SourceDefinition enables you to configure a data source directly from within their application.

A data source is a Java Naming and Directory Interface (JNDI) object used to obtain a connection from a connection pool to a database.

You can set the data source configurations at the following locations:

l JBoss WildFly : Edit the standalone.xml file located in the $JBOSS_HOME\standalone\configuration directory.

Consider setting the following parameters in the data source configuration file for performance tests:

<min-pool-size>100</min-pool-size> <max-pool-size>500</max-pool-size> <blocking-timeout-millis>30000</blocking-timeout-millis>

l min-pool-size: specifies the minimum number of connections a pool must hold. These pool instances are not created until an initial request for a connection is made. The default is 0.

l max-pool-size: specifies the maximum number of connections for a pool. The max-pool-size number of connections is the maximum that is created in a pool. The default is 20.

l blocking-timeout-millis: specifies the maximum time in milliseconds to block while waiting for a connection before throwing an exception. Note that this blocks only while waiting for a permit for a connection, and never throws an exception if creating a new connection takes an inordinately long time. The default is 5000.

Multiple JBoss Instances on One MachineMultiple instances of JBoss WildFly Application Server can run on a single machine provided you have the necessary system resources such as RAM or CPU. These instances can be clustered or run independently depending on your business requirements.


13 | Data Source Configuration on JBoss WildFly Application Server

Running multiple instances:

l Scaling: you might have to test horizontal scaling of the application to determine if you have enough system resources available.

l Isolation: you might require complete isolation of your applications if one application is unstable and negatively impacts the other applications.

l QA: you might like a separate QA environment which is isolated from the development environment on the same box.

l JBoss WildFly Application Server version dependencies: you might have multiple applications with some applications requiring version X of JBoss WildFly Application Server or version Y of JBoss WildFly Application Server.

l JVM version dependencies: JBoss WildFly application server uses JDK 11.


14 | TIBCO MDM Cache Calculation

TIBCO MDM Cache CalculationTIBCO MDM uses cache objects of different sizes depending on the amount of data held. TIBCO MDM internally calculates the cache size for each object based on the ListSize provided in CacheConfig.xml for the individual objects. For heavily used cache objects, allocate more memory to the object files in the configuration file.

Depending on the memory resources available, set the ListSize of individual objects in the CacheConfig.xml. ListSize of -1 <ListSize>-1</ListSize> means unlimited size is set for that particular object in Apache Ignite cache. Depending on the ListSize defined in the configuration file, if the memory allocated for a particular object is full, eviction policy comes into effect which could impact the performance. For high performance scenarios, it is recommended to set the ListSize to -1 for the RECORD, RECORDMAXMODVERSION, PRODUCTKEY and MV_VALUE objects.


15 | Apache Ignite Tuning

Apache Ignite TuningThe generic tuning recommendations for Apache Ignite are as follows:

l Disable internal events notifications: the default configuration supplied with TIBCO MDM disables all event notifications

l Turn off backup: the default configuration supplied with TIBCO MDM does not configure backup, however, it can be configured to avoid single point failure

l Disable SWAP storage: the default configuration supplied with TIBCO MDM disables SWAP usage

l Tune cache data rebalancing: any change in topology results in re-balancing. Rebalancing might require additional resources and hit cache performance. The default configuration of Apache Ignite is used.

l Configure thread pools: Apache Ignite has its main thread pool size set to the two times available CPU count

l Disable peer class loading: the default configuration supplied with TIBCO MDM disables peer class loading by defining property peerClassLoadingEnabled=false in the IgniteMember.xml file.

l Tune garbage collection: Apache Ignite recommends G1 garbage collector with the following settings as a starting point for JDK 1.8:

-XX:NewSize=512m

-XX:SurvivorRatio=6

-XX:+AlwaysPreTouch

-XX:+UseG1GC

-XX:MaxGCPauseMillis=2000

-XX:GCTimeRatio=4

-XX:InitiatingHeapOccupancyPercent=30

-XX:G1HeapRegionSize=8M

-XX:ConcGCThreads=16

-XX:G1HeapWastePercent=10


16 | Apache Ignite Tuning

-XX:+UseTLAB

-XX:+ScavengeBeforeFullGC

-XX:+DisableExplicitGC


17 | Oracle Database Configuration and Performance Tuning

Oracle Database Configuration and Performance TuningTIBCO MDM stores the master data of the enterprises in the backend database and retrieves potentially large amounts of data from the database. If the database does not perform well, the TIBCO MDM Server slows down as a result. A properly tuned backend database helps the TIBCO MDM Server run more efficiently.

For better performance of the Oracle database, you must do some important database configuration. However, you must consult an experienced database administrator (DBA) who can take appropriate decisions to set up your particular database environment.

Initialization Parameters ConfigurationThe following table describes the initialization parameters set in the performance lab for testing medium and large systems. The SGA and PGA requirements for performance payload was 24G and 10G respectively.

Parameter Value Description

db-block_size

8192 Specifies (in bytes) the size of Oracle database blocks. Typical values are 4096 and 8192. The value of this parameter must be a multiple of the physical block size at the device level. This parameter affects the maximum value of the FREELISTS storage parameter for tables and indexes.

db_cache_size

10G Specifies the size of the DEFAULT buffer pool for buffers with the primary block size (the block size defined by the DB_BLOCK_SIZE initialization parameter). The value must be at least 4M * number of cpus * granule size (smaller values are automatically rounded up to this value).

db_16 Minimizes disk input/output during table scans by specifying the

maximum number of blocks read in one input/output operation

Initialization Parameters




file_multiblock_read_count

during a sequential scan.

db_writer_processes

10 Useful for systems that modify data heavily. It specifies the initial number of database writer processes for an instance.

job_queue_processes

10 Specifies the maximum number of processes that can be created for the execution of jobs.

open_cursors

3024 Specifies the maximum number of open cursors (handles private SQL areas) a session can have at once. This parameter prevents a session from opening an excessive number of cursors. It is important to set the value of OPEN_CURSORS high enough to prevent your application from running out of open cursors.

parallel_threads_per_cpu

8 Specifies the default degree of parallelism for the instance and determines the parallel adaptive and load balancing algorithms. The parameter describes the number of parallel execution processes or threads that a CPU can handle during parallel execution.

processes

1000 Specifies the maximum number of operating system user processes that can simultaneously connect to an Oracle server. This value accommodates all background processes such as Job Queue (SNP) and parallel execution (Pnnn) processes.




sessions

2000 Specifies the total number of user and system sessions.

optimizer_adaptive_features

true Enables or disables all of the adaptive optimizer features, including adaptive plan (adaptive join methods and bitmap plans), automatic re-optimization, SQL plan directives, and adaptive distribution methods.

Note: The optimizer_adaptive_features property (default value is set to true) might produce a huge negative impact on the performance, especially with the RAC database. If negative performance is observed, change the property on RAC database with Oracle 12C.

Tablespaces ConfigurationTIBCO MDM manages the master data of the enterprises across different tables in the database. To manage the data in the database, create individual tablespaces for all the large tables and their indexes.

Create all the indexes including unique and primary keys for each large table in a separate tablespace.

Create separate tablespaces for the following tables and their associated indexes:

l Master Catalog Table (MCT)

l MV_SHARED Tables for different data types

l MV Non Shared Tables (MVT)

l Relationship Attribute Tables (RCT)

l GOLDENCOPY

l PRINCIPALKEY



l PROCESSLOG

l PRODUCTKEY

l PRODUCTLOG

l RELATIONSHIP

For optimal performance it is recommended to have the data files on Solid State Drives (SSD).

Note: Contact the engineering team for information regarding the scripts.

Database Table Pinning in MemoryTo improve the performance of TIBCO MDM when using Oracle, ensure that you use the commonly used tables that are cached in memory.

The following commons used tables must be used:

l ASSOCIATION

l CONFIGURATIONDEFINITION

l DOMAIN

l DOMAINENTRY

l DOMAINLINK

l DOMAINSTRING

l DOMAINVALUE

l RESOURCEACCESS

l RESOURCEACL

l QUEUEENTRY

l FUNCTION

l OBJECTSEQUENCE

l ORGANIZATION

l ENTERPRISE



l WORKFLOWFORM

Run the sample script after the installation is complete. The sample script is located under $MQ_HOME/db/oracle/install/tablepinning.sql and contains a complete list of tables which must be pinned in memory.

Note: Consult your database administrator (DBA) to modify and run this script.


22 | Microsoft SQL Server Database Configuration and Tuning

Microsoft SQL Server Database Configuration and TuningThis chapter describes how to manage the data in the database and create individual data files for large tables.

Data Files Configuration

TIBCO MDM manages the master data of the enterprises across different tables in the database. To manage the data in the database, create individual data files for all the large tables and their indexes.

Create all the indexes including unique and primary keys for each large table in a separate data file.

Create separate data files for the following tables and their associated indexes.

l Master Catalog Table (MCT)

l MV_SHARED Tables for different data types

l MV Non Shared Tables (MVT)

l Relationship Attribute Tables (RCT)

l GOLDENCOPY

l PRINCIPALKEY

l PROCESSLOG

l PRODUCTKEY

l PRODUCTLOG

l RELATIONSHIP

Out-of-the-box scripts are used for the following actions:

l Create file groups

l Add data files in file groups

l Move tables or indexes

l Delete data files



Note: For additional information, contact the engineering team.

Database Optimizations on Microsoft SQL Server - Customer StudyThe database optimization recommendations are based on the tuning that was carried out at one of our customer sites.

For recommending Microsoft SQL performance optimization, start by running the SQL diagnosis, identify the bottlenecks, and then change any configuration changes to the Microsoft SQL server.

Follow the same approach in tuning the database at the customer location. Iteratively, TIBCO ran the SQLdiag utility and highlighted top bottlenecks and addressed the heavy input/output contentions on the TEMPDB files common items first.

These configuration changes might or might not be applicable to other customers.

The following information lists some of the bottlenecks that TIBCO identified and fixed after running the Microsoft SQL diagnosis or Microsoft SQL profiler.

Heavy Input Output Contentions on TEMPDB FilesHeavy input or output contention on the temp database was one of the bottlenecks that appeared in the analysis.

Optimum TEMPDB configuration is important for any solution that has heavy data access as in the case of TIBCO MDM. If you observe heavy TEMPDB input or output contention due to huge spikes on the Microsoft SQL server CPU, check the TEMPDB file. The contention can occur even if there is only one TEMPDB file.

If this is the case, you must consider increasing the number of TEMPDB files according to the initial data access requirements and server configuration.

Although there is no convention on the number of files or size, TIBCO recommends having one TEMPDB file per two CPUs with identical sizes.

Another factor to consider while configuring the TEMPDB files is the growth of each file. You can set the growth to auto-growth with a percentage of growth factor or to no auto-



growth limit. A Microsoft best practice is to allow no auto-growth but you can also set an auto-growth of 300MB.

CXPACKET Wait-type Observed FrequentlyCXPACKET wait is another bottleneck TIBCO observed. This wait-type occurs when SQL server spawns a single query on multiple worker threads.

The default maximum degree of parallelism (MAXDOP) is set to 0 which requires infinite worker threads to execute the query. Microsoft recommends that you optimize the queries, however because most of these are product queries you might encounter limitations when optimizing some of them.

TIBCO recommends that you set the MAXDOP to 10. In our environment there were 32 physical CPUs and 40 CPUs with hyper-threading.

TIBCO also recommends setting the MAXDOP to 25% of the total number of available CPUs. TIBCO decided that the MAXDOP set to 10 allows the query to spawn across 25% of the total CPUs leaving the remaining 75% available for other threads. If you allow one query to spawn across 100% of the CPUs it might cause CPU contention.

Lesser Dedicated Memory to Operating SystemTIBCO recommends allocating sufficient memory to the operating system. If the operating system is running on less dedicated memory it could become a bottleneck.

TIBCO also recommends increasing the memory dedicated to the operating system to a higher memory allocation. A slow operating system might reflect on the database instances hosted. This might not be reflected in the SQL profiler report.

Incorrect Snapshot Isolation Level SettingsIt is important to set up the correct snapshot isolation level as recommended in the TIBCO MDM production documents. Even though this occurs in the earlier stages of the project, you must verify this configuration in case of performance issues. As noted in the product documentation, the isolation level is set to READ_COMMIT.



File-GroupingTIBCO MDM product documents recommend creating various file-groups while setting up seed-data. However, depending on the data model and the solution, you must create several file-groups so that the larger entities are on separate file-groups.

This reduces the input/output contention on the same file group. You can decide if you need more file groups by observing the file-group input/output in the SQL profiler. A Microsoft best practice is to limit the number of file groups.

For more details, see Microsoft SQL Server Database Configuration and Tuning.

Antivirus ConfigurationThe SQL server runs on a Windows operating system containing anti-virus software. TIBCO recommends that you configure the anti-virus software to disallow some of the SQL server related files to skip on-access scanning.

Lack of Indexes on MVT TablesTIBCO recommends that you include the recommended set of indexes on the MVT tables including clustered indexes on CMODVERSION and CPRODUCTKEYID. TIBCO has addressed this product defect.

Query Optimization on Product QueriesThe SQL profiler report might also show top select statements that include heavy logical reads or that consume heavy CPU cycles. These can be external queries which might not be executed by TIBCO MDM but communicate with TIBCO MDM. Based on regular observation, you might have to tune these types of queries. You can contact TIBCO MDM engineering for more information.

Storage or IO ContentionSQL input/output test can highlight various storage or input/output related bottlenecks. Ensure that there are no fault disks which can cause a performance bottleneck.



Over-Aggressive Scheduled Maintenance Jobs

Over-Aggressive Backup Schedules

To avoid an over aggressive backup strategy TIBCO recommends the following strategy:

l Complete a full backup daily.

l Complete a differential backup once per hour.

These backup strategies not only cause performance issues but also cause high CPU and input/output utilization. TIBCO recommends changing the scheduled maintenance activities to no incremental backups and full backup once a week.

There were improvements in the CPU and input/output utilization which in turn improved performance at the customer location.

Over-Aggressive T-Logs Backup

To avoid overhead, TIBCO recommends configuring the T-Log backup schedule to run every four hours rather than every 15 minutes.

Over-Aggressive Reorganization and Rebuild of Indexes

To avoid overhead, set the schedule for the reorganization and rebuild of indexes to a weekly basis rather than every day.

Additional places to look at for performance issues include reviewing your network or the overall hardware configuration of the database server.

The information in this chapter is based on a recent experience at a customer site and this effort is an ongoing activity.


27 | EMS Configuration and Tuning

EMS Configuration and TuningTIBCO MDM uses TIBCO EMS internally for proper functioning of the application. Minimal customization (pool sizes) is required for proper functioning of TIBCO MDM.

TIBCO MDM also uses some queues to exchange messages and events with external applications. You can set the pool size of the queues as well as for the individual members in the cluster.

The following figure shows how to set the pool size using the Configurator (NodeID > Async Task Management):

Setting the Pool Size

You can set the pool size for Asynchronous and Workflow queues for each member in the cluster.

Note: If the CPU on a TIBCO MDM instance is available, you can increase the pool size of asynchronous queues of TIBCO MDM instances to increase the throughput.


28 | Load Balancing

Load BalancingLoad balancing is configured for distributing workloads across multiple TIBCO MDM instances, network links, central processing units, disk drives, or other resources. Load balancing optimizes resource use, maximizes throughput, minimizes response time, and avoids overload. Using multiple components with load balancing instead of a single component increases overall throughput of the system.

Load Balancing

You can configure load balancing to replicate the member properties in the Configurator. Additionally, you can configure load balancing to start TIBCO MDM instances on the same machine or different machines referring to the same configuration file with a different Member ID. Since all the cluster members would point to the same EMS queues, the load on the system would be distributed equally. This type of configuration optimizes resource use, maximizes throughput, minimizes response time, and avoids overload.

Use an Apache web server with mod_jk for HTTP load balancing with JBoss Application Server. Refer to the JBoss Application Server documentation for the configurations.


29 | Database Loader Tuning

Database Loader TuningThe Database Loader utility addresses the need to import millions of records along with their relationships. To optimize the time involved to import a large number of records, validation is not performed and workflows are not considered while importing records into a repository.

When to Use Database LoaderDatabase Loader provides fast loading of data with relationships and must be your preferred option if you need to import large initial data in a single load with or without relationships.

Database Loader is faster because it does not include:

l Workflows

l Approval

l Validations

l Record history

l Named version

Use Database Loader when the imported data is clean, does not require any validation, and only for initial versions. Data is mapped from data sources to input maps and imported as is.

Database Loader performance depends on the following factors:

l Database setup: database Loader performs bulk operations (mostly inserts, some updates and deletes) therefore the larger the batch size, the more database resources it requires. If you cannot change the database parameters, consider using smaller sets per import.

l Number and size of attributes: more attributes and larger attribute sizes take longer to import.

l Indexes on MCT tables: although you can create many indexes on MCT tables to facilitate searches, this slows down inserts and deletes. Determine if you really need



these indexes and consider if you can drop the indexes during the initial load.

l Number of records in a repository: a larger size means the database must work harder to insert and delete the records. To address this issue, ensure that the database advisors do not report any problems with file systems, segments, or tablespaces. You can also partition your data.

l Partitioning strategy: verify how your indexes are partitioned and that your partitioning is appropriate and not causing slow inserts.

You cannot improve Database Loader performance by adding CPU or memory to TIBCO MDM or cache instances.

Performance Best Practices for Database Loader

To obtain optimum performance when using Database Loader, consider the following points:

l Create Indexes which are shipped with the product as part of Seed Data.

l Create Indexes on data source as mentioned in Data Source Indexes.

l Create separate tablespaces or data files for large TIBCO MDM tables and Indexes. For information, see Tablespaces Configuration and Database Optimizations on Microsoft SQL Server - Customer Study.

l Disable logging on the Oracle database for the following tables:

o MCT's

o Productkey

o Principalkey

o Relationship

o Goldencopy

o MVT

o RCT

Data Source IndexesWhen multiple data sources are joined in an import, and if the load sizes are huge, that is, 200 K, configuring indexes for data sources is helpful.



Indexes are created on the Data File (DF) tables for bulk loading scenarios.

Create the index file under: $MQ_COMMON_DIR\enterpriseinternalname\datasource\datasourcename.idx.Filename must match to the data source name. Indexes are created only when uploading data sources. Therefore, files must be created before the actual upload.

l Example 1

If the CID column of the DF_33969_37793_TAB data source table is mapped to PRODUCTID; and not mapped to PRODUCTIDEXT, create an index file as UPPER("CID")

l Example 2

If the CID column of the DF_33969_37793_TAB data source table is mapped to PRODUCTID, and CEXT is mapped to PRODUCTIDEXT, create an index file as UPPER("CID"),UPPER("CEXT")

Data Source Upload LimitFor large loads (Database Loader and Import) in a single batch, increase the data source upload limit. The default limit is 10485760 bytes. You can set the limit from Configurator for the UI Operation only and not for the FileWatcher operation.

Data Source Upload Limit



l Set the limit to 1048576000 bytes (1000 MB).

l Zip the data source and upload it for larger loads.

For large loads, split the data to ensure the upload file size does not exceed 1 GB.


33 | Import

ImportImport is used in bulk loading of data with and without relationships. The Import utility addresses the need to import millions of records along with their relationships. To optimize the time involved to import a large number of records, validation is not performed and workflows are not considered while importing records into a repository.

Direct Load ImportDirect load import depends on the following factors:

l CPUs available for processing records in parallel: if the CPU resources on a TIBCO MDM instance are available, you can increase the pool size of the asynchronous queue or the number of TIBCO MDM instances.

l Preloaded data: this has a pronounced impact when the data you are importing are modifications of existing records. Preloading does not noticeably improve the performance when you import new records.

l Database: the database insert performance could be a limiting factor. Insert performance depends on the number of indexes, the file system, and the partitioning strategy.

l Database parameters: TIBCO recommends that you review the Review advisories generated by Oracle, SGA, and PGA focusing on sizing.

Performance Best Practices for ImportTo obtain optimum performance when using import, consider the following points:

l Create Indexes on data source as mentioned in Data Source Indexes.

l Create separate tablespaces for each MDM table and Indexes. For information, see Tablespaces Configuration and Database Optimizations on Microsoft SQL Server - Customer Study.

l If your load is large, TIBCO recommends splitting the load into smaller loads in the range of 500K and using the Import utility to load the data.


34 | Import

l Disable logging on the Oracle database for the following tables:

o MCT's

o Productkey

o Principalkey

o Relationship

o Goldencopy

o MVT

o RCT


35 | Rulebase Execution Optimization

Rulebase Execution OptimizationTypically, rulebase execution on a single record executes quickly and is a lightweight job. For the cases where the overall bundle size is big, rulebase execution on the complete bundle has been seen as a contributor to execution time and eventually throughput.

You can have more control over processing orders. You can define dependencies in rulebases and can utilize parallel execution within a bundle. With no dependencies to define, almost all records in a bundle can be processed in parallel to optimize overall rulebase execution time.

The following are some of the directives introduced to have more control over rulebase execution and optimization:

l Evaluate Parent First (parentFirst)

l Can Parallelize (parallelize)

l Can Skip If no Change (skipIfNoChange)

For example, updating one or few records from a bundle results in execution of rulebase on all records in a bundle. With proper dependencies defined, execution can be optimized by utilizing parallel execution. Additionally, skipIfNoChange directive can be used to skip complete rulebase execution when the record is not modified.

The two following properties have been added at the constraints level for performance improvement:

l Parallelize

l Active

For example, if the parallelize flag is enabled, the constraint can be executed in parallel with other constraints in rulebase. Disabling the active flag does not evaluate the respective constraint.

For more information about release execution optimization, see the "Rulebase Execution Optimization" section in TIBCO MDM Studio Rulebase Designer User's Guide.

Rulebase Parallel Execution Depending on the dependencies defined, you can use parallelize execution.



As the rulebase for each record executes quickly and is a lightweight job, the most expensive but well-tuned rulebases take approximately one to one and one-half seconds. As dependencies are understood, most of the executions can happen in parallel. However, due to the light nature of tasks, it is not desirable to distribute them over JMS to other clusters. Instead, using the threads within the same JVM is possible. This, however, limits the degree of parallelism to the number of threads available on a single machine.

The existing thread pool mechanism uses the thread pool, which can be configured through ConfigValues (CIM Worker thread pool).

The work done in these threads is different from other usages of this thread pool. For example, each task can take one second for large rulebases. If this happens, the thread pool might lose threads if the pool is not configured correctly. As most of the rulebases for a bundle are small, and even with some large rulebases, the average is assumed to be no more than 100 milliseconds. To handle this, you can change the default thread pool configuration as follows:

1. Change the worker job depth from 8 to 124. This sets the worker job depth to the maximum thread count so that the number of threads remains finite. Because the callerRunsTask policy is used, there is no danger of tasks being rejected if the queue fills up.

2. Set the threadpool size to minimum is 8 and maximum is 32 or 2 multiplied by Cores. Set it to whichever is higher. The maximum default is 32.

Note: All records in a stack can be executed in parallel. After each stack, the next stack is executed only after all previously submitted tasks are complete.

Rulebase Execution DirectiveYou can provide execution directives to declare dependencies.

You must execute the parent first. This declares that in a bundle, the parent record must be evaluated first as the record's data has dependencies on the parent's data.

To determine dependencies, the following execution directives are used:

l Skip First Pass

This is an existing directive which indicates that the first pass can be skipped.



However, if the rulebase has a connect action or disconnect action, the first pass directive is implicit and the first pass is always executed.

l Evaluate Parent First (parentFirst)

This directive indicates that if there is one or more parents in the hierarchy, evaluate the parent first. This is usually indicated when children's data has a dependency on the parent or higher level parents.

l Can Parallelize (parallelize)

This directive does not depend on the order of execution within the same stack and can be executed in parallel. This is the most commonly used directive. The following examples indicate when this directive is not used:

o Data from siblings must be collated to perform some validation. In this case, the better way would be to do a separate validation step and evaluate it for one sibling only instead of repeating it for all.

o The sibling must be updated based on other siblings, such as sequence numbers.

l Can Skip If No Change (skipIfNoChange)

This directive applies only for operations that modify the record, such as merge, modify, or delete. It indicates that if records in a hierarchy are not modified, validation does not need to be done. This, in conjunction with the depth, can be used to speed up processing without losing any validation capability. For example, if depth is set to two, all records till depth of two are explored, but they are not validated if they have not been modified.

Note: All modified records are validated, but there is no danger in not validating a modified record. You can use depth and the skipIfNoChange directive with each other to skip some repositories while validating other hierarchies. This is the only correct usage for this directive.

Validation Algorithm on Rulebase Execution DirectivesThe validation algorithm for the rulebase execution directives establishes dependencies, uses a first pass for all records where the first pass applies, uses a second pass for all records, and processes relationship records.

Stack ProcessThe root is always in stack zero and is always processed first synchronously. Even if the



root is not modified, the root is always validated. The stacks are processed in order of increasing stack numbers. All records within a stack must be processed before processing the next stack. If a stack has only one record, it is processed synchronously. Within a stack, records are processed in a random order. While processing a stack, all the records that can be processed in parallel are processed first by initiating the processing threads. After initiating these threads and without waiting for completion, all the other records that cannot be processed in parallel are processed one by one.

Records Process

Records are processed in two passes. The first pass must be complete before the second pass can start. The first pass is applied to only those records for which either of the following is true for the rulebase:

l skipFirstPass is set to false in the rulebase directive.

l The rulebase has a connect action.

First pass process

l The first pass pre-validates the record. This pass formats data and identifies the ones which are in error. If validation class applies, the pass executes pre-validate on validation class. This method is empty for StandardCatalogValidator. Validation class only applies if the record is modified and it is either deep copy or modified.

l If Modify = true and there are connect actions, the first pass applies the connect actions. To execute connect actions, all assign and select actions are also executed.

l If Modify = true, there are propagate actions, and skipFirstPass = true, the first pass applies propagate. During this time, assign actions are also executed.

l The first pass initializes the rulebase by identifying and setting the record action, other context, and regular values.

Second pass process

l The second pass pre-validates and prepares the rulebase if the first pass was skipped.

l The second pass executes the full rulebase.

Relationship Records Process

After all stacks are processed, another full evaluation is done to process all relationship records. The relationship records are processed in stacks similar to records. There is only one pass for this. If there is only one record in the stack, it is processed synchronously.



Note: Order and properties of stacks or relationship records is identical to records. This means that execution directives of child record apply. Additionally, all relationship records are considered eligible for parallel execution and the execution directive (canParallalize = false) is ignored.



Configuration Properties for Record Bundle OptimizationTIBCO MDM optimizes the record bundle traversal and validation without requiring any additional inputs.

The following table lists the properties for record bundle optimization specified in the Configurator:

Property Name Location Value Description

Validation Record Bundle Depth

(com.tibco.cim.optimization.recordbundlevalidation.depth)

Cluster level (InitialConfig > Optimization)

Default value is 1 This property defines the depth for validations of related records in the bundle. Unless there are dependencies between the records at multiple levels (that is, if validation of record at any level requires records at other levels), depth of 1 is sufficient. If A is related to B related to C related to D and depth is specified as 2, then recordbundle would load upto C, that is, it loads all related records upto depth 2.

Relationship Exclution List

(tibco.optimization.recordbundle.excluderelationship)

Cluster level (InitialConfig > Optimization)

Default value is CatalogName.RelationshipName,CatalogName.

List of relationship excluded of loaded in the bundle

Configuration Properties for Rulebase DirectivesThe following table lists the properties for rulebase directives specified in the Configurator.

Property Name Location Value Description

Default rulebase directive - canParallalize

(com.tibco.cim.rulebase.directive.canparallalize)

Cluster level (InitialConfig > Rule Base)

Default value is true

This property indicates if no directive is specified for a rulebase, should rulebase execution be parallelized.

Default rulebase directive - skipifnochange

(com.tibco.cim.rulebase.directive.skipifnochange)

Cluster level (InitialConfig > Rule Base)

Default value is true

This property indicates if no directive is specified for a rulebase, should the rulebase be skipped if there is no data change.



Impact on Propagate RulebasePropagate is done in the first pass of rulebase. When propagation is done, all child records affected by the propagate are updated.

The following is the propagate process:

l When dependencies are evaluated, all child records are pushed to a higher stack so that parent evaluation finishes first. Update to children happens as part of the parent propagation.

l When execution is done, the properties of the propagation rulebase are checked. If it is INLINE, the default applies. If propagation rulebase properties are set as serialize, it serializes the execution; otherwise, it executes propagate in parallel.


42 | Workflow Execution

Workflow ExecutionTIBCO MDM uses a workflow message to initiate any new processes and the messages are posted to JMS queues for processing. When the JMS queue receives a message, an event (associated with the message) and a document are created from the message and stored in the distributed cache. The event and document type and subtype (defined in the workflow manager configuration file) determine the execution process.

Regular Workflow ProcessingA workflow is executed in the transactional mode.

l When the process is executed in transactional mode, TIBCO MDM persists the event to the database and the documents to the file system. The file name is associated with the event in the database. The event is then posted to a workflow queue for processing.

l The event type and subtype determine the workflows triggered for an event. The workflow reads the message associated with the event from the file system and starts a new process. This file is the main input parameter to the workflow. Each workflow has a set of activities which in turn have input and output parameters.

l Activity state changes are stored to the database. Activities can generate documents (written to the file system) that act as output parameters for that activity and input parameters to a subsequent activity in the flow.

In-Memory Workflow ProcessingTIBCO MDM retrieves the workflow data from the cache. The workflow can run without persisting the workflow state information when the workflow is executed in the in-memory mode.

The following object types are put in memory:

l Event

l EventDetails



l Process

l ProcessLog

l ProcessDetails

l ProcessState

l AttributeLog

l MLXMLDoc

Input and output parameters such as XML files are stored in memory and are not written to a disk. When a workflow is executed, the local cache manages the workflow states and documents.

Impact of In-Memory Workflows on UIThis section describes how the workflows impact the UI.

Checking Progress in the Event Log

l An event is not generated when a workflow is run as in-memory. In this case, you cannot check event details for operations triggered by an in-memory workflow. However, if you have configured in-memory workflows that are persisted on success (for example, if you set the savestateonsuccess property to true), and the event is generated, you can view the result in the Event Log after the workflow has finished executing.

l Intermediate event logs are available if the workflow contains the CheckpointWorkflow activity, async subflow, spawn workflow activity, or if the activity suspends. If a workflow errors out, the entire Event Log is available for debugging.

l If a data source upload or an import operation triggers an in-memory workflow, the Check Progress link (which you can use to track progress in case of regular workflows) redirects the event to an appropriate page with a message that the event is an in-memory operation.

Checking Progress of Records for Workflows Run In-Memory

If you add a record in-memory, you can check the success of the workflow by searching for the record in the catalog or checking the record count in the catalog.



Record history does not contain any reference to events processed in-memory. The link to the event is disabled and the event column shows a message in memory.

Regular Workflows versus In-Memory WorkflowsThe table compares and contrasts the regular workflow with the in-memory workflow.

Regular Workflows In-Memory Workflows

During each activity execution, process states are persisted to the database and distributed cache.

During each activity execution, process states are written to the local cache.

Input, output, and intermediate documents are persisted to the file system.

Input, output, and intermediate documents are maintained in the local cache.

Each activity is considered a transaction.

The entire workflow (consisting of multiple activities) is considered a transaction.

When a transaction is committed, all workflow changes (process states, documents, and record related data) in that activity are committed.

When a transaction is committed, record related data in the workflow is committed. If the saveonsuccess flag is on, data from the local cache (process states, documents) is committed.

Regular Workflows versus In-Memory Workflows

In-Memory Configuration through the ConfiguratorYou can use the Configurator to enable the In-memory workflows. The setting is available in the Workflow Settings of the Advanced Configuration Outline.

To complete the In-memory configuration, add the name of the workflow you want to execute In-memory in the Configurator. You can add the following two types of workflows:

l List of Workflows that run in-memory: The Value column of this property contains a pop-up dialog where you can enter the names of the workflows you want to run-in memory. Click the cross icon to remove a specified workflow from the in-memory execution.



l List of workflows that run in-memory and whose state needs to be persisted on success: The Value column of this property contains a pop-up dialog where you can enter the names of the workflows you want to run in-memory and persist to the database on success. Click the cross icon to remove a specified workflow from the in-memory execution and database persistence.

Note: You cannot deploy the property at runtime. Therefore, you must restart the server after saving the configuration.

Workflow Settings

When to Use In-Memory Workflow ExecutionA boost in performance occurs when you use workflow execution and the throughput is two or three times better compared to the regular workflow execution mode.

In-memory is most effective if you eliminate the process state. You also save disk space on the database.

TIBCO does not recommend using in-memory workflow in development.

Additionally, there is an intermediate mode where workflows run in-memory and where the state is persisted on success.

Workflow ConfigurationsYou can use the following configurations for web services tests under load for regular workflow execution, in-memory workflow execution, and concurrent scenarios (involving UI, web services, and Bulk Import Scenarios simultaneously).



The web service request Validation property enables the inbound web service request validation. The incoming request is validated according to the XML schema. This adds some overhead on the processing and the default value is set to false for performance. You can set this property through the Configurator.

Web services Tests

These TIBCO MDM configurations vary according to the concurrent users accessing the system:

l Maximum concurrent http service count: This property sets the maximum count for a number of HTTP threads (web services and HTTP) The default value is 20 and this property application server is member-specific.

l Maximum concurrent web service listener count: This property sets the maximum count for a number of web services service listeners. The default value is 10 users and this property is member specific application server.

TIBCO MDM Configurations



Limitations for In-Memory Workflows

Do not use the following activities for in-memory workflows:

l UploadData Source

l Purge

l ProcessServiceMessage

l ImportCatalog

l ImportClassificationScheme

l ReclassifyRecord

Additionally, in-memory workflows cannot be used in case of CIM2CIM. Mass update and synchronization export are not supported in-memory.


48 | Workflow Activity Parallelization

Workflow Activity ParallelizationAsynchronous execution of activities provides numerous benefits including the ability to process data in batches and execute batches in parallel.

Activity parallelization works as follows:

1. A COMMAND in and out parameter is used to indicate run modes for the activity. This is an internal parameter and no configuration is needed. A null command indicates that the activity is executing for the first time.

2. When the activity suspends, it sets the command to indicate the next step.

3. The workflow passes this command back to the activity when the activity restarts.

4. The activity initializes the run counters (these counters are stored in the ProcessDetail Table):

l Total number of records to be processed.

l Initial counter for record processed as zero.

l Hidden (with respect to the UI).

5. The activity initiates parallel batch processing and suspends.

6. The activity creates batches of the records to be processed and sends messages for each batch.

7. Each batch keeps track of the record/bundle count for the batch.

8. At the end of each batch, the processing batch increments the counters in an atomic operation.

9. Checks for restart. If total records processed matches the total records to be processed, a restart event is sent.

10. Workflow manager restarts the workflow and initiates the suspended activity.



Activity Parallelization ConfigurationThe following properties are specified in the Configurator (Initial Config > Basic > Optimization):

Property Name Description Default Value

Records Per Asynchronous Call

(com.tibco.cim.optimization.recordsperasynccall)

Fine tunes the processing of records in batches. The records to be processed are grouped into asynchronous batches and each batch would contain the number of records configured. The default value is 10 records per batch. A larger value might provide performance improvements in certain situations.

100

Records Per Asynchronous Call

(com.tibco.cim.optimization.bundlesperasynccall)

Defines how many groups of records (bundles) are allowed in one separate asynchronous processing batch. The default value is 20, setting this value higher can lead to performance improvements in certain situations.

20

Optimization Properties

Override in the Activity

You can override the values by specifying the following parameters in the activity:

l<Parameter direction="in" name="RecordsPerAsyncCall" type="long" eval="constant">10</Parameter>

l<Parameter direction="in" name=“BundlesPerAsyncCall" type="long" eval="constant">10</Parameter>

Activity Timeout

It is possible that an activity takes too long to complete or does not correctly restart. In this case, the activity timeouts. The activity must handle the timeout. This special timeout is pre-configured using a default value:

com.tibco.cim.optimization.parallelactivity.timeout (default value 24)

In most cases, the activity does not do anything other than setting the status to Timeout.



Async Call QueueThe application can initiate a task in the background using the Async call queue. An “Async Call” queue is defined with appropriate senders and receivers.

l AsyncCallQueueSenderManager

l AsyncCallQueueReceiverManager This configuration provides for a default async call listener which expects all async calls to pass the handler. This handler must implement the IAsyncCallable interface.

For example:

public class AsyncCatalogImport implements IAsyncCallable{

This interface has an onAsyncCall method, which is invoked by the listener.

public void onAsyncCall( { System.out.println("Processing importData/processRelationship : "+importData+"/"+processRelationship+" onAsyncCall()...................."); process(); }

To initiate a call, create the AsyncCallable object, initialize it with the input parameter, and then send it for async processing as follows:

AsyncCaller.callAsync(object);//object is the asyncCallable object



Activity Parallelization WorkflowActivity Parallelization Workflow

Record Bundling OptimizationWhen a bundle of records is created, all related records are loaded into the bundle. If the bundle is too large, it takes a long time to load. You can reduce the time to display records.

l When a bundle is loaded for view/edit, it is validated to pre-configured depth.

l Every time a user navigates the bundle, the next level is processed.

l The default depth is set to 2.

l If any record is modified, and when a bundle is saved,

o All modified records are validated.

o Related records are validated to specified depth.



l The UI display changes to show only the specified depth.

l If the recordBundle has a large hierarchy, on the UI the loading/displaying of the bundle can be controlled or restricted by the configuration.

l Custom Validations can be turned off or on during record view.

Record Caching OptimizationOnly one record is added to cache when requested. Previously, when records were requested, they were retrieved from the database and added to the cache.

Records can be preloaded into the cache when the server is restarted and at runtime when a record is requested, it can be retrieved from the cache and returned. This is done by asynchronous messaging.

The message for a given Organization loads all PRODUCTKEYS into the cache. All records for a given repository and organization are loaded into the cache.

Processing of the PRODUCTKEY and records is done in the same message, that is, synchronously. If both are required to be done in parallel, a new instance of the class must be added in the ConfigValues.xml file, passing:

catalogName=PRODUCTKEY and RECORD

in the second.

OrganizationName=PRODUCTKEY and RECORD

in the second.

<ConfValue description="The list of catalog/repository names for which the record data should be cached on startup. Specify a comma separated list. Example : MASTERCATALOG, TEST" isHotDeployable="false" listDefault="DEMO" name="Cache Preloader Catalog/Repository Name List" propname="com.tibco.cim.init.PreLoadManager.catalogName" sinceVersion="7.0" visibility="Advanced"> <ConfList> <ConfListString value="DEMO" /> </ConfList> </ConfValue>



<ConfValue description="The list of organization names used to select catalogs/repositories for preloading on startup. This should correspond to catalog/repository names. Specify a comma separated list. Example : MYORG, TIBCOCIM" isHotDeployable="false" listDefault="TIBCOCIM" name="Cache Preloader Organization List" propname="com.tibco.cim.init.PreLoadManager.OrganizationName" sinceVersion="7.0" visibility="Advanced"> <ConfList> <ConfListString value="TIBCOCIM" /> </ConfList> </ConfValue>

<ConfValue description="List of object types which should be cached on startup. Only the record (RECORD) and the key information of the record (PRODUCTKEY) are supported right now." isHotDeployable="false" listDefault="RECORD PRODUCTKEY" name="Cache Preloader Record Types" propname="com.tibco.cim.init.PreLoadManager.ObjectName" sinceVersion="7.0" visibility="All"> <ConfList> <ConfListString value="RECORD" /> <ConfListString value="PRODUCTKEY" /> </ConfList> </ConfValue>

<ConfValue description="The list of input map names used to filter records for preloading on startup. Example : INPUTMAP1" isHotDeployable="false" listDefault="DEMO" name="Cache Preloader Input Map Name List" propname="com.tibco.cim.init.PreLoadManager.inputMapName" sinceVersion="7.1" visibility="All"> <ConfList> </ConfList> </ConfValue>

If an inputmap is specified, records/productkeys for the data source related to that inputmap are loaded into the cache. <ConfValue description="The list of input map names where the record data should be cached on startup. Example : INPUTMAP1" isHotDeployable="false" listDefault="DEMO" name="Cache Preloader InputMap Name List" propname="com.tibco.cim.init.PreLoadManager.inputMapName" sinceVersion="7.1" visibility="All"> <ConfList> </ConfList> </ConfValue>

Preloading can also be done through a utility ($MQ_HOME/bin/preload.sh or bat), when the



server is running. The utility sends an asynchronous message to preload records and Productkeys per the configuration in the config file.


55 | Preload

PreloadUse Preload to load important data quickly at startup and enhance TIBCO MDM performance. Preload uses multiple threads across all nodes in the TIBCO MDM cluster.

You can preload the following objects:

l Product keys

l Records and record version numbers

l Synchronized records and synchronization logs

l Repository metadata

l Enterprise specific data such as Enterprise, organization, users, or roles

Preload configuration is common to the whole cluster and must not be configured as member specific properties. To set up the preload, start Configurator and go to InitialConfig > Optimization.

Preload Optimization


56 | Preload

Performance Best Practices for PreloadingUse the following best practices when you use preload:

l Size the cache according to the requirements (number of records to be preloaded, size of records, percentage of synchronized records). Refer to TIBCO MDM Cache Calculation.

l Enable the preload tranche for preloading large volumes of data.

l Set preload tranche size high for preloading large volumes of data. Tranche enables quick loading of the data from a single repository by splitting the records of the repository in portions and initiating the load for each portion in parallel.

The tranche is set to 500000 in the performance lab for 35 million records preloading.

For more information about the preload properties, see the Configuring Preload Properties section in TIBCO MDM System Administration.

Preloading is an asynchronous activity. After you initiate preloading, JMS messages are queued up on the ASYNC queue. An increase in the number of listeners or TIBCO MDM members increases the processing speed. Preloading also supports clustering.

Guidelines for SizingThe following information provides guidelines for sizing the cache appropriately based on the data to be preloaded. It can vary the number of records and repositories, the size of records, metadata definition, synchronized records, and the objects to be preloaded.

Preload


57 | Preload

Note: The performance lab has four repositories with a fixed number of attributes. The preceding estimates are only for RECORD, RECORDMAXMODVERSION, and PRODUCTKEY objects preloaded. For more information, see TIBCO MDM Cache Calculation.


58 | Purge

PurgePurge, a data clean-up operation, removes the data from the database, cache, and text index unlike the logical delete done in other delete operations. Purge enables you to keep only the required data and reduce the disk capacity required.

Use the command line to purge the following objects:

l Records

l Repositories

l Data sources

l Events

To trigger purge, use the command line utility $MQ_HOME/bin/datacleanup.sh.

For example, the command datacleanup.sh -o repository -a 69390 -rn ALL -m 20101 purges all records of all repositories in enterprise ID 69390 and member ID 20101.

The purge log file (located in $MQ_COMMON_DIR\Temp\Year\Month\Date\Hour) provides details such as:

l Purge Start Date

l Member who initiated the purge

l Event Descriptor

l Repository or the Repositories’ ID to which Purge is applicable.

l Exec mode

l Number of rows deleted

l Number of records deleted

l All relevant data

Performance Best Practices for PurgeThese are some of the best practices to follow when you use Purge:


59 | Purge

l Create or alter indexes on MATCHRESULTDETAILS, MATCHCANDIDATE_PTIDX1, MATCHCANDIDATEDETAILS_PTIDX1, MERGERESULT_PTIDX1, PRODUCTSTATUS, PRODUCTKEY, ACTIVITYRESULT and RELATIONSHIPDEFINITION tables.

Contact TIBCO Engineering team for the indexes scripts.

l Resize the redo logs depending on the analysis of Automatic Workload Repository (AWR) reports. Consider resizing the redo logs based on the Redo Size in AWR and the redo interval. Consider increasing the size of redo logs and also the number of redo logs.

l Log file switch: for a 10 minute switch, the total redo log file size must be approximately 2.5 GB. This enables the log switch to occur in 12 to 15 minutes.

o Use the AWR report: Report Summary to calculate the required size.

o Check Load Profile: Redo size and take per second size and calculate the required size for 12-15 minutes.

o Next, check the assigned redo logs. You can increase the size of the logs and number of logs. If the log file size is too large, the switch does not occur for some time and delays data recovery and the data being written. If there are too many log files, multiple concurrent accesses are supported but switches still occur.

o Consider moving the redo logs to faster disks to avoid log file switch delays.

o Set the size of the System Global Area (SGA) in Oracle, based on the recommendations in Automatic Database Diagnostic Monitor (ADDM) reports.


60 | Configuration for Logging

Configuration for LoggingTIBCO MDM logs the entire request and response data in logs depending on the log level set for the application.

You can set the following logging levels for the TIBCO MDM server through Configurator:

l Debugging level for TIBCO MDM package

l MASSUPDATEUI package

l DQ package

l HeaderExtractor package

l Root logging level

In the following figure, the logging is set at an individual application server member:

Logging

Standard Logging LevelsYou can set the following log levels through Configurator (NodeID > Logging > Standard Log):

l FATAL


61 | Configuration for Logging

l ERROR

l WARN

l INFO

l DEBUG

Note: o Set the log level to DEBUG to debug functional issues as well as

understand the flow. You must also set logging to DEBUG in the production environment in case you encounter errors and you need to debug the issue.

o Set the log level to ERROR for performance scenarios because logging set to DEBUG might impact performance by 15 percent.

Logging is hot deployable and you can change the logging level without restarting the TIBCO MDM server.


62 | Performance Tuning Tips

Performance Tuning Tips l If you have very large workflows, split them into smaller sub-flows.

l Reduce the workflow pool size (by setting the value for com.tibco.cim.init.WmQueueReceiverManager.poolSize in the Configurator) to 1 if you have less memory. However, the recommended maximum pool size is 4 to 6.

l Review validation rules. A review of all validation rules to simplify the logic improves the performance. With increased robustness of rulebase syntax, you might be able to reduce the time to view, validate and save records and optimize performance.

l Modify enumerated data lists. It is recommended that the enumerated data lists (for valid value lists) be changed to use data sources. TIBCO MDM caches data sources, and this helps to improve display time for the record view and edit page. It is recommended to not have the drop-down list longer than 100 choices.

l Reduce the revivify frequency. The revivify interval is used to time-out work items, and restart the workflows for time out. When set to a high frequency, it slows down all aspects of TIBCO MDM. The revivify frequency must be reduced, as follows:

o Set to an interval of 20 hours (a value of 72,000,000).

l When using Oracle, caching a few tables in Oracle memory is recommended.

l Optimization and faster loading of the Relationship tab. The default value for the Rulebase execution on related records (com.tibco.ui.rulebase.processrelated.flag) property is set to false. This means rendering of the relationship tab is delayed and only done when the user visits the Relationship tab.


63 | TIBCO Documentation and Support Services

TIBCO Documentation and Support Services

How to Access TIBCO Documentation

Documentation for TIBCO products is available on the TIBCO Product Documentation website, mainly in HTML and PDF formats.

The TIBCO Product Documentation website is updated frequently and is more current than any other documentation included with the product. To access the latest documentation, visit https://docs.tibco.com.

Product-Specific Documentation

The following documentation for TIBCO® MDM is available at the TIBCO® MDM Product Documentation page:

l TIBCO® MDM Release Notes

l TIBCO® MDM Installation and Configuration

l TIBCO® MDM Custom Installation

l TIBCO® MDM Cloud Deployment

l TIBCO® MDM User's Guide

l TIBCO® MDM System Administration

l TIBCO® MDM Customization

l TIBCO® MDM Workflow Reference

l TIBCO® MDM Web Services

l TIBCO® MDM Best Practices

l TIBCO® MDM Performance Tuning

l TIBCO® MDM Troubleshooting

l TIBCO® MDM Security Guidelines

l TIBCO® MDM API Reference

To access documentation for this product, go to the following location:

TIBCO_HOME/release_notes/TIB_cim_9.3.0_docinfo.html

https://docs.tibco.com/

https://docs.tibco.com/products/tibco-mdm

https://docs.tibco.com/products/tibco-mdm


64 | TIBCO Documentation and Support Services

Here, TIBCO_HOME is the top-level directory in which TIBCO products are installed. On Windows, the default TIBCO_HOME is C:\tibco. On UNIX systems, the default TIBCO_HOME is /opt/tibco.

How to Contact TIBCO Support

You can contact TIBCO Support in the following ways:

l For an overview of TIBCO Support, visit http://www.tibco.com/services/support.

l For accessing the Support Knowledge Base and getting personalized content about products you are interested in, visit the TIBCO Support portal at https://support.tibco.com.

l For creating a Support case, you must have a valid maintenance or support contract with TIBCO. You also need a user name and password to log in to https://support.tibco.com. If you do not have a user name, you can request one by clicking Register on the website.

How to Join TIBCO Community

TIBCO Community is the official channel for TIBCO customers, partners, and employee subject matter experts to share and access their collective experience. TIBCO Community offers access to Q&A forums, product wikis, and best practices. It also offers access to extensions, adapters, solution accelerators, and tools that extend and enable customers to gain full value from TIBCO products. In addition, users can submit and vote on feature requests from within the TIBCO Ideas Portal. For a free registration, go to https://community.tibco.com.

http://www.tibco.com/services/support

https://support.tibco.com/

https://support.tibco.com/

https://ideas.tibco.com/

https://community.tibco.com/


65 | Legal and Third-Party Notices

Legal and Third-Party NoticesSOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY (OR PROVIDE LIMITED ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE EMBEDDED OR BUNDLED SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY OTHER TIBCO SOFTWARE OR FOR ANY OTHER PURPOSE.

USE OF TIBCO SOFTWARE AND THIS DOCUMENT IS SUBJECT TO THE TERMS AND CONDITIONS OF A LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED SOFTWARE LICENSE AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE CLICKWRAP END USER LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD OR INSTALLATION OF THE SOFTWARE (AND WHICH IS DUPLICATED IN THE LICENSE FILE) OR IF THERE IS NO SUCH SOFTWARE LICENSE AGREEMENT OR CLICKWRAP END USER LICENSE AGREEMENT, THE LICENSE(S) LOCATED IN THE “LICENSE” FILE(S) OF THE SOFTWARE. USE OF THIS DOCUMENT IS SUBJECT TO THOSE TERMS AND CONDITIONS, AND YOUR USE HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN AGREEMENT TO BE BOUND BY THE SAME.

This document is subject to U.S. and international copyright laws and treaties. No part of this document may be reproduced in any form without the written authorization of TIBCO Software Inc.

TIBCO, the TIBCO logo, the TIBCO O logo, and BusinessConnect, ActiveMatrix BusinessWorks, and Enterprise Message Service are either registered trademarks or trademarks of TIBCO Software Inc. in the United States and/or other countries.

Java and all Java based trademarks and logos are trademarks or registered trademarks of Oracle Corporation and/or its affiliates.

This document includes fonts that are licensed under the SIL Open Font License, Version 1.1, which is available at: https://scripts.sil.org/OFL

Copyright (c) Paul D. Hunt, with Reserved Font Name Source Sans Pro and Source Code Pro.

All other product and company names and marks mentioned in this document are the property of their respective owners and are mentioned for identification purposes only.

This software may be available on multiple operating systems. However, not all operating system platforms for a specific software version are released at the same time. See the readme file for the availability of this software version on a specific operating system platform.

THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.

https://scripts.sil.org/OFL


66 | Legal and Third-Party Notices

THIS DOCUMENT COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS OF THIS DOCUMENT. TIBCO SOFTWARE INC. MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME.

THE CONTENTS OF THIS DOCUMENT MAY BE MODIFIED AND/OR QUALIFIED, DIRECTLY OR INDIRECTLY, BY OTHER DOCUMENTATION WHICH ACCOMPANIES THIS SOFTWARE, INCLUDING BUT NOT LIMITED TO ANY RELEASE NOTES AND "READ ME" FILES.

This and other products of TIBCO Software Inc. may be covered by registered patents. Please refer to TIBCO's Virtual Patent Marking document (https://www.tibco.com/patents) for details.

Copyright © 1999-2020. TIBCO Software Inc. All Rights Reserved.

https://www.tibco.com/patents

Date post:	17-Oct-2021
Category:	Documents
Upload:	others
View:	12 times
Download:	1 times

TIBCO® MDM Performance Tuning Guide

Documents