Date post: | 18-Jan-2016 |
Category: |
Documents |
Upload: | john-peter-inapanuri |
View: | 16 times |
Download: | 3 times |
Informatica PowerMart / PowerCenter 7 Basics
Education ServicesPC6B-20030512
Informatica Corporation, 2003. All rights reserved.
2
At the end of this course you will: Understand how to use all major PowerCenter 7
components Be able to perform basic Repository administration tasks Be able to build basic ETL Mappings and Mapplets Be able to create, run and monitor Workflows Understand available options for loading target data Be able to troubleshoot most problems
Course Objectives
3
Extract, Transform, and Load
RDBMSRDBMS MainframeMainframe OtherOther
Operational Systems
DataDataWarehouseWarehouse
Decision Support
• Transaction level data• Optimized for Transaction
Response Time
• Current• Normalized or De-
Normalized data
• Aggregated data
• Historical
ETLETLExtract Load
Transform
Aggregate DataCleanse Data
Consolidate DataApply Business Rules
De-normalize
4
PowerCenter Architecture
5
PowerCenter 7 Architecture
Not Shown: Client ODBC Connections for Source and Target metadata
TargetsSources
native native
TCP/IP
HeterogeneousTargets
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
Server
HeterogeneousTargets
Repository Designer Workflow Workflow Rep Server Manager Manager Monitor Administrative Console
6
PowerCenter 7 Components
PowerCenter Repository
PowerCenter Repository Server
PowerCenter Client• Designer• Repository Manager• Repository Server Administration Console• Workflow Manager• Workflow Monitor
PowerCenter Server
External Components• Sources• Targets
7
Repository Topics
By the end of this section you will be familiar with:
The purpose of the Repository Server and Agent
The Repository Server Administration Console GUI interface
The Repository Manager GUI interface
Repository maintenance operations
Security and privileges
Object sharing, searching and locking
Metadata Extensions
8
Repository Server
Each Repository has an independent architecture for the management of the physical Repository tables
Components: one Repository Server, and a Repository Agent for each Repository
Client overhead for Repository management is greatly reduced by the Repository Server
Repository
RepositoryServer
RepositoryAgent
Server
Repository Repository Manager Manager
Repository Server Administration Console
9
Repository Server Features
Manages connections to the Repository from client applications
Can manage multiple Repositories on different machines on a network
Uses one Repository Agent process to insert, update and fetch objects from the Repository database tables, for each Repository it manages
Maintains object consistency by controlling object locking
The Repository Server runs on the same system running the Repository Agent
10
Repository Server Administration Console
Use Repository Administration console to Administer Repository Servers and
Repositories through Repository Server. Following tasks can be performed:
Add, Edit and Remove Repository Configurations Export and Import Repository Configurations Create a Repository *Promote a local Repository to a Global Repository Copy a Repository Delete a Repository from the Database Backup and Restore a Repository Start, Stop, enable and Disable a Repositories View Repository connections and locks Close Repository connections. Upgrade a Repository
11
Repository Server Administration Console
Information Nodes
Console Tree
HTML View
Hypertext Links to Repository Maintenance Tasks
12
Repository Management
Perform all Repository maintenance tasks through Repository Server from the Repository Server Administration Console
Create the Repository Configuration
Select Repository Configuration and perform maintenance tasks:
• Create
• Delete
• Backup
• Copy from
• Disable
• Export Connection
• Make Global
• Notify Users
• Propagate
• Register
• Restore
• Un-Register
• Upgrade
13
Repository Manager
Use Repository manager to navigate through multiple folders and repositories. Perform following tasks:
Manage the Repository • Launch Repository Server Administration Console for this
purpose
Implement Repository Security• Managing Users and Users Groups
Perform folder functions• Create, Edit, Copy and Delete folders
View Metadata• Analyze Source, Target, Mappings and Shortcut dependencies.
14
Repository Manager Interface
Navigator Window
Main Window
Dependency Window
Output Window
15
Users, Groups and Repository Privileges
Steps:
Create groups
Create users
Assign users to groups
Assign privileges to groups
Assign additional privileges to users (optional)
16
Managing Privileges
Check box assignment of privileges
17
Folder Permissions
Assign one user as the folder owner for first tier permissions
Select one of the owner’s groups for second tier permissions
All users and groups in the Repository will be assigned the third tier permissions
18
Object Locking
Object Locks preserve Repository integrity
Use the Edit menu for Viewing Locks and Unlocking Objects
19
Object Searching(Menu- Analyze – Search)
Keyword search • Limited to keywords
previously defined in the Repository (via Warehouse Designer)
Search all• Filter and search
objects
20
Object Sharing
Reuse existing objects Enforces consistency Decreases development time Share objects by using copies and shortcuts
Required security settings for sharing objects:
• Repository Privilege: Use Designer• Originating Folder Permission: Read• Destination Folder Permissions: Read/Write
COPY SHORTCUT
Copy object to another folder Link to an object in another folder
Changes to original object not captured
Dynamically reflects changes to original object
Duplicates space Preserves space
Copy from shared or unshared folder Created from a shared folder
21
Adding Metadata Extensions
Allows developers and partners to extend the metadata stored in the Repository
Accommodates the following metadata types:• Vendor-defined - Third-party application vendor-created
metadata lists • For example, Applications such as Ariba or PowerConnect for
Siebel can add information such as contacts, version, etc.
• User-defined - PowerCenter/PowerMart users can define and create their own metadata
Must have Administrator Repository or Super User Repository privileges
22
Sample Metadata Extensions
Sample User Defined Metadata, e.g. - contact information, business user
Reusable Metadata Extensions can also be created in the Repository Manager
23
Design Process
1. Create Source definition(s)
2. Create Target definition(s)
3. Create a Mapping
4. Create a Session Task
5. Create a Workflow from Task components
6. Run the Workflow
7. Monitor the Workflow and verify the results
24
Source Object Definitions
By the end of this section you will:
Be familiar with the Designer GUI interface
Be familiar with Source Types
Be able to create Source Definitions
Understand Source Definition properties
Be able to use the Data Preview option
25
Source Analyzer
Analyzer Window
Navigation Window
Designer Tools
26
Import from Database Import from File Import from Cobol File Import from XML file Create manually
Methods of Analyzing Sources
Repository
Relational Flat file COBOL fileXML file
Source Analyzer
27
Analyzing Relational Sources
TableViewSynonym
Relational Source
DEFDEF
Source Analyzer
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
DEFDEF
ODBC
28
Analyzing Relational Sources
Editing Source Definition Properties
29
Analyzing Flat File Sources
Mapped DriveNFS MountLocal Directory DEFDEF
Fixed Width orDelimited
Flat FileSource Analyzer
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
DEFDEF
30
Flat File Wizard
Three-step wizard
Columns can be renamed within wizard
Text, Numeric and Datetime datatypes are supported
Wizard ‘guesses’ datatype
31
XML Source Analysis
Mapped DriveNFS MountingLocal Directory
In addition to the DTD file, an XML Schema or XML file can be used as a Source
Definition
DEFDEF
.DTD File
DATADATA
Source Analyzer
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
DEFDEF
32
Analyzing VSAM Sources
Supported Numeric Storage Options:COMP, COMP-3, COMP-6
Mapped DriveNFS MountingLocal Directory DEFDEF
.CBL File
DATADATA
Source Analyzer
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
DEFDEF
33
VSAM Source Properties
34
Target Object Definitions
By the end of this section you will:
Be familiar with Target Definition types
Know the supported methods of creating Target Definitions
Understand individual Target Definition properties
35
Creating Target Definitions
Methods of creating Target Definitions
Import from Database
Import from an XML file
Manual Creation
Automatic Creation
36
Automatic Target Creation
Drag-and-drop a Source Definition into the Warehouse Designer Workspace
37
Import Definition from Database
Can “Reverse engineer” existing object definitions from a database system catalog or data dictionary
TableViewSynonym
Warehouse Designer
Database
DEF
Repository
RepositoryServer
RepositoryAgent
TCP/IP
native
DEFDEF
ODBC
38
2. Add desired columns
ALT-F can also be used to create a new column
Manual Target Creation
1. Create empty definition
3. Finished target definition
39
Target Definition Properties
40
Target Definition Properties
41
Creating Physical Tables
LOGICAL
Repository target table definitions
PHYSICAL
Target database tables
DEF
DEF
DEF
Execute SQLvia
Designer
42
Create tables that do not already exist in target database Connect - connect to the target database Generate SQL file - create DDL in a script file Edit SQL file - modify DDL script as needed Execute SQL file - create physical tables in target database
Use Preview Data to verify the results (right mouse
click on object)
Creating Physical Tables
43
Transformation Concepts
By the end of this section you will be familiar with:
Transformation types and views
Transformation calculation error treatment
Null data treatment
Informatica data types
Expression transformation
Expression Editor
Informatica Functions
Expression validation
44
Transformation Types
Informatica PowerCenter 7 provides 23 objects for data transformation
Aggregator: performs aggregate calculations Application Source Qualifier: reads Application object sources as ERP Custom: Calls a procedure in shared library or DLL Expression: performs row-level calculations External Procedure (TX): calls compiled code for each row Filter: drops rows conditionally Joiner: joins heterogeneous sources Lookup: looks up values and passes them to other objects Normalizer: reorganizes records from VSAM, Relational and Flat File Rank: limits records to the top or bottom of a range Input: Defines mapplet input rows. Available in Mapplet designer Output: Defines mapplet output rows. Available in Mapplet designer
45
Transformation Types
Router: splits rows conditionally Sequence Generator: generates unique ID values Sorter: sorts data Source Qualifier: reads data from Flat File and Relational Sources Stored Procedure: calls a database stored procedure Transaction Control: Defines Commit and Rollback transactions Union: Merges data from different databases Update Strategy: tags rows for insert, update, delete, reject XML Generator: Reads data from one or more Input ports and
outputs XML through single output port XML Parser: Reads XML from one or more Input ports and outputs
data through single output port XML Source Qualifier: reads XML data
46
Transformation Views
A transformation has three views:
Iconized - shows the transformation in relation to the rest of the mapping
Normal - shows the flow of data through the transformation
Edit - shows transformation ports and properties; allows editing
47
Edit Mode
Allows users with folder “write” permissions to change or create transformation ports and properties
Switch between transformations
Define transformation level properties
Define port level handling
Enter comments
Make reusable
48
Expression Transformation
Passive TransformationConnected
Ports• Mixed• Variables allowed
Create expression in an output or variable port
Usage• Perform majority of
data manipulation
Perform calculations using non-aggregate functions (row level)
Click here to invoke the Expression Editor
49
Expression Editor
An expression formula is a calculation or conditional statement Used in Expression, Aggregator, Rank, Filter, Router, Update Strategy Performs calculation based on ports, functions, operators, variables,
literals, constants and return values from other transformations
50
Character Functions
Used to manipulate character data
CHRCODE returns the numeric value (ASCII or Unicode) of the first character of the string passed to this function
ASCIICHRCHRCODECONCATINITCAPINSTRLENGTHLOWERLPADLTRIMRPADRTRIMSUBSTRUPPERREPLACESTRREPLACECHR
Informatica Functions - Samples
For backwards compatibility only - use || instead
51
Conversion Functions
Used to convert datatypes
Date Functions
Used to round, truncate, or compare dates; extract one part of a date; or perform arithmetic on a date
To pass a string to a date function, first use the TO_DATE function to convert it to an date/time datatype
ADD_TO_DATEDATE_COMPAREDATE_DIFFGET_DATE_PARTLAST_DAYROUND (date)SET_DATE_PARTTO_CHAR (date)TRUNC (date)
TO_CHAR (numeric)TO_DATETO_DECIMALTO_FLOATTO_INTEGERTO_NUMBER
Informatica Functions
52
Numerical Functions
Used to perform mathematical operations on numeric data
ABSCEILCUMEEXPFLOORLNLOGMODMOVINGAVGMOVINGSUMPOWERROUNDSIGNSQRTTRUNC
COSCOSHSINSINHTANTANH
Scientific Functions
Used to calculate geometric values of numeric data
Informatica Functions
53
Informatica Functions
ERRORABORTDECODE
IIF
IIF(Condition,True,False)
ISNULLIS_DATEIS_NUMBERIS_SPACES
Test Functions
Used to test if a lookup result is null Used to validate data
Special Functions
Used to handle specific conditions within a session; search for certain values; test conditional statements
Encoding Functions
Used to encode string values
SOUNDEX
METAPHONE
54
Expression Validation
The Validate or ‘OK’ button in the Expression Editor will:
Parse the current expression• Remote port searching (resolves references to ports in
other transformations)
Parse transformation attributes • e.g. - filter condition, lookup condition, SQL Query
Parse default values
Check spelling, correct number of arguments in functions, other syntactical errors
55
Variable Ports
Use to simplify complex expressions• e.g. - create and store a depreciation formula to be
referenced more than once Use in another variable port or an output port expression
Local to the transformation (a variable port cannot also be an input or output port)
Available in the Expression, Aggregator and Rank transformations
56
Informatica Data Types
Transformation datatypes allow mix and match of source and target database types
When connecting ports, native and transformation datatypes must be compatible (or must be explicitly converted)
NATIVE DATATYPES TRANSFORMATION DATATYPES
Specific to the source and target database types
PowerMart / PowerCenter internal datatypes based on ANSI SQL-92
Display in source and target tables within Mapping Designer
Display in transformations within Mapping Designer
Native NativeTransformation
57
Datatype Conversions
All numeric data can be converted to all other numeric datatypes, e.g. - integer, double, and decimal
All numeric data can be converted to string, and vice versa Date can be converted only to date and string, and vice versa Raw (binary) can only be linked to raw Other conversions not listed above are not supported These conversions are implicit; no function is necessary
Integer Decimal Double Char Date Raw
Integer X X X X
Decimal X X X X
Double X X X X
Char X X X X X
Date X X
Raw X
58
Mappings
By the end of this section you will be familiar with:
Mapping components
Source Qualifier transformation
Mapping validation
Data flow rules
System Variables
Mapping Parameters and Variables
59
Mapping Designer
Iconized Mapping
Mapping List
Transformation Toolbar
60
Pre-SQL and Post-SQL Rules
Can use any command that is valid for the database type; no nested comments
Can use Mapping Parameters and Variables in SQL executed against the source
Use a semi-colon (;) to separate multiple statements
Informatica Server ignores semi-colons within single quotes, double quotes or within /* ...*/
To use a semi-colon outside of quotes or comments, ‘escape’ it with a back slash (\)
Workflow Manager does not validate the SQL
61
Data Flow Rules
Each Source Qualifier starts a single data stream (a dataflow)
Transformations can send rows to more than one transformation (split one data flow into multiple pipelines)
Two or more data flows can meet together -- if (and only if) they originate from a common active transformation
Cannot add an active transformation into the mix
Example holds true with Normalizer in lieu of Source Qualifier. Exceptions are: Mapplet Input and Joiner transformations
DISALLOWED
TT
Active
ALLOWED
T
Passive
T
62
Connection Validation
Examples of invalid connections in a Mapping:
Connecting ports with incompatible datatypes Connecting output ports to a Source Connecting a Source to anything but a Source
Qualifier or Normalizer transformation Connecting an output port to an output port or
an input port to another input port Connecting more than one active
transformation to another transformation (invalid dataflow)
63
Mapping Validation
Mappings must:
• Be valid for a Session to run
• Be end-to-end complete and contain valid expressions
• Pass all data flow rules
Mappings are always validated when saved; can be validated without being saved
Output Window will always display reason for invalidity
64
Workflows
By the end of this section, you will be familiar with:
The Workflow Manager GUI interface
Workflow Schedules
Setting up Server Connections
Relational, FTP and External Loader
Creating and configuring Workflows
Workflow properties
Workflow components
Workflow Tasks
65
Workflow Manager Interface
Task Tool Bar
Output Window
Navigator Window
Workspace
Status Bar
Workflow Designer
Tools
66
Workflow Manager Tools
Workflow Designer• Maps the execution order and dependencies of Sessions,
Tasks and Worklets, for the Informatica Server
Task Developer• Create Session, Shell Command and Email tasks • Tasks created in the Task Developer are reusable
Worklet Designer• Creates objects that represent a set of tasks• Worklet objects are reusable
67
Workflow Structure
A Workflow is set of instructions for the Informatica Server to perform data transformation and load
Combines the logic of Session Tasks, other types of Tasks and Worklets
The simplest Workflow is composed of a Start Task, a Link and one other Task
Start Task
Session Task
Link
68
Workflow Scheduler Objects
Setup reusable schedules to associate with multiple Workflows Used in Workflows and
Session Tasks
69
Server Connections
Configure Server data access connections Used in Session Tasks
Configure:
1. Relational
2. MQ Series
3. FTP
4. Custom
5. External Loader
70
Relational Connections (Native )
Create a relational (database) connection Instructions to the Server to locate relational tables Used in Session Tasks
71
Relational Connection Properties
Define native relational (database) connection
Optional Environment SQL (executed with each use of database connection)
User Name/Password
Database connectivity information
Rollback Segment assignment (optional)
72
FTP Connection
Create an FTP connection Instructions to the Server to ftp flat files Used in Session Tasks
73
External Loader Connection
Create an External Loader connection Instructions to the Server to invoke database bulk loaders Used in Session Tasks
74
Task Developer
Create basic Reusable “building blocks” – to use in any Workflow
Reusable Tasks• Session Set of instructions to execute Mapping
logic• Command Specify OS shell / script command(s) to run
during the Workflow• Email Send email at any point in the Workflow
Session
Command
75
Session Task
Server instructions to runs the logic of ONE specific Mapping
• e.g. - source and target data location specifications, memory allocation, optional Mapping overrides, scheduling, processing and load instructions
Becomes a component of a Workflow (or Worklet)
If configured in the Task Developer, the Session Task is reusable (optional)
76
Command Task
Specify one (or more) Unix shell or DOS (NT, Win2000) commands to run at a specific point in the Workflow
Becomes a component of a Workflow (or Worklet)
If configured in the Task Developer, the Command Task is reusable (optional)
Commands can also be referenced in a Session through the Session “Components” tab as Pre- or Post-Session commands
77
Command Task
78
Additional Workflow Components
Two additional components are Worklets and Links
Worklets are objects that contain a series of Tasks
Links are required to connect objects in a Workflow
79
Developing Workflows
Create a new Workflow in the Workflow Designer
CustomizeWorkflow name
Select a Server
80
Workflow Properties
Customize Workflow Properties
Workflow log displays
Select a Workflow Schedule (optional)
May be reusable or non-reusable
81
Workflows Properties
Define Workflow Variables that canbe used in later Task objects(example: Decision Task)
Create a User-defined Event which can later be used with the Raise Event Task
82
Building Workflow Components
Add Sessions and other Tasks to the Workflow
Connect all Workflow components with Links
Save the Workflow
Start the Workflow Save Start Workflow
Sessions in a Workflow can be independently executed
83
Workflow Designer - Links
Required to connect Workflow Tasks
Can be used to create branches in a Workflow
All links are executed -- unless a link condition is used which makes a link false
Link 2
Link 1 Link 3
84
Session Tasks
After this section, you will be familiar with:
How to create and configure Session Tasks
Session Task properties
Transformation property overrides
Reusable vs. non-reusable Sessions
Session partitions
85
Session Task
Created to execute the logic of a mapping (one mapping only)
Session Tasks can be created in the Task Developer (reusable) or Workflow Developer (Workflow-specific)
Steps to create a Session Task• Select the Session button from the Task Toolbar or• Select menu Tasks | Create
Session Task Bar Icon
86
Session Task - General
87
Session Task - Properties
88
Session Task – Config Object
89
Session Task - Sources
90
Session Task - Targets
91
Session Task - Transformations
Allows overrides of some transformation properties
Does not change the properties in the Mapping
92
Session Task - Partitions
93
Monitor Workflows
By the end of this section you will be familiar with:
The Workflow Monitor GUI interface
Monitoring views
Server monitoring modes
Filtering displayed items
Actions initiated from the Workflow Monitor
Truncating Monitor Logs
94
Monitor Workflows
The Workflow Monitor is the tool for monitoring Workflows and Tasks
Review details about a Workflow or Task in two views• Gantt Chart view• Task view
Gantt Chart viewTask view
95
Monitoring Workflows
Perform operations in the Workflow Monitor• Restart -- restart a Task, Workflow or Worklet• Stop -- stop a Task, Workflow, or Worklet• Abort -- abort a Task, Workflow, or Worklet• Resume -- resume a suspended Workflow after a
failed Task is corrected
View Session and Workflow logs
Abort has a 60 second timeout• If the Server has not completed processing and
committing data during the timeout period, the threads and processes associated with the Session are killed
Stopping a Session Task means the Server stops reading data
96
Monitoring Workflows
Task View Start
CompletionTask Workflow Worklet Time Time
Status BarStart, Stop, Abort, Resume Tasks,Workflows and Worklets
97
Monitor Window Filtering
Task View provides filtering
Monitoring filters can be set using drop down menus
Minimizes items displayed in Task View
Right-click on Session to retrieve the Session Log (from the Server to the local PC Client)
98
Debugger
By the end of this section you will be familiar with:
Creating a Debug Session
Debugger windows & indicators
Debugger functionality and options
Viewing data with the Debugger
Setting and using Breakpoints
Tips for using the Debugger
99
Debugger Features
Debugger is a Wizard driven tool • View source / target data• View transformation data• Set break points and evaluate expressions• Initialize variables• Manually change variable values
Debugger is• Session Driven• Data can be loaded or discarded • Debug environment can be saved for later use
100
Debugger Interface
Debugger windows & indicators
Session Log tab Target Data window
TransformationInstance
Data window
Flashingyellow
SQLindicator
Debugger Modeindicator
Solid yellow arrow Current Transformation
indicator
Debugger Log tab
101
Active TransformationConnected
Ports• All input / output
Specify a Filter condition
Usage• Filter rows from
flat file sources• Single pass source(s)
into multiple targets
Drops rows conditionally
Filter Transformation
102
Aggregator Transformation
Active TransformationConnected
Ports• Mixed• Variables allowed• Group By allowed
Create expressions in output or variable ports
Usage• Standard aggregations
Performs aggregate calculations
103
Informatica Functions
Aggregate Functions
Return summary values for non-null data in selected ports
Use only in Aggregator transformations
Use in output ports only
Calculate a single value (and row) for all records in a group
Only one aggregate function can be nested within an aggregate function
Conditional statements can be used with these functions
AVGCOUNT FIRSTLAST MAXMEDIANMIN PERCENTILESTDDEV SUM VARIANCE
104
Aggregate Expressions
Conditional Aggregate expressions are supported Conditional SUM format: SUM(value, condition)
Aggregate functions are supported only in the Aggregator Transformation
105
Aggregator Properties
Sorted Input Property
Set Aggregator cache sizes (on Informatica Server machine)
Instructs the Aggregator to expect the data to be sorted
106
Sorted Data
The Aggregator can handle sorted or unsorted data
• Sorted data can be aggregated more efficiently, decreasing total processing time
The Server will cache data from each group and release the cached data -- upon reaching the first record of the next group
Data must be sorted according to the order of the Aggregator “Group By” ports
Performance gain will depend upon varying factors
107
Incremental Aggregation
Trigger in Session Properties, Performance Tab
Cache is saved into $PMCacheDir: aggregatorname.DAT aggregatorname.IDX
Upon next run, files are overwritten with new cache information
MTD calculation
Best Practice is to copy these files in case a rerun of data is ever required. Reinitialize when no longer needed, e.g. – at the beginning new month processing
Example: When triggered, PowerCenter Server will save new MTD totals. Upon next run (new totals), Server will subtract old totals; difference will be passed forward
108
Joiner Transformation
By the end of this section you will be familiar with:
When to use a Joiner Transformation
Homogeneous Joins
Heterogeneous Joins
Joiner properties
Joiner Conditions
Nested joins
109
Homogeneous Joins
Joins that can be performed with a SQL SELECT statement:
Source Qualifier contains a SQL join
Tables on same database server (or are synonyms)
Database server does the join “work”
Multiple homogenous tables can be joined
110
Heterogeneous Joins
Joins that cannot be done with a SQL statement:
An Oracle table and a Sybase table
Two Informix tables on different database servers
Two flat files
A flat file and a database table
111
Joiner Transformation
Active TransformationConnected
Ports• All input or input / output• “M” denotes port comes
from master source
Specify the Join condition
Usage• Join two flat files• Join two tables from
different databases• Join a flat file with a
relational table
Performs heterogeneous joins on records from different databases or flat file sources
112
Joiner Conditions
Multiple joinconditionsare supported
113
Joiner Properties
Join types:
• “Normal” (inner)
• Master outer
• Detail outer
• Full outer
Joiner can accept sorted data (configure the join condition to use the sort origin ports)
Set Joiner Cache
114
Mid-Mapping Join
The Joiner does not accept input in the following situations: Both input pipelines begin with the same Source Qualifier Both input pipelines begin with the same Normalizer Both input pipelines begin with the same Joiner Either input pipeline contains an Update Strategy
115
Sorter Transformation
Can sort data from relational tables or flat files
Sort takes place on the Informatica Server machine
Multiple sort keys are supported
The Sorter transformation is often more efficient than a sort performed on a database with an ORDER BY clause
116
Lookup Transformation
By the end of this section you will be familiar with:
Lookup principles
Lookup properties
Lookup conditions
Lookup techniques
Caching considerations
117
How a Lookup Transformation Works
Lookup value(s)
Return value(s)
Lookup transformation
For each Mapping row, one or more port values are looked up in a database table
If a match is found, one or more table values are returned to the Mapping. If no match is found, NULL is returned
118
Lookup Transformation
Looks up values in a database table and provides data to other components in a Mapping
Passive TransformationConnected / Unconnected
Ports• Mixed• “L” denotes Lookup port• “R” denotes port used as a
return value (unconnected Lookup only)
Specify the Lookup Condition
Usage• Get related values• Verify if records exists or
if data has changed
119
Lookup Properties
OverrideLookup SQL option
Native DatabaseConnectionObject name
Togglecaching
120
Additional Lookup Properties
Set cachedirectory
Set Lookup cache sizes
Make cachepersistent
121
Lookup Conditions
Multiple conditions are supported
122
To Cache or not to Cache?
Caching can significantly impact performance
Cached
• Lookup table data is cached locally on the Server
• Mapping rows are looked up against the cache
• Only one SQL SELECT is needed
Uncached
• Each Mapping row needs one SQL SELECT
Rule Of Thumb: Cache if the number (and size) of records in the Lookup table is small relative to the number of mapping rows requiring lookup
123
Target Options
By the end of this section you will be familiar with:
Row type indicators
Row operations at load time
Constraint-based loading considerations
Rejected row handling options
124
Target Properties
Session Task
Select target instance
Row loading operations
Error handling
Properties Tab
125
Constraint-based Loading
pk1
fk1, pk2
fk2
pk1
fk1, pk2
fk2
Example 1 With only One Active source, rows for Targets 1-3 will be loaded properly and maintain referential integrity
Example 2 With Two Active sources, it is not possible to control whether rows for Target 3 will be loaded before or after those for Target 2
Maintains referential integrity in the Targets
The following transformations are ‘Active sources’: Advanced External Procedure, Source Qualifier, Normalizer, Aggregator, Sorter, Joiner, Rank, Mapplet (containing any of the previous transformations)
126
Update Strategy Transformation
By the end of this section you will be familiar with:
Update Strategy functionality
Update Strategy expressions
Refresh strategies
Smart aggregation
127
Update Strategy Transformation
Used to specify how each individual row will be used to update target tables (insert, update, delete, reject)
Active TransformationConnected
Ports• All input / output
Specify the Update Strategy Expression
Usage• Updating Slowly
Changing Dimensions• IIF or DECODE logic
determines how to handle the record
128
Target Refresh Strategies
Single snapshot: Target truncated, new records inserted
Sequential snapshot: new records inserted
Incremental: Only new records are inserted. Records already present in the target are ignored
Incremental with Update: Only new records are inserted. Records already present in the target are updated
129
Router Transformation
Rows sent to multiple filter conditions
Active TransformationConnected
Ports• All input/output• Specify filter conditions
for each Group
Usage• Link source data in one
pass to multiple filter conditions
130
Router Transformation in a Mapping
131
Parameters and Variables
By the end of this section you will understand:
System Variables
Creating Parameters and Variables
Features and advantages
Establishing values for Parameters and Variables
132
System Variables
SESSSTARTTIME
$$$SessStartTime Returns the system date value as a string. Uses system clock on machine hosting Informatica Server• format of the string is database type
dependent• Used in SQL override • Has a constant value
Returns the system date value on the Informatica Server• Used with any function that accepts
transformation date/time data types • Not to be used in a SQL override• Has a constant value
SYSDATE Provides current datetime on the Informatica Server machine• Not a static value
133
Mapping Parameters and Variables
Apply to all transformations within one Mapping
Represent declared values
Variables can change in value during run-time
Parameters remain constant during run-time
Provide increased development flexibility
Defined in Mapping menu
Format is $$VariableName or $$ParameterName
134
Mapping Parameters and Variables
Sample declarations
Declare Variables and Parameters in the Designer Mappings menu
Set the appropriateaggregation type
Set optional Initial Value
User-defined names
135
Functions to Set Mapping Variables
SetCountVariable -- Counts the number of evaluated rows and increments or decrements a mapping variable for each row
SetMaxVariable -- Evaluates the value of a mapping variable to the higher of two values
SetMinVariable -- Evaluates the value of a mapping variable to the lower of two values
SetVariable -- Sets the value of a mapping variable to a specified value
136
Unconnected Lookup
Will be physically “unconnected” from other transformations• There can be NO data flow arrows leading to or from an
unconnected Lookup
Function in the Aggregator calls the unconnected Lookup
Lookup function can be set within any transformation that supports expressions
Lookup data is called from the point in the Mapping that needs it
137
Conditional Lookup Technique
Two requirements: Must be Unconnected (or “function mode”) Lookup Lookup function used within a conditional statement
Conditional statement is evaluated for each row Lookup function is called only under the pre-defined
condition
IIF ( ISNULL(customer_id),:lkp.MYLOOKUP(order_no))
Condition
Lookup function
Row keys (passed to Lookup)
138
Conditional Lookup Advantage
Data lookup is performed only for those rows which require it. Substantial performance can be gained
EXAMPLE: A Mapping will process 500,000 rows. For two percent of those rows (10,000) the item_id value is NULL. Item_ID can be derived from the SKU_NUMB.
IIF ( ISNULL(item_id), :lkp.MYLOOKUP (sku_numb))
Condition (true for 2 percent of all rows)
Lookup (called only when condition is true)
Net savings = 490,000 lookups
139
Connected vs. Unconnected Lookups
CONNECTED LOOKUP UNCONNECTED LOOKUP
Part of the mapping data flow Separate from the mapping data flow
Returns multiple values (by linking output ports to another transformation)
Returns one value (by checking the Return (R) port option for the output port that provides the return value)
Executed for every record passing through the transformation
Only executed when the lookup function is called
More visible, shows where the lookup values are used
Less visible, as the lookup is called from an expression within another transformation
Default values are used Default values are ignored
140
Heterogeneous Targets
By the end of this section you will be familiar with:
Heterogeneous target types
Heterogeneous target limitations
Target conversions
141
Definition: Heterogeneous Targets
Supported target definition types:
Relational database
Flat file
XML
ERP (SAP BW, PeopleSoft, etc.)
A heterogeneous target is where the target types are different or the target database connections are differentwithin a single Session Task
142
Step One: Identify Different Target Types
Oracle table
Flat file
Oracle tableTables are EITHER in two different databases, or require different (schema-specific) connect strings
One target is a flatfile load
143
Step Two: Different Database Connections
The two database connections WILL differ
Flatfile requires separate location information
144
Target Type Override (Conversion)
Example: Mapping has SQL Server target definitions. Session Task can be set to load Oracle tables instead, using an Oracle database connection.
Only the following overrides are supported:
Relational target to flat file target
Relational target to any other relational database type
SAP BW target to a flat file target
CAUTION: If target definition datatypes are not compatible with datatypes in newly selected database type, modify the target definition
145
Mapplet Designer
Mapplet Transformation
Icons
Mapplet Output Transformation
Mapplet Designer Tool
146
Mapplet Advantages
Useful for repetitive tasks / logic
Represents a set of transformations
Mapplets are reusable
Use an ‘instance’ of a Mapplet in a Mapping
Changes to a Mapplet are inherited by all instances
Server expands the Mapplet at runtime
147
Active and Passive Mapplets
Passive Mapplets contain only passive transformations
Active Mapplets contain one or more active transformations
CAUTION: changing a passive Mapplet into an active Mapplet may invalidate Mappings which use that Mapplet
• Do an impact analysis in Repository Manager first
148
Using Active and Passive Mapplets
Multiple Passive Mapplets can populate the same target instance
Multiple Active Mapplets or Active and Passive Mapplets cannot populate the same target instance
Active
Passive
149
Reusable Transformations
By the end of this section you will be familiar with:
Reusable transformation advantages
Reusable transformation rules
Promoting transformations to reusable
Copying reusable transformations
150
Reusable Transformations
Define once - reuse many times
Reusable Transformations• Can be a copy or a shortcut• Edit Ports only in Transformation Developer• Can edit Properties in the mapping• Instances dynamically inherit changes• Be careful: It is possible to invalidate mappings by
changing reusable transformations
Transformations that cannot be made reusable• Source Qualifier• ERP Source Qualifier• Normalizer used to read a Cobol data source
151
Promoting a Transformation to Reusable
Place a check in the “Make reusable” box
This action is not reversible
152
Sequence Generator Transformation
Generates unique keys for any port on a row
Passive TransformationConnected
Ports• Two predefined
output ports, NEXTVAL and CURRVAL
• No input ports allowed
Usage• Generate sequence numbers• Shareable across mappings
153
Sequence Generator Properties
Number of Cached Values
154
Dynamic Lookup
By the end of this section you will be familiar with:
Dynamic lookup theory
Dynamic lookup advantages
Dynamic lookup rules
155
Additional Lookup Cache Options
Dynamic Lookup Cache
• Allows a row to know about the handling of a previous row
Cache File Name Prefix
• Reuse cache by name for another similar business purpose
Recache from Database
• Overrides other settings and Lookup data is refreshed
Make cache persistent
156
Persistent Caches
By default, Lookup caches are not persistent
When Session completes, cache is erased
Cache can be made persistent with the Lookup properties
When Session completes, the persistent cache is stored on server hard disk files
The next time Session runs, cached data is loaded fully or partially into RAM and reused
Can improve performance, but “stale” data may pose a problem
157
Dynamic Lookup Cache Advantages
When the target table is also the Lookup table, cache is changed dynamically as the target load rows are processed in the mapping
New rows to be inserted into the target or for update to the target will affect the dynamic Lookup cache as they are processed
Subsequent rows will know the handling of previous rows
Dynamic Lookup cache and target load rows remain synchronized throughout the Session run
158
Update Dynamic Lookup Cache
NewLookupRow port values• 0 – static lookup, cache is not changed• 1 – insert row to Lookup cache• 2 – update row in Lookup cache
Does NOT change row type
Use the Update Strategy transformation before or after Lookup, to flag rows for insert or update to the target
Ignore NULL Property• Per port• Ignore NULL values from input row and update the cache
using only with non-NULL values from input
159
Example: Dynamic Lookup Configuration
Router Group Filter Condition should be:
NewLookupRow = 1
This allows isolation of insert rows from update rows
160
Concurrent and Sequential Workflows
By the end of this section you will be familiar with:
Concurrent Workflows
Sequential Workflows
Scheduling Workflows
Stopping, aborting, and suspending Tasks and Workflows
161
Multi-Task Workflows - Sequential
Tasks can be run sequentially:
Tasks shows are all Sessions, but they can also be other Tasks, such as Commands, Timer or Email Tasks
162
Multi-Task Workflows - Concurrent
Tasks can be run concurrently:
Tasks shows are all Sessions, but they can also be other Tasks such as Commands, Timer or Email Tasks.
163
Multi-Task Workflows - Combined
Tasks can be run in a combination concurrent and sequential pattern within one Workflow:
Tasks shows are all Sessions, but they can also be other Tasks such as Commands, Timer or Email Tasks
164
Additional Transformations
By the end of this section you will be familiar with:
The Rank transformation
The Normalizer transformation
The Stored Procedure transformation
The External Procedure transformation
The Advanced External Procedure transformation
165
Rank Transformation
Active TransformationConnected
Ports• Mixed• One pre-defined
output port RANKINDEX
• Variables allowed• Group By allowed
Usage• Select top/bottom• Number of records
Filters the top or bottom range of records
166
Normalizer Transformation
Normalizes records from relational or VSAM sources
Active TransformationConnected
Ports• Input / output or output
Usage• Required for VSAM
Source definitions• Normalize flat file or
relational source definitions
• Generate multiple records from one record
167
Normalizer Transformation
Turn one row
YEAR,ACCOUNT,MONTH1,MONTH2,MONTH3, … MONTH121997,Salaries,21000,21000,22000,19000,23000,26000,29000,29000,34000,34000,40000,4500
01997,Benefits,4200,4200,4400,3800,4600,5200,5800,5800,6800,6800,8000,90001997,Expenses,10500,4000,5000,6500,3000,7000,9000,4500,7500,8000,8500,8250
Into multiple rows
168
Stored Procedure Transformation
Calls a database stored procedure
Passive TransformationConnected/Unconnected
Ports• Mixed• “R” denotes port will
return a value from the stored function to the next transformation
Usage• Perform transformation
logic outside PowerMart / PowerCenter
169
External Procedure Transformation (TX)
Calls a passive procedure defined in a dynamic linked library (DLL) or shared library
Passive TransformationConnected/Unconnected
Ports• Mixed• “R” designates return
value port of an unconnected transformation
Usage• Perform transformation
logic outside PowerMart / PowerCenter
Option to allow partitioning
170
Advanced TX Transformation
Calls an active procedure defined in a dynamic linked library (DLL) or shared library
Active TransformationConnected Mode only
Ports• Mixed
Usage• Perform
transformation logic outside PowerMart / PowerCenter
• Sorting, Aggregation
Option to allow partitioning
171
Transaction Control Transformation
Passive TransformationConnected Mode Only
Ports• Input and Output
Properties• Continue• Commit Before• Commit After• Rollback Before• Rollback After
Allows custom commit types (source- or target-based) and user-defined conditional commits
172
Transaction Control Functionality
Commit Types• Target Based Commit - Commit Based on “approximate” number of records
written to target• Source Based Commit – Ensures that a source record is committed in all
targets• User Defined Commit – Uses Transaction Control Transform to specify
commits and rollbacks in the mapping based on conditions
Set the Commit Type (and other specifications) in the Transaction Control Condition
173
Versioning
View Object Version Properties
Track Changes to an Object
Check objects “in” and “out”
Delete or Purge Object version
Apply Labels and Run queries
Deployment Groups
174
Informatica Business Analytics Suite
Custom Built Analytic Solutions
Packaged Analytic Solutions
Modular Plug-&-Play Approach
175
Informatica Warehouses / Marts
Informatica Warehouse ™
Supply ChainHumanResources
FinanceCustomerRelationship
Sales
Marketing
Service
Web
G/L
Receivables
Payables
Profitability
Compensation
Scorecard
Planning
Sourcing
Inventory
Quality
Customer Product Supplier Geography
Organization Time Employee
Common Dimensions
176
Inside the Informatica Warehouse
Business Adapters™ (Extract)• Data Source Connectivity with Minimal
Load• Structural/Functional Knowledge of
Sources
Analytic Bus™ (Transform)
• Transaction consolidation and standardization
• Source independent interface
Warehouse Loader (Load)
• Type I, II slowly changing dimensions• History and changed record tracking
Analytic Data Model
• Industry Best Practice Metrics• Process-centric model & conformed
dimensions
Advanced Calculation Engine
• Pre-aggregations for rapid query response
• Complex calculation metrics (e.g. statistical)
SAP i2 PSFTSEBLORCL Custom
Business Intelligence
Informatica Warehouse™
Analytic Data Model
AdvancedCalculation
Engine
Warehouse Loader ™
Analytic Bus ™
Business Adapters ™
Ex
tra
ct
T
ran
sfo
rm
L
oa
d
177
PowerConnect Products
Family of enterprise software products that allow companies to directly source and integrate ERP, CRM, real-time message queue, mainframe, AS/400, remote data and metadata with other enterprise data
PowerConnect for MQSeries (real time) PowerConnect for TIBCO (real time) PowerConnect for PeopleSoft PowerConnect for SAP R/3 PowerConnect for SAP BW PowerConnect for Siebel PowerConnect for Mainframe PowerConnect for AS/400 PowerConnect for Remote Data PowerConnect SDK
178