Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | peter-oliver |
View: | 218 times |
Download: | 0 times |
Advanced ETL: Embedding Advanced ETL: Embedding Integration ServicesIntegration Services
Ashvini SharmaAshvini SharmaDevelopment LeadDevelopment LeadDAT411 DAT411 Microsoft Microsoft CorporationCorporation
Sergei IvanovSergei IvanovTechnical LeadTechnical LeadDAT411DAT411Microsoft Microsoft CorporationCorporation
2
PrerequisitesPrerequisites
Knowledge of Integration ServicesKnowledge of Integration Services
Knowledge of Data Flow FunctionalityKnowledge of Data Flow Functionality
Level 400. Really.Level 400. Really.
3
ObjectivesObjectives
Introduction to SSIS programming Introduction to SSIS programming modelmodel
Learn how to integrate with dynamic Learn how to integrate with dynamic metadatametadata
Learn how to utilize data cleansing Learn how to utilize data cleansing functionality in your appsfunctionality in your apps
4
Integration ServicesIntegration Services
5
SSIS TerminologySSIS TerminologyPackagePackage
TasksTasks
Precedence ConstraintsPrecedence Constraints
Connection ManagersConnection Managers
ContainersContainers
Data Flow TaskData Flow TaskComponents – Source Components – Source Adapters, Transformations, Adapters, Transformations, Destination AdaptersDestination Adapters
PathsPaths
6
Application OverviewApplication Overview
Get data from an Excel fileGet data from an Excel file
Provide fuzzy cleansing for certain Provide fuzzy cleansing for certain text fieldstext fields
FirstName, LastName FirstName, LastName
Save cleaned data in another Excel Save cleaned data in another Excel filefile
Look at finished application first, then Look at finished application first, then go through several iterations to build go through several iterations to build itit
7
Application Application
8
SSIS is embeddableSSIS is embeddableSQL Server uses SSISSQL Server uses SSIS
SMOSMO
Maintenance PlansMaintenance Plans
Other (non SQL) products in development are using Other (non SQL) products in development are using SSISSSIS
Writing your own UI is possibleWriting your own UI is possibleSSIS designer, Management Studio, Import/Export Wizard, SSIS designer, Management Studio, Import/Export Wizard, Migration WizardMigration Wizard
Uses Uses nono secret APIssecret APIsEnumerating/adding/removing/changing/listening/Enumerating/adding/removing/changing/listening/scheduling/…scheduling/…
Considering releasing Migration Wizard in Shared SourceConsidering releasing Migration Wizard in Shared Source
Digital signing enables tamper resistanceDigital signing enables tamper resistance
Several customers doing metadata driven package Several customers doing metadata driven package developmentdevelopment
9
Pipeline MetadataPipeline Metadata
Pipeline engine requires static Pipeline engine requires static metadatametadata
Early design decisionEarly design decision
Buffers laid out during pre executeBuffers laid out during pre executeStrict data typesStrict data types
Cannot map columns during executionCannot map columns during execution
Designer debugging expects design time Designer debugging expects design time metadata at execution timemetadata at execution time
Configured (dynamic) queries must Configured (dynamic) queries must resolve to design time metadata at resolve to design time metadata at runtimeruntime
10
Dynamic MetadataDynamic Metadata
ScenariosScenariosSource schema changes/not known until Source schema changes/not known until executionexecution
Metadata driven ETL processesMetadata driven ETL processes
Handling dynamic metadataHandling dynamic metadataGenerate data flows dynamicallyGenerate data flows dynamically
11
Creating PackagesCreating Packages
12
Creating PackagesCreating Packages
From scratch through object modelFrom scratch through object modelCreate all package elements from Create all package elements from scratchscratch
Fast, small, efficientFast, small, efficient
Harder to evolve the applicationHarder to evolve the application
From template packageFrom template packageAdjust only what needs adjusting after Adjust only what needs adjusting after loading the template packageloading the template package
Need to embed potentially large Need to embed potentially large template filetemplate file
Easier to evolve the applicationEasier to evolve the application
Digital signing detects user changesDigital signing detects user changes
13
Components TerminologyComponents TerminologyComponentComponent
InputInputInput Columns (Only data referenced by component)Input Columns (Only data referenced by component)
Virtual Input Columns (All available data produced by Virtual Input Columns (All available data produced by upstream components – used at design time for upstream components – used at design time for selecting input columns)selecting input columns)
External Metadata Columns (Schema snapshot)External Metadata Columns (Schema snapshot)
OutputOutputOutput Columns (Produced data)Output Columns (Produced data)
External Metadata Columns (Schema snapshot)External Metadata Columns (Schema snapshot)
LineageID uniquely identifies a column LineageID uniquely identifies a column Every output column gets a new Lineage IDEvery output column gets a new Lineage ID
Column MappingColumn MappingSources: ExternalColumn<->OutputColumnSources: ExternalColumn<->OutputColumn
Transforms: InputColumn<->OutputColumnTransforms: InputColumn<->OutputColumn
Destinations: InputColumn<->ExternalColumnDestinations: InputColumn<->ExternalColumn
14
Pipeline Programming Pipeline Programming ModelModel ComponentMetadataComponentMetadata
Provided for all Provided for all components by the components by the engine automaticallyengine automatically
Manages metadata and Manages metadata and persistence for the persistence for the componentcomponent
Contact information for Contact information for unregistered unregistered componentscomponents
Helps delay creation of Helps delay creation of components until components until necessarynecessary
Runtime Connection Runtime Connection CollectionCollection
Connection managers Connection managers used by the componentused by the component
ComponentMetaDataComponentMetaData
InputsInputs
OutputOutputss
ComponentComponent
RCCRCC
15
Configuring Data FlowsConfiguring Data Flows
16
Using Fuzzy transformsUsing Fuzzy transforms
17
SSIS As A SourceSSIS As A Source
ETL processes ETL processes typically encode typically encode complex business complex business rulesrules
Reuse is importantReuse is importantOne version of the One version of the truthtruth
Updates in one placeUpdates in one place
Leverage advantages Leverage advantages of SSIS: scalability, of SSIS: scalability, manageability, visual manageability, visual building of complex building of complex processes, etc.processes, etc.
18
SSIS Source SSIS Source ImplementationImplementation
Implements Implements IDbConnectionIDbConnection
ConnectionString is the ConnectionString is the command line args to command line args to dtexec.exedtexec.exe
CommandCommandCommandText is the CommandText is the name of the name of the DataReaderDest DataReaderDest component in packagecomponent in packageExecuteReader runs the ExecuteReader runs the package when asked for package when asked for data, returns IDataReaderdata, returns IDataReader
Supports SchemaOnly Supports SchemaOnly alsoalso
DataReaderDest DataReaderDest implements IDataReaderimplements IDataReaderGets the first buffer and Gets the first buffer and waits for data requestwaits for data request
Microsoft.SqlServer.Dts.DtsClientMicrosoft.SqlServer.Dts.DtsClient Data Reader Destination ComponentData Reader Destination Component
19
Putting it togetherPutting it together
20
SummarySummary
Programming SSIS is straightforward Programming SSIS is straightforward
Several embedding options existSeveral embedding options exist
SSIS can handle flexible metadataSSIS can handle flexible metadata
SSIS provides rich functionality and SSIS provides rich functionality and high performance high performance
21
ResourcesResources
Embedding Reporting and Analysis in your Embedding Reporting and Analysis in your Smart Client AppSmart Client App DAT313 – 502AB 5:00PM DAT313 – 502AB 5:00PM
Samples installed by setupSamples installed by setup
Community site, run by MVPsCommunity site, run by MVPshttp://www.sqlis.comhttp://www.sqlis.com
Interact with product team on MSDN Interact with product team on MSDN ForumsForums
http://forums.microsoft.com/msdn/http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=80 ShowForum.aspx?ForumID=80
Webcasts, training, blog links, books, …Webcasts, training, blog links, books, …http://msdn.microsoft.com/SQL/sqlwarehouse/http://msdn.microsoft.com/SQL/sqlwarehouse/SSIS/default.aspx SSIS/default.aspx
© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.