Date post: | 26-Mar-2015 |
Category: |
Documents |
Upload: | samuel-sweeney |
View: | 217 times |
Download: | 3 times |
InfoSphere Information Server: Trends & Tactics for Improving Data Quality of your Business Intelligence Solution
Session Abstract
Many organizations struggle with broader user adoption of Business Intelligence and Performance Management due to a lack of trust in their data, and the inability to deliver the breadth, speed and consolidated information perspective necessary to keep pace with the business.
This "how-to" session will discuss how to enhance your existing and planned Cognos initiatives by addressing the need for on-time delivery of trusted information.
Specifically, learn how to leverage the IBM InfoSphere product portfolio, as a foundation for Cognos 8 BI, to immediately address your data quality; real-time information integration and data warehousing challenges to drive more business value.
2
Performance Management Challenges Faced
How to deliver: quality information from fragmented, disparate systems at volume and velocity required by the business?
How to address the diverse needs of everyone in the business with a complete, consistent view of information?
How to establish standards, governance, and breakdown barriers to establish anIT-business partnership
Business Challenge
Information Challenge
Process Challenge
3
Increasing Focus on Data Quality
Businesses are beginning to realize that data quality issues not only cost them time and money, but also inhibit their ability to address core strategic projects
More and more businesses are establishing programs for data quality, to measure and improve the reliability of information
Analysts contend that companies with focused data quality programs will find more opportunities to outperform their peers
4
Why Does this Problem Exist?
Most enterprises are running distinct sales, services, marketing, manufacturing and financial applications, each with it’s own “master” reference data.
No one system is the universally agreed-to system of record.
Enterprise Application Vendors do not guarantee a complete & accurate integrated view – they point to their dependence on the quality of the raw input data
Data quality continues to erode at the point of entry, though it is not a data entry problem
5
Business Drivers for Investment Depend on Data Quality
Empowering risk and compliance initiatives with the information they require
Optimizing Revenue Opportunities by ensuring effective and efficient interactions with customers, partners, and suppliers
Enabling collaborative business processes with consistent and trustworthy information
Reducing the total cost of ownership for maintaining consistent information across the enterprise
6
What is the Impact of Poor Data Quality?
Lost Sales Opportunity
SKU misplaced or hard to find Out of stocks attributed to the store
“Hard” Losses
Lost potential for cross-sell and up-sell (staff not trained or available)
Reduced store visit frequency Abandoned carts (poor service or
excessive queues)
“Soft” Losses
1.5%
1.7%
2-4%
1-3%
1-2%
Total 7.2%- 12%Source: GMA/FMI/CIES 2003 (US grocery), ECR Europe 2003, Lineraires.com, California Management Review, IBM case studies, interviews and IBM Institute for Business Value analysis
7
Data Quality is a Subjective Business Standard
Data = facts used as a basis for decision making suitable for storage on a computer
Quality = the general standard or grade of something
Data Quality = a subjective standard
used to determine if a set of facts is suitable
for a particular business purpose
Relevant?
Accurate?
Valid?
Complete?
Business Purpose
Ultimately, Data Quality = Trust
8
So, What Constitutes Data Quality?
Data is standardized
Data is fit for purpose (conforms to rules)
Each record is unique
View of information is complete
Records are certified against authoritative sources
Lineage is understood
Data quality is measured over time
9
What Do You Need to Establish a Data Quality Program?
A foundation platform that centralizes quality rules and provides auditable data quality
Business-driven, data-centric design environment for data quality rules
An ongoing process for data quality
A way to measure quality over time
Universal deployment of quality rules across all points of entry
Data quality ownership and data governance
Management sponsorship and a corporate mandate for data quality improvement
10
Common Data Problems
Lack of information standards - different formats & structures across different systems
Data surprises in individual fields - data misplaced in the database
Information buried in free-form fields
Data myopia - lack of consistent identifiers inhibit a single view
The redundancy nightmare - duplicate records with a lack of standards
Kate A. Roberts 416 Columbus Ave #2, Boston, Mass 02116
Catherine Roberts Four sixteen Columbus APT2, Boston, MA 02116
Mrs. K. Roberts 416 Columbus Suite #2, Suffolk County 02116
Name Tax ID Telephone
J Smith DBA Lime Cons. 228-02-1975 6173380300Williams & Co. C/O Bill 025-37-1888 415-392-20001st Natl Provident 34-2671434 3380321HP 15 State St. 508-466-1200 Orlando
WING ASSY DRILL 4 HOLE USE 5J868A HEXBOLT 1/4 INCH
WING ASSEMBY, USE 5J868-A HEX BOLT .25” - DRILL FOUR HOLES
USE 4 5J868A BOLTS (HEX .25) - DRILL HOLES FOR EA ON WING ASSEM
RUDER, TAP 6 WHOLES, SECURE W/KL2301 RIVETS (10 CM)
19-84-103 RS232 Cable 6' M-F CandS
CS-89641 6 ft. Cable Male-F, RS232 #87951
C&SUCH6 Male/Female 25 PIN 6 Foot Cable
90328574 IBM 187 N.Pk. Str. Salem NH 0145690328575 I.B.M. Inc. 187 N.Pk. St. Salem NH 0145690238495 Int. Bus. Machines 187 No. Park St Salem NH 0415690233479 International Bus. M. 187 Park Ave Salem NH 0415690233489 Inter-Nation Consults 15 Main Street Andover MA 0234190345672 I.B. Manufacturing Park Blvd. Bostno MA 04106
11
A Platform for Data Quality
12
A Process For Data Quality
Establish Data Quality Ownership & SponsorshipEstablish Data Quality Ownership & Sponsorship
Analyze Source DataAnalyze Source Data
Measure & Baseline Data QualityMeasure & Baseline Data Quality
StandardizeStandardize
Certify & EnrichCertify & Enrich
MatchMatch
Link or SurviveLink or Survive
Re-MeasureRe-Measure
ReportReport
Understanding Data Quality
Enforcing Data Quality Standards
Monitoring Data Quality
13
Analyzes data structure, Quality Controls for Completeness and Validity of data values
Incomplete or Invalid values set by value, range, or reference sources
Consistency checks for data formats
Removes duplicates
Cross-references matching records
Survives a single complete record
Cleanses and enriches data
Understanding and Monitoring Data Quality
Enforcing Data Quality Standards
Data Quality Capabilities
14
Understanding Data Quality: Data Quality Assessment Methodology
Define clear business problem statement
• Increase revenue by cross selling more effectively our services to all clients
• Reduce materials costs by negotiating better prices from our suppliers
• Reduce parts inventory across our manufacturing plants
• Reduce IT costs and improve service levels by consolidating overlapping applications
Over 5 days, our technical experts analyze data that supports your business problem statement
• IBM and customer map issues to relevant data samples
• Agree scope of measures and customer provides data sample: e.g., 4 or 5 key tables and 5-10 key columns
IBM analyzes the data
• Column usage and completeness
• Compliance with business formats
• Variation in standards
• Range and outliers
• Incidence of duplicates
Data Quality Analysis
Business Subject Matter Expert
Data Steward
InfoSphereInformation
Analyzer
15
Understanding Data Quality: Assessment Outcomes
Management report and presentation of findings
• Identify Performance Management project exposures
• Optional follow-on workshops
• Regulatory exposures
Data Discovery
• Quantitative results
• Data completeness and format issues
• Business rule compliance
Data Quality Baseline
• The DQA sets a shared baseline platform for an ongoing data quality improvement initiative (data governance) or tactical remedial project
Case Study: Pharmaceutical company
The Tipping Point – unable to get a consolidated view of data. Report accuracy was suspect.
The Hurdle – marketing and sales data warehouse contained many data quality issues
The Result – using IBM InfoSphere Information Analyzer and IBM InfoSphere QualityStage they reduced development time and their reports now support better targeted marketing
Enforcing Data Quality Standards: Investigation
Parsing:Separating multi-valued fields into individual pieces
123 | St. | Virginia | St.
VirginiaVirginia
Lexical analysis:Determining business significance of individual pieces
Context Sensitive:Identifying various data structures and content
Number Street Alpha Street Type Type
123 | St. | Virginia | St.
House Street Number Street Name Type
123 | St. Virginia | St.
123123 St.St. St.St.
“The instructions for handling the data are inherent within the data itself.”
17
Enforcing Data Quality Standards: Standardization
Input File:
Address Line 1 Address Line 2
639 N MILLS AVENUE ORLANDO, FLA 32803306 W MAIN STR, CUMMING, GA 301303142 WEST CENTRAL AV TOLEDO OH 43606843 HEARD AVE AUGUSTA-GA-309041139 GREENE ST ACCT #1234 AUGUSTA GEORGIA 309014275 OWENS ROAD SUITE 536 EVANS GA 30809
Result File:
House # Dir Str. Name Type Unit No. NYSIIS City SOUNDEX State Zip ACCT#
639 N MILLS AVE MAL ORLANDOO645 FL 32803 306 W MAIN ST MAN CUMMINGC552 GA 30130
3142 W CENTRAL AVE CANTRAL TOLEDO T430 OH 43606
843 HEARD AVE HAD AUGUSTA A223 GA 30904
1139 GREENE ST GRAN AUGUSTA A223 GA 30901 1234
4275 OWENS RD STE 536 ON EVANS E152 GA 30809Results in strongly “typed” fixed fielded standardized data
18
Enforcing Data Quality Standards: Matching
Clerical review
Record linkage
Survivorship
Append/Fix sources
?
Cross-reference
=
19
Lessons Learned and Best Practice
Recruit an executive sponsor
• Signals that the initiative is important
• Assures that funds continue to be available
• Discourages other business units from implementing conflicting projects
Convene a data quality working group
• Assess and report on quality early in the process
• May coincide with implementation teams or data warehousing teams
• Business leads, but IT coordinates and facilitates
• Strive for consensus
Have the business appoint a data quality steward for each business unit
• For business units with large user populations, several stewards are appropriate
20
Summary
Data quality is becoming an increasingly important organizational issue
Improving data quality and ensuring information delivery requires a focused programmatic and varied approach
At the core of any data quality program is a platform capable of providing auditable data quality assessment services
IBM InfoSphere Information Server, InfoSphere Warehouse and Cognos 8 BI delivers informational understanding, ownership and trust
21
How Can IBM Help?
Comprehensive platform for data quality assessment, cleansing and on-going monitoring
Experience and repeatable process for helping organizations set up data quality programs
Domain and industry-specific expertise in establishing repeatable data quality services
Data quality assessment offering to report on existing data quality and establish the business value of a data quality program
Stop by the “Solution Center” for demos of InfoSphere with Cognos 8 BI integration
Contact your Cognos or IBM InfoSphere representative for more information, or visit: www.ibm.com/infosphere
Thank you for your time
© Copyright IBM Corporation 2008 All rights reserved. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. IBM, the IBM logo, Cognos, the Cognos logo, and other IBM products and services are trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service marks of others.