Date post: | 12-Apr-2017 |
Category: |
Documents |
Upload: | ashish-nachane |
View: | 141 times |
Download: | 4 times |
Enterprise Information Management
Pristine Data Inc. Ashish Nachane
Contents
2
• Introduction • Information Management Paradigm• Information Management Evolution Cycle• Data Categories
• Information Management Initiatives• Data Service• Data Governance• Data Quality• Appendix
• Working Diagrams• Implementation Techniques
Information Management ParadigmIntroduction
Governance• Guiding Principles• Project Management• Enhancement
Prioritization• Business Case
Development• Funding requests• Privacy
Data Quality• Data Profiling• Quality Measurement• Cleanup Initiatives
Content Expertise• Data Content SME’s• Source System SME’s• Report Librarian• Atomic Data SME’s• Metadata Tools
Nomenclature• Common Code Tables• Data Naming Standards• KPI Definition• Element Reference
Directory
Content & UseManagement
Development / Design• Analytic Sandbox• Analytic Pilots• Information Delivery Tool
SME’s• ETL SME’s
Metadata• Conceptual Models• Logical Models• Physical Data Models• Business Definitions
Architecture• Process Blueprints• Technology evaluation• Tool/product Standards• Corporate Standards
Development
Shared Services• Reusable Components• Reusable Infrastructure• Enterprise Licenses• Web Services Stack
Training• Tools• Metadata / Content
Information/Data Domain
Func
tiona
l Are
a
Storage Management• Allocation of Local / SAN
/ NAS Storage• Mapping Importance to
Physical Media• Physical Failover
(RAIDx)• Physical Data Layout
Disaster Recovery• Business Continuity
Planning• Backup / Recovery
Strategy• DR SLA’s
Performance Management
• Query Monitoring• DB Tuning• Usage Monitoring• Data Archive Strategy
Capacity Planning• Physical Storage• Index / Working
Overhead• Growth / Usage
Projections• ETL Staging
ETL Architecture• Integration Paradigms• ETL Best Practices• Shared Process
Management• Common Operational
Controls / Monitoring
Data Acquisition• Data Sourcing / Mapping• Extract Management• Incremental Derivation
Data Transformation• Transformation Rules• Transformation Services• Source / Target Maps
Data Access• Standard Reporting• Ad Hoc Queries• Analytic marts• OLAP• Data Mining
InformationDeliveryOperationsIntegration
Architecture
Security• Stratification• Policies• Access Control
Mechanisms• Compliance Procedures
Infrastructure• Network Connectivity &
Capacity• Storage Mechanisms• Servers
EAI / ESB/ Messaging• Message based
integration• Business Activity
Monitoring
EII• Real-time Integration
3
Information Management Life Cycle
Business operations managed effectively
Information Integration and
Delivery
Basic Data
Management
Enterprise Information
Management
• Enterprise view of data exists• Master Data Management in place• Data definitions and standards exist
across the enterprise• Standardized processes to create
and maintain high-quality data exist• Industry standards-compliant
structure and format of data• 3rd party data integration
• Information created and delivered at the point of operational performance or management control
• Complete information – bad news with the good
• Formal information network predominates
• Collaborative exchange of transactions & information with trading partners
• Analytics applied to operational measurement
• Increased effectiveness of management decision making and faster answers to critical business questions
• Organizational incentives and information processes designed to maximize enterprise value
• Enhanced information, tools and decision-making at every management control point
• Value creation from information used to disrupt the value network
StrategicInformation Capability
Facts >>>> Understanding >>>> Optimization >>>> Innovation
Increasing Executive-Level Commitment
• Data exists in operational silos• Limited analytic capabilities
exist• Information not shared across
functional areas
Business performance analyzed across silos – knowing what happened
Information integrated into operations to optimize performance
Competitive advantages derived from leveraging information to accelerate desired capabilities
Introduction
4
Data CategoriesIntroduction
5
Most organizations today accumulate various categories of data. Each of these data categories have a specific usage within the enterprise.
Enterprise Process ManagementGartner defines Business Process Management as:
A management discipline that treats business processes as assets that directly contribute to enterprise performance by driving operational excellence and business agility.
http://www.gartner.com/it-glossary/business-process-management-bpm/http://www.metacase.com/methods/bpmn.html
An organization’s business is evolving and do the processes that support the business need to evolve. A typical BPM initiative will follow these steps:
• Analyze current processes (‘As is Model’) • Design the future processes (‘To be Model’)• Develop future processes (Workflows,
Business Rules, Process Models etc.)• Execute workflows• Monitor business activity
Software vendors such as IBM, Oracle, Pegasystems, Appian (& many more) offer BPM suites to accomplish the tasks involved in a BPM program. Some of the components offered by these suites are:
• Process Modeling, Simulation• Rules Engines, Forms Designer • Web Services (SOA ), Application Integration• Content Repositories, Document Management• Data and Database access • Business Activity Monitoring, Portal, Analytics
BPM evolution cycle
Information Management Initiatives
6
Service Oriented Architecture
Solaris Windows Linux AIX Mainframes
Middleware
SOA Products .Net MQSeries TIBCO CICS
A service is an action or collection of actions performed in order to provide predictable results. Many such services exists within an organization. These standalone services need to collaborate with each other to provide an organization with a platform to accomplish it’s goals.
This collaboration is made possible by: Ensuring that the services are available They are complete and reliable And can communicate with each other using a common interface (protocol)
Establishing these baseline requirements is a key goal of Service Oriented Architecture.
Application Layer
SAP ORACLE CA In House / Custom Applications
Internal Services External Services
Order Processing
Accounts Receivables
Market Reporting
Outsourced Services
SOA Benefits•Quicker response to changing market conditions•Holistic approach to organization needs•Better business control over IT solutions•Agile and scalable infrastructure•Lower application development costs•De-coupling users from service implementations•Reusability
7
Information Management Initiatives – Enterprise Application Integration
ESB Stack
Enterprise Service BusIn a Service Oriented Architecture, ESB is a software architecture that facilitates interaction between mutually excusive software applications. ESB exploits asynchronous messaging for communicating between applications. An ESB needs to:•Govern message exchange between services•Control deployment and versioning of services•Resolve contention between services•Promote reusability of services•Cater commodity services like – protocol conversion, event handling, data mapping, message and event queuing, exception handling, enforcing service quality
8Messaging – Message Service, Message Routing & Consolidation (MQ Series, MSMQ)
Protocol Conversion – XML, XSL, CORBA, SOAP
Web Services – WSDL, REST, CGI
Special Message Services – Test tools, loop back
Business Application Monitoring
Data Consolidation & Mapping – EDI, MDM, B2B
Application Adapters – RFC, IDoc, XML-RPC
Process Automation – BPEL, Workflow
ESB Architecture assumes that services are autonomous and the availability of a service cannot be guaranteed. Hence the messages need to be buffered continuously. An ESB manages message processing such that it can:•Buffer a message and deliver it as soon as the receiver is ready•Enforce dynamic processing and security policies•Monitor messages and services•Prioritize, delay and reschedule message delivery•Maintain message logs and handle exceptions
ESB does not implement SOA but provides features for SOA implementation. ESB is standards based and flexible. ESB is not always web services based.
Information Management Initiatives – Enterprise Application Integration
Information Management & Data Warehousing Information Management Initiatives – Enterprise Business Intelligence
Data Warehouse (DW): consolidated data storage designed to support an organization’s analytical needsOperational Data Store (ODS): a data storage with near real-time data to support operational needs
Data Mart (DM): a data store designed to support a department's analytical needsExtract Transform load (ETL): process used to extract transactions from different systems, transform the data into usable form
and load to the data warehouse
Business Intelligence (BI): the process of analyzing data based on relationships and trendsKnowledge Management (KM): an approach to organizing information such that it is more available and more valuable
Master Data Management (MDM): an approach to define and manage organization’s non-transactional reference dataExecutive Information Systems (EIS): provide to the minute information to executive management about the organizations operations
On-line Analytical Processing (OLAP): a system designed to provide efficient data analytics by rolling up data into pre-defined aggregates
Data Mining : process and technique for analyzing large amounts of data to derive customer behavior patterns
Dimensions: a table used to store master data associated with a business activity. Such as Contacts, Universities, Alumni, Placements etc.
Facts: a table used to store historical business activity details by dimension. Traditionally fact tables contain multiple rows for the same entity over a period of time.
Aggregates: Database table used to store business activities aggregated / rolled up based on pre-defined criteria.
Data Organization
Data Warehouse Objectives
Data Manipulation Techniques
Data Warehousing Terms
Massively Parallel Processing (MPP): a technique that uses memory distribution via independent nodes to process thousands of rows of data at the same time
Columnar Databases : Data Organization technique that stores data by columns instead of the traditional row based storage for relatively quicker data input as well as output
9
Data Warehouse Components
AcquireExtract data from source systems
ProfileCollect statistics related to source data
CleanseEnsure data integrity
TransformApply business rules to source data
IntegrateConsolidate data from multiple sources
Extract, Transform, Load
LoadMove the data to data storage
EDW Enterprise Wide Single Version of Truth
ODSNear real-time data for operational reporting
DMHistorical view of departmental data
Data Organization
Source Data Contacts Fund Raising Activity Management
Reporting of operational performance metrics
Consolidated Reporting across business units
Departmental / Specific subject area analytics
Periodic Trend Analysis
Organized Adoption of analysis driven decision making
Master Data Management central repository to hold organization’s
reference data
Information Access
Metadata Data Definitions Business Rules Data Standards Operational Metadata Technical Metadata Taxonomy
Information Management Initiatives – Enterprise Business Intelligence
10
Data Service ArchitectureInformation Management Initiatives – Enterprise Business Intelligence
11
Data GovernanceData governance is the exercise of authority and control over the management of data assets. The goals of a data governance initiative are:•To define, approve, and communicate data strategies, policies, standards, architecture, procedures and metrics•Enforce regulatory compliance and conformance to data policies, standards, architectures and procedures•To sponsor, track and oversee the delivery of data management projects and services•To manage and resolve data related issues•To understand and promote the values of data assets
12
Activities:1.Data Management Planning
1. Understand Strategic Enterprise Data Needs2. Develop & Maintain Data Strategy3. Establish Data Professional Roles & Operations4. Identify & Appoint Data Stewarts 5. Establish Data Governance Organization6. Develop & Approve Data Policies, Standards & Procedures7. Review and Approve Data Architecture8. Plan & Sponsor Data Management Projects and Services9. Estimate Data Asset Values and Associated Costs
2.Data Management Control1. Supervise Data Professional Organization & Staff2. Coordinate Data Governance Activities3. Manage & Resolve Data Related Issues4. Monitor & Ensure Data Regulatory Compliance 5. Enforce Conformance With Data Policies & Standards 6. Oversee Data Management Projects and Services7. Communicate and Promote the value of Data Assets
Suppliers:•Business Executives•IT Executives•Data Stewards•Regulatory Bodies
Inputs:•Business Goals•Business Strategies•IT Objectives•IT Strategies•Data Needs•Data Issues•Regulatory Requirements
Outputs:•Data Policies•Data Standards•Resolved Issues•Data Mgmt. Projects & Services•Quality Data & Information •Recognized Data Values
Consumers:•Data Producers•Knowledge Workers•Managers & Executives•Data Professionals •Customers
Metrics:•Data Value•Data Management Cost•Achievement of Objective•#Meetings Held, Decisions Made•Steward Representation•Data Professional Headcount
Participants:•Executive Data Stewards•Coordinating Data Stewards•Business Data Stewards•Data Professionals•DM Leader•CIO
Tools:•Email•Personal Productivity Tools•Internet and Other Resources
Data Governance
Data Quality
Data Quality Phases
An organization’s data quality is determined based on:•How accurate the data is•The Completeness of the data •The timeliness of the overall data availability
The data quality improvement is achieved based on following phases:•Quality Assessment – this phase determines the current state of the data quality. This phase also involves loading the source data and profiling the data. During this phase it is easy to identify redundancies and outliers
•Design – this phase is used to design the quality process. The relationships between objects is finalized during this phase
•Transformation – in case there are any changes / updates required to be done on the source data, they are done during this phase
•Monitoring - Data monitoring is the process of examining data over time and sending alerts when the data violates any business rules that are set
13
Contents
14
• Introduction • Information Management Paradigm• Information Management Evolution Cycle
• Information Management Initiatives• Data Governance• Appendix
• Working Diagrams• Implementation Techniques
Data Warehouse Architecture
Techniques to implement Data Warehouses
• Data consolidated from multiple sources and loaded to a staging area in a single batch usually during off –peak hours • Business Rules are applied to the data in the staging area • Transformed data is loaded to common data repository • Reporting applications reference the common data repository for historical reporting
• Data consolidated from multiple sources and loaded directly to a common data storage multiple times during the day• Business Rules are applied to the data while the is loaded to the common data storage • Data loaded into a database structure that is very similar to the transactional data structure • Reporting applications reference the common data repository but also have to apply rules in order to present aggregated data
Appendix
• Data consolidated from multiple sources and loaded to a staging area in a single batch usually during off –peak hours • Business Rules are applied to the data in the staging area • Reference data loaded specific reference tables (dimensions) • Transformed transaction data loaded to subject data repository • Reporting applications reference the historical transaction data though common reference data repository. This presents a conformed view of the Reference Data for the Organization
Drivers for BI / Data WarehouseOperational Efficiency
Required activities
Definebusiness
value
Business Intelligence
Decision Support Simulation
Managerial Reporting
Defineinfomgmt goals
Define current state
Competitive Advantage
Effective Data Management
Governance
Key Performance Metrics
Goals
Source Analysis Processes
While Business Intelligence is often the main driver for a Data Warehouse, data warehousing also supports higher levels of collaboration by providing a single version of the
truth for shared data
Data warehousing reduces IT cost by reducing the:Costs associated with information collection, consolidation, and disseminationEffort required for report developmentMaintenance of transactional systemsDisruption caused by ad-hoc information requests
When information assets are well-managed, the business can be better positioned to focus on:Improving customer satisfactionImproving delivery of products and servicesMeeting increasing demands for regulatory reportingManaging risk across the organizationBuilding repeatable processes and platforms
Information Management Initiatives – Enterprise Business Intelligence
17
Enterprise Data Warehouse – Conceptual (WIP)