Data architecture is foundational to an information-based operational environment. It is your data architecture that organizes your data assets so they can be leveraged in your business strategy to create real business value. Even though this is important, not all data architectures are used effectively. This webinar describes the use of data architecture as a basic analysis method. Various uses of data architecture to inform, clarify, understand, and resolve aspects of a variety of business problems will be demonstrated. As opposed to showing how to architect data, your presenter Dr. Peter Aiken, will show how to use data architecting to solve business problems. The goal is for you to be able to envision a number of uses for data architectures that will raise the perceived utility of this analysis method in the eyes of the business.
Copyright 2014 by Data Blueprint 1
Welcome: Data Architecture Requirements
Date: May 13, 2014Time: 2:00 PM ETPresented by: Peter Aiken, PhD
Copyright 2014 by Data Blueprint
Two Most Commonly Asked Questions
1. Will I get copies of the slides after the event?
2. Is this being recorded so I can view it afterwards?
2
Copyright 2014 by Data Blueprint 3
Like Us on Facebookwww.facebook.com/
datablueprint Post questions and
commentsFind industry news, insightful
content and event updates.
Join the GroupData Management &
Business IntelligenceAsk questions, gain insights and collaborate with fellow
data management professionals
Get Social With Us!
Live Twitter FeedJoin the conversation!
Follow us: @datablueprint
@paikenAsk questions and submit your comments: #dataed
Copyright 2014 by Data Blueprint
Meet Your Presenter: Dr. Peter Aiken• Internationally recognized data
management thought-leader – 30 years of experience
– Recipient of multiple international awards
– Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS, VCU (vcu.edu)
• (Past) Pres. DAMA International (dama.org)
• 9 books and dozens of articles
• Multi-year immersions with organizations as diverse as the US DoD, Deutsche Bank, Nokia, Wells Fargo, the Commonwealth of Virginia and Walmart
4
Presented by Peter Aiken, Ph.D.
Data Architecture Requirements
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
6
Copyright 2014 by Data Blueprint 7
You can accomplish Advanced Data Practices without becoming proficient in the Basic Data Management Practices however this will:• Take longer• Cost more• Deliver less• Present
greaterrisk
Copyright 2014 by Data Blueprint
Data Management Practices Hierarchy
Basic Data Management Practices
Advanced Data
Practices• MDM• Mining• Big Data• Analytics• Warehousing• SOA
8
Data Program Management
Data Stewardship Data Development
Data Support Operations
Organizational Data Integration
Data Program Coordination
Feedback
DataDevelopment
Copyright 2014 by Data Blueprint
StandardData
Organizational Strategies
Goals
BusinessData
Business Value
Application Models & Designs
Implementation
Direction
Guidance
9
OrganizationalData Integration
DataStewardship
Data SupportOperations
Data Asset Use
IntegratedModels
Leverage data in organizational activities
Data management processes andinfrastructure
Combining multipleassets to produceextra value
Organizational-entity subject area data
integration
Provide reliable data access
Achieve sharing of data within a business area
Organizational DM Practices
Copyright 2014 by Data Blueprint 10
Manage data coherently.
Share data across boundaries.
Assign responsibilities for data.Engineer data delivery systems.
Maintain data availability.
Data Program Coordination
Organizational Data Integration
Data Stewardship Data Development
Data Support Operations
Five Integrated DM Practices
Copyright 2014 by Data Blueprint 11
Data Management Functions DAMA DM BoK & CDMP• Published by DAMA International
– The professional association for Data Managers (40 chapters worldwide)
– DMBoK organized around – Primary data management functions focused
around data delivery to the organization (more at dama.org)
– Organized around several environmental elements
• CDMP– Certified Data Management Professional– DAMA International and ICCP– Membership in a distinct group made up of
your fellow professionals– Recognition for your specialized knowledge in
a choice of 17 specialty areas– Series of 3 exams– For more information, please visit:
• http://www.dama.org/i4a/pages/index.cfm?pageid=3399
• http://iccp.org/certification/designations/cdmp
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
12
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
13
Copyright 2014 by Data Blueprint 14
Copyright 2014 by Data Blueprint 15
Copyright 2014 by Data Blueprint 16
Inspired by: Karen Lopez http://www.information-management.com/newsletters/enterprise_architecture_data_model_ERP_BI-10020246-1.html?pg=2
Data Modeling for Business Value• Goal must be shared IT/business understanding
– No disagreements = insufficient communication
• Data sharing/exchange is largely and highly automated and thus dependent on successful engineering– It is critical to engineer a sound foundation of data modeling basics
(the essence) on which to build advantageous data technologies
• Modeling characteristics change over the course of analysis– Different model instances may be useful to different analytical problems
• Incorporate motivation (purpose statements) in all modeling– Modeling is a problem defining as well as a problem solving activity - both are inherent to
architecture
• Use of modeling is much more important than selection of a specific modeling method
• Models are often living documents– The more easily it adapts to change, the resource utilization
• Models must have modern access/interface/search technologies– Models need to be available in an easily searchable manner
• Utility is paramount– Adding color and diagramming objects customizes models and allows for a more engaging and
enjoyable user review process
Copyright 2014 by Data Blueprint 17
Levels of Abstraction, Completeness and Utility
• Models more downward facing - detail
• Architecture is higher level of abstraction - integration
• In the past architecture attempted to gain complete (perfect) understanding– Not timely
– Not feasible
• Focus instead on architectural components– Governed by a framework
– More immediate utility• http://www.architecturalcomponentsinc.com
Copyright 2014 by Data Blueprint 18
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Data Architecture Management
Copyright 2014 by Data Blueprint 19
Architecture
Architecture is both the process and product of planning, designing and constructing space that reflects functional, social, and aesthetic considerations. A wider definition may comprise all design activity from the macro-level (urban design, landscape architecture) to the micro-level (construction details and furniture). In fact, architecture today may refer to the activity of designing any kind of system and is often used in the IT world.
Copyright 2014 by Data Blueprint 20
Architecture Representation
• Architectures are the symbolic representation of the structure, use and reuse of resources
• Common components are represented using standardized notation
• Are sufficiently detailed to permit both business analysts and technical personnel to separately read the same model, and come away with a common understanding and yet they are developed effectively
Copyright 2014 by Data Blueprint 21
Understanding• A specific definition
– 'Understanding an architecture'
– Documented and articulated as a digital blueprint illustrating the commonalities and interconnections among the architectural components
– Ideally the understanding is shared by systems and humans
Copyright 2013 by Data Blueprint
22
Copyright 2013 by Data Blueprint
healthcare.gov
23
• 55 Contractors!• "Anyone who has written a
line of code or built a system from the ground-up cannot be surprised or even mildly concerned that Healthcare.gov did not work out of the gate,"
Standish Group International Chairman Jim Johnson said in a recent podcast.
• "The real news would have been if it actually did work. The very fact that most of it did work at all is a success in itself."
• Software programmed to access data using traditional data management technologies
• Data components incorporated "big data technologies"http://www.slate.com/articles/technology/bitwise/2013/10/problems_with_healthcare_gov_cronyism_bad_management_and_too_many_cooks.html
Copyright 2014 by Data Blueprint 24
• Process Architecture– Arrangement of inputs -> transformations = value -> outputs– Typical elements: Functions, activities, workflow, events, cycles, products,
procedures
• Systems Architecture– Applications, software components, interfaces, projects
• Business Architecture– Goals, strategies, roles, organizational structure, location(s)
• Security Architecture– Arrangement of security controls relation to IT Architecture
• Technical Architecture/Tarchitecture – Relation of software capabilities/technology stack– Structure of the technology infrastructure of an enterprise, solution or system– Typical elements: Networks, hardware, software platforms, standards/protocols
• Data/Information Architecture– Arrangement of data assets supporting organizational strategy – Typical elements: specifications expressed as entities, relationships, attributes,
definitions, values, vocabularies
Typically Managed Architectures
Copyright 2014 by Data Blueprint
Information Architectures• The underlying (information) design principals upon
which construction is based– Source: http://architecturepractitioner.blogspot.com/
• … are plans, guiding the transformation of strategic organizational information needs into specific information systems development projects
– Source: Internet• A framework providing a structured description of an
enterprise’s information assets — including structured data and unstructured or semistructured content — and the relationship of those assets to business processes, business management, and IT systems.
– Source: Gene Leganza, Forrester 2009• "Information architecture is a foundation discipline
describing the theory, principles, guidelines, standards, conventions, and factors for managing information as a resource. It produces drawings, charts, plans, documents, designs, blueprints, and templates, helping everyone make efficient, effective, productive and innovative use of all types of information."
– Source: Information First by Roger & Elaine Evernden, 2003 ISBN 0 7506 5858 4 p.1.
• Defining the data needs of the enterprise and designing the master blueprints to meet those needs
– Source: DM BoK
25
Copyright 2014 by Data Blueprint 26
Illustration by murdock23 @ http://designfestival.com/information-architecture-as-part-of-the-web-design-process/
What do you use an information architecture for?
Copyright 2014 by Data Blueprint
Data Architecture – Better Definition
27
• All organizations have information architectures– Some are better understood and
documented (and therefore more useful to the organization) than others.
• Common vocabulary expressing integrated requirements ensuring that data assets are stored, arranged, managed, and used in systems in support of organizational strategy [Aiken 2010]
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
28
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
29
Copyright 2014 by Data Blueprint
Vocabulary is Important-Tank, Tanks, Tankers, Tanked
30
Copyright 2014 by Data Blueprint
How one inventory item proliferates data throughout the chain
31
555 Subassemblies & subcomponents
17,659 Repair parts or Consumables
System 1:18,214 Total items75 Attributes/ item
1,366,050 Total attributes
System 247 Total items
15+ Attributes/item720 Total attributes
System 316,594 Total items73 Attributes/item
1,211,362 Total attributes
System 48,535 Total items16 Attributes/item
136,560 Total attributes
System 515,959 Total items22 Attributes/item
351,098 Total attributes
Total for the five systems show above:59,350 Items
179 Unique attributes3,065,790 values
Copyright 2014 by Data Blueprint 32
• Generates unnecessary costs & negative impacts on operations, including:– Resources are focused on non-value added tasks of maintaining obsolete inventory,
which creates distractions to the agency’s main mission
• Storage– Physical/real estate needed to house items
• Handling– Includes transportation and human resources
dedicated to moving, maintaining, counting and securing outdated inventory
• Opportunity– Inventory could be returned to manufacturer or
sold to free up financial assets for more needed and critical supplies
• Systemic– Cost of inventorying information and maintaing
paper or electronic records which should be used to support mission-critical acquisitions and distribution
• Maintenance– Repairing of expired items
Business Value: Agency units are carrying $1.5 billion worth of expired inventory
Copyright 2014 by Data Blueprint 33
Would you build a house without an architecture sketch?
Model is the sketch of the system to be built in a project.
Would you like to have an estimate how much your new house is going to cost?
Your model gives you a very good idea of how demanding the implementation work is going to be!
If you hired a set of constructors from all over the world to build your house, would you like them to have a common language?
Model is the common language for the project team.
Would you like to verify the proposals of the construction team before the work gets started?
Models can be reviewed before thousands of hours of implementation work will be done.
If it was a great house, would you like to build something rather similar again, in another place?
It is possible to implement the system to various platforms using the same model.
Would you drill into a wall of your house without a map of the plumbing and electric lines?
Models document the system built in a project. This makes life easier for the support and maintenance!
Why Architectural Models?
Copyright 2014 by Data Blueprint 34
Architecture Example
Copyright 2014 by Data Blueprint 35
Poor Quality Foundation
Copyright 2014 by Data Blueprint 36
What they think they are purchasing!
Copyright 2014 by Data Blueprint 37
Context Diagrams Show System Boundaries
Copyright 2014 by Data Blueprint 38
Too Much Detail
Copyright 2014 by Data Blueprint 39
Web Developers Understand IAhttp://www.jeffkerndesign.com
Copyright 2014 by Data Blueprint 40
Web Developers Understand IAhttp://www.jeffkerndesign.com
Copyright 2014 by Data Blueprint 41
Program F
Program E
Program DProgram G
Program H
Program I
Applicationdomain 2Application
domain 3
Database Architecture Focus
databasearchitecture
engineeringeffort
Data
DataData
Data
Data Data
Data
Focus of asoftware
architectureengineering
effort Program A
Program B
Program C
Program F
Program E
Program DProgram G
Program H
Program I
Applicationdomain 1
Applicationdomain 2Application
domain 3
Data
Focus of a
Data
Data
Copyright 2014 by Data Blueprint 42
Data Architecture Focus has Greater Potential Business Value• Broader focus
than either software architecture or database architecture
• Analysis scope is on the system wide use of data
• Problems caused by data exchange or interface problems
• Architectural goals more strategic than operational
Copyright 2013 by Data Blueprint
Data Data
Data
Information
Fact Meaning
Request
Strategic Information Use: Prerequisites
[Built on definitions from Dan Appleton 1983]
Intelligence
Strategic Use
1. Each FACT combines with one or more MEANINGS. 2. Each specific FACT and MEANING combination is referred to as a DATUM. 3. An INFORMATION is one or more DATA that are returned in response to a specific REQUEST 4. INFORMATION REUSE is enabled when one FACT is combined with more than one
MEANING.5. INTELLIGENCE is INFORMATION associated with its STRATEGIC USES.6. DATA/INFORMATION must formally arranged into an ARCHITECTURE.
Wisdom & knowledge are often used synonymously
Data
Data
Data Data
43
Copyright 2014 by Data Blueprint 44
A B
C D
A B
C D
A
D
C
B
How are data structures expressed as architectures?
• Details are organized into larger components
• Larger components are organized into models
• Models are organized into architectures
Copyright 2014 by Data Blueprint 45
How are Data Models Expressed as Architectures?• Attributes are organized into entities/objects
– Attributes are characteristics of "things"– Entitles/objects are "things" whose information is
managed in support of strategy– Examples
• Entities/objects are organized into models– Combinations of attributes and entities are
structured to represent information requirements– Poorly structured data, constrains organizational
information delivery capabilities– Examples
• Models are organized into architectures– When building new systems, architectures are
used to plan development– More often, data managers do not know what
existing architectures are and - therefore - cannot make use of them in support of strategy implementation
– Why no examples?
More Granular
More Abstract
Copyright 2014 by Data Blueprint 46
Architectures Comprise a Network of Networks
Copyright 2014 by Data Blueprint 47
How do data structures support organizational strategy?• Consider the opposite question?
– Were your systems explicitly designed to be integrated or otherwise work together?
– If not then what is the likelihood that they will work well together?
– In all likelihood your organization is spending between 20-40% of its IT budget compensating for poor data structure integration
– They cannot be helpful as long as their structure is unknown
• Two answers– Achieving efficiency and
effectiveness goals
– Providing organizational dexterity for rapid implementation
Computers
Human resources
Communication facilities
Software
Managementresponsibilities
Policies,directives,and rules
Data
Copyright 2014 by Data Blueprint 48
What Questions Can Architectures Address?• How and why do the
components interact?• Where do they go?• When are they needed?• Why and how will the
changes be implemented?
• What should be managed organization-wide and what should be managed locally?
• What standards should be adopted?
• What vendors should be chosen?
• What rules should govern the decisions?
• What policies should guide the process?
! ! ! !
Copyright 2014 by Data Blueprint 49
Organizational Needs
become instantiated and integrated into an Data/Information
Architecture
Informa(on)System)Requirements
authorizes and articulates sa
tisfy
spe
cific
org
aniz
atio
nal n
eeds
Data Architectures produce and are made up of information models that are developed in response to organizational needs
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
50
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
51
Copyright 2014 by Data Blueprint 52
Less ROT
Technologies
Process
People
Data Leverage
• Permits organizations to better manage their sole non-depleteable, non-degrading, durable, strategic asset - data– within the organization, and – with organizational data exchange partners
• Leverage – Obtained by implementation of data-centric technologies, processes, and human skill
sets– Increased by elimination of data ROT (redundant, obsolete, or trivial)
• The bigger the organization, the greater potential leverage exists
• Treating data more asset-like simultaneously 1. lowers organizational IT costs and 2. increases organizational knowledge worker productivity
Copyright 2014 by Data Blueprint 53
Conceptual Logical Physical
Validated
Not Validated
Architecture Evolution Framework
Every change can be mapped to a transformation in this framework!
Copyright 2013 by Data Blueprint
Application-Centric Development
Original articulation from Doug Bagley @ Walmart
54
Data/Information
Network/Infrastructure
Systems/Applications
Goals/Objectives
Strategy• In support of strategy, organizations develop specific goals/objectives
• The goals/objectives drive the development of specific systems/applications
• Development of systems/applications leads to network/infrastructure requirements
• Data/information are typically considered after the systems/applications and network/infrastructure have been articulated
• Problems with this approach:– Ensures data is formed to the applications and
not around the organizational-wide information requirements
– Process are narrowly formed around applications
– Very little data reuse is possible
Copyright 2014 by Data Blueprint
Data-Centric Development
Original articulation from Doug Bagley @ Walmart
55
Systems/Applications
Network/Infrastructure
Data/Information
Goals/Objectives
Strategy• In support of strategy, the organization develops specific goals/objectives
• The goals/objectives drive the development of specific data/information assets with an eye to organization-wide usage
• Network/infrastructure components are developed supporting organizational data use
• Development of systems/applications is derived from the data/network architecture
• Advantages of this approach:– Data/information assets are developed from an
organization-wide perspective– Systems support organizational data needs
and compliment organizational process flows – Maximum data/information reuse
Copyright 2014 by Data Blueprint
Why is Data Architecture Important?• Poorly understood
– Data architecture asset value is not well understood
• Inarticulately explained– Little opportunity to obtain learning and
experience• Indirectly experienced
– Cost organizations millions each year in productivity, redundant and siloed efforts
– Example: Poorly thought out software purchases
56
Copyright 2014 by Data Blueprint 57
Architectural Work ProductComponents may be defined as:
• The intersection of common business functionality and the subsets of the organizational technology and data architectures used to implement that functionality
• Component definition is an important activity because CM2 component engineering is focused on an entire component as an analysis unit. A concrete example of a component might be
– The business processes, the technology and the data supporting organizational human resource benefits operations. This same component could be described simply as the "PeopleSoft™ version 7.5 benefits module implemented on Windows 95." illustrates the integration of the three primary PeopleSoft metadata structures describing the: business processes used to organization the work flow, menu navigation required to access system functionality, and data which when combined with meanings provided by the panels provided information to the knowledge workers.
Copyright 2014 by Data Blueprint 58
Engineering Standards
Copyright 2014 by Data Blueprint
SystemProcess
Process2
Process1
Process3
Subprocess1.1
Subprocess1.2
Subprocess1.3
59
Hierarchical System Functional Decomposition
Copyright 2014 by Data Blueprint
Level 1 Level 2 Level 3Pay Employment Recruitmentand Selectionpersonnel Personnel Employee relations
administration Employee compensation changesSalary planningClassification and payJob evaluationBenefits administrationHealth insurance plansF lexible spending accountsGroup life insurance
Retirement plansPayroll Payroll administration
Payroll processingPayroll interfaces
Development N/ATrainingadministration
Career planning and skillsinventoryWork group activities
Health andsafety
Accidents and workerscompensationHealth and safety programs
A three-level decomposition of the model views from the governmental pay and personnel scenario
60
Copyright 2014 by Data Blueprint
H ealth car e system1 Patient administration 1.1 R egistration1.2 Admission1.3 Disposition1.4 Transfer1.5 M edical record1.6 Administration1.7 Patient bi l l ing1.8 Patient affairs1.9 Patient management2 Patient appointments
and sche d ul ing 2.1 Create or maintain
schedules2.2 Appoint patients2.3 R ecord patient encounter2.4 I dentify patient2.5 I dentify health care
provider3 Nursing 3.1 Patient care3.2 Unit management4 Laboratory 4.1 R esults reporting4.2 Specimen processing4.3 R esult entry processing4.4 Laboratory management4.5 Workload support5 Pharmacy 5.1 Unit dose dispensing5.2 Control led Drug
I nventory5.3 Outpatient
6 R adiology 6.1 Schedul ing6.2 E xam processing6.3 E xam reporting6.4 Special interest and
teaching6.5 R adiology workload
reporting7 C l inical dietetics 7.1 E stabl ish parameters7.2 R eceive diet orders8 Order entry and r e sults 8.1 R eporting8.2 E nter and maintain
orders8.3 Obtain results8.4 R eview patient
information8.5 C l inical desktop9 System management 9.1 Logon and security
management9.2 Archive run
M anagement9.3 Communication software9.4 M anagement9.5 Site management10 Faci l ity qual ity assurance 10.1 Provider credential ing10.2 M onitor and evaluation
A relatively complex model
view decomposition
61
Copyright 2014 by Data Blueprint
DSS
"Governors"
Taxpayers Clients
Vendors Program Deliver
62
Data model is comprised of model views
DSS Strategic Data Model Taxpayer view Client view Governance view Program Delivery view Vendor view
Copyright 2014 by Data Blueprint
Taxpayer viewPayments Taxpayers
SocialServicePrograms
TaxpayerBenefits
63
Copyright 2014 by Data Blueprint
Client viewPayments
Clients ClientBenefits
LocalWellfareAgencies
64
Copyright 2014 by Data Blueprint
Governance viewPayments
SocialServicePrograms
GovernmentalResources
Governance Governments
State Boardof SocialServices
PolicyApproval
65
Copyright 2014 by Data Blueprint
SocialServicePrograms
Clients
ServiceDeliveryPartners
LocalWellfareAgencies
66
Program Delivery view
Copyright 2014 by Data Blueprint
Payments
SocialServicePrograms
Clients
LocalWellfareAgencies
GoodsandServices
Vendors
67
Vendor view
Copyright 2014 by Data Blueprint
GovernmentalResources
Governance Governments Payments Taxpayers
State Boardof SocialServices
SocialServicePrograms
Clients ClientBenefits
TaxpayerBenefits
PolicyApproval
ServiceDeliveryPartners
LocalWellfareAgencies
GoodsandServices
Vendors
68
DSS Strategic Level Data Model
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
69
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
70
Copyright 2014 by Data Blueprint 71
Challenge
Package Implementation Example• "Green screen" legacy system to be replaced with Windows Icons
Mice Pointers (WIMP) interface; and• Major changes to operational processes
– 1 screen to 23 screens
• Management didn't think workforce could adjust to simultaneous changes– Question: "How big a change will it be to replace all instances of
person_identifier with social_security_number?"
• Answer: – (from "big" consultants) "Not a very big change."
Copyright 2014 by Data Blueprint
Home Page
Business Process Name
Business Process Component
Business Process Component Step
72
PeopleSoft Process Metadata
Home Page Name
(relates to one or more)
Business Process Name
(relates to one or more)
Business Process Component Name
(relates to one or more)
Business Process Component Step Name
Copyright 2014 by Data Blueprint 73Example Query Outputs
Home Page NameBusiness Process NameBusiness Process Component NameBusiness Process Component Step Name
Peoplesoft Metadata Structure
Copyright 2014 by Data Blueprint
processes(39)
homepages(7)
menugroups(8)
components(180)
stepnames(822)
menunames(86)
panels(1421)
menuitems(1149)
menubars(31)
fields(7073)
records(2706)
parents(264)
reports(347)
children(647)
(41) (8)
(182)
(847)
(949)
(86)
(281)
(1259)(1916)
(5873)(264)
(647)(708)(647)
(25906)
(347)
74
Peop
leso
ft M
etad
ata
Stru
ctur
e
QuantitySystem Component
Time to make change Labor Hours
1,400 Panels 15 minutes 350
1,500 Tables 15 minutes 375
984 Business process component steps 15 minutes 246
Total 971
X $200/hour $194,200
X 5 upgrades $1,000,000
Copyright 2014 by Data Blueprint 75
Business Value - Better Decisions
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
76
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
77
Copyright 2014 by Data Blueprint 78
A National Cancer Institute• This Virginia cancer center is a
leader in shaping the fight against cancer
• Over 500 researchers and staff tend to over 12,000 patients annually
• This requires robust information management and analytical services
• The problem: It takes 1 month to run a report on an incident, i.e. a patient’s hospital visit that shows all touch points
Copyright 2014 by Data Blueprint
Other Departments
SQLSQLSAS
Cancer Registry
ClaimsDatabase
File Export
Physician Invoices
Patient(Hospital)
Patient(Physician)
Patient(Registry)
Billing Data(Hospital)
Billing Data(Physician)
Diagnoses(Hospital)
Diagnoses(Physician)
Diagnoses(Registry)
Physicians(Hospital)
Physicians(Physician)
Access
SQL
SQL
SAS
SQL
Excel
Excel
Hospital Claims Text
Files FTP FTP
Text Files
FTP orEmail
WordWordWord
Current State Assessment
Copyright 2014 by Data Blueprint
Other Departments
SSIS
Cancer Registry
Hospital Claims
Staging
SSIS
Physician Invoices
PatientDemographics
Billing Data(Hospital)
Billing Data(Physician)
Diagnoses(Hospital)
Diagnoses(Physician)
Diagnoses(Registry)
Physicians(Hospital)
Physicians(Physician)
SSIS
SSIS
Consolidated/Sandbox
SSIS SSAS
Patient(Consolidated)
RPT
Physicians(Consolidated)
Diagnoses(Consolidated)
SSRS
SharePoint
Excel
One-off reports
Reusable reports
Conceptual Target Architecture
0
25
50
75
100
Current Improved
Copyright 2013 by Data Blueprint
Reversing The Measures
• Currently:– Analysts spend 80% of their time manipulating data and 20% of their time
analyzing data– Hidden productivity bottlenecks
• After rearchitecting:– Analysts spend less time manipulating data and more of their time analyzing data– Significant improvements in knowledge worker productivity
81
Manipulation Analysis
A 20% improvement results in a doubling of productivity!
Copyright 2013 by Data Blueprint
Results: It is not always about money• Solution:
– Integrate multiple databases into one to create holistic view of data
– Automation of manual process
• Results:– Data is passed safely and effectively– Eliminate inconsistencies,
redundancies, and corruption– Ability to cross-analyze– Significantly reduced turnaround time
for matching patients with potential donor -> increased potential to make life-saving connection in a manner that is faster, safer and more reliable
– Increased safe matches from 3 out of 10 to 6 out of 10
82
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
83
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
84
Copyright 2014 by Data Blueprint
EngineeringArchitecture
85
Engineering/Architecting Relationship• Architecting is used to
create and build systems too complex to be treated by engineering analysis alone
• Architects require technical details as the exception
• Engineers develop the technical designs
• Craftsman deliver components supervised by:– Building Contractor– Manufacturer
USS Midway & Pancakes
Copyright 2014 by Data Blueprint 86
What is this?
• It is tall• It has a clutch• It was built in 1942• It is still in regular use!
Copyright 2014 by Data Blueprint
Improving Data Quality during System Migration
87
• Challenge– Millions of NSN/SKUs
maintained in a catalog– Key and other data stored in
clear text/comment fields– Original suggestion was manual
approach to text extraction– Left the data structuring problem unsolved
• Solution– Proprietary, improvable text extraction process– Converted non-tabular data into tabular data– Saved a minimum of $5 million– Literally person centuries of work
Unmatched Items
Ignorable Items
Items Matched
Week # (% Total) (% Total) (% Total)1 31.47% 1.34% N/A2 21.22% 6.97% N/A3 20.66% 7.49% N/A4 32.48% 11.99% 55.53%… … … …14 9.02% 22.62% 68.36%15 9.06% 22.62% 68.33%16 9.53% 22.62% 67.85%17 9.50% 22.62% 67.88%18 7.46% 22.62% 69.92%
Copyright 2014 by Data Blueprint
Architecture Derived: Diminishing Returns Determination
88
Time needed to review all NSNs once over the life of the project:Time needed to review all NSNs once over the life of the project:NSNs 2,000,000Average time to review & cleanse (in minutes) 5Total Time (in minutes) 10,000,000
Time available per resource over a one year period of time:Time available per resource over a one year period of time:Work weeks in a year 48Work days in a week 5Work hours in a day 7.5Work minutes in a day 450Total Work minutes/year 108,000
Person years required to cleanse each NSN once prior to migration:Person years required to cleanse each NSN once prior to migration:Minutes needed 10,000,000Minutes available person/year 108,000Total Person-Years 92.6
Resource Cost to cleanse NSN's prior to migration:Resource Cost to cleanse NSN's prior to migration:Avg Salary for SME year (not including overhead) $60,000.00Projected Years Required to Cleanse/Total DLA Person Year Saved 93Total Cost to Cleanse/Total DLA Savings to Cleanse NSN's: $5.5 million
Copyright 2014 by Data Blueprint 89
Quantitative Benefits
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
90
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is Data/Information Architecture?
• Why is Data/Information Architecture Important?
• Data Engineering/Leverage
• Example: Software Package Implementation
• Example: Donation Center Processing
• Example: Text Mining/Analytics
• Take Aways, References & Q&A
Copyright 2013 by Data Blueprint
Data Architecture Requirements
91
Copyright 2014 by Data Blueprint 92
Take Aways• What is an information architecture?
– A structure of data-based information assets supporting implementation of organizational strategy
– Most organizations have data assets that are not supportive of strategies - i.e., information architectures that are not helpful
– The really important question is: how can organizations more effectively use their information architectures to support strategy implementation?
• What is meant by use of an information architecture?– Application of data assets towards organizational strategic objectives– Assessed by the maturity of organizational data management practices – Results in increased capabilities, dexterity, and self awareness– Accomplished through use of data-centric development practices (including
taxonomies, stewardship, and repository use)
• How does an organization achieve better use of its information architecture?– Continuous re-development; the starting point isn't the beginning– Information architecture components must typically be reengineered – Using an iterative, incremental approach, typically focusing on one component at a
time and applying formal transformations
June Webinar:Monetizing Data ManagementJune 10, 2014 @ 2:00 PM ET/11:00 AM PT
Sign up here:• www.datablueprint.com/webinar-schedule • www.Dataversity.net
Brought to you by:
Copyright 2014 by Data Blueprint
Upcoming Events
PETER AIKEN WITH JUANITA BILLINGSFOREWORD BY JOHN BOTTEGA
MONETIZINGDATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
Copyright 2014 by Data Blueprint
Questions?
94
+ =