Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | stewart-reynolds |
View: | 215 times |
Download: | 0 times |
1Statistics South AfricaCase Study - ESMDF Project
Data Management and Information Delivery (DMID)
Case Study
Sibongile MadonselaMatile MalimabeBubele Vakalisa
Statistics South Africa
UNECE Workshop on the Common Metadata FrameworkVienna, Austria, July 4-7, 2007
3Statistics South AfricaCase Study - ESMDF Project
Introduction
Programme Providing Frame for Stats SA Projects Providing Relevant Statistical Information to meet user Needs Enhancing the Quality of Products and Services Developing and Promoting Statistical Coordination and Partnerships Building Human Capacity
This project is aimed at supporting the strategic theme “Enhancing the Quality of Products and Services”. Within the DMID, the metadata management system addresses this strategic theme
Overall Project Objective Metadata management system forms part of the organisation’s broader objective to continuously improve the quality of its products The Survey metadata tool consist of elements for providing the overall description of a statistical survey The survey metadata component is fashioned along the lines of Statistics Canada’s Integrated Metadata Database (IMDB) Metastat
4Statistics South AfricaCase Study - ESMDF Project
Organisation ChartStats SA and current projects
The Data Management and Information Delivery (DMID) project (magenta shaded box) is located within the Data Management and Technology Division
The yellow shaded boxes indicate some of the ongoing projects that are concurrent with the DMID project.
Statistician-General
Economic Statistics Population and Social Statistics
Quality and Integration
Statistical Support and Informatics
Corporate Services
Population Census
Social Statistics
Health and Vital Statistics
Industry and Trade Statistics
Employment and Price Statistics
Financial Statistics
Geography
System of Registers
Data Management and Technology
Statistical Information Services
Provincial Coordination
National Statistics System Division
Methodologies and Standards
Integrative Analysis
National Accounts
Finance and Provisioning
Human Resource Management
SG Support and Strategic Planning
Facilities Management, Security and Logistics
Human Capacity Development
Programme Office
Internal Audit
Data Management and Information Delivery Project
Census Comm. Project
Labour Force Survey Re-engineering Project
CPI ProjectCensus 2011 Project
SAS 9 Migration Project
Statistician-General
Economic Statistics Population and Social Statistics
Quality and Integration
Statistical Support and Informatics
Corporate Services
Population Census
Social Statistics
Health and Vital Statistics
Industry and Trade Statistics
Employment and Price Statistics
Financial Statistics
Geography
System of Registers
Data Management and Technology
Statistical Information Services
Provincial Coordination
National Statistics System Division
Methodologies and Standards
Integrative Analysis
National Accounts
Finance and Provisioning
Human Resource Management
SG Support and Strategic Planning
Facilities Management, Security and Logistics
Human Capacity Development
Programme Office
Internal Audit
Data Management and Information Delivery Project
Census Comm. Project
Labour Force Survey Re-engineering Project
CPI ProjectCensus 2011 Project
SAS 9 Migration Project
5Statistics South AfricaCase Study - ESMDF Project
Organisation ChartDMID project, including supplier’s resources
Prescient Business Technologies (PBT) - the supplier to the DMID project, developing the ESDMF System ESDMF – End to end Statistical Data Management Facility PM – Project Manager
6Statistics South AfricaCase Study - ESMDF Project
DMID Project Structure at a high level
Standards Development and Implementation Led by Chief Standards Officer Develop policies, standards and procedures before components of the facility can be implemented, using the Standard Lifecycle For Phase One, the policies for Data Quality and Metadata were implemented For future phases, related policies will be developed
End to end Statistical Data Management Facility (ESDMF) Led by Technical Lead/Project Manager Uses policies developed by the standard team to generate requirements for the system by using software technologies to implement the system
7Statistics South AfricaCase Study - ESMDF Project
Standards Life Cycle
Develop policies, standards and procedures before components of the facility can be implemented, using the Standard Lifecycle
8Statistics South AfricaCase Study - ESMDF Project
Conceptual components of the ESDMF
Need1
Design2
Build3
Collect4
Process5
Analyse6
Disseminate7
End to end Statistical Data Management Facility (ESDMF) Uses policies developed by the standard team to generate requirements for the system by using software technologies to implement the system
9Statistics South AfricaCase Study - ESMDF Project
Statistical Metadata in Each Phase of the Statistical Cycle
Description Metadata is used during various stages of statistical production as essential input to production processes The production processes in turn, produce metadata Metadata is also important in documenting the trail of activities during the statistical production process
List of Metadata Groups (Categories of Metadata) Survey Metadata (Dataset Metadata)
• Used to describe, access and update dataset, data structures• Called survey rather than dataset metadata because some of the metadata, such as information about “the population which the data describe”, refer to the broader aspects of the survey, and not only the dataset
Definitional Metadata• Describes the concepts used in producing statistical data• These concepts are often encapsulated into measurement variables used to collect statistical data
Methodological Metadata• Relate to the procedures by which data are collected and processed. • These include Sampling, Collection methods, Editing processes, etc.
System Metadata• Refers to active metadata used to drive automated operations• Some of the examples are: file size, access methods to databases, etc.
Operational Metadata• Metadata arising from and summarising the results of implementing the procedures• Examples include: Respondent burden, Response rates, Edit failure rates, Costs and other quality and performance indicators, etc.
10Statistics South AfricaCase Study - ESMDF Project
Detailed Process Model Scheme
Need Understand the need for the required statistics, i.e., what the required statistics are going to be used for in concrete terms by their users.
Design Preparing ground for the execution of a statistical production project. For example, questionnaire design, Capturing tool design, Tabulation plans, etc.
Build The build phase puts together all the pieces of the infrastructure for a statistical production project E.g. the data capturing and scanning tools are developed, tested and implemented
Collect Refers to both direct and administrative methods of data collection The direct collection method refers to data collection in which Stats SA sources data directly from the respondents In administrative collection, data are drawn from databases of other organizations which in turn source them from their respondents
Process Includes capturing collected data into databases so that data processing may be done
Analyse After data have been cleaned during the Process phase, it is now ready for manipulation using analytical tools
Disseminate Publications are created from the datasets produced by the analysis phase Disseminated in various forms, e.g. electronic, printed output and compact disks
Need1
Design2
Build3
Collect4
Process5
Analyse6
Disseminate7
11Statistics South AfricaCase Study - ESMDF Project
How the Stats SA process model map to the METIS Metadata cycles
Post Survey EvaluationThis is currently done outside the statistical cycle. It is performed only for the large surveys such as the population census and the community survey
METIS Stats SA
Survey planning and design Need and Design Phases
Survey preparation Part of Design Phase
Data collection Collection Phase
Input processing Processing Phase
Derivation, Estimation, Aggregation Processing Phase
Analysis Analysis Phase
Dissemination Dissemination Phase
Post Survey Evaluation
12Statistics South AfricaCase Study - ESMDF Project
How Metadata Fits into Other Stats SA Systems
ESDMF Core (Metadata Subsystem is one part thereof) This consists of all the components that make up the ESDMF system (only a few shown)
Integration Layer This facilitates access to the ESDMF functionality to the existing systems with Stats SA
13Statistics South AfricaCase Study - ESMDF Project
Metadata Description
Survey Metadata Capture Tool This is the principal mechanism by which metadata is documented The tool is based on an approved Survey Metadata Standard Template The standard is made up of groups of metadata elements
e.g. Overview, Generic Information, Methodology, Data Quality Report, Documentation and Contacts) which describe certain characteristics about the survey data
Survey Metadata is organized around an entity known as the survey A survey can be:
a direct survey: data is collected directly from the respondents. This could be a sample or a census administrative: data is sourced from another organization, which had collected the data for their own purposes. derived: a statistical program uses administrative data or a data integration activity is done
Series Metadata The group of metadata topics that remains constant for a period of time or do not change as frequently (e.g. history, objective or abstract of the survey etc.)
Instance Metadata This is the metadata that is compiled and produced frequently The frequency of production of this metadata set is every time a release is produced
Storage Metadata captured is stored in a database and can be saved and viewed anytime
Benefits Once captured, users can always access their metadata from a centalised storage location (centralisation of metadata) Users across the organisation also use the same mechanism to capture metadata (quality of metadata)
14Statistics South AfricaCase Study - ESMDF Project
Survey Metadata Capture Tool
The implemented Survey Metadata Capture Tool of the ESDMF captures the following metadata: Overview
The Overview section comprises the following items: Objective, Abstract, History, etc. Generic Information
provides generic information about the survey time frames, e.g. frequency, collection start and end dates
Primary Data Source External data inputs to the survey, e.g. external or internal data sources
Methodology The activities conducted and the methods and processes used which are specific to the survey, e.g. survey population, instrument design, sample design, etc.
Data Quality Report Comprises the quality dimensions of the data, e.g. relevance, accuracy, accessibility, etc.
Documentation Links to additional documentation related to the survey
Contact Contact person who will manage enquiries related to the data or information produced by the survey
Active Metadata Sets The file identifier and status of the current/active metadata set is displayed immediately under this section. In other words, the metadata set that the user is currently capturing, editing or viewing.
Loaded Metadata SetsLists the file identifiers and statuses of metadata sets created by the current user Enables the current user to switch between metadata sets
15Statistics South AfricaCase Study - ESMDF Project
Survey Metadata Capture Tool User Interface – Activity Selection
Instance Metadata – Create an instance metadata Create a Report – Create a report in PDF format View (metadata) – View approved metadata Approve (metadata) – Approve metadata for use in a survey Series Metadata – Create a series metadata
16Statistics South AfricaCase Study - ESMDF Project
Survey Metadata Capture Tool User Interface – Navigation
Active Metadata Sets Overview Generic Information Primary Data Source Methodology Data Quality Report Documentation Contact Loaded Metadata Sets
17Statistics South AfricaCase Study - ESMDF Project
IT Infrastructure Specifications
Operating Systems Desktops are in Microsoft Windows The application is deployed in an Open Source operating system (Novell SuSe Linux)
Networks The network architecture is based on open protocols and industry standards Allows remote access to some employees Supports both local area (LAN) and wide area (WAN) networks
Servers The system is developed as a client-server application This means that there is a need for powerful computer servers capable of handling
intensive processing
Data Storage Storage management is via the Storage Area Network (SAN)
Environments Three environments:
• Application Development• User Acceptance Testing (UAT)• Production
18Statistics South AfricaCase Study - ESMDF Project
IT Infrastructure Specifications - Details
A. Development Environment
Function Make/ModelOperating System/Database Engine
Comment
Application Server
HP BL45pQuad processor4 GB RAM2 x 72 GB HDD
SuSe Linux Ver. 10Make/Model exceeds recommendation
Database Server
HP BL45pQuad processor16 GB RAM2 x 72 GB HDD
Oracle 10g orSybase ASE and Sybase IQUnix/Linux/Windows
Make/Model exceeds recommendation
Build Server
HP DL 320Dual processor2 GB RAM2 x 72 GB HDD
SuSe Linux Ver. 10Make/Model exceeds recommendation
B. User Acceptance Test (UAT) Environment
Application Servers
2 x HP BL45pQuad processor8 GB Ram2 x 72 GB HDD
SuSe Linux Ver. 10Make and model exceeds recommendation
Database Servers
2 x HP BL45pQuad processor32 GB Ram2 x 72 GB HDD
Oracle 10g orSybase ASE and Sybase IQLinux
C. Production Environment
Application Servers
2 x HP BL45pQuad processor8 GB Ram2 x 72 GB HDD
SuSe Linux Ver. 10Make and model exceeds recommendation
Database Servers
2 x HP BL45pQuad processor32 GB Ram2 x 72 GB HDD
Oracle 10g orSybase ASE and Sybase IQLinux
20Statistics South AfricaCase Study - ESMDF Project
Components of Metadata Management Application
User Interface The user interfaces for all the metadata management system applications is web-based Client workstations only need to have a web-browser to access server based applications The main supported web-browsers are Microsoft Internet Explorer and Firefox
Database The application is supported by a relational database management system (RDBMS) The RDBMS engine of choice for this project is Sybase The project is currently using the open source RDBMS, MySQL
Business Logic The business logic controlling the interaction between the UI and the underlying database is coded using Java server side scripting There is also business logic coded using stored procedures. This mostly performs housekeeping within the database
Application/Web Server The application is served to the client via Tomcat, which processes Java code. Tomcat also handles HTTP calls from the web browser
21Statistics South AfricaCase Study - ESMDF Project
Partnerships and Cooperation between Agencies
Latvia Metadata model is also based on Bo Sundgren’s modelTheir outsourced supplier took a while to understand the business of the statistical organization
Ireland Issues regarding communication between the customer and the supplier Project took longer than planned
Slovenia Metadata model is also based on Bo Sundgren’s model, with some modifications
New Zealand Adopted their business process model, called it Statistical Value Chain in Stats SA We also adopted their broke down of metadata into five categories
Australia For a successful data warehouse project, there is a need to develop policies and standards
Sweden Advise us on various aspects of metadata and statistical production processes Better idea on how to develop a data quality template, as well as how data quality should be reported on
Canada We applied that knowledge of their Metastat (IMDB) during the development of our Survey Metadata Capturing Tool. Consultants from Canada come to help in other projects within Stats SA, including us.
United StatesWe used the Corporate Metadata Repository (CMR) model by Dan Gillman, from the US Bureau of Statistics in our understanding the metadata model
22Statistics South AfricaCase Study - ESMDF Project
Organizational and Cultural Issues
Climate and Culture Assessment A key challenge to Stats SA is to focus the organisation on the strategic importance of
the DMID project A Climate and Culture Assessment was done by holding focus groups as well as running
an online survey via Stats SA intranet website
Change Readiness Assessment A Change Readiness Assessment was conducted to determine the current capacity of
Stats SA to change, and to identify areas of resistance towards DMID requiring Organisation Change Management (OCM) interventions
The following ‘change readiness dimensions’ formed the basis of the Change Readiness Assessment:
• Clear vision• Effective leadership• Positive experience with past change initiatives• Motivation to do the project• Effective communication• Adequate project team resources
What is Change Readiness? The Change Readiness Assessment is a process used to determine the levels of
understanding, acceptance and commitment likely to affect the success of the planned change.
23Statistics South AfricaCase Study - ESMDF Project
Change Commitment Curve
Commitment As the DMID project phases roll out, different stakeholders will need to be at specific
levels of commitment The level of commitment required will be dependent on the role they play in the DMID
project and their ability to influence the program
Framework The Change Commitment Curve will provide a framework for understanding and tracking
the requisite levels of commitment that stakeholders need to be facilitated through so that OCM interventions can be developed accordingly
Setting the
scene
Contact Awareness
Understanding
Engagement
Acceptance
Commitment
Internalisation
I know something is changing
I know what it is
I know the implications for me
I ’ll look at doing it the new way
I ’ll do it the new way
This is the way we do things
This is the way I do things
Achieving acceptance
Achieving commitment
Setting the
scene
Contact Awareness
Understanding
Engagement
Acceptance
Commitment
Internalisation
I know something is changing
I know what it is
I know the implications for me
I ’ll look at doing it the new way
I ’ll do it the new way
This is the way we do things
This is the way I do things
Achieving acceptance
Achieving commitment
24Statistics South AfricaCase Study - ESMDF Project
Climate and Culture plus Change Readiness Assessments
The following we the findings from the assessments: Executive Management does not have the same understanding of the DMID project Lack of communication between management and sub-ordinates; this makes it difficult for
sub-ordinates to understand the purpose of the project and the impact it has on their working lives
Lack of support from Executive management will result to resistance and difficult success of the project
If management does not communicate, does not understand and does not promote the project, it will result in difficulty to deliver the message and get buy-in from staff in the organisation
Next Steps from the Findings The findings of the assessments resulted in identifying where some of the key staff
members belonged on the Change Commitment Curve. In general, most were in the “Setting the Scene” and “Achieving Acceptance” area
bounded by in time by “Contact” (“I know something is changing”) and “Understanding” (“I know the implications for me”)
Obviously, a lot of effort is needed in order to move from that area to “Achieving Commitment” demonstrated by “Internalisation” wherein staff can claim that “This is the way I do things”
Another outcome of these assessments was to organize a Leadership Alignment workshop.
In this workshop, the Executive Committee was given a presentation of the findings and the path forward
The path forward is to ensure that the leadership understands the goals of the project and how they line up with the vision of Stats SA
The leadership was also instructed on how to communicate the same message about the project
25Statistics South AfricaCase Study - ESMDF Project
Lessons Learned
The business of Stats SA The supplier had a difficult time understanding the business of Stats SA, which is statistical production processes
Skills Transfer Plan Under pressure of meeting the deliverables, the supplier ignored the Skills Transfer Plan, with the result that the Stats SA developers were not involved in the final design and development of the phase 1 deliverable.
Breakdown of deliverables Each phase was planned to be three months long in duration. Also, each phase was planned to be a complete deliverable in its own right, even though the next phase was planned to build on the previous phases. The first phase was delivered late mainly due to the lack of understanding that the supplier demonstrated. The first deliverable did not meet the stated business objectives of data quality initially
26Statistics South AfricaCase Study - ESMDF Project
END
Thank You
Contact information:
Sibongile Madonsela ([email protected])Matile Malimabe ([email protected]) Bubele Vakalisa ([email protected])Ashwell Jenneker ([email protected])
all from Statistics South Africa (www.statssa.gov.za)