+ All Categories
Home > Documents > 1 Statistics South Africa Case Study - ESMDF Project Data Management and Information Delivery (DMID)...

1 Statistics South Africa Case Study - ESMDF Project Data Management and Information Delivery (DMID)...

Date post: 28-Dec-2015
Category:
Upload: stewart-reynolds
View: 215 times
Download: 0 times
Share this document with a friend
27
1 Statistics South Africa Case Study - ESMDF Project Data Management and Information Delivery (DMID) Case Study Sibongile Madonsela Matile Malimabe Bubele Vakalisa Statistics South Africa UNECE Workshop on the Common Metadata Framework Vienna, Austria, July 4-7, 2007
Transcript

1Statistics South AfricaCase Study - ESMDF Project

Data Management and Information Delivery (DMID)

Case Study

Sibongile MadonselaMatile MalimabeBubele Vakalisa

Statistics South Africa

UNECE Workshop on the Common Metadata FrameworkVienna, Austria, July 4-7, 2007

2Statistics South AfricaCase Study - ESMDF Project

3Statistics South AfricaCase Study - ESMDF Project

Introduction

Programme Providing Frame for Stats SA Projects Providing Relevant Statistical Information to meet user Needs Enhancing the Quality of Products and Services Developing and Promoting Statistical Coordination and Partnerships Building Human Capacity

This project is aimed at supporting the strategic theme “Enhancing the Quality of Products and Services”. Within the DMID, the metadata management system addresses this strategic theme

Overall Project Objective Metadata management system forms part of the organisation’s broader objective to continuously improve the quality of its products The Survey metadata tool consist of elements for providing the overall description of a statistical survey The survey metadata component is fashioned along the lines of Statistics Canada’s Integrated Metadata Database (IMDB) Metastat

4Statistics South AfricaCase Study - ESMDF Project

Organisation ChartStats SA and current projects

The Data Management and Information Delivery (DMID) project (magenta shaded box) is located within the Data Management and Technology Division

The yellow shaded boxes indicate some of the ongoing projects that are concurrent with the DMID project.

Statistician-General

Economic Statistics Population and Social Statistics

Quality and Integration

Statistical Support and Informatics

Corporate Services

Population Census

Social Statistics

Health and Vital Statistics

Industry and Trade Statistics

Employment and Price Statistics

Financial Statistics

Geography

System of Registers

Data Management and Technology

Statistical Information Services

Provincial Coordination

National Statistics System Division

Methodologies and Standards

Integrative Analysis

National Accounts

Finance and Provisioning

Human Resource Management

SG Support and Strategic Planning

Facilities Management, Security and Logistics

Human Capacity Development

Programme Office

Internal Audit

Data Management and Information Delivery Project

Census Comm. Project

Labour Force Survey Re-engineering Project

CPI ProjectCensus 2011 Project

SAS 9 Migration Project

Statistician-General

Economic Statistics Population and Social Statistics

Quality and Integration

Statistical Support and Informatics

Corporate Services

Population Census

Social Statistics

Health and Vital Statistics

Industry and Trade Statistics

Employment and Price Statistics

Financial Statistics

Geography

System of Registers

Data Management and Technology

Statistical Information Services

Provincial Coordination

National Statistics System Division

Methodologies and Standards

Integrative Analysis

National Accounts

Finance and Provisioning

Human Resource Management

SG Support and Strategic Planning

Facilities Management, Security and Logistics

Human Capacity Development

Programme Office

Internal Audit

Data Management and Information Delivery Project

Census Comm. Project

Labour Force Survey Re-engineering Project

CPI ProjectCensus 2011 Project

SAS 9 Migration Project

5Statistics South AfricaCase Study - ESMDF Project

Organisation ChartDMID project, including supplier’s resources

Prescient Business Technologies (PBT) - the supplier to the DMID project, developing the ESDMF System ESDMF – End to end Statistical Data Management Facility PM – Project Manager

6Statistics South AfricaCase Study - ESMDF Project

DMID Project Structure at a high level

Standards Development and Implementation Led by Chief Standards Officer Develop policies, standards and procedures before components of the facility can be implemented, using the Standard Lifecycle For Phase One, the policies for Data Quality and Metadata were implemented For future phases, related policies will be developed

End to end Statistical Data Management Facility (ESDMF) Led by Technical Lead/Project Manager Uses policies developed by the standard team to generate requirements for the system by using software technologies to implement the system

7Statistics South AfricaCase Study - ESMDF Project

Standards Life Cycle

Develop policies, standards and procedures before components of the facility can be implemented, using the Standard Lifecycle

8Statistics South AfricaCase Study - ESMDF Project

Conceptual components of the ESDMF

Need1

Design2

Build3

Collect4

Process5

Analyse6

Disseminate7

End to end Statistical Data Management Facility (ESDMF) Uses policies developed by the standard team to generate requirements for the system by using software technologies to implement the system

9Statistics South AfricaCase Study - ESMDF Project

Statistical Metadata in Each Phase of the Statistical Cycle

Description Metadata is used during various stages of statistical production as essential input to production processes The production processes in turn, produce metadata Metadata is also important in documenting the trail of activities during the statistical production process

List of Metadata Groups (Categories of Metadata) Survey Metadata (Dataset Metadata)

• Used to describe, access and update dataset, data structures• Called survey rather than dataset metadata because some of the metadata, such as information about “the population which the data describe”, refer to the broader aspects of the survey, and not only the dataset

Definitional Metadata• Describes the concepts used in producing statistical data• These concepts are often encapsulated into measurement variables used to collect statistical data

Methodological Metadata• Relate to the procedures by which data are collected and processed. • These include Sampling, Collection methods, Editing processes, etc.

System Metadata• Refers to active metadata used to drive automated operations• Some of the examples are: file size, access methods to databases, etc.

Operational Metadata• Metadata arising from and summarising the results of implementing the procedures• Examples include: Respondent burden, Response rates, Edit failure rates, Costs and other quality and performance indicators, etc.

10Statistics South AfricaCase Study - ESMDF Project

Detailed Process Model Scheme

Need Understand the need for the required statistics, i.e., what the required statistics are going to be used for in concrete terms by their users.

Design Preparing ground for the execution of a statistical production project. For example, questionnaire design, Capturing tool design, Tabulation plans, etc.

Build The build phase puts together all the pieces of the infrastructure for a statistical production project E.g. the data capturing and scanning tools are developed, tested and implemented

Collect Refers to both direct and administrative methods of data collection The direct collection method refers to data collection in which Stats SA sources data directly from the respondents In administrative collection, data are drawn from databases of other organizations which in turn source them from their respondents

Process Includes capturing collected data into databases so that data processing may be done

Analyse After data have been cleaned during the Process phase, it is now ready for manipulation using analytical tools

Disseminate Publications are created from the datasets produced by the analysis phase Disseminated in various forms, e.g. electronic, printed output and compact disks

Need1

Design2

Build3

Collect4

Process5

Analyse6

Disseminate7

11Statistics South AfricaCase Study - ESMDF Project

How the Stats SA process model map to the METIS Metadata cycles

Post Survey EvaluationThis is currently done outside the statistical cycle. It is performed only for the large surveys such as the population census and the community survey

METIS Stats SA

Survey planning and design Need and Design Phases

Survey preparation Part of Design Phase

Data collection Collection Phase

Input processing Processing Phase

Derivation, Estimation, Aggregation Processing Phase

Analysis Analysis Phase

Dissemination Dissemination Phase

Post Survey Evaluation

12Statistics South AfricaCase Study - ESMDF Project

How Metadata Fits into Other Stats SA Systems

ESDMF Core (Metadata Subsystem is one part thereof) This consists of all the components that make up the ESDMF system (only a few shown)

Integration Layer This facilitates access to the ESDMF functionality to the existing systems with Stats SA

13Statistics South AfricaCase Study - ESMDF Project

Metadata Description

Survey Metadata Capture Tool This is the principal mechanism by which metadata is documented The tool is based on an approved Survey Metadata Standard Template The standard is made up of groups of metadata elements

e.g. Overview, Generic Information, Methodology, Data Quality Report, Documentation and Contacts) which describe certain characteristics about the survey data

Survey Metadata is organized around an entity known as the survey A survey can be:

a direct survey: data is collected directly from the respondents. This could be a sample or a census administrative: data is sourced from another organization, which had collected the data for their own purposes. derived: a statistical program uses administrative data or a data integration activity is done

Series Metadata The group of metadata topics that remains constant for a period of time or do not change as frequently (e.g. history, objective or abstract of the survey etc.)

Instance Metadata This is the metadata that is compiled and produced frequently The frequency of production of this metadata set is every time a release is produced

Storage Metadata captured is stored in a database and can be saved and viewed anytime

Benefits Once captured, users can always access their metadata from a centalised storage location (centralisation of metadata) Users across the organisation also use the same mechanism to capture metadata (quality of metadata)

14Statistics South AfricaCase Study - ESMDF Project

Survey Metadata Capture Tool

The implemented Survey Metadata Capture Tool of the ESDMF captures the following metadata: Overview

The Overview section comprises the following items: Objective, Abstract, History, etc. Generic Information

provides generic information about the survey time frames, e.g. frequency, collection start and end dates

Primary Data Source External data inputs to the survey, e.g. external or internal data sources

Methodology The activities conducted and the methods and processes used which are specific to the survey, e.g. survey population, instrument design, sample design, etc.

Data Quality Report Comprises the quality dimensions of the data, e.g. relevance, accuracy, accessibility, etc.

Documentation Links to additional documentation related to the survey

Contact Contact person who will manage enquiries related to the data or information produced by the survey

Active Metadata Sets The file identifier and status of the current/active metadata set is displayed immediately under this section. In other words, the metadata set that the user is currently capturing, editing or viewing.

Loaded Metadata SetsLists the file identifiers and statuses of metadata sets created by the current user Enables the current user to switch between metadata sets

15Statistics South AfricaCase Study - ESMDF Project

Survey Metadata Capture Tool User Interface – Activity Selection

Instance Metadata – Create an instance metadata Create a Report – Create a report in PDF format View (metadata) – View approved metadata Approve (metadata) – Approve metadata for use in a survey Series Metadata – Create a series metadata

16Statistics South AfricaCase Study - ESMDF Project

Survey Metadata Capture Tool User Interface – Navigation

Active Metadata Sets Overview Generic Information Primary Data Source Methodology Data Quality Report Documentation Contact Loaded Metadata Sets

17Statistics South AfricaCase Study - ESMDF Project

IT Infrastructure Specifications

Operating Systems Desktops are in Microsoft Windows The application is deployed in an Open Source operating system (Novell SuSe Linux)

Networks The network architecture is based on open protocols and industry standards Allows remote access to some employees Supports both local area (LAN) and wide area (WAN) networks

Servers The system is developed as a client-server application This means that there is a need for powerful computer servers capable of handling

intensive processing

Data Storage Storage management is via the Storage Area Network (SAN)

Environments Three environments:

• Application Development• User Acceptance Testing (UAT)• Production

18Statistics South AfricaCase Study - ESMDF Project

IT Infrastructure Specifications - Details

A. Development Environment

Function Make/ModelOperating System/Database Engine

Comment

Application Server

HP BL45pQuad processor4 GB RAM2 x 72 GB HDD

SuSe Linux Ver. 10Make/Model exceeds recommendation

Database Server

HP BL45pQuad processor16 GB RAM2 x 72 GB HDD

Oracle 10g orSybase ASE and Sybase IQUnix/Linux/Windows

Make/Model exceeds recommendation

Build Server

HP DL 320Dual processor2 GB RAM2 x 72 GB HDD

SuSe Linux Ver. 10Make/Model exceeds recommendation

B. User Acceptance Test (UAT) Environment

Application Servers

2 x HP BL45pQuad processor8 GB Ram2 x 72 GB HDD

SuSe Linux Ver. 10Make and model exceeds recommendation

Database Servers

2 x HP BL45pQuad processor32 GB Ram2 x 72 GB HDD

Oracle 10g orSybase ASE and Sybase IQLinux

C. Production Environment

Application Servers

2 x HP BL45pQuad processor8 GB Ram2 x 72 GB HDD

SuSe Linux Ver. 10Make and model exceeds recommendation

Database Servers

2 x HP BL45pQuad processor32 GB Ram2 x 72 GB HDD

Oracle 10g orSybase ASE and Sybase IQLinux

19Statistics South AfricaCase Study - ESMDF Project

IT Infrastructure Specifications - Diagrams

20Statistics South AfricaCase Study - ESMDF Project

Components of Metadata Management Application

User Interface The user interfaces for all the metadata management system applications is web-based Client workstations only need to have a web-browser to access server based applications The main supported web-browsers are Microsoft Internet Explorer and Firefox

Database The application is supported by a relational database management system (RDBMS) The RDBMS engine of choice for this project is Sybase The project is currently using the open source RDBMS, MySQL

Business Logic The business logic controlling the interaction between the UI and the underlying database is coded using Java server side scripting There is also business logic coded using stored procedures. This mostly performs housekeeping within the database

Application/Web Server The application is served to the client via Tomcat, which processes Java code. Tomcat also handles HTTP calls from the web browser

21Statistics South AfricaCase Study - ESMDF Project

Partnerships and Cooperation between Agencies

Latvia Metadata model is also based on Bo Sundgren’s modelTheir outsourced supplier took a while to understand the business of the statistical organization

Ireland Issues regarding communication between the customer and the supplier Project took longer than planned

Slovenia Metadata model is also based on Bo Sundgren’s model, with some modifications

New Zealand Adopted their business process model, called it Statistical Value Chain in Stats SA We also adopted their broke down of metadata into five categories

Australia For a successful data warehouse project, there is a need to develop policies and standards

Sweden Advise us on various aspects of metadata and statistical production processes Better idea on how to develop a data quality template, as well as how data quality should be reported on

Canada We applied that knowledge of their Metastat (IMDB) during the development of our Survey Metadata Capturing Tool. Consultants from Canada come to help in other projects within Stats SA, including us.

United StatesWe used the Corporate Metadata Repository (CMR) model by Dan Gillman, from the US Bureau of Statistics in our understanding the metadata model

22Statistics South AfricaCase Study - ESMDF Project

Organizational and Cultural Issues

Climate and Culture Assessment A key challenge to Stats SA is to focus the organisation on the strategic importance of

the DMID project A Climate and Culture Assessment was done by holding focus groups as well as running

an online survey via Stats SA intranet website

Change Readiness Assessment A Change Readiness Assessment was conducted to determine the current capacity of

Stats SA to change, and to identify areas of resistance towards DMID requiring Organisation Change Management (OCM) interventions

The following ‘change readiness dimensions’ formed the basis of the Change Readiness Assessment:

• Clear vision• Effective leadership• Positive experience with past change initiatives• Motivation to do the project• Effective communication• Adequate project team resources

What is Change Readiness? The Change Readiness Assessment is a process used to determine the levels of

understanding, acceptance and commitment likely to affect the success of the planned change.

23Statistics South AfricaCase Study - ESMDF Project

Change Commitment Curve

Commitment As the DMID project phases roll out, different stakeholders will need to be at specific

levels of commitment The level of commitment required will be dependent on the role they play in the DMID

project and their ability to influence the program

Framework The Change Commitment Curve will provide a framework for understanding and tracking

the requisite levels of commitment that stakeholders need to be facilitated through so that OCM interventions can be developed accordingly

Setting the

scene

Contact Awareness

Understanding

Engagement

Acceptance

Commitment

Internalisation

I know something is changing

I know what it is

I know the implications for me

I ’ll look at doing it the new way

I ’ll do it the new way

This is the way we do things

This is the way I do things

Achieving acceptance

Achieving commitment

Setting the

scene

Contact Awareness

Understanding

Engagement

Acceptance

Commitment

Internalisation

I know something is changing

I know what it is

I know the implications for me

I ’ll look at doing it the new way

I ’ll do it the new way

This is the way we do things

This is the way I do things

Achieving acceptance

Achieving commitment

24Statistics South AfricaCase Study - ESMDF Project

Climate and Culture plus Change Readiness Assessments

The following we the findings from the assessments: Executive Management does not have the same understanding of the DMID project Lack of communication between management and sub-ordinates; this makes it difficult for

sub-ordinates to understand the purpose of the project and the impact it has on their working lives

Lack of support from Executive management will result to resistance and difficult success of the project

If management does not communicate, does not understand and does not promote the project, it will result in difficulty to deliver the message and get buy-in from staff in the organisation

Next Steps from the Findings The findings of the assessments resulted in identifying where some of the key staff

members belonged on the Change Commitment Curve. In general, most were in the “Setting the Scene” and “Achieving Acceptance” area

bounded by in time by “Contact” (“I know something is changing”) and “Understanding” (“I know the implications for me”)

Obviously, a lot of effort is needed in order to move from that area to “Achieving Commitment” demonstrated by “Internalisation” wherein staff can claim that “This is the way I do things”

Another outcome of these assessments was to organize a Leadership Alignment workshop.

In this workshop, the Executive Committee was given a presentation of the findings and the path forward

The path forward is to ensure that the leadership understands the goals of the project and how they line up with the vision of Stats SA

The leadership was also instructed on how to communicate the same message about the project

25Statistics South AfricaCase Study - ESMDF Project

Lessons Learned

The business of Stats SA The supplier had a difficult time understanding the business of Stats SA, which is statistical production processes

Skills Transfer Plan Under pressure of meeting the deliverables, the supplier ignored the Skills Transfer Plan, with the result that the Stats SA developers were not involved in the final design and development of the phase 1 deliverable.

Breakdown of deliverables Each phase was planned to be three months long in duration. Also, each phase was planned to be a complete deliverable in its own right, even though the next phase was planned to build on the previous phases. The first phase was delivered late mainly due to the lack of understanding that the supplier demonstrated. The first deliverable did not meet the stated business objectives of data quality initially

26Statistics South AfricaCase Study - ESMDF Project

END

Thank You

Contact information:

Sibongile Madonsela ([email protected])Matile Malimabe ([email protected]) Bubele Vakalisa ([email protected])Ashwell Jenneker ([email protected])

all from Statistics South Africa (www.statssa.gov.za)

27Statistics South AfricaCase Study - ESMDF Project


Recommended