+ All Categories
Home > Documents > Metadata Driven Integrated S tatistical D ata M anagement S ystem CSB of Latvia

Metadata Driven Integrated S tatistical D ata M anagement S ystem CSB of Latvia

Date post: 22-Jan-2016
Category:
Upload: alyson
View: 38 times
Download: 0 times
Share this document with a friend
Description:
Metadata Driven Integrated S tatistical D ata M anagement S ystem CSB of Latvia By Karlis Zeila Vice President CSB of Latvia MSIS 2004, Geneva May 17 - 19. Any action within the system is ruled by metadata ,. META DATA DRIVEN ... ?. Meta data is the key element of the system ,. - PowerPoint PPT Presentation
Popular Tags:
23
Metadata Driven Metadata Driven Integrated Integrated S S tatistical tatistical D D ata ata M M anagement anagement S S ystem ystem CSB of Latvia CSB of Latvia By Karlis Zeila Vice President CSB of By Karlis Zeila Vice President CSB of Latvia Latvia
Transcript
Page 1: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Metadata Driven Metadata Driven Integrated Integrated SStatistical tatistical DData ata

MManagement anagement SSystemystem

CSB of LatviaCSB of Latvia

By Karlis Zeila Vice President CSB of LatviaBy Karlis Zeila Vice President CSB of Latvia

MSIS 2004, Geneva May 17 - 19MSIS 2004, Geneva May 17 - 19

Page 2: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

META DATA DRIVEN ... ?

Any action within the system is ruled by metadata,

Meta data is the key element of the system,

All software modules of entire system is connected with the Core Metadata module (Meta data base).

Any changes within the system starts with the changes of meta dataFull cycle of the data processing is possible as late as the proper description process in meta data base are completed

Page 3: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

INTEGRATED ... ?Most of the system software modules are

connected with the Registers module,Registers module is an integral part of the system,

All surveys are supported by adequate classifications stored in the Meta data base

In all surveys respondent data fields are connected with registers data

All data is stored in corporative data warehouse

Statistical data processing has split in unified steps for different surveys

Export / Import procedures ensure work with the system data files using different standard software packages

Page 4: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Advantages and Restrictions

Advantages

2. Centralized processing and storage of the statistical data, including metadata, by using data warehouse technologies and OLAP tools.

3. All the data processing procedures are being hosted from common metadata system. These procedures are being described in metadata base.Therefore for standardized procedure execution for each survey individual programming is not required.

4. The system is informatively connected with Business Register, which provides with the direct respondent data retrieval and updating.

5. Special import and export procedure is created for data exchange with other systems.

6. A link with PC Axis is created for electronic data dissemination.

1. At most standardized main business statistics data entry, processing and storage procedures, that provide the bases for transfer from stove pipe data processing approach to process oriented data processing approach.

Page 5: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Restrictions 

1.The system is oriented towards the data processing of different periodicity business statistics surveys.

2.Metadata base does not foreseen description of confidentiality rules they are hard coded in the system.

3. Hardware and Standard software requirements:

PC’s >/= Pentium II, RAM >/=128Mb equipped with

W – 95 to W-2000 and MS Office 2000.4. Metadata base does not foreseen description of algorithm

for automatic creation of respondents lists for Sample surveys from the Business register frame.

5. Diagnostic tools for the metadata descriptions are not powerful enough, therefore experts preparing meta data descriptions should be of high experience.

Page 6: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

ISDMS architecture

Integrated statistical data management system

Corporative data Warehouse CSB Web Site

Macrodata base

Metadata base

Microdata base

Registers base

OLAP data base

User adminis-

tration data base

Dissemi-nation data

base

Windows 2000 Server Advanced MS Internet Information

Server SQL server 2000,

PC-AxisISDMS Business application Software Modules

Core metadata base modulerelated with DB:

Registers module

related with DB:

Data entry and validation module

related with DB:

Data aggregation module

related with DB:

Data analysis module

related with DB:

FIR

EW

AL

L

METADATA

USER ADMINISTRATION

REGISTERS

USER ADMINISTRATION

METADATA MICRODATA REGISTERS

USER ADMINISTRATION

METADATA MICRODATA REGISTERS

USER ADMINISTRATION

OLAP

METADATA

MACRODATA

Raw data base

Data dissemination

modulerelated with DB:

Data WEB entry module

related with DB:

Data mass entry module

related with DB:

Missed data imputation module

related with DB:

METADATA MACRODATA REGISTERS

USER ADMINISTRATION

METADATA MICRODATA REGISTERS

USER ADMINISTRATION

METADATA MICRODATA REGISTERS

RAW DATABASEUSER

ADMINISTRATION

METADATA MICRODATA REGISTERS

DATA IMPUTATION SOFTWARE

User administration module

related with DB:

METADATA MICRODATA MACRODATA

USER ADMINISTRATION

Page 7: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

INTEGRATED, METADATA DRIVEN STATISTICAL DATA MANAGEMENT SYSTEM (ISDMS)

Statistical metadata(structured and unstructured)

Data entryfrom paper

formMICRO

DATABASE

Web raw database

ACTIVE structured statistical metadata for ISDMS1.VARIABLE=INDICATOR + ATTRIBUTE (CLASSIFICATIONS)2.QUESTIONNAIRE,TABLE,RowColumn Code Statistical metadata for description of the Output data

1.STATISTICAL DOMAINS2.BREAKDOWNS, CLASSIFICATIONS3.INDICATORS (basic and derived)

Datavalidationmodule

Dataaggregation

module

Data analysesmodule

MACRODATABASE

Data outputmodule OUTPUT

DATA

PC AXISWEB

modules

Statisticaldata from

other sources

Datagathering

Datain

paperform

e - questi-onnaires

Data entry administration moduleData aggregation and analyse

administration moduleData dissemination administration module

Search

e-Clients

Publication in paperform

e-Publication

Me

tad

ata

for d

ata

dis

sem

ina

tion

Me

atd

a ta

f or d

at a

pr o

ce

ss

i ng

RESPONDENTS

CSB clients - PersonnelCSB Clients - Respondents CSB clients - data users

Page 8: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Structure of Surveys (questionnaires)

New survey should be registered in the System. For each survey shall by created questionnaire version, which is valid for at least one year. If questionnaire content and/or layout do not change, then current version and it description in Metadata base is usable for next year.Each survey contains one or more data entry tables or chapters (data matrix) which can be constant table - with fixed rows and columns number or table with variable rows or columns number.

For each chapter we have to describe rows and columns with their codes and names in the Metadata base. This information is necessary for automatic data entry application generation, data validation e.t.c.

Last step in the questionnaire content and layout description is cells formation. Cells are smallest data unit in survey data processing. Cells are created as combination of row and column from survey version side and variable from indicators and attributes side.

Page 9: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Structure of trade statistics questionnaire (data matrix - fixed table)Name of Questionnaire, index, code, corroboration date, Nr.Respondents (object) code, name and address;Period (year, quarter, month)Name of chapter

Goods and commodity groups

Row code

Total turnover

( 2,3,4)

Retail trade turnover

Public catering turnover

Wholesale trade

A B 1 2 3 4

Goods, in total ( 2010, 2020, 2030-2190)

2000 15000 9000 5000 1000

Food products (except alcoholic beverages and tobacco goods)

2010 12000 5600 6000 400

Alcoholic beverages, in total 2020 3000 2000 400 600

of which:

spirits and liqueurs, whisky, long drinks

2021 500 300 100 100

wines2022 1000 500 200 300

CELL

[2010,1]VARIABLE 1

INDICATOR 1 + ATTRIBUTE

Metadata repository: common table of statistical indicators, table of attributes (classifications) and table of created variables

A t

t r

i b

u t

e s

I n d i c a t o r s

Page 10: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

1. Data matrix - Fixed number of Rows (3) and variable number of Columns (n)

(Example) Main economical indicators of the economics activity

Row heading Row’s code

Total Name1 Name2 N Name n-1 Name n

A B 9999 NACE 1 code

NACE 2 code

….. NACE n-1 code

NACE n code

Number of employees 1110      …    

Net turnover 1120      …    

Other income 1130           

Page 11: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

2. Data matrix - Fixed number of Columns (3) and variable number of Rows (n)

(Example) Production of industry products

Name of production

Production code

(PRODCOM or CN code)

Produced in natural measurement

Sailed in natural measurement

Income in lats (LVL)

A B 1 2 3

Product 1 1234567      

Product 2 2345678      

… … . . . . . . . . .

Product n-1 4567890      

Product n 5678901      

Page 12: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Creating of variables

INDICATOR

Example:

Number of employees

+ Regional code (ATVK or NUTS)

= Number of employees, total

= Number of employees in breakdown by kind of activity (~300 variables)

= Number of employees in breakdown by regions (~26 variables)

+ no attribute

+ Local kind of activity (NACE)

Dimensions (Vectors) of indicators

ATTRIBUTES (CLASSIFICATORS)+ = VARIABLES

Page 13: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Dimensions of objects and indicators (example)

Number of employees in breakdown by kind of activityNACE 1 NACE 2 NACE 3 NACE 4

55 35 5 5

Region 1 60

Region 2 25

Region 3 15

Number of employees, total

100

Nu

mbe

r of

em

ploy

ees

in

brea

kdo

wn

by r

egio

ns

NACEREGIONS (Teritory)

OWNERSHIP AND ENTERPRENERSHIPEMPLOYEES GROUP

TURNOVER GROUP

Main dimensions (vectors) of respondents (objects O(t) )

Dimensions (vectors) of indicators

Page 14: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Integrated Metadata Driven Quasy Process Oriented Technology

SURVEY 1SURVEY 2

.....SURVEY N

Data outputand

dissemination

Standardized output datadissemination interface

METAdatabase

MACROdatabase

Data validation procedure

Dataaggregationprocedure

Metadata entry

IMPORT- EXPORTFOR PROCEDURES

OUTSIDE ISDMS

MICROdatabase

PROCESSORIENTED

APPROACH INRECTANGLES

Businessregister

EXPORTFOR PROCEDURES

OUTSIDE ISDMS

SURVEY 1 SURVEY 2 SURVEY N

Standardized data entryinterface

Respondlist

Page 15: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Metadata base link with Microdata and Macrodata bases

General description of survey

Description of survey

version

Description of chapters

(data matrix)

Description of rows and

columns

Selecting Indicators

Selecting

Attributes

Creating of Variables

Linking variables to cells

Generation form for data entry (automatically)

MICRO DATABASE

Defining of data aggregation

rules

Data aggregation function

(automatically)

MACRO DATABASE

META DATA BASE

(REPOSITORY)

IMPORT EXPORT

Page 16: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

Data entry and validation

Data transfer to Microdata

Base

Description of data entry

forms

Description of validation

rules

Standard data entry and validation

Creating list of Respon-

dents

MICRO DATA BASE

RAW Web

DATA BASE

META DATA BASE

BUSINESS REGISTER

Mass data entry

Web data entry and validation

RAW DATA BASE

Data validation

Web Data validation

F i r e w a l l

Data import from files

Full data validation

Page 17: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

CSBHEADQUARTERS

in RIGA~ 200 ISDMS Users

Data Collectionand Processing

CENTRE~20 users

Data Collectionand Processing

CENTRE~20 users

Data Collectionand Processing

CENTRE~20 users

Data Collectionand Processing

CENTRE~20 users

2 Mbit/secon-line

ISDMS USERS in CSB of Latvia

Page 18: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED

Design of the new information system should be based on the results of deep analysis of the statistical processes and data flows

Clear objectives of achievements have to be set up, discussed and approved by all parties involved

StatisticiansIT personalAdministration

Page 19: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED

Within the process of the design and implementation of metadata driven integrated statistical information system both parties statisticians and IT specialists should be involved from the very beginning

Both parties have to have clear understanding of all statistical processes,which will be covered by the system, as well as metadata meaning and role within the system from production and user sides

Page 20: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED

Initiative to move from classical stove-pipe production approach to process oriented have to come from statisticians side not from IT personal or administration

Motivation of the statisticians to move from existing to the new data processing environment is essential;

Improvement of knowledge about metadata is one of the most important tasks through out of the all process of the design and implementation phases of the project

Page 21: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED

Clear division of the tasks and responsibilities between statisticians and IT personal is the key point to achieve successful implementation

To achieve the best performance of the entire system it is important to organize the execution of the statistical processes in the right sequence

Design of the new surveys and questionnaires particularly as well as changes in the existing ones should be done in accordance with the system requirements

Page 22: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED

As the result of feasibility study we clear understood, that some steps of statistical data processing for different surveys defy standardization, some surveys may require complementary functionality (non standard procedures), which is necessary just for this exact survey data processing;

For solving problems with the non-standard procedures interfaces for data export/import to/from system has been developed to ensure use of the standard statistical data processing software packages and other generalized software available in market;

Page 23: Metadata Driven  Integrated  S tatistical  D ata  M anagement  S ystem CSB of Latvia

LESSONS LEARNED It is necessary to establish and train special group of

statisticians, which will maintain Metadata base and which will be responsible for accurateness of metadata;

For the administration and maintenance of the system it is necessary to have well trained IT staff, which is familiar with the MS SQL Server 2000 administration, MS Analysis Service, other MS tools, PC AXIS family products and system Data Model, system applications;


Recommended