African Commission on Agricultural Statistics
Twenty-second Session
Addis Ababa, Ethiopia
30 Nov – 3 Dec, 2011
Use of the IHSN Microdata Management Toolkit to
Document Agricultural Census Data
Alemayehu Gebretsadik
CSA
CSA
Introduction
Data collection, Documentation, Capturing, Archiving
and Dissemination in the Old Good Days
The Need for Improvement
Why IHSN Microdata Management Toolkit?
IT Data Management Framework
Achievements through the Improvement Process
Documentation
Dissemination
Infrastructure
Achievements Summary
Outline
CSA
•The use of statistical data for better decision making has been
recognized better than ever.
•Statistical offices across the world have been engaged in this statistical
data production to address the ever growing demand of timely, reliable
and well documented statistical information.
•The CSA conducts about 11 different types of surveys per annum and
generates data through censuses usually undertaken in ten years
intervals
•Many advances has been made by improving the quality and usability of
statistical data through improved data documentation, archiving and
dissemination system utilizing the Information Communication
Technology (ICT)
Introduction
CSA
•CSA has been engaged in utilizing the ICT to facilitate its data collection,
processing, archiving and dissemination system so that the required statistical
information could be generated and reaches the users for better decision making.
• Addressing the importance of documentation, archiving and dissemination in the
Ethiopian National Strategy for the Development of Statistics (NSDS) could be
considered as a critical step in guarantying the sustainability of these activities
Introduction Con…
CSA
•Data collection nothing but paper based
•The IBM series 12k CPU
•HP 3000/Series 44 system processor unit with 1MB memory
•A Stand alone PC system was used until 2004 at the CSA and the
resource sharing and efficient communication was a series problem.
• Utilization of 1.44 MB floppy diskettes was considered the efficient
means of transferring files or documents among professionals.
• In addition, there was no any centralized management of the system
which was hindering the data security and management system.
• Little attention was given for meta data documentation
• There was no integration between Micro data and meta data
Data collection, Documentation, Capturing, Archiving
and Dissemination in the Old Good Days
CSA
•Access to data was difficult and poor dissemination system
•Utilization of data was difficult because of poor
documentation
•No electronic method of dissemination
•Less consideration in utilizing international standards for
meta data documentation
•Archiving was done in a decentralized manner
=> User dissatisfaction
=> Difficulties in maintaining institutional memory
Data collection, Documentation, Capturing, Archiving
and Dissemination in the Old Good Days
CSA
•Central Statistical Agency priorities:
• Improve data collection, management and dissemination system
• Effective use of ICT
• Better Profile for ICT and GIS activities in the Agencies
Organizational Structure
• Initiated Central Databank Project
• Establishment of socio-economic database
• Deployment archiving and dissemination system
•To ensure compliance with international standards:
• IHSN Microdata Management Toolkit (DDI and Dublin Core)
• UNICEF DevInfo (SDMX, DDI, Dublin Core, ISO)
•FAO CountrySTAT
The Need for Improvement
CSA
• Strengthening of ICT framework
• Take full advantage of new technologies for data management:
•GPS, Satellite Imageries, PDA, Scanners
•Internet, RDBMS, GIS, and electronic methods of data
archiving and dissemination
Provision of data also involves:
• Harmonizing and integrating statistical data,
• Filling the gap between data produced and data available,
• Laying down efficient ICT infrastructure,
• Improving the quality and comparability of data,
• Addressing the challenges of data and metadata exchange
• Adoption of standards available in data management
The Need for Improvement
CSA
In general, improvement of the CSA’s internal ICT capacity focused on
the following:
Look forward for new tools to improve data capturing
Development of an integrated Central Data Bank of survey
and other data as well as creating Ethiopian Socio-Economic
Database for basic indicators;
Development of database management systems;
Strengthening the Local Area Network (LAN) and
The Need for Improvement
CSA
Development of a Wide Area Network (WAN) to connect branch
offices;
Web Site Development;
CD_ROM publishing;
Comprehensive Program of Documentation of existing and new
data especially related to socio-economic indicators;
Utilization of GIS for Geo referencing, spatial data analysis and
other referencing of new and existing data.;
Appropriate training and capacity building;
The Need for Improvement
CSAWhy IHSN Microdata Management Toolkit?
• Statistical data production is an expensive exercise that requires a
great deal of investment in terms of experts’ time and organization’s
budget in the national statistical agencies
• The return of this investment is when the data generated is utilized
well by data users
• However, if one can really observe the utilization of existing data
visa-vise the investment on its production, it is really underutilized
particularly in developing countries like Ethiopia
• In most of African countries for example, availing accurate and
timely statistical data for developing policy is equally challenging as
much as underutilizing the already available data due to a very little
emphasis given to documentation, archiving and dissemination
system.
CSAWhy IHSN Microdata Management Toolkit ?
• The CSA was not exceptional to this problem and it was too difficult
to obtain survey data and related metadata once the CSA produced
basic statistical reports
• Therefore, there was a great demand from the CSA’s side to work
on improving the documentation, archiving and dissemination
system in order to address users’ dissatisfaction in availing the data
as well as keep the institutional memory by having a centralized
system of a well documented information of both micro and meta
data
• Accordingly, the CSA has worked very hard on this issue since 2004
and tried to address most of the problems associated through a very
collaborative work with the International Household Survey Network
(IHSN) of the World Bank and the Accelerated Data program of
PARIS21/World Bank
CSA Why IHSN Microdata Management Toolkit?
• The assistance obtained from IHSN and the ADP includes provision
of software (Toolkit, NADA) capacity building through on job
trainings
• The Agency has benefited greatly from the assistance obtained by
IHSN/ADP as stated above coupled with a high commitment of the
CSA as well as the financial and technical assistance obtained from
the development partners
• Ethiopia is considered as one of the model country in documenting
98 of its surveys conducted since 1995
CSA Framework Set to Improve Survey Documentation, Archiving
and Dissemination System at the CSA
• In order to improve the survey documentation, archiving and
dissemination system at the CSA, the IT based data management
framework has been designed in accordance with international
metadata recommendations and best practices in data archiving to
facilitate data dissemination and metadata exchange at the global
level
• It has the following basic structure.
CSA IT Data Management Framework
DATA PRODUCTION Planning, collection, cleaning, processing
Quality Control
DATA ARCHIVING Conversion, Packaging,
Confidentiality
DATA DISSEMINATION Publish Traditional
Media, Multimedia, Web
ACCESS metadata, data & documentation
+ research tools
ANALYSIS Retrieve data for
analysis Online Analysis
Technical Support
COLLABORATION Disseminate and share
knowledge, user-producer dialog,
feedback
HARMONIZATION Standard formats,
comparability, multilingual
USERS / PRODUCERS NEEDS
HARDWARE Server, Workstations,
Laptops, CD/DVD, Scanners, Printers, Backup,
Tools
SOFTWARE OS, DBMS, Development, Web, Multimedia, Office,
Statistics, Security
TELECOMS Intranet, Internet,
Connectivity, LAN/WAN, Security
SPECIALIZED SOFTWARE AND GUIDELINES
Toolkit, DevInfo,WinISIS, DDP, Nesstar, CSPro, XML,
IT CORE ARCHITECTURE
PROJECT MANAGEMENT
ICT UNIT
TRAINING
DATA TOOLS
DATA MANAGEMENT FRAMEWORK
CSA IT Data Management Framework
• This IT based archiving and dissemination system is made possible
by the establishment of a central databank to archive all
documentation and micro-data obtained from surveys and censuses
by developing a user friendly system for its dissemination
• This includes specifications such as DDI, Dublin Core and SDMX
and the use of tools like the International Household Survey
Network’s Microdata Management Toolkit and the UNICEF DevInfo
package as well as the CountrySTAT system of FAO related to food
and agricultural statistics.
CSA IT Data Management Framework
Web Development and Dissemination Tools (SQL server, web server, web development, DDP-Country)
WinISIS
Catalog of documents
Dublin Core compliant
XML format
Data Dissemination
Toolkit
Documented datasets
DDI compliant
XML format
DEVINFO
Indicators database
(SDMX compliant)
SQL/OLAP format
Country Statistical WebsiteOn-line access to indicators database,
Microdata in SPSS, STATA, SAS and other formats (optional),
On-line analysis of microdata (Nesstar server; optional),
Searchable document catalog, with documents in PDF
Other information (contacts, legislation, methods, etc.)
CD-ROMs CD-ROMs CD-ROMs
Data processing / analysis
CsPro
STATA/SPSS
Excel
PCTrade, EpiInfo and other specialized
software for data processing and analysis
Production of
Document
(reports, manuals,
questionnaires, etc)
MS-Word
Excel
Web Development and Dissemination Tools (SQL server, web server, web development, DDP-Country)
WinISIS
Catalog of documents
Dublin Core compliant
XML format
Data Dissemination
Toolkit
Documented datasets
DDI compliant
XML format
DEVINFO
Indicators database
(SDMX compliant)
SQL/OLAP format
Country Statistical WebsiteOn-line access to indicators database,
Microdata in SPSS, STATA, SAS and other formats (optional),
On-line analysis of microdata (Nesstar server; optional),
Searchable document catalog, with documents in PDF
Other information (contacts, legislation, methods, etc.)
CD-ROMs CD-ROMs CD-ROMs
Data processing / analysis
CsPro
STATA/SPSS
Excel
PCTrade, EpiInfo and other specialized
software for data processing and analysis
Production of
Document
(reports, manuals,
questionnaires, etc)
MS-Word
Excel
CSA Achievements through the Improvement Process
Documentation:
•A Central Databank has been established for the microdata which contains
over 6000 data and documentation files covering 98 surveys
• 98 surveys have been archived using the IHSN Microdata Management
Toolkit, making the metadata compliant with the Data Documentation
Initiative DDI-XML specifications, as recommended by the International
Household Survey Network
•This process enabled the CSA to well document its surveys down to
variable level and include all the related metadata.
CSA Achievements through the Improvement Process
Dissemination:
The CSA Website (WWW.CSA.GOV.ET) as Pivot of the CSA Dissemination
Strategy most importantly the CSA is using its website to:
•Disseminate national statistics and monthly CPI figures
•Provide access to all documentation and related metadata for all of its
surveys archived in the central data bank and
•Serve as a portal for other access points for CSA’s data, like the
ETHIOINFO database, the ENADA system, CountrySTAT, and the Price
database.
CSA Achievements through the Improvement Process
•Given the CSA’s openness to technology in order to facilitate and extend
the culture of analysis, the CSA became very interested in the simple and
powerful tools that were developed by the IHSN and ADP
•One of the key elements of the IHSN/ADP platform for data
dissemination is providing researchers and policy makers with the
innovative tools to facilitate their need for country information
•The web based tool called National Data Archive (NADA) provides this
facility
•The CSA, recognizing the functionality of this system, utilizes the NADA
platform as the Ethiopian National Data Archive (ENADA). The ENADA
system will simplify the CSA’s data access through its cataloging
functionalities
CSA Achievements through the Improvement Process
•This system will allow access to the DDI file which provides information
on the CSA’s survey and census metadata.
CSA
CD-ROM products available for these surveys and metadata
and documents published on the web
Achievements through the Improvement Process
Data Archiving and Dissemination
CSA
The CSA also uses other dissemination
systems like:
Achievements through the Improvement Process
Data Archiving and Dissemination
CSA
• The EthioInfo database containing main MDG indicators
is available on CD-ROMs and online;
Achievements through the Improvement Process
Data Archiving and Dissemination
CSA
Data Archiving and Dissemination
Achievements through the Improvement Process
• The price database has been developed and
made available online
CSA
CSA has an official presence on the Internet
Improved Data Capturing tools
Compliance with international metadata standards
The Central Databank has created an institutional memory
The information is more secure
Single point of access for data and documentation
Micro and macro data and metadata on CD-ROMs and Internet;
Better user/producer support through improved ICT infrastructure;
Potential new activities: metadata, data and documentation quality
assurance, survey comparability and harmonization, statistical data disclosure
control and confidentiality
Achievements Summary