ABS Statistical Databases
Session 6
Mark Viney
Australian Bureau of Statistics
6 June 2007
INPUT THRUPUT OUTPUT
INPUT THRUPUT OUTPUT
INPUT THRUPUT OUTPUT
"Stove Pipe" approach
INPUT
THRUPUT
Standardised
interface
Standardised
interface
INPUT
INPUT
OUTPUT
OUTPUT
OUTPUT
OUTPUT
"Clearing-House" Approach
Standardised
interface
OUTPUT
IDW ABSIW
e-Census
e-Census 2006
Conducted 2006 Population Census with the option of electronic submission of responsesƒ drop-off/ pick up
drop-off/mail back in 201110.2% of returns were electronic
ƒ no edits incorporated into electronic formƒ less visits to pick up paper formsƒ less paper forms
less scanning/repair
ABS Secure Deposit Box
Secure Deposit Box
An externally facing database to allow respondents to lodge their raw data electronicallyƒ Excel spreadsheet (essentailly replacing a paper form)
ƒ Administrative datasets
ABS Statistical Databases
ABS Input Data Warehouse (ABS IDW)ABS Information Warehouse (ABSIW)
ABS Input Data Warehouse(ABS IDW)
Input Data Warehouse
Used as a repository for data as soon as it is entered into ABS computer systemsƒ Initially used for data received electronicallyƒ Now used to load (and process) survey data
Input Data Warehouse
Structureƒ Star schema
1 fact table and several dimension tableseach data cell is stored as 1 row in the fact table
Star Schema
ABS Input Data Warehouse- What it allows us to do
Keep a historical record of what each cell was at every point in the processingƒ Reason for the changeƒ when it changedƒ who changed itƒ change in value
Ready access to both current and historical data
ABS Input Data Warehouse- What it allows us to do
A data store for use with :-ƒ editingƒ imputationƒ winsorisationƒ estimation
Quick easy analysis and confrontation of data:-ƒ across time ƒ across dataitemsƒ across data sources
ABS Input Data Warehouse- Flow of Information
I m p u ta tio nW in s o r iza tio nE d it in g
C o lle c t io n 1data
C o lle c t io n 3data
C o lle c t io n 2data
O utputP ro c e s s ing
What we hope to achieve from IDW
Reduced costsImproved data qualityTools to assist with management of data providers
Better understanding of Editing processesƒ Significance Editing
One single source of microdataƒ for all statistical collections
Well managed and secure data storage
ABS Information Warehouse (ABSIW)
ABS Information Warehouse
Need to make both data and metadata:-ƒ Visibleƒ Relatableƒ Accessibleƒ Understandableƒ Reliableƒ Media Independent
ABS Information Warehouse
Visibleƒ central known location
Relatableƒ across collections
Accessibleƒ tools to allow extraction and manipulation
ABS Information Warehouse
Understandableƒ data fully described by metadata
Reliableƒ single sourceƒ high availability
Media Independentƒ single source for outputs
paper publicationselectronic releasesad - hoc requests
ABS Information Warehouse
Define and manage metadataLoad lightly aggregated data Validate data as compliant with metadata Manipulate dataProduce statistical outputsMake data publicly available
ABS Information Warehouse- Flow of information
Data from a collection
Load info on how to categorize data
Load info on what data items mean
Load info about collection
Load data to the
ABSDB Closed DB
Sign-off data to the
ABSDB Open DB
Disseminate output tables
Derive ad-hoc client
data requests
Disseminate time series
Processing System Information Warehouse PPW
ABS Information Warehouse- Define and Manage Metadata
Interfaces to manage metadataƒ load, amend, validate, extractƒ dataitems,classifications, collections,datasets,publications
Application Program Interfaces (API) to link with other systems/programs ƒ increasingly using XML
ABS Information Warehouse- Loading data
Load data from major sourcesƒ Input Data Warehouseƒ SASƒ FAMEƒ SuperCROSS
ABS Information Warehouse- Generating New data Cubes
Passing data through one or more steps to derive a new tableƒ aggregationƒ drop dataitemsƒ calculate new items
ABS Information Warehouse- Other Manipulations
Seasonal Adjustment ƒ SeasABS (X-11)
Chain Volume Measuresƒ FAME (timeseries)
SupertablesConfidentialisation
ƒ Disclosure Avoidance Analysis System
ABS Information Warehouse- Data Delivery
Data combined with metadataOutput formats created tailored to specific useƒ spreadsheetsƒ timeseriesƒ supertablesƒ paper publicationsƒ electronic release
ABS Information Warehouse- Public Release
Make data available on an internally accessible database at a predetermined time (usually 11:30 am Canberra time)ƒ This data is then available to ABS Statistical Consultants to satisfy customer requests
Feed data to websiteƒ www.abs.gov.au
ABS Website
www.abs.gov.au
National Data Network (NDN)
www.nationaldatanetwork.orgwww.nationaldatanetwork.org
We assist and encourage informed decision making, research and discussion within governments and the community, by providing leading a high quality, objective and responsive national statistical service
Australian Bureau of Statistics
National Data Network
Website that raises visibility of statistical dataƒ regardless of publishing agency
A national platform for acquiring, sharing and integrating data relevant to policy and research in Australia
National Data Network
One central websiteƒ descriptions of dataƒ quality statementƒ references to other data
Several websites (Nodes) owned and maintained by other agencies
www.nationaldatanetwork.org
National Data Network
National Data Network
Current FocusCurrent Focusƒ Publish / Search / AcquirePublish / Search / Acquire
Future FocusFuture Focusƒ Design / Capture / ProcessDesign / Capture / Processƒ Analyse / ReportAnalyse / Report
We assist and encourage informed decision making, research and discussion within governments and the community, by providing leading a high quality, objective and responsive national statistical service
Australian Bureau of Statistics
Questions?