Date post: | 15-Dec-2015 |
Category: |
Documents |
Upload: | francesca-bayley |
View: | 213 times |
Download: | 0 times |
Data archiving in Canada: problems and prospects
Presentation to NRF
by Laine GM Ruus
University of Toronto. Data Library Service
16/05/2002
Outline
• Why archive data?
• Current problems in Canada (and a personal view of some solutions)
• Current trends, in problems and solutions
Why archive data?
• Promotes research ethics/research integrity
• Enables replication analysis
• Allows re-analysis with refined or new social theories
• Allows re-analysis with new techniques (eg. bootstrapping, etc.)
Why archive data (cont’d)
• Allows comparative analysis, similar data with new universes
• Enables analysis of change, similar data at different points in time
• Maximizes return on initial investment in data collection
• Enables training of policy makers with local data
Why archive data (cont’d)
• Promotes increased numeracy in the population
• Preservation for future generations of an aspect of our culture not measured by other means.
Current climate for data archiving in Canada
• 3 major data producers: government, academia, commercial sector
• Copyright Act:– Crown copyright: government produced data
and information products belong to the Queen
• Government information policy– set by Treasury Board for government, without
consultation with non-government sectors
Government sector as a data producer
• Statistics Canada is major data collector for socio-economic data
• Government has no data archive
• National Archives has not collected data over last 15 years
• Treasury Board policy since ca 1984 treats government data and information products as a commodity
Academia as a data producer
• SSHRCC has data deposit regulations, but no enforcement
• CIHR (formerly MRC), and NSERC have no data deposit regulations
• individual university research funds have no data deposit regulations
• no national data archive
Academia as a data producer (cont’d)
• no tradition in academic sector of depositing research data
• attitude of individual ownership vis-à-vis data files
• history of using US data/10
• no tradition of citing data files in publications
Academia as a data producer (cont’d)
• Few requirements among periodical editors requiring citation of data files
• no tradition among tenure boards to treat creation of a data file as equivalent to publication
• only commercial value in software and applications awakening universities to their rights under Copyright Act.
Commercial sector as a data producer
• Subject to Copyright Act and new Personal Information Acts
• No national data archive
• No uniformity in approach to archiving their data
• No national body with which to negotiate arrangements
Solutions to the Canadian problems (a very personal view)
• Canada needs a national information policy
• Canada needs a national data archive
• Government (all sectors) need government data archives
• Need to promote a culture of data deposit and data sharing in the academic sector (SSRC, CIHR, and NSERC, etc.)
Solutions to the Canadian problems (a very personal view) cont’d
• Need to educate hiring bodies and tenure boards that data file creation is a valuable academic activity
• Need to sell commercial sector on benefits of data archiving and data sharing
• Need to promote numeracy in the population.
Recent trends in the data archiving/data service sector
• Longitudinal data
• Research data centres
• DDI/DTD
• WWW data extractor interfaces
• GIS
• Proliferating formats
Recent trends: longitudinal data
• Data producers increasingly collecting longitudinal/panel data
• Enhanced capability to test theories re social change over time
• Increased problems of preserving privacy and confidentiality
• Requires more sophisticated research techniques
Recent trends: research data centres
• Secure access to more detailed or sensitive data
• Creates segregation of research capabilities (data haves vs data have nots)
• Data producers less likely to produce public use microdata files
Recent trends: DDI/DTD
• Data Documentation Initiative Data Type Definition
<http://www.icpsr.umich.edu/DDI/>
• A standard format for metadata describing microdata files
• Being expanded to encompass aggregate and time-series data
Recent trends: data extractors
• Data extractors provide access to data/analyses of data via Internet protocols
• Enabled by development of DDI/DTD
• Selected data extractors linked at:<http://www.chass.utoronto.ca/datalib/misc/dli/extracts.htm>
Recent trends: data extractors (cont’d)
• Two major data extractor developments:– NESSTAR
<http://www.nesstar.org>– Virtual Data Center project
<http://TheData.org>
Recent trends: GIS
• Geographic information systems
• New theoretical models based on spatial analysis
• New software capable of spatial analysis (ArcGIS)
• Increasing demand for geocoded aggregate and microdata
The nice thing about standards is that there are so many to
choose from!
Recent trends: proliferation of formats
• Data archiving becoming more difficult
• Many new proprietary formats and flavours to deal with
• Increasing number of formats for which we have not yet developed preservation formats, eg GIS shape files, relational databases, etc.
To finish...
• Without data archives, we will loose about 50 years of our culture
• Successful long-term preservation of our electronic culture will partly depend on bringing copyright legislation, internationally, into the 21st century
• Data archiving is not a national problem, nor a problem that is unique to any one country. The problems and solutions are similar in all countries. We can all learn which solutions are best and/or worst from each other.
• IASSIST (International Association for Social Science Information Service & Technology) is one of the venues in which we learn from each other.
<http://www.iassistdata.org/>
• This presentation is available at:<http://www.chass.utoronto.ca/~laine/misc/sada02.ppt>