Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | charleigh-edwards |
View: | 28 times |
Download: | 1 times |
Trials and Tribulations of creating DDI Codebooks at
the University of Guelph
A.Michelle Edwards and Carol Perry,
Data Resource Centre,
University of Guelph
Guelph, Ontario
Current Search Function
Search Results
Current Documentation
Identifying Variables
Rationale for Change
522 datasets to date.
No comprehensive metadata search function.
No current variable search within dataset.
Limits researcher’s autonomy.
XML tags
Started with approx. 30 or so tags…
As of June 5, 2002 101 tags59 are filled Information contained inside tags
Codebook Templates
Used Maddie to develop initial template.
Edited the template to add tags as required.
Filled in fields common to all codebooks.
Codebook Templates
Statistics Canada data
ICPSR data
B2020 data format
Statistics Canada Codebook
Differences between Codebook Templates
Authoring entity
Distributor (DLI vs. ICPSR)
Licenses
Other material – ICPSR abstract link
B2020No direct link to databaseNo variables
How do we move our information from an HTML
readme file to an XML file???
Readme to XML
Document Description
Study Description
Data Files Description
Readme to XML
Currently – copy and paste information from the Readme (html) file into the XML Codebook.
Script extracts metadata from html and places into XML.
Same amount of time.
Variable Information
Variable Information
Sources of Variable information
Variable names, labels, and position from the SAS program.
Frequencies for each variable value from SAS output.
Variable Information
Sources of Variable information
Literal questions from questionnaires if available.
Variable Information
Script:
Looks into the SAS program – pulls out the variable names, labels and positions.
Looks into a SAS output file for frequencies and variable value labels.
Variable Information
Script:
If questionnaire is available – seeks out questions and matches with variables.
Variable Information
Problems with Script:
SAS programs must be consistent in their format.
SAS output and questionnaires – matching variables.
SAS to XML
SAS 8.2 - XML engine and ODS XML.
Can create XML SAS output.Variable names, labels, value labels, and
frequencies.
Variable positions with the input statement and Proc Print XML.
SAS to XML Frequency Output
SAS to XMLProc print output
SAS to XML
SAS to XML
Advantages:
SAS programs do not need to be consistent.
Use one program from start to finish – SAS.
Still in development.
XML to Viewable Document
Saxon – to render our XML documents to HTML using XSL Stylesheets.
XSL – pull out info from XML document and display with HTML tags.
XSL Templates
Set for each:Statistics Canada ICPSRB2020
Initial templates from University of Virginia samples.
XSL Templates
Abstract
Study Info
Methodology & File Dimensions
Questions
Variables & Frequencies
Other Documents
XSL Stylesheets
Search
Uses SAS IntrNet to call and run the UNIX SGREP search.
Creates an XML file with results.
Calls Saxon to render the file with the Variable XSL Stylesheet.
“Final Product”
Frames to put it all together.
Links to each component (abstract, etc.).
Returns the rendered HTML on the fly.
“Final Product”
“Final Product”
Sun Exposure Survey 1996
http://tdr.uoguelph.ca/DATA/WWWDOCS/XML/SES2/ses96cbk.html
“Finished Product”
522 datasets to date.
35 Completed DDI-compliant codebooks.
Fall completion ???
“Final Product”
“Final Product”
“Final Product”
“Final Product”
“Final Product”
“Final Product”
“Final Product”