Data Management Practices for Early Career Scientists:Closing
Robert CookEnvironmental Sciences Division Oak Ridge National LaboratoryOak Ridge, [email protected]
February 3, 2013
NACP Best Data Management Practices, February 3, 2013
Plan for archiving data
“Begin with the end in mind”
•Identified the Data Center
•Collaborated with data center during project
•Communicated:• Volume
• Number of Files
• Special needs
• Delivery dates
2
NACP Best Data Management Practices, February 3, 2013
Followed Fundamental Data Practices
Define the contents of your data files
Use consistent data organizationUse stable file formats Assign descriptive file names Preserve informationPerform basic quality assurance Provide documentationProtect your data
3
NACP Best Data Management Practices, February 3, 2013
What to submit to the archive?
• Well-structured data files, with variables, units, and fill values well defined
• Metadata files (optional)• Document that describes the data set• Companion files that describe project,
protocols, or field sites (photographs)– Material from Project Web site or Wiki
4
5
Exploration and Distribution– provide tools to explore, access,
and extract data
Post-Project Data Support– provide long-term secure
archiving– serve as a buffer between end
users and PIs– provide usage statistics
Stewardship– security, disaster recovery– migration to new computer
systems
Data Center: Stewardship and Archive Functions
Ingest– perform QA checks– compile project-provided
metadata– generate additional metadata– convert to archival file
formatsMetadata / Documentation
– prepare final metadata record and documentation
Archive / Publish−generate citation and DOI
(digital object identifier)
NACP Best Data Management Practices, February 3, 2013
Workshop Goal
Provide fundamental data management practices that investigators should perform during the course of data collection.
6
To improve the usability of data sets• You• Collaborators• People outside your project
By following the practices taught in this workshop, your data will be • less prone to error, • more efficiently structured for analysis, and • more readily understandable for any future research.
7
Workshop Sponsors