Date post: | 13-Jan-2016 |
Category: |
Documents |
Upload: | katrina-rose |
View: | 213 times |
Download: | 0 times |
Open Data Challenges in Interdisciplinary Research
Open Access Week October 2012
Jennifer K. Barton, PhDAssociate Vice President for Research
Professor , Biomedical Engineering
The University of [email protected]
520-621-4116
• Non-uniform requirements for data sharing.
• Interdisciplinary research has a huge variety of data types that should be shared.
• Highly competitive research environment.
• Researchers largely untrained in how to share data, how to organize data collection.
Current State
Non-Uniform Requirements• Although some funding agencies require a data
management and sharing plan (e.g. NSF) others do not (e.g. NIH <$500k/year DC).
• Most journals do not require guaranteed access to study data, even when the subject material of the article clearly cries out for this requirement.
e.g. article on FEM of light propagation through the head “Readers should contact the senior author if they want a copy of the model.”
• This issue will probably take care of itself…
Data Types
• In a recent paper, we reported the following data:An imaging system with a novel optical design (Zemax optical model, mechanical drawings, SOPs for assembly), house-built software (C++), and house-built electronics (schematics, PCB layout).
Data Types
• In a recent paper, we reported the following data:Data on 114 mice on a chemoprevention experiment. Weights at 5 timepoints (data table), final gross assessment (photographs) and histological data (over 1000 slides).
Data Types
• In a recent paper, we reported the following data:Optical images on mice at 4 time points (4560 different 3MP images, in raw and processed form)
Data Types
• All boiled down to a couple key graphs using fairly complicated statistical analysis.
• Each type of data requires different storage and presentation conditions, different expertise to handle.
21 W
eeks
25 W
eeks
29 W
eeks
33 W
eeks
38 W
eeks
Histol
ogy
0.00
1.00
2.00
3.00
4.00
5.00
6.00
AOM No Drug
AOM DFMO
AOM Sulindac
AOM DFMO/Sulindac
Age of Mice
Num
ber
of
Tum
ors
Competitive Environment• Effective data sharing presumes that all players
see the advantage in providing “full disclosure”, and are willing/able to give it.
• Lack of information makes it nearly impossible to replicate results. Common in hardware articles, especially letters. Performance and general concept, but not design details, given.
Competitive Environment
• Whether intentional or unintentional, articles without sufficient data to replicate results to not best advance science.
• Reviewers and editors can insist on increased detail. This is incompatible with letter format.
• Supporting data should be available- require of journals? Or link to a neutral party location?
Parts lists?Software?Wiring diagrams?How to make this truly useful?
How to Share Data?
• Typically, investigators do not know how to share data.
Typical investigator data storage facility
How to Share Data?
• How to make the data useful to others?Not just a software program, but also installation instructions, user manual.Not just raw image data, but also information on the instrument configuration, experimental conditions, image characteristics.Not just a chemical recipe but careful SOP with sources of all materials, etc.
An Old Success
Minor Success• There’s got to be a better way…
A good idea that didn’t work• Group of NIH-funded investigators decided to
compile an archive of optical images of normal mouse.
Spent a huge amount of time determining what image data needed to be submitted with the images.Relied on voluntary submissions.Optical imaging has no standard data format like DICOM for MRI/CT.Huge variety of optical imagers, each instrument unique, quality of images varies widely.Project sunk beneath its own weight.
Creating a Data Sharing Culture
• Many investigators will need a “kick” from their funding agency or journals.
• Some investigators will need to realize the great benefits to be made from sharing data.
• Investigators need resources to store and manage data. Standards for data storage and metadata need to be developed.
• Investigators need to think of data sharing from the beginning of a project- minimizes work after the fact to get data in a publically presentable form.