Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | lesley-moris-mosley |
View: | 217 times |
Download: | 5 times |
Epi 202: Designing Clinical Research
Data Management for Clinical Research
Thomas B. Newman, MD,MPH
Professor of Epidemiology & Biostatistics and Pediatrics, UCSF
September 4, 2012
1
Outline
Data management steps Advantages of database vs
spreadsheet entry REDCap demonstration Take-home message: Pretest should
include data entry and analysis
2
Data Management Steps
Design data collection form Capture data Enter data Clean data
Then can do data analysis
3
Traditional Paper method
Data collection form design -- Word Data capture – Pen Data entry -- keyboard transcription
into Excel Data cleaning -- painful
4
Questionnaire from TN’s DCR section 2009
5
Oophorectomy
IDoophe-
rectomy204 no205 yes207 no208 no209 no211 no212 yes214 no
215 no216 yes (one)217 no218 no
219 no
• Advantage of paper form: ability to write in answers you had not anticipated
• Subject might leave it blank or guess if forced to chose
6
Questionnaire from DCR 2009
7
Race coding: Problems
ID race204 black205 hispanic207 Asian208 white209 latina211 white212 asian214 white
215 white216 black217 black218 hispanic
219 white
Free text for “other”: hispanic, latina
“Asian” and “asian” are different values for a string variable
8
Questionnaire from DCR 2009
9
Weight change
ID raceweight change gain/lose
204 black 40 loose205 hispanic 35 gain207 Asian 2 blank (+/-)208 white 10 gain209 latina 5 gain211 white 0 lose212 asian 0 214 white 15 gain
215 white 10 loose216 black 25 loose217 black 0 218 hispanic 15 loose
219 white 5-10
pounds loose10
Data cleaning before transcription- study staff
Different color ink
Person making changes identified
11
Data cleaning (Stata example)
replace race = “Asian” if race == “asian”
replace weightchange = 7.5 if weightchange == “5-10 pounds”
12
Questionnaire from DCR 2009
13
Exercise
IDexercise
typeexercise freqency
204 walking 2-4times/week205 stretch/walk 2-3 days/week207 walking 3x208 Curves 3-5 x/week209 biking every day211 walking 212 walking 2x/week214
215aerobic-resistant 5-6days/week
216 walking 2x/week217 218
219 blank blank
These variables will be hard to analyze. This is what we are trying to avoid.
14
Data cleaning before transcription- study staff
15
Simple coding
Advantages of paper
Rapid data entry anywhere Readily understood Permanent record Allows ready annotation
16
Disadvantages of paper No immediate quality control Branching logic harder Data entry required Allows you to postpone thinking about
data analysis when you should be thinking about it now!
17
Consider data analysis early Restrict options Provide range and logic checks Include coding on the paper form
PRETEST data entry and analysis!
18
Data Dictionary Variable name Type of variable (binary, integer, real,
string, etc.) Variable label (longer name) Value labels (e.g., 0 = No, 1 =Yes) Permitted values Notes
19
Research Electronic Data Capture (REDCap) Design survey or data collection form Creates data dictionary Can track subjects and responses Exports to statistical packages Available with MyResearch account Other options: Access (PC), Epi-Info
(PC), FilemakerPro
20
REDCap demo
21
Home Page
22
My Projects
23
Project Setup
24
Online Survey Designer
25
Add New Field
26
New Question added
27
REDCap Creates a Stata do fileclear
insheet participant_id redcap_survey_timestamp redcap_survey_identifier mas_or_ticr want_attend_review dates_available___1 dates_available___2 dates_available___3 dates_available___4 field comments survey_complete using "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV", nonames
label data "DATA_DCR_FINAL_REVIEW_SESSION_SURVEY_COPY_2_TNEWMAN_2011-08-10-22-39-34.CSV”
label define mas_or_ticr_ 1 "No" 2 "Yes ===> Exit this survey"
label define want_attend_review_ 1 "No ====> Exit this survey" 2 "Yes"
label define dates_available___1_ 0 "Unchecked" 1 "Checked"
label define field_ 1 "Clinical pharmacology" 2 "Community medicine" 3 "Dentistry" 4 "Dermatology" 5 "Emergency medicine" 6 "Endocrinology" 7 "Epidemiology/environmental health" 8 "Family medicine" 9 "Global health" 10 "Hospital medicine" 11 "Infectious disease" 12 …
label variable mas_or_ticr "Are you in either the Masters Degree in Clinical Research program or the ATCR (Advanced Training in Clinical Research) program?"
28
Most Important Message:
29
Pretest!
Questions and comments
30
Extra slides
31
Main decisions
Electronic capture vs paper Optical form reading vs keyboard
transcription Enter data into database, spreadsheet
or statistical package
Highly recommended!32
Advantages of database vs Spreadsheet Restricts choices Error checking Can track study progress, produce
reports, export to statistical package Safer – harder to accidentally alter data
33