+ All Categories
Home > Documents > 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa...

1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa...

Date post: 19-Dec-2015
Category:
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST Seminar March 17, 2009
Transcript
Page 1: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

1

Why Dummy Tables are Smart!  A Systematic Approach to Data Analysis

for Your M.Sc. Thesis

Lisa Fredman, Ph.D.Department of Epidemiology, BUSPH

CREST SeminarMarch 17, 2009

Page 2: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

2

Outline:1. Research fundamentals (the basics)

2. Analytic plan in research a. Hypothesis guides plan b. Identify measures for E, D, and covariables c. Descriptive statistics on E, D, and covariables d. Analyses on E-D association

i. Crude analysesii. Evaluate potential confoundersiii. Multivariable analyses

3. Present results in tables and text

Aim: describe how dummy tables used in Steps 2a-d, 3

Page 3: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

3

- systematic investigation of E-D association

- analysis follows sequential steps from descriptive analyses -> univariate E-D association -> confounder assessment -> multivariate modeling

- document methods and variables

- document analytic steps, results at each step,decisions that influence next steps

- clear communication throughout- hypothesis- methods- analytic steps- results

Research fundamentals:

Page 4: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

4

Dummy tables

Definition: Dummy tables (aka mock tables) are shells of tables with variable names, SAS names, and statistical measures. Do not include data.

• Create dummy tables when develop analysis plan.

• Fill in dummy tables as perform analyses.

• Use dummy tables to guide analyses• record SAS programs used for analyses• names of measures used• document interim results• draft methods and results

Page 5: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

5

Example of generic dummy table:

Title: Distribution of key variables (SAS program used to generate results, date)

Variable Distribution

Exposure: Variable (VARNAME) (mean, std, range)

Outcome: Variable (VARNAME) (%)

Covariables

Covar1 (VARNAME) (%)

Covar2 (VARNAME) (%)

Covar3 (VARNAME) (%)

… . .

Brief notes on results, decisions, next steps

Page 6: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

6

Why are dummy tables smart?

• Stay focused on analyses to test YOUR hypothesis.

• Provides template for systematic steps in your analysis.

• Internal documentation.

• Centralized record of analyses, results, decisions.

• Communication aid.

Page 7: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

7

Dumb things that smart researchers often do:

Dummy tables help you avoid doing these dumb things.

Revise analytic variables and not rename vars or record changes.

DON’T LET YOURSELF FALL INTO THIS TRAP!

DON’T BE TEMPTED TO DO THIS!

Analyze associations that look interesting but are tangential to their hypothesis.

Page 8: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

8

Guide to dummy tables for analyses for epidemiologic study:

Before starting analyses:1. Write down hypothesis2. Make dummy table for each stage of analysis3. Make note to write summary of table, decisions, next

steps.

Page 9: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

9

Start with 4-5 dummy tables:• Descriptive analyses: variable distributions • Crude analyses• Bivariate analyses• Confounder analysis• Multivariable analyses

Guide to dummy tables for analyses for epidemiologic study, con’t:

Page 10: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

10

While doing analyses, at each step:• Fill in dummy table and/or checklist at each stage• Make decisions based on analyses at this stage

(operationalizing variables, selecting confounders, excluding variables from multivariate model) that will influence next stage

• Write each decision and rationale for it

Proceed to next stage

Guide to dummy tables for analyses for epidemiologic study, con’t:

Page 11: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

11

• This Epicurious.com recipe: Corned Beef with Cabbage

•4 lb corned brisket of beef3 large carrots, cut into large chunks6 to 8 small onions1 teaspoon dry English mustardlarge sprig fresh thyme and some parsley stalks, tied together1 cabbagesalt and freshly ground pepper

Put the brisket into a saucepan with the carrots, onions, mustard and the herbs. Cover with cold water, and bring gently to a boil. Simmer, covered, for 2 hours. Discard the outer leaves of the cabbage, cut in quarters and add to the pot. Cook for a further 1 to 2 hours or until the meat and vegetables are soft and tender.

Serve the corned beef in slices, surrounded by the vegetables and cooking liquid. Serve with lots of floury potatoes and freshly made mustard.

• Irish Traditional Cooking© 1995 (reprinted 2005)February 2008by Darina Allen2008-02-11 10:37:29.0

EX: Making Corned Beef with Cabbage dinner

Page 12: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

12

Generic dummy table aka “Shopping List”

Shopping list for Corned Beef dinner

Ingredients Amount Cost

Cabbage 1 head

Carrots 3 large

Corned brisket or beef

4 lbs

Toadstools 6 small

EX: Making Corned Beef with Cabbage dinner

(Title)

(Variables)

Stop & Shop, or Shaws?

Page 13: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

13

Stop & Shop or Shaws?

Need subgroup analyses!

Page 14: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

14

Fill in shopping list!

Shopping list for Corned Beef dinner

Ingredients Amount Cost

Cabbage 1 head

Carrots 3 large

Corned brisket or beef -- Hummell

4 lbs $1.49/lb

Toadstools 6 small

EX: Making Corned Beef with Cabbage dinner

(Title)

(Variables)

Stop & Shop, or Shaws?

Either

Page 15: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

15

• This Epicurious.com recipe: Corned Beef with Cabbage

•4 lb corned brisket of beef3 large carrots, cut into large chunks6 to 8 small onions1 teaspoon dry English mustardlarge sprig fresh thyme and some parsley stalks, tied together1 cabbagesalt and freshly ground pepper

Put the brisket into a saucepan with the carrots, onions, mustard and the herbs. Cover with cold water, and bring gently to a boil. Simmer, covered, for 2 hours. Discard the outer leaves of the cabbage, cut in quarters and add to the pot. Cook for a further 1 to 2 hours or until the meat and vegetables are soft and tender.

Serve the corned beef in slices, surrounded by the vegetables and cooking liquid. Serve with lots of floury potatoes and freshly made mustard.

• Irish Traditional Cooking© 1995 (reprinted 2005)February 2008by Darina Allen2008-02-11 10:37:29.0

Make notes to improve recipe

LF: use fewer onions, more carrots

LF: definitely plan on 2 hrs! Use less water

Page 16: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

16

Another example: is positive affect associated with better

recovery in physical functioning following hip fracture?

Main study hypothesis:• Elderly hip fracture patients with high positive affect

will show recovery in more ADLs, and in more mobility-related ADLs over 2-years following fracture than patients with low positive affect or depression.

Page 17: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

17

Dummy tables for Positive Affect study:

Title: Table 1 (manuscript): baseline characteristics of hip fracture sample, by positive affect category (OCESD) (SAS pgm used for results, date)

Total sample

High PA (n=xxx)

Low PA (n=xxx)

Depressed (n=xxx)

p-value

Sociodemographic variables

Age groups: % (AGE)

Sex: % female (RACE)

Medical conditions

Past stroke: % (V508)

Past hip fx: % (V515)

Functional status at baseline

ADL limitations (0-7): mean, std (KATZ0)

Page 18: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

18

Age-adjusted mean KATZ ADL score at each interview point, by baseline Positive Affect Category

PositiveAffectcategory

Baseline(KATZ0)

2-month(KATZ02)

6-months(KATZ06)

12-months(KATZ12)

18-months

(KATZ18)24-months(KATZ24)

 (OCESD) Mean (se) Mean (se) Mean (se) Mean (se) Mean (se) Mean (se)

High pos. affect

Low pos. affect

Depressivesymptoms

More dummy tables:

Page 19: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

19

Age-adjusted mean KATZ ADL score at each interview point, by baseline Positive Affect Category (pgm=hipKatz2_age adjusted means, 5/3/06)

PositiveAffectcategory

Baseline(KATZ0)

2-month(KATZ02)

6-months(KATZ06)

12-months(KATZ12)

18-months

(KATZ18)24-months(KATZ24)

 (OCESD) Mean (se) Mean (se) Mean (se) Mean (se) Mean (se) Mean (se)

High pos. affect 0.72 (0.12) 3.76 (0.13) 2.49 (0.16) 2.03 (0.17) 1.98 (0.19) 2.02 (0.18)

Low pos. affect 0.49 (0.20) 3.82 (0.21) 2.59 (0.27) 2.28 (0.28) 2.14 (0.30) 1.91 (0.28)

Depressivesymptoms 1.29 (0.10) 4.20 (0.11) 3.05 (0.13) 2.83 (0.14) 2.86 (0.16) 2.63 (0.15)

Summary of age-adjusted analyses: Respondents with low positive affect (PA) reported the fewest ADL limitations at baseline, and those with depressive symptoms reported the most. On average, respondents in each affect category reported more ADL limitations at each interview following the fracture. On the KatzADL variable, the high PA group reported the fewest ADL limitations 2-months through 18-months post-fracture. However, there were no statistically significant differences between respondents with high and low PA.

Filled-in dummy table and summary:

Page 20: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

20

Dummy table for confounder assessment:

Confounder assessment for Positive Affect_ADLs analyses

Betacoefficient Beta coefficients for models with individual potential confounders

OutcomeOCESD

level Age%change Race %change medsum42 %change

KATZ ADL measure: model with cesd* time interaction term 

OCESD-level 1

OCESD-level 2

Page 21: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

21

Confounder assessment for Positive Affect_ADLs analyses

Betacoefficient Beta coefficients for models with individual potential confounders

OutcomeOCESD

level Age%change Race %change medsum42 %change

KATZ ADL measure: model with cesd* time interaction term 

OCESD-level 1 -0.3805 -0.354 107.5 -0.3969 95.9 -0.3612 105.3

OCESD-level 2 -0.2796 -0.4252 65.8 -0.3021 92.6 -0.2544 109.9

from hipKatzmix1_mixed models baseline, 5/3/06

Summary: Age and 1 or more medical conditions (medsum42) met the criteria as potential confounders. I will also include race in the multivariable models since it may turn out to be a confounder in the models of the KatzADL outcome.

Filled-in dummy table for confounder assessment:

Page 22: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

22

Dummy tables for multivariable analyses:

Predicted mean KATZ ADL score at each interview point, by baseline Positive Affect Category, PROC MIXED results

(pgm=hipKatzmix2_mixed models, prelim multivariable models, 5/4/06)

Positive Affect category

2-months(KATZ02)

6-months(KATZ06)

12-months(KATZ12)

18-months(KATZ18)

24-months(KATZ24)

(n=352) (n=321) (n=306) (n=245) (n=232)

 (OCESD) Mean (se) Mean (se) Mean (se) Mean (se) Mean (se)

High positive affect

Low positive affect

Depressive symptoms

Differences and 95% CI’s:

High vs. low positive affect

High positive affect vs. depr.

Page 23: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

23

Filled-in dummy tables and summary for multivariable analyses:Predicted mean KATZ ADL score at each interview point, by baseline Positive Affect

Category, PROC MIXED results(pgm=hipKatzmix2_mixed models, prelim multivariable models, 5/4/06)

Positive Affect category 2-months 6-months 12-months 18-months 24-months

(n=352) (n=321) (n=306) (n=245) (n=232)

  Mean (se) Mean (se) Mean (se) Mean (se) Mean (se)

High positive affect 3.87 (0.14) 2.62 (0.14) 2.18 (0.14) 2.19 (0.15) 2.35 (0.16)

Low positive affect 3.96 (0.23) 2.75 (0.24) 2.51 (0.23) 2.35 (0.24) 2.27 (0.25)

Depressive symptoms 3.97 (0.12) 2.94 (0.12) 2.75 (0.12) 2.88 (0.13) 2.70 (0.13)

Differences and 95% CI’s:

High vs. low positive affect

-0.09 (-0.61,0.43)

-0.14(-.68,0.41)

-0.34 (-.88,0.20)

-0.15 (-0.72,0.42)

0.07 (-0.50,.65)

High positive affect vs. depr.

-0.10 (-0.46,0.25)

-0.32 (-.68,0.05)

-0.57 (-0.94,-.20)

-0.68(-1.08, .29)

-0.35 (-0.76,.06)

Summary: In the multivariable model, positive affect and followup time were associated with the KatzADL score over time. Mean KatzADL scores were significantly lower (ie, less impaired) in respondents with high positive affect compared to those with depressive symptoms at months 12 and 18; there were no differences between respondents with high and low positive affect.

Page 24: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

24

Additional records to supplement dummy tables: • Data memos to co-investigators/self

• Footers and WORD file names with filename and date created/revised

ex: Positive Affect ADLs_datamemo3_050306

Page 25: 1 Why Dummy Tables are Smart! A Systematic Approach to Data Analysis for Your M.Sc. Thesis Lisa Fredman, Ph.D. Department of Epidemiology, BUSPH CREST.

25

Conclusion:

• Dummy tables are an organizational tool to ensure that data analyses follow hypothesis and are systematically recorded.

• Provide internal documentation.

• Link analytic plan, interim results, final tables and manuscript.

That’s why dummy tables are smart!


Recommended