Date post: | 06-Feb-2018 |
Category: |
Documents |
Upload: | nguyenliem |
View: | 222 times |
Download: | 2 times |
Development of Trend Analysis Guidelines atNCHS
Donald MalecAssociate Director of Science
Office of Research and Methodologyand
Spokesperson, NCHS Trends Analysis Workgroup
Presentation for the Board of Scientific CounselorsSeptember 25, 2015
Guidance being developed by NCHS workgroup
I Members include analysts across NCHSI Draft guidance document in two main parts: Guidance and
Real Examples
1. Overview of Key Trend Analysis Issues2. Choosing a Method for Trend Analysis3. Illustrative Examples of Trend Analysis with Alternative
Comparisons4. Technical Appendixes... with explanation of statistical
guidance
●●
●
●
●
●
010
2030
4050
NHANES Cycle
perc
ent
1999−2000 2001−2002 2003−2004 2005−2006 2007−2008 2009−2010
An example of a trend
Figure : Prevalence of Obesity among adults age 60 years and older,NHANES 1999-2010.
ll
l
l
l
l
010
2030
4050
NHANES Cycle
perc
ent
1999−2000 2001−2002 2003−2004 2005−2006 2007−2008 2009−2010
An example of a trend
Figure : Prevalence of Obesity among adults age 60 years and older,NHANES 1999-2010.
ll
l
l
l
l
010
2030
4050
NHANES Cycle
perc
ent
1999−2000 2001−2002 2003−2004 2005−2006 2007−2008 2009−2010
An example of a trend
Figure : Prevalence of Obesity among adults age 60 years and older,NHANES 1999-2010.
l l l l ll l l l l
l
ll
l
2000 2002 2004 2006 2008 2010 2012
010
2030
4050
Year
perc
ent
l
l
ll
l
l
ll
l
l
l
l
l
l
ll
l
l
l ll
l
l
l ll
l l
l
l
l
PrivateMedicaidUninsured
other examples of trends ...
Figure : Percent of persons with any emergency room use in the past 12months among adults aged 18-64, by insurance status. NHIS 2000-2013.
l
ll
l
l
ll
2005 2006 2007 2008 2009 2010 2011
05
1015
2025
Year
Perc
ent o
f Vis
its
other examples of trends ...
Figure : Percentage of ED visits in which an EKG was ordered orprovided: United States. NHAMCS 2005-2011.
ll
ll
llllllll
ll
ll
ll
lllll
1995 2000 2005 2010
020
4060
8010
0
Year
l
l
l
l
l
l
ll
lll
l
l
ll
ll
l
l
ll
ll
l
l
Age 15−17Age 18−19
other examples of trends ...
Figure : Birth rates for women aged 15-17 and 18-19: United States,1991-2013.
Aim of a Guidance Document for Trend Analysis
I Focus on descriptive statistics, not prediction
I Document issues that may cause controversy - choice ofstudy interval, choice of model, etc.
I Provide a summary of methods for reference
I When possible, suggest a preferred method
Aim of a Guidance Document for Trend Analysis
I Focus on descriptive statistics, not prediction
I Document issues that may cause controversy - choice ofstudy interval, choice of model, etc.
I Provide a summary of methods for reference
I When possible, suggest a preferred method
I Recognize limitations of rules
I Analyst will bring expertise into the analysisI Include “just enough detail”
Why are trends analyzed at NCHS? .... here are a few
I Topical ... trends in obesity prevalence
I Program changes ... trends in health insurance coverage
I Surveillance/ resource ... Health US and Healthy People
I Interesting changes (e.g. changes in slope) noted byanalysts
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)
I Comparison to a benchmark year
I Linear (with curves) and non-linear regression
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalence
I Comparison to a benchmark year
I Linear (with curves) and non-linear regression
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalence
I Comparison to a benchmark year
I Linear (with curves) and non-linear regression
I sample variance weighted least squares fit to populationprevalence
I In 2013, 17.8% of noninstitutionalized adults aged 18 and overwere current cigarette smokers, a decline from 2000 (2014 -HUS)
I polynomial regression ... orthogonal polynomialsI linear splines ... joinpoint software (NCI)I log transform, logistic transform of prevalence ratesI smoothing by collapsing years, followed by additional analysisI pairwise comparisons
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark year
I Linear (with curves) and non-linear regression
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regression
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regressionI polynomial regression ... orthogonal polynomials
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regressionI polynomial regression ... orthogonal polynomialsI linear splines ... joinpoint software (NCI)
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regressionI polynomial regression ... orthogonal polynomialsI linear splines ... joinpoint software (NCI)I log transform, logistic transform of prevalence rates
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regressionI polynomial regression ... orthogonal polynomialsI linear splines ... joinpoint software (NCI)I log transform, logistic transform of prevalence ratesI smoothing by collapsing years, followed by additional analysis
First step: How are trends analyzed at NCHS now?
I Linear regression (straight line)I least squares fit to population prevalenceI population weighted least squares fit to population prevalenceI sample variance weighted least squares fit to population
prevalence
I Comparison to a benchmark yearI In 2013, 17.8% of noninstitutionalized adults aged 18 and over
were current cigarette smokers, a decline from 2000 (2014 -HUS)
I Linear (with curves) and non-linear regressionI polynomial regression ... orthogonal polynomialsI linear splines ... joinpoint software (NCI)I log transform, logistic transform of prevalence ratesI smoothing by collapsing years, followed by additional analysisI pairwise comparisons
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
Methodological Issues: Design vs Model-based Approach
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
SUDAAN DESCRIPTSAS WLS & Joinpoint
Methodological Issues: Design vs Model-based Approach
I Trend estimates are similar but significance can vary widely
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
p−value=0.033
p−value=0.285
SUDAAN DESCRIPTSAS WLS & Joinpoint
Methodological Issues: Design vs Model-based Approach
I Trend estimates are similar but significance can vary widely
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
p−value=0.033SE=.45
p−value=0.285SE=.52
SUDAAN DESCRIPTSAS WLS & Joinpoint
Methodological Issues: Design vs Model-based Approach
I Trend estimates are similar but significance can vary widely
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
p−value=0.033
DF~80SE=.45
p−value=0.285
DF=4SE=.52
SUDAAN DESCRIPTSAS WLS & Joinpoint
Methodological Issues: Design vs Model-based Approach
I Trend estimates are similar but significance can vary widely
l
l
l
l
l
1 2 3 4 5
910
1112
1314
period
perc
ent
p−value=0.033
DF~80SE=.45
p−value=0.285
DF=4SE=.52
SUDAAN DESCRIPTSAS WLS & Joinpoint
Methodological Issues: Design vs Model-based Approach
I Trend estimates are similar but significance can vary widely
I Model-based includes “lack of model fit” as error.Design-based provides variance of each point estimate.
Methodological Issues: What Slope to Estimate?
I Not the usual issue of how to estimate (i.e. efficiency) but,rather, what to estimate
I When fitting a line to population values, NCHS uses (at least)three different targets:
I Unweighted Slope: slope ∝∑T
t=1 Pt × yeartI Population weighted Slope: slope ∝
∑Tt=1 NtPt × yeart
I Sample variance weighted Slope: slope ∝∑T
t=1Pt×yeartvar(P̂t)
I These three estimates will all be very similar if the underlyingpopulation size does not change and if the sample size/designdoesn’t change
Methodological Issues: What Slope to Estimate?
I Not the usual issue of how to estimate (i.e. efficiency) but,rather, what to estimate
I When fitting a line to population values, NCHS uses (at least)three different targets: ∑T
I Unweighted Slope: slope ∝ t=1 Pt × yeart
I Population weighted Slope: slope ∝∑T
t=1 NtPt × yeartI Sample variance weighted Slope: slope ∝
∑Tt=1
Pt×yeartvar(P̂t)
I These three estimates will all be very similar if the underlyingpopulation size does not change and if the sample size/designdoesn’t change
Methodological Issues: What Slope to Estimate?
I Not the usual issue of how to estimate (i.e. efficiency) but,rather, what to estimate
I When fitting a line to population values, NCHS uses (at least)three different targets: ∑T
I Unweighted Slope: slope ∝ t=1 Pt × year∑ tT
I Population weighted Slope: slope ∝ t=1 NtPt × yeart
I Sample variance weighted Slope: slope ∝∑T
t=1Pt×yeartvar(P̂t)
I These three estimates will all be very similar if the underlyingpopulation size does not change and if the sample size/designdoesn’t change
Methodological Issues: What Slope to Estimate?
I Not the usual issue of how to estimate (i.e. efficiency) but,rather, what to estimate
I When fitting a line to population values, NCHS uses (at least)three different targets: ∑T
I Unweighted Slope: slope ∝ Pt ×t=1 year∑ tT
I Population weighted Slope: slope ∝ t=1 NtPt × year∑ tT
I Sample variance weighted Slope: slope ∝ Pt×yeartt=1 var(P̂t)
I These three estimates will all be very similar if the underlyingpopulation size does not change and if the sample size/designdoesn’t change
Methodological Issues: What Slope to Estimate?
I Not the usual issue of how to estimate (i.e. efficiency) but,rather, what to estimate
I When fitting a line to population values, NCHS uses (at least)three different targets: ∑T
I Unweighted Slope: slope ∝ t=1 Pt × year∑ tT
I Population weighted Slope: slope ∝ t=1 NtPt × year∑ tT
I Sample variance weighted Slope: slope ∝ Pt×yeartt=1 var(P̂t)
I These three estimates will all be very similar if the underlyingpopulation size does not change and if the sample size/designdoesn’t change
Methodological Issues: Survey estimates may be correlatedbetween years
I Surveys estimates such as from NHIS are dependent betweenyears- Software will take into account correlation but record-leveldata is needed- Many estimation operations work from table estimates (i.e.,correlations are not typically constructed for all items)
I Incorporating correlation into prevalence estimation usuallyresults in small changes but there has been no systematicunderstanding
I Requiring analysis to always include correlation will increaseworkload, could reduce output
I Will recommend individual reports always incorporatedependencies but not clear on large pubs like HUS and HP
l l
l
l l
l
ll
l
l
1980 1990 2000 2010
05
1015
2025
30
Year
perc
ent
I Pick time period based on a specific policy change?
I Focus interest on past decade?
I Interested in what has been happening “recently”?
I issue: is it an interesting trend or a statistical anomaly?
Methodological Issues: How to pick a time frame?
I Explaining an entire long trend may not be relevant
l l
l
l l
l
ll
l
l
1980 1990 2000 2010
05
1015
2025
30
Year
perc
ent
I Pick time period based on a specific policy change?
I Focus interest on past decade?
I Interested in what has been happening “recently”?
I issue: is it an interesting trend or a statistical anomaly?
Methodological Issues: How to pick a time frame?
I Explaining an entire long trend may not be relevant
l l
l
l l
l
ll
l
l
1980 1990 2000 2010
05
1015
2025
30
Year
perc
ent
I issue: is it an interesting trend or a statistical anomaly?
Methodological Issues: How to pick a time frame?
I Explaining an entire long trend may not be relevant
I Pick time period based on a specific policy change?
I Focus interest on past decade?
I Interested in what has been happening “recently”?
l l
l
l l
l
ll
l
l
1980 1990 2000 2010
05
1015
2025
30
Year
perc
ent
Methodological Issues: How to pick a time frame?
I Explaining an entire long trend may not be relevant
I Pick time period based on a specific policy change?
I Focus interest on past decade?
I Interested in what has been happening “recently”?
I issue: is it an interesting trend or a statistical anomaly?
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
year
p
l
l
l
l
l
l
l
l
l
l
l
Methodological Issues: Choice of transformation?
I View choice of transformation as flexible( “ all transformations are wrong but ...”)
I Case in point: logistic regression
Methodological Issues: Choice of model?
I View choice of model as flexible ( “ all models are wrong ...”)I For trends, NCHS seems to rely on linear splines (joinpoint)
and polynomial regressionI joinpoint software:
I selection is defensible from a design-based viewI accounts for multiple testing in model choiceI pinpoints an exact time point where a change takes place
I polynomial regression model:I useful for checking deviations from linearityI relies on off-line separate multiple comparison model fitting
I how important is it to always check the overall fit of the finaltrend model?
I trade-off between false positives and false negatives?I is the complexity of a model obvious sometimes?
I Analysis with a small number of time points (3 or 4)I is trend analysis appropriate?
I Yes - can still evaluate linearity or lack ofI No - what does linearity mean for such a few points? - look at
pairwise differences
Methodological Issues: Other Issues
I Collapsing years togetherI individual estimates fail precision requirements for publicationI group individual estimates into reliable groups of neighboring
yearsI do a trend analysis on grouped dataI theory states: regression estimates more precise if data not
groupedI dilemma: present slope for ungrouped data with grouped
individual estimates?
Methodological Issues: Other Issues
I Collapsing years togetherI individual estimates fail precision requirements for publicationI group individual estimates into reliable groups of neighboring
yearsI do a trend analysis on grouped dataI theory states: regression estimates more precise if data not
groupedI dilemma: present slope for ungrouped data with grouped
individual estimates?
I Analysis with a small number of time points (3 or 4)I is trend analysis appropriate?
I Yes - can still evaluate linearity or lack ofI No - what does linearity mean for such a few points? - look at
pairwise differences
Guidance Not Planned:
I Detailed guidance on software
I Time series methods
I Age period cohort models
I Determining underlying correlates of trend
I Causal analysis
I Superpopulation models
I Model fitting - new methods
What should guidance on trends consist of?
I What is the balance between subject matter expertise andstatistical testing?
I How to guide multiple testing for balancing betweenfalse-positive and false negative conclusions?
I How much specific guidance should be provided to anticipatechallenges to conclusions?
I How much guidance should be directed to researchers outsideof NCHS?
I How much detail should reports include regardingmethods/guidance used?