Post on 19-Dec-2015
transcript
Evaluation ResearchIssues, Methods, and Opportunities
William H Fisher, PhD
Center for Mental Health Services Research
What is Evaluation Research?
• Evaluation is the systematic assessment of the worth or merit of something
• In our context – systematic assessment of a policy, practice or service intervention.
• Many of the same research principles apply – just in more complicated ways
Types of Evaluation Research
• outcome evaluations investigate whether the program or technology caused demonstrable effects on specifically defined target outcomes
• impact evaluation is broader and assesses the overall or net effects -- intended or unintended -- of the program or technology as a whole – i.e. – it did this, but it also did that
• cost-effectiveness and cost-benefit analysis address questions of efficiency by standardizing outcomes in terms of – dollar costs and values– “social costs” – we reduced hospitalizations but increased incarcerations– “opportunity costs” – because we did “A” we couldn’t do “B”
• secondary analysis reexamines existing data to address new questions or use methods not previously employed – – can be used to do all of the above relatively inexpensively
A little history
• 1965 – the big year for evaluation research– Great Society Programs– Recognition at HEW of social science – The “Head Start Program” – an early example– Question – did kids in that program benefit compared to
those not in the program?– Donald Campbell (1975) “Reforms as Experiments”
• Mental Health arena – Community Support Program– Did people benefit from the services they received?– Did community-based services keep people out of the
hospital?
Evaluation Research vs. “Regular Research”
• Traditional research - Scientific MethodTheory -- Hypotheses -- Operationalization – Analysis
• Evaluations – Scientific method but not always theory driven – Less theory “logic models” (which may sometimes
be theory driven)
• Guide to what the intervention is going to do and what its goals are
Example of Logic Model
Methodological Issues in Evaluation Research
• Gold Standard of Experimental Designs – Randomized Clinical Trials – often not feasible
• Over the years, new approaches have been developed – QUASI- EXPERIMENTAL DESIGNS
Conterfactual Explanation
Basic Thrust of Evaluation
Determine what the world would look like if the event, intervention, law
etc. hadn’t been implemented?
Quasi- Experiments and “True” Experiments
True Experiments: The Randomized Control Trial
• RCTs function to minimize threats to validity – i.e., factors that can contaminate one’s study
• Randomize to “arms” from a pool of individuals who meet pre-specified criteria
• Able to isolate effects of change attributable to the intervention – everything else is controlled
OOOOO --------- “Treatment” ----------- OOOOOOOO
OOOOO -------- “No Treatment” ----------- OOOOOOOO
Quasi-Experimental designs
• The “bread and butter” of evaluation research• Missing one or more features of “true
experiments”• Methods aimed at minimizing the extent of
these factors• Vary in “strength”; which one is used depends
on what opportunities there are for data collection
Goal: Reduce Threats to Validity
• Internal Validity
• External Validity
Threats to Internal Validity
• “Internal validity” refers to problems with the design that lead to inadequate control of extraneous variation.
• For example– Sampling bias – control and treatment groups not sufficiently
comparable (e.g., one group all male, other all female)– History effects – factors that went on in the environment of the
treatment or intervention that could have had an effect on the outcome (major event that changed the outcome independent of the treatment or intervention –
• change in reimbursement rates, hospital admission policies, • advent of a new medication, • economic downturn during an employment program
Threats to External Validity
• Inability to generalize findings because– Poor internal validity – study is so badly designed
that can’t be generalized to another setting
– Study population or site is very idiosyncratic (which comes up occasionally in evaluations)
• May be okay if it’s just a local effect you’re looking for and not planning to publish
Examples of Some Quasi-Experimental Designs
Post-test only, no control (weakest)
X o
• Very weak design• Sometimes called a “case study” • “Here’s what happened when we did ‘X’”• Useful for guiding other studies, • need to couch within lots of caveats
Pre-Post, No Control oooo X oooo
– Time series analysis is an example– Interrupted Time Series Analysis– Get a measure of outcome for time periods before
and after the intervention • The “interruption” is the intervention• Good for looking at legal or policy changes
A “Classic” Time Series: The Cincinnati Bell Experiment
Questions with Time Series
• How strong is the effect?• Is the effect
– Abrupt or gradual?– Temporary or permanent?
Statistical Methods with Time Series Analysis
• Depends on type and nature of data – ARIMA, Poisson Regression
• All take the same approach, built on the counterfactual explanation
• Based on previous patterns in the time series,– Forecast the future (i.e., what it would be in the
absence of the intervention)– Compare observed post-intervention with forecast
Issues in Time Series
• Seasonality
• Secular trends – long term trends – years, not months
• Autocorrelation- neighboring data points may be related
• Data on long pre-intervention period is good– Total of 50 observations
Pre-Post, Non-Equivalent Control Group
• One of the better designs
- E.g. – comparing two types of case management where you can’t randomize
oooooooooo X ooooooooooo
oooooooooo Y oooooooooooX= treatment, intervention, etc
Y = alternative, placebo, etc
Example: Northampton Consent Decree Study
• Federal court deinstitutionalization order in Western Massachusetts but not elsewhere
• Compared Northampton State Hospital and Worcester State Hospital
• Argued that areas “sort of different” except for the consent decree
• Looked at changes in state hospital use and other factors
Which design to choose
• What resources are available– Research assistants, etc– Funding– Time lines
• What data are already available?• How accessible are subjects?• Can they be assessed before a treatment or
intervention occurs?• Is there a sensible choice for a control?
– e.g., Central Mass as control for Western Mass
Evaluating interventions in “real time”
• Common evaluation activity • E.g., SAMHSA – often includes a requirement
that an evaluation be done (usually with inadequate resources)
• E.g., A wellness program is being initiated at a local mental health center
• MISSION-DIRECT VET• This will be monitored
What do we need to do?
• Operationalize expectations into measureable outcome measures– Develop plans for:
• collecting data • examining “what happens” as the project goes
forward – “process evaluation”
Process Analysis
• What really happened?– Was the intervention delivered as proposed?– Was the design (sampling, followup, etc.)
implemented as proposed?
• Need to understand how the implementation actually occurs so that the outcome is more interpretable.
Is the Intervention what we said it would be? Fidelity
• Many interventions have a strong evidence base
• Some – example, Program for Assertive Community Treatment (PACT) have formal “fidelity measures.”If you say you’re doing PACT, it must have the
following elements; otherwise you’re not
Evaluation Research as an Activity
• Major call for data-driven decision making• Scarce resources – can’t be wasted on things
that don’t work or work poorly• Evaluation research now thrives as a major
focus for researchers– American Evaluation Association– Evaluation Research journal devoted to methods
and examples from multiple fields
Issues for Academic Researchers
• Working with public agencies and other service providers can be rewarding and interesting
• Cultures are different• Specific issues: “Quality Assurance /
Evaluation vs. “Research”• Who owns the data?• IRB issues
Some consumers of evaluation research may not be interested in a fancy study
Publishing
• Academic researchers want to publish• Need to straighten out issues in advance
– Who can publish?– Does the organizational partner want to be
involved?• Review of findings?• “Censorship”
Final thoughts
• Evaluation research makes significant contribution to behavioral health research– Informs policy and practice
• Our department recognizes its importance• Lots of opportunities