Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | bathsheba-smith |
View: | 213 times |
Download: | 0 times |
From Republic of Science to Audit Society
Irwin Feller
Professor Emeritus, Economics, Pennsylvania State University
Assessing impact of public policies - Assessing impact of public research.
Which complementarities
ASIRPA
Paris, France
June 13, 2012
Presentation’s Focus
• Good and Bad Metrics/Methodologies for Evaluating the Impacts of Research
• Good and Bad Use(s) of Assessments of Research
Outline
Perennial S&T Policy Decisions
Political Economy of Governance of Science
Use/Nonuse/Misuse of Evidence
Choice of Methods/Metrics
The Times They are A-changin
Then you’d better start swimming or you’ll sink like a stone.
What Has Been Lost (As seen by some)
“The age of chivalry is gone. That of sophisters, economists and calculators
has succeeded”
Edmund Burke, Reflections on the Revolution in France
Context, Context, Context
• (Historical, Path Dependent) Context of National Science/Innovation Systems
• (Current Political/Policy) Context
• (Decision/Situational) Context
• (Linguistic/Disciplinary) Context
Assessment
Assessment is an historical, integral component of patron-based
scientific activity.
Assessment Framework
• Who is Assessing Whom
• Using What Criteria/Performance Measures/Methods to Document and Value What Outcomes, with
• What Impacts on the Vitality of the Scientific Enterprise and the Position(s) of Those Engaged in Scientific Activity?
Pre-New Public Management Assessment Paradigm
• Republic of Science (M.Polyani)• Peer (Expert) Review• Social Contract
Social Contract for Science
“Government promises to fund the basic science that peer reviewers find most worthy of support, and scientists promise that the research will be performed well and honestly and will produce a
steady stream of discoveries that can be translated into new products, medicines or weapons”
(Guston and Keniston, 1994: “The Social Contract for Science”)
New Public Management Paradigm
• Accountability
• Deregulation
• Competition
• Performance Measurement
What is New
• Performance measures are increasingly mandated components of appropriation decisions and oversight reviews
• Dominant ethos is that better measures will lead to better (evidence-based) decisions
• Stream of new data sets and analytical techniques
• Increased political/policy trends towards performance based budgeting
The Evidence-based Decision Making Imperative
• “Agencies should demonstrate the use of evidence throughout their FY2014 budget submissions.
• “….comparative cost-effectiveness of agency investments: allocation of funding across agency programs or within programs”
• OMB: Use of Evidence and Evaluation in the 2014 Budget (May 18,2012)
Tensions Among Accountability, Efficiency and Autonomy
Fine line between improved, evidence-based decision-making –”Wanted Better Benchmarks”
and increased influence on the direction and inner workings of the scientific
enterprise-”Asking Scientists to Measure Up”
Low-Stakes/High-Stakes Assessments
• Low Stakes– Reputational Surveys: National Research Council
Assessments of Graduate Programs; Shanghai Academic Ranking of World Universities
• High Stakes‒ Performance-based University Research Funding
Systems: UK-Research Assessment Exercise; Germany University Excellence Competition
3 Faces (Purposes) of Evaluation
• Learn about a program’s operations (Does it Work?; How can it be made better?)
• Control the behavior of those responsible for program implementation (Modify objectives; reallocate resources; reassign responsibilities)
• Influence the responses of outsiders in the program’s political environment (create the appearance of a well managed program; preemptively set metrics and methodologies)
Generic Science Policy Questions
The major issues in science policy are about allocating sufficient resources to science, to distribute them wisely between activities, to make sure that resources are used efficiently
and contribute to social welfare” (Lundvall, B. and S. Borras, 2005, p. 605)
Promises of Research Performance Assessment
• Objectives provide useful baseline for assessing performance.
• Performance measurement focuses attention on the end objectives of public policy, on what’s happened or happening outside rather than inside the black box.
• Well defined objectives and documentation of results facilitate communication with funders, performers, users, and others.
Limitations of Research Performance Measurement
• Returns/Impacts to research are uncertain, long-term, and circuitous
• Impacts are typically dependent on complementary actions by agents outside of Federal agency control
• Benefits from “failure” are underestimated
• Specious precision in selection of measures
• Distortion of Incentives
• Limited (public) evidence of contributions to improved decision making
Assessment as Lever for Structural change
“ Given that science is changing, the institutions that are efficient in supporting science at one point in time may be less appropriate at a later point of time. On precise dimensions, a failure to continually re-tune science policy may therefore impede scientific progress. B. Jones (2010) As Science Evolves, How Can Science Policy?
Performance Metrics
• Metrics Abound: Generic List of 37, with New Ones Constantly being Proposed
• Most Programs Have Multiple Objectives--Select Metrics most relevant to Decisions
• “Cherry Pick” Metrics (Strategic Retreat from Objectives)
All Performance Measures Can be Gamed
“Once STI indicators are made targets for STI policy, such indicators lose most of the
informational content that qualify them to play such a role” (Freeman and Soete, 2009)
Overview of Evaluation MethodologiesMETHOD BRIEF DESCRIPTION EXAMPLE OF USE
Analytical conceptual modeling of underlying theory
Investigating underlying concepts and developing models to advance understanding of some aspect of a program, project or phenomenon.
To describe conceptually the paths through which spillover effects may occur.
Survey Asking multiple parties a uniform set of questions about activities, plans, relationships, accomplishments, value, or other topics, which can be statistically analyzed.
To find out how many companies have licensed their newly developed technology to others.
Case study – descriptive
Investigating in-depth a program or project, a technology, or a facility, describing and explaining how and why developments of interest have occurred.
To recount how a particular joint venture was formed, how its participants shared research tasks, and why the collaboration was successful or unsuccessful.
Case study - economic estimation
Adding to a descriptive case study quantification of economic effects, such as through benefit-cost analysis.
To estimate whether, and by how much, benefits of a project exceed its costs.
Econometric and statistical analysis
Using tools of statistics, mathematical economics, and econometrics to analyze functional relationships between economic and social phenomena and to forecast economic effects.
To determine how public funding affects private funding of research.
Sociometric and social network analysis
Identifying and studying the structure of relationships by direct observation, survey, and statistical analysis of secondary databases to increase under-standing of social organizational behavior and related economic outcomes.
To learn how projects can be structured to increase the diffusion of resulting knowledge.
Taking the “Con” Out of Econometrics
Leamer (1986): “Hardly anyone takes the data analysis seriously”.
Distressing lack of robustness to changes in key (‘whimsical’) assumptions
Credibility Revolution in Empirical Economics
• “Empirical microeconomics has experienced a credibility revolution, with a consequent increase in policy relevance and scientific impact….
• “Primary engine driving improvement has been a focus on the quality of empirical research designs”
Angrist and Pischke (J. Economic Perspectives, 2010)
Before After
Treatment Group
SТ SТ
Comparison/Control Group
SC SC
τ
τ + 1
τ + 1
τ
Issue: Before/After Design Shows changes “related” to policy intervention, but does not adjust for “intervening” factors. (Threats to internal validity)
Reframe Analysis: Did policy “cause” change(s) in treatment group different from those observable in a comparison/control group
Trends in U.S. Agricultural Productivity
Impact of R&D on U.S. Agricultural Productivity
Benefit -Cost Analysis Steps
Conduct Technical Analysis
Identify Next Best Alternative
Estimate Program Costs
Estimate Economic Benefits
Determine Agency Attribution
Estimate Benefits of Economic Return
RTI 2010
Benefit-Cost Estimates of Returns to Health Research
• An average 45 year old in 1994 had a life expectancy 4 ½ years longer than in 1950 because cardiovascular disease mortality had decreased.
• “…(U)nambiguous conclusion …that medical research on cardiovascular disease is clearly worth the cost” (2002; p.113).
• In benefit-cost terms, this increase is estimated to yield a 4 to 1 return for medical treatment and a 30-to-1 return for research and dissemination costs related to behavioral change. (Cutler and Kadiyala, 2002)
Benefit-Cost Estimates of DOE-EERE Geothermal Technology Studies
MetricPDC
(thousands$2008)
Binary(thousands
$2008)
TOUGH(thousands
$2008)
Cement(thousands
$2008)
Total Benefits $15,823,141 $379,870 $928,369 $7,828
Program Cost $108,886 $195,469 $101,807 $137,223
Net Benefits $15,714,255 $184,401 $826,562 -$129,395
NPV of Net Benefits @3% $8,147,636 $74,642 $418,997 -$106,679
NPV of Net Benefits @7% $3,850,834 $3,380 $173,667 -$86,579
Benefits/Costs Ratio 145.3 1.9 9.1 0.06
Benefits/Costs Ratio @3% 94.4 1.5 6.1 0.03
Benefits/Costs Ratio @7% 56.4 1.0 3.7 0.012
Internal Rate of Return 91% 7% 19% NA
RTI: 2010
Econometric ApproachManufacturing Extension Partnership
Variable Client Mean Non-Client Mean
N 1559 15,982
Age, 1992 15.97 16.04
Employment, 1992 170.21 71.70
Employment Growth Rate, 1987-1992 0.013 -0.088
Sales, 1992 30,797,199 13,418,587
Sales Growth Rate, 1987-1992 0.052 -0.085
Sales Growth Rate, 1982-1987 0.427 0.338
Annual wage, 1992 28,072 25,013
Production Worker Share, 1992 0.699 0.724
Value Added Per Worker, 1987 53,042 50,853
Value Added Per Worker, 1992 56,709 52,797
Labor Productivity Growth Rate, 1987-1992 0.215 0.203
Labor Productivity Growth Rate, 1982-1992 0.052 0.010
# of Extension Projects 3.82 NA
Total Project Costs 63,787 NA
R. Jarmin Measuring the Impact of Manufacturing Extension
Summary Statistics
“Dominant” U.S. Methodology is Expert Panels
“The most effective means of evaluating federally funded research programs is expert review. Expert review-which includes quality review, relevance review, and benchmarking
should be used to assess both basic research and applied research programs”
(National Academies, Evaluating Federal Research Programs, 1999, p. 5)
Bibliometrics: US
• Added to reputational surveys in NRC assessments
• Patent to Citation Linkage to Document Impacts of Basic Research
• Increasingly used by departments/colleges
• No use of performance-based funding
• Little evidence (to date) of impacts on Federal funding of academic research
Use of Bibliometric Data to Allocate Resources Across Fields
• Over the last 3 decades, even as the US position in the life sciences has remained strong, its world share of engineering papers has been cut almost in half, from 38% in 1981 to 21% in 2009, placing it below the share (33%) for the EU27. Similar declines in world share are noted for mathematics, physics, and chemistry.
• If bibliometric performance is a function of resource allocation, a nation gets what it funds.
• Formulation begs questions if what it’s producing is what it most needs, and if what it’s producing is being produced in the most efficient manner.
Is Anyone Listening?
“The ideas of economists and political philosophers, both when they are right and when they are wrong, are more powerful than is commonly understood. Indeed the world is ruled by little else. Practical
men, who believe themselves to be quite exempt from any intellectual influence, are usually the slaves of some defunct economist. Madmen in
authority, who hear voices in the air, are distilling their frenzy from some academic scribbler of a few
years back” (J.Keynes)
Is Anyone Listening? Yes
• Continuing Impact of “Academic Scribblers” on Men in Power in Setting Policy Worldview– Solow-Abramovitz-Romer– Arrow-Nelson– Mansfield
BIG “3” FEDERAL SCIENCE QUESTIONS
QUESTION
• How much should be allocated to Federal research?
• How much to spend across missions/agencies/fields of science?
• Which performers; what allocation criteria?
ROLE OF PERFORMANCE MEASURES
• Measures do not provide a basis for determining if, say, 3% is too high, too low, or just right.
• Measures/methodologies provide multiple answers, leading to multiple possible decisions
• Potentially of considerable value, but underutilized
Using Social Rates of Return to Guide Resource Allocations
“But it is evident that these studies can provide very limited guidance to the Office of Management and Budget or to the Congress regarding many pressing
issues. Because they are retrospective, they shed little light on current resource allocation decisions, since these decisions depend on the benefits and costs of proposed projects, not those completed in the past”.
(Mansfield, 1991, p. 26).
Use, Non-Use, Misuse of Program Assessments
• Use: NIH Benefit-Cost Studies
• Nonuse: ATP (Program terminated); DOE-ERRE (Budget slashed)
• Misuse: Overgeneralization of findings to different decision/policy settings
• “Bunkum”-Worthless, mundane, incompetently done assessments
Asymmetrical Impacts of Well Done Evaluations
• If program is not working, kill it because it is ineffective
• If program is working, prima facie evidence that the private sector would engage in it where it not being crowded out
The Ever Lurking (Ideological) Counterfactual:ATP
“ATP’s defenders claim that these subsidies generate greater technological innovation. They point out all the
technologies on the market that ATP funded. Of course, ATP grants have funded some successful products. But
the key question is whether the market would have produced those products even without ATP. Both economic
theory and practice say, “Yes.”
Brian Riedl, Testimony before the Homeland Security and Government Affairs Committee, United States Senate,
May 2005
Thank you