From Republic of Science to Audit Society Irwin Feller Professor Emeritus, Economics, Pennsylvania...

From Republic of Science to Audit Society

Irwin Feller

Professor Emeritus, Economics, Pennsylvania State University

Assessing impact of public policies - Assessing impact of public research.

Which complementarities

ASIRPA

Paris, France

June 13, 2012

Presentation’s Focus

• Good and Bad Metrics/Methodologies for Evaluating the Impacts of Research

• Good and Bad Use(s) of Assessments of Research

Outline

Perennial S&T Policy Decisions

Political Economy of Governance of Science

Use/Nonuse/Misuse of Evidence

Choice of Methods/Metrics

The Times They are A-changin

Then you’d better start swimming or you’ll sink like a stone.

What Has Been Lost (As seen by some)

“The age of chivalry is gone. That of sophisters, economists and calculators

has succeeded”

Edmund Burke, Reflections on the Revolution in France

Context, Context, Context

• (Historical, Path Dependent) Context of National Science/Innovation Systems

• (Current Political/Policy) Context

• (Decision/Situational) Context

• (Linguistic/Disciplinary) Context

Assessment

Assessment is an historical, integral component of patron-based

scientific activity.

Assessment Framework

• Who is Assessing Whom

• Using What Criteria/Performance Measures/Methods to Document and Value What Outcomes, with

• What Impacts on the Vitality of the Scientific Enterprise and the Position(s) of Those Engaged in Scientific Activity?

Pre-New Public Management Assessment Paradigm

• Republic of Science (M.Polyani)• Peer (Expert) Review• Social Contract

Social Contract for Science

“Government promises to fund the basic science that peer reviewers find most worthy of support, and scientists promise that the research will be performed well and honestly and will produce a

steady stream of discoveries that can be translated into new products, medicines or weapons”

(Guston and Keniston, 1994: “The Social Contract for Science”)

New Public Management Paradigm

• Accountability

• Deregulation

• Competition

• Performance Measurement

What is New

• Performance measures are increasingly mandated components of appropriation decisions and oversight reviews

• Dominant ethos is that better measures will lead to better (evidence-based) decisions

• Stream of new data sets and analytical techniques

• Increased political/policy trends towards performance based budgeting

The Evidence-based Decision Making Imperative

• “Agencies should demonstrate the use of evidence throughout their FY2014 budget submissions.

• “….comparative cost-effectiveness of agency investments: allocation of funding across agency programs or within programs”

• OMB: Use of Evidence and Evaluation in the 2014 Budget (May 18,2012)

Tensions Among Accountability, Efficiency and Autonomy

Fine line between improved, evidence-based decision-making –”Wanted Better Benchmarks”

and increased influence on the direction and inner workings of the scientific

enterprise-”Asking Scientists to Measure Up”

Low-Stakes/High-Stakes Assessments

• Low Stakes– Reputational Surveys: National Research Council

Assessments of Graduate Programs; Shanghai Academic Ranking of World Universities

• High Stakes‒ Performance-based University Research Funding

Systems: UK-Research Assessment Exercise; Germany University Excellence Competition

3 Faces (Purposes) of Evaluation

• Learn about a program’s operations (Does it Work?; How can it be made better?)

• Control the behavior of those responsible for program implementation (Modify objectives; reallocate resources; reassign responsibilities)

• Influence the responses of outsiders in the program’s political environment (create the appearance of a well managed program; preemptively set metrics and methodologies)

Generic Science Policy Questions

The major issues in science policy are about allocating sufficient resources to science, to distribute them wisely between activities, to make sure that resources are used efficiently

and contribute to social welfare” (Lundvall, B. and S. Borras, 2005, p. 605)

Promises of Research Performance Assessment

• Objectives provide useful baseline for assessing performance.

• Performance measurement focuses attention on the end objectives of public policy, on what’s happened or happening outside rather than inside the black box.

• Well defined objectives and documentation of results facilitate communication with funders, performers, users, and others.

Limitations of Research Performance Measurement

• Returns/Impacts to research are uncertain, long-term, and circuitous

• Impacts are typically dependent on complementary actions by agents outside of Federal agency control

• Benefits from “failure” are underestimated

• Specious precision in selection of measures

• Distortion of Incentives

• Limited (public) evidence of contributions to improved decision making

Assessment as Lever for Structural change

“ Given that science is changing, the institutions that are efficient in supporting science at one point in time may be less appropriate at a later point of time. On precise dimensions, a failure to continually re-tune science policy may therefore impede scientific progress. B. Jones (2010) As Science Evolves, How Can Science Policy?

Performance Metrics

• Metrics Abound: Generic List of 37, with New Ones Constantly being Proposed

• Most Programs Have Multiple Objectives--Select Metrics most relevant to Decisions

• “Cherry Pick” Metrics (Strategic Retreat from Objectives)

All Performance Measures Can be Gamed

“Once STI indicators are made targets for STI policy, such indicators lose most of the

informational content that qualify them to play such a role” (Freeman and Soete, 2009)

Overview of Evaluation MethodologiesMETHOD BRIEF DESCRIPTION EXAMPLE OF USE

Analytical conceptual modeling of underlying theory

Investigating underlying concepts and developing models to advance understanding of some aspect of a program, project or phenomenon.

To describe conceptually the paths through which spillover effects may occur.

Survey Asking multiple parties a uniform set of questions about activities, plans, relationships, accomplishments, value, or other topics, which can be statistically analyzed.

To find out how many companies have licensed their newly developed technology to others.

Case study – descriptive

Investigating in-depth a program or project, a technology, or a facility, describing and explaining how and why developments of interest have occurred.

To recount how a particular joint venture was formed, how its participants shared research tasks, and why the collaboration was successful or unsuccessful.

Case study - economic estimation

Adding to a descriptive case study quantification of economic effects, such as through benefit-cost analysis.

To estimate whether, and by how much, benefits of a project exceed its costs.

Econometric and statistical analysis

Using tools of statistics, mathematical economics, and econometrics to analyze functional relationships between economic and social phenomena and to forecast economic effects.

To determine how public funding affects private funding of research.

Sociometric and social network analysis

Identifying and studying the structure of relationships by direct observation, survey, and statistical analysis of secondary databases to increase under-standing of social organizational behavior and related economic outcomes.

To learn how projects can be structured to increase the diffusion of resulting knowledge.

Taking the “Con” Out of Econometrics

Leamer (1986): “Hardly anyone takes the data analysis seriously”.

Distressing lack of robustness to changes in key (‘whimsical’) assumptions

Credibility Revolution in Empirical Economics

• “Empirical microeconomics has experienced a credibility revolution, with a consequent increase in policy relevance and scientific impact….

• “Primary engine driving improvement has been a focus on the quality of empirical research designs”

Angrist and Pischke (J. Economic Perspectives, 2010)

Before After

Treatment Group

SТ SТ

Comparison/Control Group

SC SC

τ

τ + 1

τ + 1

τ

Issue: Before/After Design Shows changes “related” to policy intervention, but does not adjust for “intervening” factors. (Threats to internal validity)

Reframe Analysis: Did policy “cause” change(s) in treatment group different from those observable in a comparison/control group

Trends in U.S. Agricultural Productivity

Impact of R&D on U.S. Agricultural Productivity

Benefit -Cost Analysis Steps

Conduct Technical Analysis

Identify Next Best Alternative

Estimate Program Costs

Estimate Economic Benefits

Determine Agency Attribution

Estimate Benefits of Economic Return

RTI 2010

Benefit-Cost Estimates of Returns to Health Research

• An average 45 year old in 1994 had a life expectancy 4 ½ years longer than in 1950 because cardiovascular disease mortality had decreased.

• “…(U)nambiguous conclusion …that medical research on cardiovascular disease is clearly worth the cost” (2002; p.113).

• In benefit-cost terms, this increase is estimated to yield a 4 to 1 return for medical treatment and a 30-to-1 return for research and dissemination costs related to behavioral change. (Cutler and Kadiyala, 2002)

Benefit-Cost Estimates of DOE-EERE Geothermal Technology Studies

MetricPDC

(thousands$2008)

Binary(thousands

$2008)

TOUGH(thousands

$2008)

Cement(thousands

$2008)

Total Benefits $15,823,141 $379,870 $928,369 $7,828

Program Cost $108,886 $195,469 $101,807 $137,223

Net Benefits $15,714,255 $184,401 $826,562 -$129,395

NPV of Net Benefits @3% $8,147,636 $74,642 $418,997 -$106,679

NPV of Net Benefits @7% $3,850,834 $3,380 $173,667 -$86,579

Benefits/Costs Ratio 145.3 1.9 9.1 0.06

Benefits/Costs Ratio @3% 94.4 1.5 6.1 0.03

Benefits/Costs Ratio @7% 56.4 1.0 3.7 0.012

Internal Rate of Return 91% 7% 19% NA

RTI: 2010

Econometric ApproachManufacturing Extension Partnership

Variable Client Mean Non-Client Mean

N 1559 15,982

Age, 1992 15.97 16.04

Employment, 1992 170.21 71.70

Employment Growth Rate, 1987-1992 0.013 -0.088

Sales, 1992 30,797,199 13,418,587

Sales Growth Rate, 1987-1992 0.052 -0.085

Sales Growth Rate, 1982-1987 0.427 0.338

Annual wage, 1992 28,072 25,013

Production Worker Share, 1992 0.699 0.724

Value Added Per Worker, 1987 53,042 50,853

Value Added Per Worker, 1992 56,709 52,797

Labor Productivity Growth Rate, 1987-1992 0.215 0.203

Labor Productivity Growth Rate, 1982-1992 0.052 0.010

# of Extension Projects 3.82 NA

Total Project Costs 63,787 NA

R. Jarmin Measuring the Impact of Manufacturing Extension

Summary Statistics

“Dominant” U.S. Methodology is Expert Panels

“The most effective means of evaluating federally funded research programs is expert review. Expert review-which includes quality review, relevance review, and benchmarking

should be used to assess both basic research and applied research programs”

(National Academies, Evaluating Federal Research Programs, 1999, p. 5)

Bibliometrics: US

• Added to reputational surveys in NRC assessments

• Patent to Citation Linkage to Document Impacts of Basic Research

• Increasingly used by departments/colleges

• No use of performance-based funding

• Little evidence (to date) of impacts on Federal funding of academic research

Use of Bibliometric Data to Allocate Resources Across Fields

• Over the last 3 decades, even as the US position in the life sciences has remained strong, its world share of engineering papers has been cut almost in half, from 38% in 1981 to 21% in 2009, placing it below the share (33%) for the EU27. Similar declines in world share are noted for mathematics, physics, and chemistry.

• If bibliometric performance is a function of resource allocation, a nation gets what it funds.

• Formulation begs questions if what it’s producing is what it most needs, and if what it’s producing is being produced in the most efficient manner.

Is Anyone Listening?

“The ideas of economists and political philosophers, both when they are right and when they are wrong, are more powerful than is commonly understood. Indeed the world is ruled by little else. Practical

men, who believe themselves to be quite exempt from any intellectual influence, are usually the slaves of some defunct economist. Madmen in

authority, who hear voices in the air, are distilling their frenzy from some academic scribbler of a few

years back” (J.Keynes)

Is Anyone Listening? Yes

• Continuing Impact of “Academic Scribblers” on Men in Power in Setting Policy Worldview– Solow-Abramovitz-Romer– Arrow-Nelson– Mansfield

BIG “3” FEDERAL SCIENCE QUESTIONS

QUESTION

• How much should be allocated to Federal research?

• How much to spend across missions/agencies/fields of science?

• Which performers; what allocation criteria?

ROLE OF PERFORMANCE MEASURES

• Measures do not provide a basis for determining if, say, 3% is too high, too low, or just right.

• Measures/methodologies provide multiple answers, leading to multiple possible decisions

• Potentially of considerable value, but underutilized

Using Social Rates of Return to Guide Resource Allocations

“But it is evident that these studies can provide very limited guidance to the Office of Management and Budget or to the Congress regarding many pressing

issues. Because they are retrospective, they shed little light on current resource allocation decisions, since these decisions depend on the benefits and costs of proposed projects, not those completed in the past”.

(Mansfield, 1991, p. 26).

Use, Non-Use, Misuse of Program Assessments

• Use: NIH Benefit-Cost Studies

• Nonuse: ATP (Program terminated); DOE-ERRE (Budget slashed)

• Misuse: Overgeneralization of findings to different decision/policy settings

• “Bunkum”-Worthless, mundane, incompetently done assessments

Asymmetrical Impacts of Well Done Evaluations

• If program is not working, kill it because it is ineffective

• If program is working, prima facie evidence that the private sector would engage in it where it not being crowded out

The Ever Lurking (Ideological) Counterfactual:ATP

“ATP’s defenders claim that these subsidies generate greater technological innovation. They point out all the

technologies on the market that ATP funded. Of course, ATP grants have funded some successful products. But

the key question is whether the market would have produced those products even without ATP. Both economic

theory and practice say, “Yes.”

Brian Riedl, Testimony before the Homeland Security and Government Affairs Committee, United States Senate,

May 2005

Thank you

Date post:	25-Dec-2015
Category:	Documents
Upload:	bathsheba-smith
View:	213 times
Download:	0 times

From Republic of Science to Audit Society Irwin Feller Professor Emeritus, Economics, Pennsylvania...

Documents