Research, Development, and Engineering Metrics
by
John R. Hauser
October 1997
John R. Hauser is the Kirin Professor of Marketing, Massachusetts Institute of Technology,
Sloan School of Management, 38 Memorial Drive, E56-314, Cambridge, MA 02142, (617) 253-2929,
(617) 258-7597 fax, [email protected].
This research was funded by the International Center for Research on the Management of
Technology (ICRMOT), Sloan School of Management, M.I.T. We wish to thank the managers,
scientists, and engineers who donated their time to talk to us about this important topic. This paper has
benefitted from presentations at the M.I.T. Marketing Group Seminar, Stanford University Marketing
Seminar, the US Army Soldier Systems Command, the ICRMOT sponsors meeting, the Marketing
Science Conference at the University of Florida, and the Marketing Science Institute’s conference on
Interfunctional Issues in Marketing. Special thanks to Florian Zettelmeyer who completed most of the
qualitative interviews described in this paper. I have discussed many of these ideas with him and have
benefitted greatly from his feedback.
We would appreciate your feedback on any and all aspects of this working paper. Our goal in
this paper is to provide practical insight into an interesting managerial problem by combining
qualitative fieldwork with formal analysis. An annotated bibliography and a summary of the qualitative
interviews are available upon request. A complete listing of ICRMOT working papers is available at
http://web.mit.edu/icrmot/www/.
Research, Development, and Engineering Metrics
Abstract
We seek to understand how the use of Research, Development, and Engineering (R,D&E)
metrics can lead to more effective management of R,D&E. This paper combines qualitative and
quantitative research to understand and improve the use of R,D&E metrics. Our research begins with
interviews of 43 representative Chief Technical Officers, Chief Executive Officers, and researchers at
10 research-intensive international organizations. These interviews, and an extensive review of the
literature, provide qualitative insights. Formal mathematical models attempt to explore these
qualitative insights based on more general principles.
Our research suggests that metrics-based evaluation and management vary based on the
characteristics of the R,D&E activity. For applied projects, we find that project selection can be based
on market-outcome metrics when firms use central subsidies to account for short-termism, risk
aversion, and scope. With an efficient form of subsidies known as “tin-cupping,” the business units
have the incentives to choose the projects that are in the firm’s best long-term interests. For core-
technological development, longer time delays and more risky programs imply that popular R,D&E
effectiveness metrics lead researchers to select programs that are not in the firm’s long-term interest.
Our analyses suggest that firms moderate such market-outcome metrics by placing a larger weight on
metrics that attempt to measure research effort more directly. Such metrics include standard measures
such as publications, citations, patents, citations to patents, and peer review. For basic research, the
issues shift to getting the right people and encouraging a breadth of ideas. Unfortunately, metrics that
identify the “best people” based on research success lead directly to “not-invented-here” behaviors.
Such behaviors result in research empires that are larger than necessary, but lead to fewer ideas. We
suggest that firms use “research tourism” metrics which encourage researchers to take advantage of
research spillovers from universities, other industries, and, even, competitors.
R&D expenditure is often a convenient target when it comes to maintaining or increasing the
company dividend. If fact, with R&D expenditure roughly the same amount as the dividend in many
companies, it is a significant temptation.
James W. Tipping (1993, p. 13)
Director of Research and Technology, ICI Americas, Inc.
Pioneering research is closely connected to the company's most pressing business problems. ...
Research must "coproduce" new technologies and work practices by developing with partners
throughout the organization a shared understanding of why these innovations are important.
John Seely Brown (1991, pp. 103-104)
Director of Xerox Palo Alto Research Center (PARC)
Balancing Market- and Research-Driven R,D&E
Research, Development, and Engineering (R,D&E) provides the science and technology which
firms use to serve tomorrow's customers profitably. Many managers, consultants, and researchers have
argued that, to succeed in the next century, R,D&E should be market driven. See review in Griffin and
Hauser (1996). John Seely Brown's comments are typical of those heard in interviews with Chief
Technical Officers (CTOs) and Chief Executive Officers (CEOs). Indeed a recent international CTO
task force on the evaluation of R,D&E opines that success is more likely if a product delivers unique
benefits to the user (EIRMA 1995, p. 36).
However, it is not easy for R,D&E to be market-driven. If we limit our definition of the
customer to "today's customers," it might not even be desirable. R,D&E, almost by definition,
represents the long-term technological capability of the organization. While many successful new
products are developed based on customer needs (von Hippel 1988), an organization can not meet
customer needs if it does not have the capability to do so (EIRMA 1995). The laser was not invented to
provide high quality music or to store large quantities of data on compact disks. The US Army
Research Laboratory (ARL) and their affiliated research, development, and engineering centers
(RDECs) would not have been able to adapt rapidly to the post-cold-war era if they did not have
capabilities in the basic research areas. By maintaining basic chemistry and chemical engineering
expertise, the Hoechst Celanese Advanced Technology Group, a major producer of chemicals for
R, D & E METRICS PAGE 2
automotive tires, was able to turn a chance discovery of a chemical process into a thriving
pharmaceutical business. Other examples include Carothers' research on linear superpolymers that led
to Nylon and Westinghouse's research on water flows through porous geological formations that led to
breakthroughs in uranium mining, the evaluation of environmental impacts for real estate development,
and heat flow analyses for high-temperature turbines and for below-ground heat pumps (Nelson 1959,
Mechlin and Berg 1980). On the other hand, the great isolation of Bayer A.G.'s corporate research
center was a failure (Corcoran 1994).
Perhaps today’s popular conviction that R,D&E should be evaluated based on market outcomes
is too strong. For example, Mansfield (1980) demonstrates that, holding total R,D&E expenditures
constant, an organization's innovative output is directly related to the percentage of expenditures
allocated to basic research. In a statistical study of new product development at 135 firms, Cooper and
Kleinschmidt (1995) find that adequate resources devoted to R,D&E is a key driver that separates
successful firms from unsuccessful firms. Bean (1995) indicates that a greater percentage of research
activities in R,D&E (vs. business units) implies more growth.
We seek to understand how metrics can be used to manage R,D&E more effectively.
Specifically, we examine how the use of market-outcome metrics should vary as research activities
move from basic explorations to applied projects. We demonstrate how risk, time lags, scope,
spillovers, and the management of creative people affect the metrics used to evaluate R,D&E.
Our methodology combines qualitative and quantitative methods. We began by interviewing
43 CTOs, CEOs, and researchers at 10 research-intensive organizations. See table 1. We next
reviewed the public statements of CTOs, consultants, and academic researchers. (See Zettelmeyer and
Hauser 1995 for more details on the qualitative interviews and Hauser 1996 for an annotated
bibliography.) Together these activities led to both a qualitative description of R,D&E's activities and
formal analyses that attempt to generalize the insights. These insights suggest the properties of metrics
that can be used to evaluate and manage R,D&E more effectively.
The remainder of this paper is structured into five sections. In the next section we describe a
tier metaphor for R,D&E. We then devote a section to each tier. We close with a summary and
suggested extensions.
R, D & E METRICS PAGE 3
A Tier Metaphor for Describing R,D&E
Many of the firms we interviewed used a tier metaphor to describe their R,D&E activities
(Figure 1). This metaphor recognizes that R,D&E activities vary based on risk, on the time lag from
conception to market outcomes, and on the number of potential applications (scope). Tier 1 represents
basic research. Activities in this area are exploratory and less tied to the market – they concentrate on
understanding basic phenomena that might have applicability to many business units. They are often
long-term and risky. Tier 2 represents the development of core-technological competence. Tier 2
activities fulfill an organizations' existing strategic directions and set new ones. Tier 3 is applied
engineering. Activities in tier 3 are usually done with some funding from business units and are often
evaluated based on market outcomes. Not only is the tier metaphor common at the firms we
interviewed (for example, the US Army uses funding numbers such as 6.1, 6.2, and 6.3 to describe their
tiers), but it is consistent with concepts in the R,D&E literature (Bachman 1972, Krause and Liu 1993,
Pappas and Remer 1985, Tipping, Zeffren and Fusfeld 1995).
Some firms use a formal tier structure, while others use the metaphor to aid evaluation and
management. Although many firms assign activities to tiers, all recognize that the assignment is fuzzy.
Some activities overlap tiers and most activities evolve from one tier to another as knowledge is gained.
Real explorations, programs, and projects often have elements of more than one tier. Indeed, many
scientists and engineers work on activities drawn from two or more tiers. We use the tier metaphor to
focus on activities that have properties which typical of each tier. This metaphor simplifies exposition
and makes the insights more transparent. For example, we treat the value of research scope in tier 3 as
if it were fully determined by tier 2 activities. In reality, there is still residual uncertainty about research
scope that is resolved by tier 3 activities. Thus, the lessons of tier 2 apply to tier 3, but to a lesser
extent. By focusing our analyses by tier we avoid repetition.
We present the tiers in a pyramid to represent conceptually the amount of funding that is
allocated to the tiers. For example, in a study of 108 corporations, Mansfield (1981) found that roughly
5% of company-financed research was devoted to tier 1. However, this does not mean that tier 1 is
unimportant. In many ways tier 1 is the R&D lab of the R,D&E lab.
In the R,D&E literature many words, such as program and project, are used interchangeably
(Steele 1988). For the purpose of this paper we adopt Steele's terminology and use the words
"objectives" and/or "explorations" for basic research activities, the word "programs" for development
activities, and the word "projects" for applied engineering activities. This trichotomy is somewhat
R, D & E METRICS PAGE 4
arbitrary, but it indicates clearly to which tier we refer.
Tier 3. Applied Engineering for R,D&E’s Customers
We begin our analyses with the most market-oriented of the tiers, applied engineering (tier 3).
Activities in this tier have the following properties: (1) The business unit managers have the knowledge
and skill to evaluate the projects. (2) The projects have more immediate application with relatively less
risk. And, (3) previous R,D&E activities have provided acceptable estimates of scope, the time stream
of payoffs, the magnitude of payoffs, and the probability of success. We focus on metrics that are used
to select among tier 3 projects.
Qualitative Ideas
Our interviewees suggested that project selection is the most important and difficult
management task in tier 3. They were satisfied with the monitoring and feedback mechanisms that they
used once a project was selected. Many CTOs believed that the business units (the customers of tier 3)
have the means and information with which to judge tier 3 projects. Furthermore, they believed that the
business units were better able to judge a project's value than R,D&E management. We found a major
trend toward making project selection more business-unit driven.
Among the statements that we heard were: "Customer satisfaction is the number one priority."
"R,D&E has to be developed in the marketplace." “The key is to become customer focused."
"Technology assessment is `What does it do for the customer?'" At one firm, R,D&E proposes tier 3
projects and the business unit managers decide whether or not to fund them. In many firms R,D&E
maintains its budget by "selling" projects to business units.
On the other hand, many firms subsidized R,D&E with central funds. Business units were
asked to pay only a fraction of the cost of applied engineering projects. One interviewee stated that the
business units could judge research better if they did not have to pay the entire cost. For other examples
of subsidies see Corcoran (1994), Mechlin and Berg (1980), Szakonyi (1990).
Our interviewees proposed at least three justifications for subsidies: research scope, risk
aversion, and varying time horizons between the business unit managers and the corporation. Research
scope affects subsidies when the results of a pilot test have applications beyond those for which the
business unit paid. Other business units often benefit without incurring R,D&E costs. See Mansfield
R, D & E METRICS PAGE 5
(1982), Mechlin and Berg (1980), and Vest (1995). Scope economies also apply across technological
disciplines, for example, when discoveries in chemistry enhance research in biology (Henderson and
Cockburn 1996, Koenig 1983). Risk aversion affects subsidies when, without a subsidy, a business unit
manager would decide to avoid a risky project even though the expected payoff to the firm justifies the
project. Different time horizons affect subsidies when, as expressed in our interviews, business unit
managers have shorter time horizons than the firm. They often favor quick fixes for their immediate
problems. See Braunstein and Salsamendi (1994), Hultink and Robben (1995), Negroponte (1996), and
Whelen (1976). Holmstrom (1989) adds theoretical justification that market expectations can make it
rational for managers to be short-term oriented.
Finally, in calculating the net value of an applied project, many firms recognize that they need
only commercialize those technologies that prove profitable in pilot tests (Mitchell and Hamilton
1988). The cost of commercialization can be avoided for failed pilot projects. We assume that the firm
implements strategies which minimize the tendency of business unit managers to escalate commitments
to failing projects (Boulding, et. al. 1997).
We now incorporate these ideas into a formal model.
Model
We illustrate the contingent nature of applied research decisions with the simple model in
Figure 2. First, business unit managers and/or R,D&E managers (and engineers) select among potential
projects and begin initial development. For project j let the pilot engineering costs be kj. If the project
succeeds (with probability pj), the business unit and R,D&E managers observe the commercial value
(tj∃ 0) of the project. This commercial value is modeled as being drawn from a probability density
function, f(tj). If the project fails or if the realized commercial value is below a cutoff (tc), then the firm
can abort the project without further costs. If the commercial value is sufficient, the firm can exercise
its "option" and apply the technology elsewhere in the firm. We model this “research scope” as if the
firm can apply the technology to mj applications at a cost of cj for each application. Let ∀ j be the
percent of the applications that are within the business unit that funded the project. (For tier 3 we
assume ∀ j and mj are given. In the next section, we address how tier 2 might determine these values.)
The parameters in Figure 2 are feasible to obtain. Many organizations routinely make
judgments about the expected value of a pilot test (E[tj]), the probability of success for various
outcomes (pj), and costs (both for the pilot application, kj, and for eventual commercialization, cj). For
R, D & E METRICS PAGE 6
example, EIRMA (1995) suggests that the "3 main components that must be estimated for any project
are project cost, benefits, and probability of success." See Abt, et. al. (1979), Block and Ornati (1987),
Boschi, et. al. (1979), Krogh, et. al. (1988), and Schainblatt (1982) for discussion and methods.
To model the difference in time horizons we define (j and (F as the business unit and firm
discount factors. These factors reflect the fact that commercial values and costs are really time streams
of revenue and costs. If the business unit managers and the firm discount these time streams differently,
then the net present value as perceived by the business unit managers will differ from that perceived by
the firm. Without loss of generality, we normalize (F=1 and treat (j as the value relative to the firm. The
business unit manager is more short-term oriented when γj<1. For issues in the measurement of (j see
Hodder and Riggs (1985) and Patterson (1983).
For simplicity, we include all project costs in kj such that tj is positive. This allows us to
illustrate the effect of f(tj) with a negative exponential distribution with expected value 8j. Such
probabilistic processes are common in the R,D&E literature. When the business unit managers are risk
averse we model them as constantly risk averse with utility, u(x)=1-exp(-rx), where x is monetary
outcomes and r is the risk aversion parameter.1 For risk neutrality, u(x) becomes linear as r60.
Analyses
In the appendix we show that the optimal cutoff, tc, equals the cost of commercialization, cj, and
that the expected rewards (to the business unit) of the decision tree in Figure 2 are:
The computations are straightforward applications of conditional probability. The term, exp(-cj/8j),
appears in the formula to represent the fact that the firm need only invest further (and incur costs of cj)
when tj is above the cutoff. The expected outcome from the decision tree in Figure 2 exceeds the naïve
valuation, (j∀ jmjpj(8j-cj)-kj, that would be made without anticipating the "option" nature of the project.
If the business unit manager is risk neutral, he or she will value the project via Equation 1. If
the manager is risk averse, the certainty equivalent (c.e.) can be approximated by:
1 The qualitative implications should be the same for most reasonable density and utility functions. Some readers may prefer
a two-parameter lognormal distribution to facilitate the option-value calculations and to separate risk from expected outcomes.
(1) Expected net rewards = m p e - kj j j j j-c /
jj jγ α λ λ
R, D & E METRICS PAGE 7
For risk neutrality, Rj/1. The firm values the project differently than the business unit managers. It
earns value from all commercializations within the firm, discounts future value and cost streams less,
and can diversify risk. The firm will want at least one business unit to select the project if:
Subsidies
Comparing Equations 1 and 3 we see that the firm can match its incentives with those of the
business unit managers by subsidizing projects. If business units are asked to pay only a fraction, sj, of
the project costs, then the business unit manager(s) will choose the same projects as the firm if:
In other words, the subsidy adjusts for the concentration of research scope (∀ j), short-termism ((j), and
risk aversion (Rj). The subsidy varies by project because both scope and short-termism vary by project.
(Short-termism varies because the effect of a differential discount rate has a greater impact on projects
with longer time horizons. Rj varies by project because the uncertainty in payoffs varies by project.)
In principle, the subsidy also varies by business unit. Thus, the firm needs a means by which it
can entice either a single business unit or a coalition of business units to fund a project. (The firm
benefits if other business units "free ride" on the initial business unit's investment. We leave strategic
free riding among business units to future papers.)
In theory, the firm can implement the subsidy with a Dutch auction, lowering sj until one and
only one business unit selects the project (with the limit that the subsidy is not so low that Equation 3 is
violated). In practice, the subsidies, which vary from 30% to 90% among our interviewees, are set by a
complex negotiation process that allows information to be transferred and coalitions to form. (One
manager called this "tin cupping" because, like a beggar with a tin cup, she went to other business unit
managers asking them to contribute to projects that she championed.) An average subsidy will
(2)
c.e. of expected net rewards R m p e - k
where R = 1
1 + r m
j j j j j j-c /
j
jj j j
j j≈ γ α λ
λ α
λ
(3) j j j-c /
jm p e - k 0j jλ λ ≥
(4) j j j js = Rα γ
R, D & E METRICS PAGE 8
introduce selection inefficiencies whenever there is substantial variation in ∀ j, (j, λ j, and mj.
We summarize this section by stating the implications of Equations 1-4 as a set of qualitative
hypotheses that can be used for empirical testing. Equations 1-3 provide explicit quantification of the
value of applied engineering projects.
IMPLICATION 1. (a) The "option value" of a tier 3 project should be higher than the (naïve) expected
value. This option value anticipates future decisions on subsequent investment. (b) For applied
projects, firms should use subsidies and implicit auctions. The subsidies and auctions correct for the
tendency of business unit managers to choose projects that are more concentrated in a single business
unit, have shorter-term payoffs, and are less risky than the firm would find optimal. (c) Subsidies
should be larger (sj smaller) when projects have benefits that are less concentrated, have revenue
streams over longer periods, and are perceived as more risky.
Tier 2. Development Programs to Match or Create Core Technological Competence
We now focus on development activities (tier 2) that provide the bridge from basic research
(tier 1) to applied engineering (tier 3). These activities are more risky and have longer-term payoffs
than tier 3 projects. These activities are more difficult for business unit managers (and line managers) to
evaluate because evaluation requires more detailed information and greater current technical
experience. Instead, business unit managers rely more heavily on the decisions of R,D&E managers and
engineers. The challenge for activities having tier 2 characteristics is to develop a set of metrics with
which to evaluate the decisions and the efforts of R,D&E managers and engineers. Because the firm
must rely on their decisions, we seek metrics which encourage R,D&E managers and engineers to make
those decisions and allocate those efforts that are in the firm’s best interests. In addition, because tier 2
programs evolve into tier 3 projects, we examine how the activities in tier 2 determine the parameters
used to select tier 3 projects.
Qualitative Ideas
Our qualitative interviews and the R,D&E literature suggest that the primary task of
development is to match expertise with strategic direction. See Adler, et. al. (1992), Allio and Sheehan
(1984), Block and Ornati (1987), Boblin, et. al. (1994), Chester (1994), EIRMA (1995), Frohman
R, D & E METRICS PAGE 9
(1980), Ransley and Rogers (1994), Schmitt (1987), Sen and Rubenstein (1989), and Steele (1987,
1988). As one of our interviewees said: "The customer knows the direction, but lacks the expertise;
researchers have the expertise, but lack the direction." Tier 2 researchers and managers are judged both
for their competence in developing technologies and for their ability to align the values of R,D&E with
those of the firm (Steele 1987). Our interviewees said that development succeeds if it gets the
programs right. However, researchers in tier 2 must also have the incentives to invest the right amount
of scientific, engineering, and process effort.
R,D&E researchers (and managers) appear to have more expertise and knowledge than top-
level managers about the specifics of the development programs. Thus, firms use metrics to encourage
tier 2 researchers to select the right programs and to put forth sufficient scientific, engineering, and
process effort to develop those programs. We heard concerns that net present value metrics favor short-
term, predictable, incremental development programs (Steele 1988, Irvine 1988). Our interviewees felt
that tier 2 metrics should not imply a penalty for failure that is too strong. Such penalties encourage
researchers to focus only on “safe” technologies and not take sufficient risks. Failure was part of the
territory (estimates varied from 20-80%); interviewees felt that metrics which eliminated failure also
eliminated success. Instead, we often found metrics such as patents, publications, citations, citations to
patents, and peer review. See also Edwards and McCarrey (1973), Henderson and Cockburn (1996),
Irvine (1988), Miller (1992), Pappas and Remer (1985), and Shapira and Globerson (1983). These
metrics appear to be surrogates for the scientific, engineering, and process effort that is devoted to
development programs. There appears to be a tension, when designing a tier 2 evaluation system,
between market-outcome metrics and “effort” metrics.
Model
Figure 3 represents our conceptual model of tier 2 activities. In step 1, researchers select
programs based on the ongoing results of basic research (tier 1) explorations.2 Naturally, tier 2
researchers do so anticipating potential outcomes but taking uncertainty into account. In step 2,
researchers evaluate each program to resolve some of the uncertainty. In this evaluation they determine
research scope (mj) and concentrations (∀ j's for each business unit). This step also clarifies uncertainty
in the value (to the firm) of the program so that business unit managers and applied engineers have
2We refer to development decisions and efforts as if they were made by researchers. The same analyses apply to teams of
researchers and managers (as long as we account for free riding within teams).
R, D & E METRICS PAGE 10
sufficient information to estimate the parameters for Equations 1-4. If the program shows sufficient
potential, then, in step 3, development researchers invest significant scientific, engineering, and process
efforts to develop the program into potential applied projects.
Because development researchers select programs before they know the outcomes of the
development programs, we model a key parameter, research scope, as a random variable, ~mj .
Specifically, we model the process of determining ~mj as if there were Mj potential applications within
the firm. During step 2, the researcher determines how many of these applications apply to the firm – a
priori each applies with a probability, qj. (Estimates of Mj and qj are based on the result of basic
research explorations and on expertise in evaluating the outcomes of these explorations.) We define vj
as the "value" of each realized application.
We model the scientific, engineering, and process effort in step 3 with an additive parameter, ej,
that measures the expected incremental profit to the firm of this effort.3 The effort by development
researchers to obtain these results is costly to the researchers and this cost may be difficult for the firm
to observe. We call this cost, dj(ej), and assume that it is convex in ej. Finally, there is some fixed cost
to the firm, Kj, of developing program j.
Each program might have different anticipated time streams of net revenues and development
researchers might be more short-term oriented than the firm. We model this by a discount factor, ∋ j. We
allow researchers to be (constantly) risk averse. (We expect that Γ j <γ j because of longer time lags
associated with development programs. The case of no short-termism is represent by ∋ j=1 and the case
of risk neutrality is represented by r60.)
To focus on key phenomena and avoid redundancy, we have simplified our model in this
section. Each of these simplifications can be relaxed readily. First, we set kj=0 to simplify the options
analysis that has already been discussed. (Options analysis applies to tier 2 in the same manner that it
applies to tier 3.) Second, we model uncertainty in ~mj but not vj because the effect of uncertainty in vj
would only reinforce the effects due to ~mj . Finally, we model the effort allocated in step 3 but not the
effort allocated in step 2. The basic intuition would be the same, but the algebra would be
unnecessarily complicated. None of these simplifications change the basic insights derived here.
3We define ej based on effort that is induced incrementally by the metrics system above and beyond any effort that the
researcher would put forth based solely on his or her base wage. We might consider alternative formulations treatingeither ej or (1+ej) as multiplicative terms. These formulations provide the same qualitative implications when we focus onprogram choice or effort allocation. However, scaling constants and the detailed optimizations vary.
R, D & E METRICS PAGE 11
Development Metrics
Recently, many firms have adopted development metrics which are based on comparing market
outcomes to development costs. For example, see McGrath and Romeri (1994). However, some of our
interviewees believe that such schemes distort development decisions. Thus, we want to contrast these
metrics with “effort” metrics.
Because many firms try to measure effort directly with metrics such as publications,
citations, patents, citations to patents, and peer review we represent these metrics with a normal
random variable, ~e j , with mean ej and variance,σe2 . The uncertainty in this measure represents the fact
that these metrics are, at best, noisy indicators of the incremental profit to the firm of the researchers’
efforts.
Market outcomes result from the value and scope of the chosen program and from the
researchers’ efforts. To explore development metrics we recognize that the market outcomes in our
model are ~mj vj+ ~e j and the costs are Kj. This implies a net market outcome metric of ~mj vj+ ~e j -Kj.
To represent our observations that firms combine market-outcome and effort metrics, we consider a
more general metric which allows a weight of η1 on market outcomes, η2 on effort, and η3 on costs.
If we define ∃ v=01, ∃ e=01+02, and βK=η3, then this implies the linear development metric given by
equation 5.
In this notation, the metric advocated by McGrath and Romeri is represented by a special case where
∃ v=∃ e=∃ K=1, or equivalently, η1=η3 and η2=0. 4 The linear function suffices to demonstrate the basic
tension in development metrics. However, future analyses might improve upon observed practice by
introducing non-linear reward systems.
Development metrics enable top management to motivate researchers to choose those
development programs (and allocate effort) that are in the best interests of the firm. Top management
would have less need for these metrics if it could simply dictate to the researchers the programs on
which they should work and then monitor costlessly how hard they work. The metrics enable top
management to delegate the selection of programs and the allocation of scientific, engineering, and
4Specifically, their effectiveness index (EI) is equal to (% of revenue from new products)*[(% of revenue that is profit)/(%
of revenue spent on R&D) + 1]. For clarity of exposition our representation is a linear rather than a ratio function. Wemight also note that their metric does not include the impact of development activities on existing products.
(5) development metric = m v + e - Kv j j e j K jβ β β~ ~
R, D & E METRICS PAGE 12
process effort to those who have the unique technical knowledge and experience necessary to judge the
merits of the programs.
To represent how researchers will evaluate rewards, explicit or implicit, that are based on this
metric, we first recognize that researchers will find effort to be costly. Thus, we subtract dj(ej).
Secondly, we recognize that there is a time lag in observed outcomes, but not costs. Thus, we discount
observed outcomes. Further, if ~e j is observed before ~mj vj, then we allow different discounting
constants, Γ jm and Γ j
e .5 Finally, if researchers are risk averse they will perceive the uncertainty in
~mj and ~e j to be costly. Thus, we represent the uncertain rewards with their certainty equivalent. In
the appendix we derive the researcher’s certainty equivalent based on the development metric:
It is immediately clear that either ∃ v or ∃ e must be non-zero. Otherwise, researchers would select no
programs for development and allocate no effort.
In contrast to researchers, the (risk neutral) firm wants to select those programs that maximize
the expected value of the program (net of the wages the firm must pay). To calculate this value we use
standard agency theory methods (e.g., Holmstrom 1989) to represent the profit the firm can earn. First,
we recognize that Mjqj is the expected value of ~mj and ej is the expected value of ~e j . Thus, before
wages, the firm’s expected profits are Mjqjvj+ej-K. However, if the firm is to retain its employees it
must pay them their market wages net of switching costs, wo, and it must reimburse them for any effort
costs and for any risk costs. (By definition, wo represents the minimum amount that would be required
to retain a researcher who did not have to incur incremental effort and risk costs on the firm’s
programs.) Thus, the firm’s profit is given by:
(7) Firm s profit M q v e K d e w risk c ostsj j j j j j o' ( )* * * **
* **= + − − − −
where j* indicates the researchers’ program selection and e j** indicates the researchers’ response to the
firm’s choice of the β’s. The firm will select the β’s to maximize its profit. This optimization will, by
implication, determine the program choice and the effort that the researchers allocate.
5We have chosen to define the Γ’s with respect to the β’s rather than the η’s. It is possible to derive one from the other.
(6)c.e.= M q v + e - K - d ( e ) -
(r ){ M q (1 - q )v }
v jm
j j j e je
j K j j j
v2
jm
j j j j2
e2
je
e2
β β β
β β σ
Γ Γ
Γ Γ/ ( ) ( )2 2 2+
R, D & E METRICS PAGE 13
In principle, we could solve the complete agency problem by choosing the β’s to maximize
Equation 7 recognizing that the certainty equivalent of the researchers’ wages is given by Equation 6.
The resulting solution would balance the tension between inducing the best choice and motivating the
optimal effort. However, we gain greater insight into this tension with a simpler approach that analyzes
the problem in stages. We begin by holding effort constant and illustrating how ∃ v and ∃ K affect the
choice among programs. We then hold research scope constant to show how ∃ e affects the researchers’
efforts. This allows us to interpret the relative magnitudes of ∃ v, ∃ K, and ∃ e.
Selecting the Right Programs
For this subsection we assume that e j* and d( e j
* ) do not vary by research program and that
σe2 =0. Under these conditions, the anticipated effort allocation will not affect program choice.6 With
efforts constant among programs, the effort benefits, effort costs, and fixed wages would simply shift Kj
by a fixed constant in the following discussion. Thus, we can normalize [ e j* - d( e j
* )-wo]=0 without loss
of generality.
Differential discounting ( Γ jm <1) and risk aversion (r>0) cause the researcher’s c.e. to differ
from the expected profit the firm could earn if it did not need to rely on metrics and could dictate the
choice of program. In the latter case the firm’s profit would be Mjqjvj - Kj. For program choice to
matter to the researcher, Equation 6 requires non-zero ∃ v. However, larger ∃ v increases the firm’s risk
costs. Furthermore, Equation 6 suggests that ∃ v could distort program choice. That is, differential
discounting and risk aversion might cause researchers to reject some programs that would be profitable
for the firm and to favor less profitable programs (for the firm) over more profitable programs.
We find it is easier to illustrate these effects graphically. Figure 4 maps the magnitude of the
phenomena for the case of two alternative research programs and for representative values of the
parameters (given in the appendix). The horizontal and vertical axes represent the values (vj) of
programs 1 and 2, respectively.
Figure 4a isolates the effect of discounting (with risk neutrality). Equations are derived in the
6The technical conditions of the problem formulation assure us that we can choose βe
* independently of βv* and βK
* . Thus,
all terms involving e will be the same for each project being compared. For a multiplicative formulation ej would scale the
value and e j2 would scale the variance. If ej=1 for the multiplicative formulation, then Figure 4 would be the same.
R, D & E METRICS PAGE 14
appendix. If researchers discount the time stream of revenue, then some programs will be falsely
rejected (inverse L-shaped region in Figure 4a). If revenues from one research program occur faster
than another ( Γ Γ1 2m m> ), then researchers will be more likely to choose the program with better short-
term prospects (diagonal false selection region in Figure 4a). We can eliminate the false rejection
regions if ∃ K= Γ jm ∃ v, but eliminating the false selection region requires, in addition, that we allow ∃ v to
vary by program.
Figure 4b isolates the effect of risk on false rejection. (We expand the scale in Figure 4b, vs.
Figure 4a, in order to illustrate this effect.) When ~mj is a random variable and researchers are risk
averse, the certainty equivalent will be less than the expected value (see also Holmstrom 1989). For a
given cost (Kj), when the value (vj) and implied risk become large, the certainty equivalent becomes
negative and researchers no longer find it attractive to begin development even though the program
provides a very large expected return to the firm. The areas where both programs are falsely rejected
are shaded. (We might also shade the regions above the upper bound to illustrate that at least one
program is falsely rejected.) If the firm wants to eliminate these false rejection regions, it must make ∃ v
sufficiently small such that the false rejection regions are beyond any feasible outcome, but large
enough so that researchers prefer high-expected-return programs. Placing too large a weight on
market-outcome metrics leads to a tendency by researchers to avoid high-expected-return development
programs that are risky and/or long-term.
Figure 4c isolates the effect of risk on false selection. The concept is similar to that of false
rejection. In the shaded regions of Figure 4c, uncertainty and risk aversion cause researchers to avoid
high-return development programs when the returns are risky and/or long-term. The firm can eliminate
these false selection regions by making ∃ v sufficiently small.
Figure 4d summarizes the effects of both discounting and risk. The regions are more complex,
but the phenomena are the same – discounting and risk aversion lead to large regions of false rejection
and false selection when researchers are evaluated too heavily on market-outcome metrics.
Encouraging Tier 2 Scientists and Engineers to Put Enough Effort into Developing a Program
In this subsection we focus on the effort that is allocated after a program is selected. We hold
the realized scope ( ~mj ), the value (vj), and costs (Kj) constant and focus on step 3 in Figure 4. With
only effort being analyzed, the selection of a weight (∃ e) to encourage researchers to allocate optimal
efforts is a standard agency theory problem. See Holmstrom (1989). In the appendix we show that the
R, D & E METRICS PAGE 15
firm can choose an "optimal" ∃ e such that researchers allocate the scientific, engineering, and process
effort that maximizes the firm's profits. The optimal weight is:
Because d( e j* ) is convex, Γ j
eeβ * [ , ]∈ 0 1 . When researchers are very good at anticipating the outcomes
of their efforts,σe2 will be close to 0.0. When the effort metrics are observed much faster than market
outcomes, Γ je will be close to 1.0. Under these conditions, βe
* will be close to 1.0.
We now see the tension. If market outcomes were the only metrics available then the metrics
would measure ~mj vj and ~e j simultaneously. To avoid false program choice the firm would want the
weight on market outcomes to be small, but to induce the right research and process efforts the firm
would want the weight on market outcomes to be large. One way to finesse this tension is for the firm
to search for metrics that correlate with effort, but not necessarily with market outcomes. The firm can
then implement a small weight on ~mj vj and a large weight on ~e j by placing a small weight on market
outcomes and a large weight on the "effort" metrics.7 The firm finds it attractive to use effort metrics
more than market outcomes because (1) effort metrics can be observed sooner than market outcomes
and because (2) the measurement uncertainty relating the effort metrics to true effort is less than the
uncertainty in predicting ultimate market outcomes. The reduced discounting and risk motivate
researchers to allocate the most profitable amount of effort to the development programs. The effort
metrics make it feasible for the firm to place a small, but positive, weight on market outcomes. A small
weight on market outcomes avoids false selection and false rejection in the choice of development
programs.
7Returning to the η’s for a moment, we see that βv small and βe large imply η1 small and η2 large, and vice versa.
(8) e*
je
e2
2j
j2
-1 = [1+ rd e
e]β σ( )
( )*
Γ − ∂∂
1
R, D & E METRICS PAGE 16
Selecting the Right Programs and Allocating Sufficient Effort
If returns to effort vary by development program, then, in step 2 of Figure 4, for a given set of
∃ 's, researchers will select among programs anticipating the effort that they will allocate in step 3.
Technically, we incorporate this effect by using all the details of Equations 6 and 7 to redo the analyses
that led to Figure 4 and Equation 8. For each potential program, these optimal values of effort do not
depend upon the realized value of the research scope because, in our model, ~mj and ~e j are
independently distributed (and researchers are constantly risk averse). More complete analysis could
determine the optimal metrics (β’s).8 However, this more complicated analysis does not change the
qualitative lessons that can be derived from our simpler analyses.
Implications for Practice
Our simple analyses seem to conform to practice. Development metrics do appear to be based
on both market-outcome metrics and "effort indicator" metrics. In particular, many firms use metrics
such as patents, publications, citations, citations to patents, and peer review. Such metrics have proven
to be correlates of incremental value, and by implication, scientific, engineering, and process effort.
See Griliches (1990), Koenig (1983), Miller (1992), Stahl and Steger (1977), and Tenner (1991).
Indeed, if more than one such measure of effort is available, the firm can do better by using a linear
combination of measures (Holmstrom 1989). When the measures are independent indicators, the
"optimal" weights are inversely proportional to the variance of the measures (see appendix for
equations). Thus, when metrics can be found that are indicators of development effort, the firm should
weigh these metrics more heavily than market outcome metrics. If these indicators can be observed
before market outcomes ( Γ Γjm
je< ) and if the measures are less uncertain from the perspective of
development researchers, then effort-indicator metrics help to avoid distortions due to short-termism
and risk aversion.
Our analysis is contrary to calls in the popular press for greater market accountability of
development and is contrary to many of the schemes advocated (but not yet fully evaluated) in the
8The profits that result from optimal β’s will be less that the (“first-best”) profits the firm could obtain if it had the
knowledge and capabilities to dictate program choice. The metrics-based profits are less because the firm must reimbursethe researcher for the risk costs that the development metrics impose. Future authors might reduce the risk costs with anon-linear system to obtain “second-best” profits. (Optimization over all potential linear or non-linear functions.)
R, D & E METRICS PAGE 17
R,D&E literature. We predict that a simple comparison of market outcomes and research costs (e.g.,
McGrath and Romeri 1994) will lead researchers to avoid long-term and/or risky programs. (Indeed,
one senior manager, who indicated to us that his firm uses these measures, found that the measures
increased for a few years, but now appear to be decreasing.)
In addition to combining market-outcome and effort-indicator metrics, the firm can also
attempt to develop metrics that measure directly the ability of researchers to choose the right programs.
For example, some firms reward development researchers for "strategic vision" and for decisions that
are aligned with the firm's goals (Steele 1987).
We summarize our analyses with some testable implications.
IMPLICATION 2. Development programs (tier 2) should be evaluated on market outcome metrics such
as profits, revenues, sales, or business-unit evaluations, but the weight on those metrics should be
small. Otherwise, researchers favor short-term programs with less risk. On the other hand, metrics
such as publications, citations, patents, citations to patents, and peer review should have a much higher
weight (1) if these metrics correlate with the amount of value-enhancing scientific, engineering, and
process effort and (2) if they can be observed sooner and with less uncertainty than market outcomes.
Tier 1 – Basic Research Explorations: The Role of Research Tourism
We now focus on basic research explorations (tier 1) which provide the raw material for
development programs. The uncertainty and time lag for these explorations is even larger than that for
development programs and line managers must rely even more on the specialized knowledge of tier 1
managers and researchers. Many of the lessons from previous tiers apply to tier 1. For example, effort-
indicator metrics should be given a higher weight than market-outcome metrics. The additional
challenge in tier 1 is to provide the right incentives so that tier 1 researchers and managers explore a
sufficiently broad set of new ideas, concepts, technology, and science.
Qualitative Ideas
We found that basic research (tier 1) is more likely than the other tiers to be funded from
corporate coffers; more likely to be located in central laboratories; and more likely to focus on long-
term concepts. (One of our interviewees, the CEO of a $2 billion company, said that one of his main
R, D & E METRICS PAGE 18
responsibilities was to protect the basic research budget from his business unit managers.) See also
Chester (1994), Krause and Liu (1993), Mansfield (1981), Mechlin and Berg (1980), Reynolds (1965),
and Szakonyi (1990). Tier 1 is organized more often by scientific discipline than by markets served
(see also Chester 1994). It accounts for roughly 5-15% of R,D&E spending, but appears to be the seed
for new ideas, concepts, technology, and science.
Our interviewees stressed the need to maintain the best, most creative basic researchers (see
also Steele 1988). We observed that management provided these people with sufficient protected space
and discretion in which to innovate. This included special privileges, such as "Research Fellows" at
IBM and 3M or "Man on the Job" at the US Army, that are not unlike the tenure system at research
universities. However, judging the best people was difficult because the success of a research
exploration depends, in part, on as-yet-undiscovered natural phenomena. Indeed, some researchers
provide value to the firm by identifying which directions not to explore. As a result, basic researchers
are often judged by the quality of the research that they, themselves, perform (Platt 1964). Fame,
recognition, and salary appear to depend more on that which a researcher originates than on ideas,
concepts, technology, and science that are "arbitraged" from outside sources.
In contrast, many of the most profitable new ideas, concepts, technology, and science come
from outside the firm. Our interviewees stressed the need to maintain expertise in the scientific
disciplines in order to identify ideas from universities, from other firms in the industry, and from other
industries. They called this activity "research tourism." One of our interviewees stressed that his firm’s
competitive advantage was to identify and develop outside ideas better than anyone else in the industry.
Research tourism opens "new fishing grounds" for corporate development (Griliches 1990) and
spillovers can be quite large (Acs, Audretsch and Feldman 1992, Bernstein and Nadiri 1989, Griliches
1992, Jaffe 1989, Ward and Dranove 1995). In an econometric study of 1700 firms, Jaffe (1986)
suggests that, while the direct effect of R,D&E spending by competitive firms lowers profitability, the
indirect effect of spillovers is sufficiently large to make the net effect positive.
However, encouraging research tourism is not easy. A common problem at many research
laboratories is a "Not Invented Here (NIH)" attitude (Griffin and Hauser 1996). The outputs of internal
explorations are easier to measure, hence it is tempting to evaluate researchers based on that which they
originate rather than the total number of ideas, concepts, technology, and science that they bring into
the firm. This is perpetuated by evaluation systems (e.g., Galloway 1971) that trace successful new
products back to their idea source. Other firms encourage work within the organization to avoid
"buying" technological results (Roussel, Saad, and Erickson 1991). EIRMA (1995) suggests that the
R, D & E METRICS PAGE 19
inability to incorporate spillovers and spin-offs appears to be one of the weaknesses of the evaluation
systems used by European firms.
The Right Reward System Encourages Research Tourism; the Wrong Reward System Encourages NIH
We focus on how the firm should evaluate researchers so that they have incentives to seek out
the “right amount” of ideas, concepts, technology, and science. For ease of exposition, we refer to
these outputs simply as “ideas.” (Previously, we addressed how researchers and managers chose which
“idea” to develop as a tier 2 program.) By “right amount” we seek the number of ideas that maximizes
the value of the ideas minus the cost of obtaining them. Some ideas are better than others, but for the
purpose of this section we treat all ideas equally.
Our interviews and the literature (e.g., Cohen and Levinthal 1989) suggest that more and better
internal research provides a greater ability to identify and use outside ideas. Let h be the number of
internal explorations and assume that each exploration leads to an “idea.” Suppose that for each
internal idea identified, the basic researcher can also identify : ideas from the outside. Thus, the total
number of ideas, n, will be equal to h+µh. Let 6i be the cost of exploring an internal idea and let 6o be
the cost of exploring each external idea. (The subscripts are mnemonic for inside and outside,
respectively.) Naturally, 6i>6o. Let V(n) be the value of n total ideas (appropriately discounted). We
assume that V is a concave function of n. For example, V(n) might be the maximum of n draws from a
normal distribution. (The effects of risk and differential discounting on V will be similar to those
covered in the previous section. In this section, we focus on the implications of choosing either n or h
as the metric. Therefore, we treat V(n) as if it imposes no risk and no time lag on the researchers.)
The potential for spillovers (:>0) decreases the cost per idea, hence, for concave V, the optimal
number of ideas increases when spillovers are possible. However, even though spillovers make internal
explorations more efficient, this efficiency might imply fewer internal explorations. In the appendix we
show formally that this means that the optimal number of internal explorations might actually decrease.
We summarize this analysis as testable implications.
IMPLICATION 3. When spillovers are possible, (a) the optimal number of explorations increases but (b)
the optimal number of internal explorations might decrease.
R, D & E METRICS PAGE 20
Implication 3 suggests why tier 1 researchers might adopt an NIH attitude. If a researcher’s (or
research manager's) status is based on the number of the internal explorations that the firm funds, then
seeking spillovers might decrease this internal empire. To illustrate the phenomenon more formally,
suppose that the firm can evaluate researchers on either internal ideas alone (the size of the research
"empire") or on the total number of ideas that are identified – whether or not they originate internally.
That is, the firm evaluates tier 1 researchers based either on h or on n. We call these evaluation
functions gh(h) and gn(n). Suppose that the researcher’s rewards, either explicit or implicit, are based on
these evaluations.
Tier 1 researchers can choose whether or not to seek spillovers. We model this ability by
allowing them to choose how many external ideas they explore. That is, they choose a value :o from the
set [0, µ ] such that the total number of ideas they explore is h+:oh. Let :* be the value they choose (in
their own best interests). If :*=0 then this is equivalent to NIH; if :*= µ , then this is equivalent to
research tourism.
We now examine how the choice of metric affects the researchers’ reactions to the evaluation
system. To make the comparison meaningful, we select functions such that the researcher would earn
the same reward whenever he or she acts in the best interests of the firm. We choose evaluation
functions that accurately reflect the value to the firm of the ideas that the researcher explores. These
assumptions imply that gh(h)=V[(1+ µ )h] and gn(n)=V(n). The firm would choose this gh(h) if it fully
expected researchers to explore spillovers and rewarded them accordingly, but did not anticipate that
the choice of a research metric affects the researchers’ choice of :o. (We might also assume that the
firm can anticipate the value of :o that researchers will choose . If the firm were restricted to using h, but
could anticipate :o it would choose gh(h)=V(h); if it were allowed to use n, it would choose gn(n)=V(n)
as the reward function. We obtain similar results for these assumptions. 9)
The formal results are derived in the appendix. We provide the intuition here. When researchers
are evaluated on the metric, n, the evaluation structure for researchers is similar to that by which the
firm evaluates its profits. The cost per idea decreases with :o, thus researchers, like the firm, will find it
9We could analyze this as a formal agency problem, in which case, the firm could obtain maximal profits by paying tier 1
researchers via V(n)+wo+(6i+6o:)/(1+:)n*-V(n*) Because we have abstracted from risk in this section (it is covered in previoussections), this makes tier 1 researchers the residual claimants. Alternatively, we could restrict the firm to rewards of the formg(h)+constant. In this case, the optimal rewards would be g(h)=V(h). This case is analyzed in the appendix. It providessimilar, but not identical, results. In the text we have chosen to compare the two reward systems that we feel representpractice. We leave analysis with risk aversion and differential discounting to future extensions.
R, D & E METRICS PAGE 21
in their own best interests to set µn* = µ . Their objectives will parallel those of the firm and they will
choose the optimal number of explorations. However, when the researchers are evaluated based on the
metric, h, the cost per unit gain in gh(h) increases as :o increases, hence the researchers will want to keep
:o small. With µ >0 and µh* =0, researchers are rewarded as if there were spillovers, but they incur
costs as if there were no spillovers. Because rewards are concave, this leads to more internal
explorations. This does not necessarily imply more ideas. That depends upon the relative costs of
internal and external explorations. We state these testable results as Implication 4.10
IMPLICATION 4. (a) If tier 1 researchers are evaluated on all ideas, new concepts, new technology, and
new science, including that identified outside the firm, they will set µn* = µ and invest in the "optimal"
number of explorations for the firm. (b) If researchers are evaluated on the results of internal
explorations only, they will adopt an NIH attitude by setting µh* =0. They will work on more internal
explorations and may develop fewer ideas, new concepts, new technology, and new science than would
be "optimal" for the firm.
Summary and Implications for Basic Research Metrics
Our analysis of spillovers suggests that the common practice of rewarding basic researchers for
original ideas leads them to (1) ignore ideas that were "not invented here" and (2) build "research
empires" by undertaking too many internal explorations. This may lead to fewer ideas. The firm can be
more profitable if it encourages research tourism by evaluating researchers for ideas generated
internally and for ideas identified from sources outside the firm. Fortunately, progress is being made.
The recent vision statement adopted by General Motors includes the phrase “Develop more highly
valued innovations, no matter their source, than any other enterprise.” (Underline added. Vision
statement obtained by private communication to the author.)
Summary and Future Research
Arthur Chester (1995), Senior Vice President for Research and Technology for GM Hughes
10
If tier 1 researchers are evaluated on g(h)=V(h), then the equivalent result is that researchers will develop fewer ideas andmay work on fewer internal explorations.
R, D & E METRICS PAGE 22
Research Laboratories, states that: "measuring and enhancing R&D productivity or R&D effectiveness
... has gained the status of survival tactics for the R&D community." R,D&E evaluation is an important
policy issue in Japan (Irvine 1988) and Europe (EIRMA 1995). Erickson and Jacobson (1997) provide
evidence that there are no supranormal returns to R&D spending, but that “obtaining a comparative
advantage … depends crucially on the specific nature of the expenditure and how it interacts with the
firm’s asset and skill base.” CEOs and CTOs use metrics to evaluate and manage people, objectives,
programs, and projects. In many ways, metrics determine whether or not a firm’s R,D&E activities are
well managed. While the identification of specific measures for each firm is an empirical question
beyond the scope of this paper, we have attempted to identify the properties of those metrics that enable
firms to manage R,D&E effectively.
First, it is clear that metrics must vary by tier. Market-outcome metrics make sense for applied
engineering projects that provide relatively predictable and immediate returns. However, the incentives
of business-unit managers may not be aligned with those of the firm. Thus, the cost of applied projects
should be subsidized to adjust for short-termism, risk aversion, and scope. Ideally, these subsidies
should vary by project and by business unit.
Contrary to popular wisdom, market-outcome metrics should be given less weight for
development programs with longer-term and more-risky payoffs. Indeed, too great a stress on market-
outcome metrics will encourage managers and researchers to avoid long-term, risky programs that have
high profit potential. Instead, the firm should place a small weight on market outcomes and a larger
weight on effort-indicator metrics such as publications, citations, patents, citations to patents, and peer
review. This combination of metrics provides managers and researchers with the incentives to choose
the right programs and allocate the right amount of value-enhancing scientific, engineering, and process
effort.
Basic research is even further from the market, hence more difficult for line managers and
business unit managers to evaluate. As a result, the firm relies more heavily on the judgment of basic
research managers and scientists. It often seeks indicators of the quality of these people and the quality
of their work. Unfortunately, many organizations evaluate these people based only on the ideas,
concepts, technology, or science that they originate. Such evaluations encourage them to do only
internal explorations and build research empires that are too large. The firm can do better by
encouraging “research tourism.” It should reward research managers and scientists for the “ideas” that
they originate and for the “ideas” that they identify from outside the firm. Problems with “not invented
here” result from the wrong evaluation system. They can be avoided with the right evaluation system.
R, D & E METRICS PAGE 23
Our analyses can be extended. For example, Draper Laboratories (Pien 1997) has begun to use
these insights to identify a set of metrics that provides researchers and managers with the right
incentives throughout the tiers of R,D&E. Pien’s metrics also show promise in predicting the success
of tier 1 explorations. We have begun other research to test whether the US Army’s Government-
Performance-and-Results-Act (GPRA) metrics affect people in the ways predicted in this paper. We
are sending questionnaires to CTOs to determine whether the metrics used by a cross-section of
organizations have the predicted properties.
Other research directions include the integration of R,D&E metrics with internal customer
evaluation systems and/or customer satisfaction measures (Hauser, Simester, and Wernerfelt (1994,
1996), the exploration of self-selection on risk aversion (Holmstrom 1989), strategic behavior to
withhold information or support (Rotemberg and Saloner 1995), internal patent systems and research
tournaments (Taylor 1995), product platforms (Utterback 1994), and the role of R,D&E as a crucible
for "growing technical managers."
Finally, there are personal and cultural issues in a research community. Many scientists are
driven by an inherent need to know and many scientists believe strongly in a research culture.
Hopefully, our analyses are complementary to these sociological and anthropological approaches to
R,D&E management.
R, D & E METRICS PAGE 24
References
Abt, R., M. Borja, M. M. Menke and J. P. Pezier (1979), "The Dangerous Quest for Certainty in MarketForecasting," Long Range Planning, 12, 2, (April).
Acs, Zoltan J., David B. Audretsch and Maryann P. Feldman (1992), "Real Effects of Academic Research:Comment," American Economic Review, (March), 363-67.
Adler, P. S., D. W. McDonald and F. MacDonald (1992), "Strategic Management of Technical Functions,"Sloan Management Review, Winter, 33, 2.
Allio, Robert J. and Desmond Sheehan (1984), "Allocating R&D Research Effectively," ResearchManagement, (July-Aug.), 14-20.
Bachman, Paul W. (1972), "The Value of R&D in Relation to Company Profits," Research Management,15, (May), 58-63.
Bean, Alden S. (1995), "Why Some R&D Organizations are More Productive than Others," ResearchTechnology Management, (Jan-Feb), 25-29.
Bernstein, Jeffery L. and M. Ishaq Nadiri (1989), "Research and Development and Intra-Industry Spill-overs: An Empirical Application of Dynamic Duality," Review of Economic Studies, (April), 249-269.
Block, Z. and O. A. Ornati (1987), "Compensating Corporate Venture Managers," Journal of BusinessVenturing, 2, 41-51.
Boblin, Nils H., Herman J. Vantrappen and Alfred E. Wechsler (1994), "The Chief Technology Officer asan Agent of Change," Prism, (Fourth Quarter), 75-85.
Boschi, Roberto A. A., Hans Ulrich Balthasar and Michael M. Menke (1979), "Quantifying and ForecastingResearch Success," Research Management, (Sept.), 14-21.
Boulding, William, Ruskin Morgan and Richard Staelin (1997), “Pulling the Plug to Stop the New ProductDrain,” Journal of Marketing Research, 34, (February), 164-176.
Braunstein, David M. and Miren C. Salsamendi (1994), "R&D Planning at ARCO Chemical," ResearchTechnology Management, (Sept-Oct), 33-37.
Brown, John Seely (1991), "Research that Reinvents the Corporation," Harvard Business Review, (Jan-Feb), 102-111.
Chester, Arthur N. (1995), "Measurements and Incentives for Central Research," Research TechnologyManagement, (July-Aug), 14-22.
Chester, Arthur N. (1994), "Aligning Technology with Business Strategy," Research TechnologyManagement, (Jan-Feb), 25-32..
R, D & E METRICS PAGE 25
Cohen. W. M. and D. A. Levinthal (1989), "Innovation and Learning: The Two Faces of R&D," EconomicJournal, 99, 569-596.
Cooper, Robert G. and Elko J. Kleinschmidt (1995), "Benchmarking the Firm's Critical Success Factors inNew Product Development," Journal of Product Innovation Management, 12, 374-391.
Corcoran, Elizabeth (1994), "The Changing Role of US Corporate Research Labs," Research TechnologyManagement, (July-Aug), 14-20.
David, Herbert A. (1970), Order Statistics, (New York: John Wiley and Sons, Inc.).
Drake, Alvin W. (1967), Fundamentals of Applied Probability Theory, (New York: McGraw-Hill).
Edwards, S.A. and M. W. McCarrey (1973), "Measuring the Performance of Researchers," ResearchManagement, 16, 1, (Jan), 34-41.
Erickson, Gary and Robert Jacobson (1992), “Gaining Comparative Advantage Through DiscretionaryExpenditures: The Returns to R&D and Advertising,” Management Science, 38, 9, (September),1264-1279.
European Industrial Research Management Association (1995), Evaluation of R&D Projects, WorkingGroup Report No. 47.
Frohman, Alan L. (1980), "Managing the Company's Technological Assets," Research Management, (May-June), 20-24.
Galloway, E.C. (1971), "Evaluating R&D Performance – Keep it Simple," Research Management, (March),50-58.
Grabowski, H. G. and J. Vernon (1990), "A New Look at the Returns and Risks to Pharmaceutical R&D,"Management Science, 36, 804-821.
Griffin, Abbie and John R. Hauser (1996), "The Marketing/R&D Interface," Journal of Product InnovationManagement, 13, 3, (May).
Griliches, Zvi (1990), "Patent Statistics as Economic Indicators: A Survey," Journal of EconomicLiterature, 28, 4, 16661-1707.
Griliches, Zvi (1992), "The Search for R&D Spillovers," The Scandinavian Journal of Economics, 94,Supplement, 29-47.
Gross, Irwin (1972), "The Creative Aspects of Advertising," Sloan Management Review, 14, 1, (Fall), 83-109.
Gumbel, E. J. (1958), Statistics of Extremes, (New York: Columbia University Press).
Hauser, John R. (1996), "Metrics to Value R&D: An Annotated Bibliography," Working Paper,International Center for Research on the Management of Technology, MIT Sloan School,
R, D & E METRICS PAGE 26
Cambridge, MA 02142 (March). Also available from the Marketing Science Institute.
______, Duncan I. Simester and Birger Wernerfelt (1996), "Internal Customers and Internal Suppliers,"Journal of Marketing Research, 33, 3, (August), 268-280.
______, ______ and ______ (1994), “Customer Satisfaction Incentives,” Marketing Science, 13, 4, (Fall),327-350.
Henderson, Rebecca and Iain Cockburn (1996), "Scale, Scope, and Spillovers: The Determinants ofResearch Productivity in Drug Discover," Rand Journal of Economics, 27, 1, (Spring), 32-59.
Hodder, James E. and Henry E. Riggs (1985), "Pitfalls in Evaluating Risky Projects," Harvard BusinessReview, (Jan.-Feb.) 128-136.
Holmstrom, Bengt (1989), "Agency Costs and Innovation," Journal of Economic Behavior andOrganization, 12, 3, 305-327.
Hultink, Erik Jan and Henry S. J. Robben (1995), "Measuring New Product Success: The Difference thatTime Perspective Makes," Journal of Product Innovation Management, 12, 392-405.
Irvine, John (1988), Evaluating Applied Research: Lessons from Japan, (London: Pinter Publishers).
Jaffe, Adam B. (1986), "Technological Opportunity and Spillovers of R&D: Evidence for firms Patents,Profits, and Market Value," American Economic Review, (December), 984-1001.
______ (1989), "Real Effects of Academic Research," American Economic Review, (December), 957-970.
Keeney, Ralph L. and Howard Raiffa (1976), Decisions with Multiple Objectives: Preferences and ValueTradeoffs, (New York: John Wiley & Sons).
Koenig, M.E.D. (1983), "A Bibliometric Analysis of Pharmaceutical Research," Research Policy, 12, 15-36.
Krause, Irv and Liu, John (1993), "Benchmarking R&D Productivity: Research and Development; CaseStudy," Planning Review, 21, 1, (January), 16-?.
Krogh, Lester C., Julianne H. Prager, David P. Sorensen and John D. Tomlinson (1988), "How 3MEvaluates Its R&D Programs," Research Technology Management, (Nov-Dec), 10-14.
Mansfield, Edwin (1981), "Composition of R&D Expenditures: Relationship to Size of Firm,Concentration, and Innovative Output," Review of Economics and Statistics, (Nov).
______ (1980), "Basic Research and Productivity Increase in Manufacturing," American Economic Review,(December).
______ (1982), "How Economists See R&D," Research Technology Management, (July), 23-29.
McGrath Michael E. and Michael N. Romeri (1994), "The R&D Effectiveness Index: A Metric for ProductDevelopment Performance," Journal of Product Innovation Management, 11, 213-220.
R, D & E METRICS PAGE 27
Mechlin, George F. and Daniel Berg (1980), "Evaluating Research – ROI is Not Enough," HarvardBusiness Review, 59, (Sept-Oct), 93-99.
Miller, Roger (1992), "The Influence of Primary Task on R&D Laboratory Evaluation: A ComparativeBibliometric Analysis," R&D Management, 22, 1, (January), 3.
Mitchell, Graham R. and William F. Hamilton (1988), "Managing R&D as a Strategic Option," ResearchTechnology Management, (May-June), 15-22.
Negroponte, Nicholas (1996), "Where Do New Ideas Come From," Wired, (January), 204.
Nelson, R. (1959), "The Simple Economics of Basic Scientific Research," Journal of Political Economy, 67,297-306.
Pappas, Richard A. and Donald S. Remer (1985), "Measuring R&D Productivity," Research Management,(May-June), 15-22.
Patterson, W. (1983), "Evaluating R&D Performance at Alcoa Laboratories," Research Management,(March-April), 23-27.
Pien, Homer (1997), “Competitive Advantage through Successful Management of R&D,” Master’s Thesis,Management of Technology Program, MIT, Cambridge, MA 02139.
Platt, John (1964), “Strong Inference,” Science, 146, 3642, (October 16), 347-353.
Ransley, D. L. and J. L. Rogers (1994), "A Consensus on Best R&D Practices," Research TechnologyManagement, (Mar-Apr), 19-26.
Reynolds, William B. (1965), "Research Evaluation," Research Management, (March), 117-125.
Rotemberg, Julio J. and Garth Saloner (1995), "Overt Interfunctional Conflict (and its Reduction ThroughBusiness Strategy)," Rand Journal of Economics, 26, 4, (Winter), 630-653.
Roussel, Philip A., Kamal N. Saad and Tamara J. Erickson (1991), Managing the Link to CorporateStrategy: Third Generation R&D, (Boston, MA: Harvard Business School Press).
Schainblatt, A. H. (1982), "How Companies Measure the Productivity of Engineers and Scientists, ResearchManagement, 25, 5, (May).
Schmitt, Roland W. (1987), "R&D in a Competitive Era," Research Management, (Jan-Feb), 15-19.
Sen, Falguni and Albert H. Rubenstein (1989), "External Technology and in-house R&D's FacilitativeRole," Journal of Product Innovation Management, 6, 2, 123-138.
Shapria, R. and S. Globerson (1983), "An Incentive Plan for R&D Workers," Research Management, (Sept-Oct), 17-20.
R, D & E METRICS PAGE 28
Stahl, Michael J. and Joseph Steger (1977), "Measuring Innovation and Productivity – A Peer RatingSystem," Research Management, (January)
Steele, Lowell W. (1987), "Selling Technology to Your Chief Executive," Research Management, 30, 1,(Jan-Feb).
______ (1988), "What We've Learned: Selecting R&D Programs and Objectives," Research TechnologyManagement, (March-April), 1-36.
Stigler, George (1961), "The Economics of Information," Journal of Political Economy, 60, (June), 213-225.
Szakonyi, Robert (1990), "101 tips for Managing R&D More Effectively - I," Research TechnologyManagement, (July-Aug), 31-36 and (Nov-Dec), 41-46.
Taylor, Curtis (1995), "Digging for Golden Carrots: An Analysis of Research Tournaments," AmericanEconomic Review, 85, 4, (September), 872-890.
Tenner, Arthur R. (1991), "Quality Management Beyond Manufacturing," Research TechnologyManagement, (Sept-Oct), 27-32.
Tipping, James W. (1993), "Doing a Lot More with a Lot Less," Research Technology Management, (Sept-Oct), 13-14.
______, Eugene Zeffren and Alan R. Fusfeld (1995), "Assessing the Value of Your Technology," ResearchTechnology Management, 22-39.
Utterback, James M. (1994), Managing the Dynamics of Innovation, (Boston, MA: Harvard BusinessSchool Press).
Vest, Charles M. (1995), "Drift Toward Mediocrity in Science," The MIT Report, (Sept-Oct), 23, 7, 3-4
von Hippel, Eric (1988), The Sources of Innovation, (New York: Oxford University Press).
Ward, Michael and David Dranove (1995), "The Vertical Chain of R&D in the Pharmaceutical Industry,"Economic Inquiry, 33, (January), 1-18.
Whelen, J. M. (1976), "Project Profile Reports Measure R&D Effectiveness," Research Management,(September).
Zettelmeyer, Florian and John R. Hauser (1995), "Metrics to Value R&D Groups, Phase I: QualitativeInterviews," Working Paper, International Center for Research on the Management of Technology,MIT Sloan School, Cambridge, MA 02142 (March).
Table 1. Managers Interviewed(A total of 43 managers and researchers were interviewed. This table lists some of the titles.)
Organization
Chevron Petroleum Technology
Hoechst Celanese ATG
AT&T Bell Laboratories
Bosch GmbH
Schlumberger Measure. & Systems
Electricite de France
Cable & Wireless plc
Polaroid Corporation
US Army Missile RDEC and Army Research Laboratory
Varian Vacuum Products
Managers Interviewed
President, Head of Strategic Research, R&D Portfolio Manager
President, VP Technology, VP Commercial Development, VPTechnology & Business Assessment, Director InnovationsVP Administrative Systems, Director of R&D Programs, Directorof Information Applications ArchitectureSenior VP for Strategic Planning, Head of Corporate Research
VP Director of R&D, Director of Engineering Process Devel-opment, Director of European Tech. CooperationAssociate Director R&D, Director of Division
Federal Development Director, Director of Technology (HK),Group Strategic Development AdvisorCEO, Director of Research
Associate Director for Science and Technology, AssociateDirector for Systems, Deputy Assistant Secretary for Researchand Technology/Chief ScientistVP, General Manager