Non-dogmatic social discountingecon.ucsb.edu/~amillner/files/NDSD.pdf · Non-dogmatic social...

Non-dogmatic social discounting

By Antony Millner∗

The long-run social discount rate has an enormous effect on

the value of climate mitigation, infrastructure projects, and

other long-term public policies. Its value is however highly

contested, in part because of normative disagreements about

social time preferences. I develop a theory of ‘non-dogmatic’

social planners, who are insecure in their current norma-

tive judgments and entertain the possibility that they may

change. Although each non-dogmatic planner advocates an

idiosyncratic theory of intertemporal social welfare, all such

planners agree on the long-run social discount rate. Non-

dogmatism thus goes some way towards resolving normative

disagreements, especially for long-term public projects.

JEL: H43,D61,D90

Keywords: Social discount rate, normative uncertainty, inter-

dependence, cost-benefit analysis.

‘I take the problem of discounting for projects with payoffs in the far fu-

ture...to be largely ethical.’ – Kenneth Arrow (1999)

∗ Department of Economics, University of California, Santa Barbara, CA 93106, USA,[email protected]. I am grateful to Geir Asheim, Partha Dasgupta, Marc Fleurbaey,Simone Galperti, Ben Groom, Geoff Heal, Derek Lemoine, Lucija Muehlenbachs, FrikkNesje, Bruno Strulovici, the audiences of numerous seminars and conferences, four anony-mous referees, and the editor, for helpful comments and discussions. This work was car-ried out at the Grantham Research Institute, London School of Economics and PoliticalScience. I gratefully acknowledge financial support from the ESRC Centre for ClimateChange Economics and Policy and the Grantham Foundation for the Protection of theEnvironment during my time at LSE.

1

2 THE AMERICAN ECONOMIC REVIEW

‘[T]he list of axioms we use as a basis for our ethical theory can never be

more than a tentative list, one always open to possible revision.’

– John Harsanyi (1977)

The social discount rate (SDR) converts the future consequences of public

projects into present values, and is thus a critical input to public cost-

benefit analysis. Small changes in the SDR can have an enormous effect on

the estimated value of public projects with long-run consequences such as

infrastructure investments, climate change mitigation measures, and nuclear

waste management (Arrow et al., 2013). Yet despite almost a century of

economic research on intertemporal public decision-making, opinion is still

divided on how costs and benefits that occur more than a few decades in

the future should be discounted.

SDRs are related to intertemporal marginal rates of substitution, which

quantify the social value of consumption changes in the future. Much of

the disagreement about SDRs stems from normative disagreements about

the social time preferences that determine these marginal rates of substitu-

tion (Drupp et al., 2018a; Dasgupta, 2008). That such disagreements occur

should not be surprising – specifying social time preferences requires dif-

ficult normative judgments about, for example, the appropriate degree of

social impatience and aversion to intertemporal consumption inequalities,

and there is no silver bullet specification that is immune to criticism (see

Greaves (2017) for a discussion of the arguments). Conversely, as MacAskill

(2016) observes, ‘for almost any ethical view, there seems to be something

VOL. NO. NON-DOGMATIC SOCIAL DISCOUNTING 3

to be said in its favour.’1

This paper develops a theory of ‘non-dogmatic’ social planners’ time pref-

erences, starting from the premise that no single normative theory of in-

tertemporal social welfare is unassailable, and devotees of all theories should

thus exhibit a degree of insecurity in their normative judgments.2 Although

non-dogmatic planners favour a particular theory of intertemporal social

welfare, they admit the possibility that they may be persuaded of the virtues

of an alternative theory in the future. Non-dogmatic planners anticipate

these possible changes, and internalize the preferences of their future selves.

The non-dogmatism of such planners thus manifests both through their will-

ingness to entertain the possibility of a change in their views, and through

their unwillingness to act as pure dictators with respect to their future

selves.3 Since non-dogmatic planners are always insecure, their normative

preferences at any future time τ reflect uncertainty about future preferences

at times greater than τ . Persistent normative insecurity coupled with inter-

nalization of future preferences thus results in a recursively defined sequence

of time preferences, in which current planners’ preferences depend on their

1An alternative tradition in the literature, often termed the ‘positive approach’, iden-tifies SDRs with observed market interest rates, and thus does not directly engage withnormative reasoning. Arrow, Dasgupta and Maler (2003) however remind us that ‘us-ing market observables to infer social welfare can be misleading in imperfect economies.That we may have to be explicit about welfare parameters...in order to estimate marginalrates of substitution in imperfect economies is not an argument for pretending that theeconomies in question are not imperfect after all.’ Market imperfections are particularlysalient for long-run SDRs. Gollier and Hammitt (2014), for example, explain that ‘thepositive approach cannot be applied for time horizons exceeding 20 or 30 years, becausethere are no safe assets traded on markets with such large maturities.’

2Throughout the paper I use ‘planner’ and ‘theory’ roughly interchangeably. A plan-ner’s time preferences are equivalent to a normative theory of intertemporal social welfare.They are thus distinct from consumer preferences that are inferred by revealed preference,but are an intertemporal analogue of the social welfare functions used in, for example,optimal tax theory.

3Non-dogmatic planners still rank consumption streams using their current prefer-ences, but current preferences depend in part on future preferences.


possible future preferences, each of which is in turn recursively defined.

Crucially, non-dogmatic planners still make idiosyncratic judgments about

all the contested normative aspects of intertemporal welfare functions (IWFs),4

including utility functions and pure time discount factors. Nevertheless, I

show that all non-dogmatic IWFs yield the same value of the long-run SDR.

Thus, adopting a model of intertemporal evaluation in which planners ex-

hibit some insecurity in their normative judgments ends up resolving dis-

agreements about the evaluation of long-run public projects. The intuition

for this finding is developed in an example below. It is a consequence of the

fact that each non-dogmatic IWF depends in part on other non-dogmatic

IWFs in future periods. Disagreements about long-run SDRs wash out when

this nested sequence of interdependent valuations is unravelled backwards

to the present, since IWFs mix repeatedly over time.

Formally, the model extends and reinterprets existing models of ‘purely

altruistic’ intergenerational time preferences (Ray, 1987; Bergstrom, 1999;

Saez-Marti and Weibull, 2005; Galperti and Strulovici, 2017). In these mod-

els a single representative agent in the current generation internalizes the

preferences of future generations, assuming that each future generation does

the same, and that preferences are time invariant. Leading philosophers have

long drawn an analogy between present generations’ concerns for future

generations, and present selves’ concerns for future selves.5 Parfit (1984,

p. 319), for example, argues that ‘Like future generations, future selves

have no vote, so their interests need to be specially protected.’ The present

4IWFs are functions that represent planners’ normative preferences over consumptionstreams.

5Analogies between intergenerational and intrapersonal choice have also proved fruit-ful in economics (Phelps and Pollak, 1968; Laibson, 1997). See Ray, Vellodi and Wang(2018) for a behavioural model in which agents exhibit concern for future selves.


paper formalizes this analogy in a model of social planners whose norma-

tive judgments may change over time. Unlike the intergenerational work

cited above, the internalization of future preferences occurs intrapersonally

in my model.6 Hori (2009) has previously observed in a static framework

that increased interdependence between heterogeneous agents who internal-

ize others’ preferences can cause their values to get ‘closer together’. This

paper shows that in any fixed model of heterogeneous social planners’ time

preferences internalization leads to complete convergence on the most con-

tested quantity in public cost-benefit analysis: the long-run SDR.

A related literature has studied social choice theoretic approaches to ag-

gregating time preferences7(Gollier and Zeckhauser, 2005; Heal and Millner,

2014; Millner and Heal, 2018; Chambers and Echenique, 2018; Feng and

Ke, 2018), and the aggregation of heterogeneous opinions on SDRs (Weitz-

man, 2001; Freeman and Groom, 2015).8 Unlike this work, this paper does

not specify an aggregation rule that is applied unilaterally by an external

analyst. Non-dogmatic planners may disagree on all the normative issues

that are sources of contention in discussions of social discounting, and priv-

ilege their own theory of intertemporal social welfare. Nevertheless, non-

dogmatism causes each planner to account to some extent for alternative

theories, so that some aggregation occurs internally in each theory. I show

that this is enough to generate consensus on the long-run SDR.

6An alternative interpretation of the model that retains an interpersonal flavour ispossible, see footnote 15.

7A utilitarian aggregation approach leads to representative discount rates that aredominated by the preferences of the most patient agent for long maturities (Gollier andZeckhauser, 2005). This only occurs in a very special case of the model I develop; ingeneral consensus long-run discount rates are determined by a non-trivial mixture of allnon-dogmatic planners’ IWFs.

8The papers that aggregate SDRs directly do not disentangle heterogeneous beliefsabout facts (i.e., consumption growth rates) from disagreements about values. Thispaper, like the social choice literature, focusses on values.


It is important to emphasize that the model I present is normative: I sug-

gest that planners should exhibit some insecurity in their normative judg-

ments, propose a method for them to do so, and use a calibrated version

of the model to show how observed disagreements on SDRs might change

if they did. The model does not claim to describe the observed behaviour

of governments, or the recommendations of ‘experts’ on social discounting.

Like much normative work, the paper is an exercise in persuasion. It shows

that if advocates of alternative theories of intertemporal social welfare ad-

mitted some insecurity in their normative judgments, but were unwilling to

give them up entirely, a lot of progress could still be made.

I. A motivating example

The essential features of the model can be illustrated in a simple ex-

ample. Suppose that there are only two plausible normative theories of

intertemporal social welfare, and let planner i ∈ {1, 2} be a devotee of the-

ory i. To establish a baseline model, begin by assuming that at time τ

planner i’s normative preferences over infinite annual consumption streams

Cτ = (cτ , cτ+1, cτ+2, . . .) can be represented by an IWF V iτ of the following

familiar form:

(1) V iτ = U i(cτ ) + βiV

iτ+1,

where U i(c) is a utility function and βi ∈ (0, 1) is a pure time discount

factor. These IWFs have the following equivalent representation:

(2) V iτ =

∞∑s=0

(βi)sU i(cτ+s).


Now consider evaluating a marginal public project with a sequence of annual

payoffs πππ = (π0, π1, . . .). Standard results (Dasgupta, Sen and Marglin,

1972; Gollier, 2012) show that the project is welfare improving according

to planner i if and only if its net present value is positive, where the net

present value of πππ is defined as:

(3) NPV i(πππ) =∞∑s=0

πse−ri(s)s,

and the social discount factor at maturity s is given by the marginal rate

of substitution between consumption at times τ + s and τ , denoted MRSis:

(4) e−ri(s)s = MRSis = (βi)

s (U i)′(cτ+s)

(U i)′(cτ ).

The social discount rate at maturity s according to planner i is

(5) ri(s) = −1

slnMRSis.

This fundamental quantity tells us how planner i converts safe marginal

payoffs at maturity s to present values. Since each planner has an idiosyn-

cratic utility function U i(c) and pure time discount factor βi, there is no

possibility of them generically agreeing on any part of the term structure of

SDRs ri(s).

Equation (5) can be made more intuitive by assuming that utility func-

tions are iso-elastic (i.e., (U i)′(c) = c−ηi), writing βi = e−ρi , and defining

compound annual consumption growth rates gs via cτ+s = cτegss. Substi-


tuting these assumptions into (4–5) we find the famous Ramsey rule:

(6) ri(s) = ρi + ηigs.

The first term in this expression is planner i’s pure rate of social time pref-

erence, and the second term captures her aversion to intertemporal con-

sumption inequalities, which depends on her elasticity of marginal utility ηi.

Planners’ adopted values for these parameters constitute primitive norma-

tive judgments about how society should trade off consumption at different

points in time (Gollier and Hammitt, 2014).

Now consider a variation on the time preferences in (1). Suppose that

each planner is a little insecure in her normative judgments, and entertains

the possibility that she may be persuaded of the alternative theory of in-

tertemporal social welfare in the future. For concreteness, suppose that the

probability of the planners’ judgments remaining unaltered next year is w,

and the probability of them changing is 1 − w. How should the planners

account for their insecurity today? One answer is that they should simply

forget about it. This is perfectly coherent, but amounts to a dogmatic impo-

sition of current preferences on future selves, despite the planners’ insecurity

about their current, possibly transitory, normative judgments. Normative

insecurities have no consequences for SDRs in this case.9 A second option is

for the planners to adjust their ‘raw’ preferences by aggregating them with

the alternative theory of intertemporal social welfare. However, this places

9This is the approach often taken in models of time inconsistent preferences. Sophis-ticated agents in these models anticipate the actions of their future selves, and reactoptimally to them, but do not incorporate future preferences into their own rankings ofconsumption streams – they are dogmatic. See Galperti and Strulovici (2017) for furtherdiscussion of the relationship between time consistency and preference internalization.


current and possible future preferences on an equal conceptual footing today,

even though the planners are currently devotees of only one theory. We are

after a model in which current planners can put all their eggs in one basket.

A third option – the one I will pursue – is for the planners to use their cur-

rent preferences to rank consumption streams, but for those preferences to

internalize the preferences of future selves. In this interpretation equation

(1) is thought of as saying that self τ ’s IWF is an additive combination of

current utility and the IWF of self τ + 1. If the self at τ admits the possi-

bility that the self at τ + 1 may be persuaded of the alternative theory of

intertemporal social welfare, a natural analogue of (1) is:

V 1τ = U1(cτ ) + β1(wV 1

τ+1 + (1− w)V 2τ+1),(7)

V 2τ = U2(cτ ) + β2((1− w)V 1

τ+1 + wV 2τ+1),(8)

where w ∈ (0, 1).

In this model planners’ insecurity in their current normative judgments

causes them to avoid imposing their current preferences on their future self

(they only care about the self one year ahead in this example). Current

planners account for their future self’s preferences directly, and do not just

dogmatically value future consumption streams using their current prefer-

ences, which may be obsolete by the time next year rolls around. Moreover,

normative insecurity is persistent: planners’ preferences at time τ + 1 them-

selves reflect uncertainty about preferences at τ + 2, ad infinitum. Planners

whose IWFs are of the form in (7–8) will be called ‘non-dogmatic’ – I provide

a formal definition below.10 Note that the IWFs defined by (7–8) still admit

10The literature on intergenerational altruism uses the terms ‘non-paternalistic’ or‘pure’ to describe agents who internalize others’ preferences. I use ‘non-dogmatic’ both


arbitrary idiosyncratic pure time discount factors and utility functions.

To analyze the coupled system of time preferences in (7–8), define

~Vτ =

V 1τ

V 2τ

; ~Uτ =

U1(cτ )

U2(cτ )

; F =

β1w β1(1− w)

β2(1− w) β2w

.

Then we can write (7–8) as:

(9) ~Vτ = ~Uτ + F~Vτ+1 =∞∑s=0

Fs~Uτ+s.

Planners’ attitudes to consumption changes in the distant future depend on

the behaviour of Fs for large s. Since w ∈ (0, 1), the matrix F is strictly

positive. The Perron-Frobenious theorem (see Sternberg, 2014) then tells

us that there is a 2× 2 matrix A, with elements aij > 0, such that

(10) lims→∞

Fs

µs= A,

where µ ∈ (0, 1) is the largest eigenvalue of F. Thus when s is large both

planners’ weights on future utilities are proportional to µs, where µ is a

non-trivial mixture of both planners’ discount factors.11

To understand the intuition for this result notice that current planners at

τ only care about utilities at future times τ+1, τ+2, . . . indirectly through a

mixture F of their possible IWFs at τ+1. Planners at τ+1 in turn only care

about utilities at times τ +2, τ +3, . . . through a mixture F of their possible

to distinguish my model of intrapersonal internalization from this literature, and becausethis term is a better fit to the context of this paper, in which planners contend with manyplausible theories of intertemporal welfare.

11In this example µ = w(β1+β2)2 +

√w2(β1+β2)2

4 − β1β2(2w − 1).


IWFs at time τ +2. Thus we see that current planners’ attitudes to utilities

at time τ+s are obtained by iterating their possible IWFs at τ+s backwards

to τ , passing through their IWFs at times τ+s−1, τ+s−2, . . . , τ+1. With

each step back in this iteration the discount factors associated with different

theories of intertemporal welfare are mixed by the matrix F. As the number

of mixing operations grows (i.e., as s increases), planners’ discount factors

become homogenized. For large s the mixing process converges, and both

planners’ long-run utility weights are proportional to a common factor µs.

It is the fact that planners anticipate possible changes in their theories of

intertemporal social welfare, and form their current preferences with one eye

on their future selves, that delivers this result.

Substituting (10) into (9) we see that according to planner i, the marginal

rate of substitution between consumption at τ and consumption at distant

future times τ + s is

MRSis =µs[ai1(U1)′(cτ+s) + ai2(U2)′(cτ+s)]

(U i)′(cτ ).(11)

With a few assumptions we can simplify this expression further. Denote the

long-run growth rate of consumption by g, i.e., cτ+s = egscτ for large s. In

addition, define the long-run pure rate of social time preference ρ = − lnµ,

and assume again that utility functions are iso-elastic (i.e., (U i)′(c) = c−ηi).

Since (U i)′(cτ+s) ∝ e−gηis for large s, MRSis is dominated by the exponential

term with the lowest value of ηig. Substituting these assumptions into (11),

we see that when s is large,

MRSis ∝ e−(ρ+min{η1g,η2g})s ⇒ ri(s)→ ρ+ min{η1g, η2g}.


Thus, although the non-dogmatic planners may have arbitrary disagree-

ments about pure time discount factors and elasticities of marginal utility,

they both agree on the long-run SDR. I will show below that disagreements

may reduce substantially even for medium term maturities.

II. The model

I now extend the results above to an arbitrary number of planners, each

of whom may account for the preferences of future selves into the indefinite

future. Assume that there are N > 1 plausible normative theories of in-

tertemporal social welfare. As before I identify planner i with theory i, and

denote i’s IWF at time τ by V iτ . The vector of IWFs at time τ is denoted

by ~Vτ = (V 1τ , V

2τ , . . . , V

Nτ ). We will say that the time preferences defined by

the sequence {~Vτ}τ∈N are non-dogmatic if for all i = 1 . . . N, τ ∈ N,

V iτ = U i(cτ ) +

∞∑s=1

N∑j=1

f ijs Vjτ+s,(12)

where f ijs ≥ 0 for all s ≥ 1, and there exists a t ≥ 1 such that f ijt > 0 for all

i, j = 1 . . . N . Lemma 1 in the Appendix shows that (12) defines a unique

bounded set of time preferences that are non-decreasing in all utilities if

(13) maxi

{∞∑s=1

N∑j=1

f ijs

}< 1.

I assume this condition from now on.12

The definition in (12) encodes three assumptions. First, planners’ time

12The condition in (13) is sufficient, but not necessary, for the required properties tohold. A necessary and sufficient condition is provided in the appendix, however as thiscondition is difficult to check in practice we will work with (13). None of the resultsdepend on this simplification.


preferences are forward looking and time invariant; this captures the persis-

tence of normative insecurity, and implies that preferences do not depend on

the history of consumption. Second, current planners internalize the prefer-

ences of possible future selves, and assign non-zero weight to each plausible

theory when imagining what their future preferences might be. Third, pref-

erences are additively time separable. A set of IWFs satisfies these three

assumptions if and only if it is of the form in (12).13

The intertemporal weight f ijs in (12) is the product of two terms: the pure

time discount factor of planner i at time τ on the IWF of self τ +s (denoted

by βis), and the weight i places on theory j in year τ + s (denoted by wijs ):

f ijs = βiswijs ,

where∑N

j=1wijs = 1. In the normative application we consider it is natural

to require some parity between the weights wijs of different planners, as in

the simple example above.14 This ensures that the model delivers a set of

theories that are ‘equally non-dogmatic’, but since this is not required for

the main result I do not insist on it in the definition.

There is nothing in the representation (12) that requires us to think of the

weights wijs as probabilities – at present these weights are merely parameters

of the preference representation.15 If, however, we do interpret these weights

13Forward looking time invariant IWFs that internalize future preferences are of theform V iτ = F i(cτ , V

1τ+1, . . . , V

Nτ+1, V

1τ+2, . . . , V

Nτ+2, . . .). Galperti and Strulovici (2017)

show that IWFs of this kind are time separable if and only if they are of the form in (12).14Equation A.32 in the appendix gives a more sophisticated example of ‘parity’ between

planners’ intertemporal weights.15With some modification (12) could be interpreted as a positive model of a set of

altruistic agents, each of whom cares about everyone else’s total wellbeing in every futureperiod. This would require the arguments of utility functions to be idiosyncratic privateconsumption variables, rather than aggregate social consumption – in this case each agentwould have N consumption discount rates at each maturity. If, however, these agents


as beliefs, it is natural to require that that those beliefs be consistent. Con-

sistency requires that current planners’ beliefs about which theories they

may adopt in the future cohere with their future selves’ beliefs about their

own chances of switching from their preferred theory. Let Probτ (i → j; s)

denote the probability that planner i at time τ assigns to a switch to theory

j after s years. Beliefs are consistent iff

Probτ (i→ j; s) =N∑k=1

Probτ (i→ k; t)Probτ+t(k → j; s− t),

for all τ ∈ N, s ≥ 2, 1 ≤ t < s. Lemma 2 in the Appendix shows that non-

dogmatic planners have consistent beliefs iff there exists an N×N stochastic

matrix P such that

(14) wijs = (Ps)i,j,

for all s ≥ 1. We use this restriction on the weights wijs in Section IV below,

but the main results do not require it.

III. Results

As in the example above, V iτ in (12) has an equivalent representation in

terms of sums of future utilities which may be determined by solving the

infinite system of equations (12) (see appendix). We write the solution of

this system as

(15) V iτ =

∞∑s=0

N∑j=1

aijs Uj(cτ+s),

derived their utility from consumption of a public good, (12) would apply unchanged.


where aijs ≥ 0 for all i, j = 1 . . . N, s ∈ N. The SDR at maturity s according

to planner i is:

(16) ri(s) = −1

slnMRSis = −1

sln

(1

(U i)′(cτ )

N∑j=1

aijs (U j)′(cτ+s)

).

Define the elasticity of planner i’s marginal utility function as

(17) ηi(c) = −c(U i)′′(c)

(U i)′(c).

If ηi(c) is uniformly larger than ηj(c), planner i is more averse to intertem-

poral consumption inequalities than planner j. I assume that ηi(c) ≥ 0, is

bounded for all c, and that limc→∞ ηi(c) > 0 and limc→0 η

i(c) > 0 for all i

(I assume that all limits exist). In addition, define the long-run growth rate

of consumption to be

g = lims→∞

1

sln

(cτ+s

cτ

)and let

(18) η =

mini {limc→∞ ηi(c)} if g > 0

maxi {limc→0 ηi(c)} if g < 0.

Finally, let Fs be an N × N matrix with elements (Fs)i,j = f ijs , let 1N be


the N ×N identity matrix, and define the NM ×NM matrix

ΦM =

F1 F2 . . . FM−1 FM

1N 0 . . . 0 0

0 1N . . . 0 0...

......

......

0 0 . . . 1N 0

.(19)

Let µ(M) ∈ (0, 1) be the largest eigenvalue of ΦM .

With these definitions in place the main result can be stated.

PROPOSITION 1: All non-dogmatic planners agree on the long-run SDR:

(20) lims→∞

ri(s) = ρ+ ηg, ∀i = 1, . . . , N,

where ρ = − limM→∞ lnµ(M).

The proof of this proposition shows that the requirement in (12) that each

planner place positive weight on all theories in some future period is stronger

than is needed for the result.16 The formula in (20) can also be extended

to the case where consumption growth is uncertain (see the appendix). The

proposition also provides a practical procedure for approximating ρ: com-

pute − lnµ(M) for increasingly large values of M .17

Proposition 1 provides a simple characterization of the consensus long-

run elasticity of marginal utility η. The consensus long-run pure rate of

social time preference ρ is, however, a much more complex quantity, which

16It is sufficient for each planner to place positive weight on some other theory in somefuture period, in such a way that if we look far enough ahead, all planners’ preferencesinfluence each other. Planner i need not place positive weight on theory j directly.

17The appendix shows that − lnµ(M) decreases monotonically to ρ as M increases.


depends on the full set of intertemporal weights f ijs . The appendix provides

further discussion of ρ, including some comparative statics results. We will

content ourselves with describing two intuitive properties of ρ here.

PROPOSITION 2: 1) ρ is decreasing in f ijs for all i, j, s.

2) Suppose that the intertemporal weights f ijs are given by

f ijs (ε) =

f iis j = i

hijs (ε) j 6= i,

where the functions hijs (ε) are continuous, hijs (ε) > 0 for ε > 0, and

hijs (0) = 0. Let ρi be planner i’s idiosyncratic long-run rate of pure

time preference when ε = 0, and let ρ(ε) be the consensus long-run

rate of pure time preference when ε > 0. Then

(21) limε→0+

ρ(ε) = miniρi.

The first part of the proposition is intuitive. Any increase in f ijs increases the

weight planner i places on future utilities. Since all planners’ IWFs depend

on planner i’s IWF, all planners are less impatient if f ijs increases. Thus the

consensus long-run rate of time preference decreases if f ijs increases. The

second part of the proposition shows that if all planners assign arbitrarily

small, but positive, weight to alternative theories, their consensus long-

run rate of time preference is the lowest of all of their ‘dogmatic’ rates.

To understand the intuition for this finding, note that although planner i

places arbitrarily small weight on theories that do not coincide with her

current theory as ε → 0, each theory still enters into her current IWF V iτ

for all ε > 0. When ε = 0 planner j’s weights on future utilities decline


like e−ρjs as s → ∞. Thus the planner with the lowest value of ρj will

place exponentially more weight on distant future utilities than any more

impatient planner as s → ∞ when ε = 0. Since the most patient planner’s

preferences are part of each planner’s preferences for ε > 0, by continuity

the consensus long-run rate of time preference must be given by the most

patient planner’s dogmatic long-run rate of time preference as ε→ 0.18

Part 2 of Proposition 2 invokes related findings on the aggregation of

opinions on SDRs (Weitzman, 2001; Freeman and Groom, 2015), and on

the utilitarian aggregation of time preferences (e.g. Gollier and Zeckhauser,

2005). In each of these cases averaging over a distribution of discount factors

leads to a ‘certainty equivalent’ discount rate, or a representative discount

rate, that declines to the lowest rate as the time horizon tends to infinity.

Proposition 2 differs from these results as it pertains to the long-run SDR

in each theory, rather than an external analysts’ average across preferences

or real discount rates. The proposition also shows that ρ is only determined

by the most patient planner in a very special case of the model, i.e., when

planners are ‘minimally’ non-dogmatic. In all other cases, ρ is a non-trivial

mixture of the intertemporal weights of all theories.

IV. Consequences for cost-benefit analysis

While Proposition 1 emphasizes the emergence of a consensus on the

long-run SDR, this result implies a more general phenomenon that has

18More technically, a matrix that determines planners’ pure time discount factors,call it Φ(ε), separates into N independent components at ε = 0, each of which has adominant eigenvalue that corresponds to the long-run pure time discount factor of oneof the dogmatic planners. The largest eigenvalue of Φ(0) is simply the largest of theseN dominant eigenvalues. When ε > 0, Φ(ε) is primitive and has only one component.Since eigenvalues are continuous functions of matrix elements, the largest eigenvalue ofΦ(ε) converges to the largest of the N dogmatic long-run discount factors as ε→ 0.


relevance for cost-benefit analysis. As (3) shows, calculations of the net

present value of public projects depend on the full term structure of SDRs.

Since non-dogmatic planners’ SDRs ri(s) converge completely as maturity

s → ∞, they must also exhibit partial convergence at finite maturities.

Non-dogmatism may thus reduce disagreement about project NPVs by act-

ing through the entire term structure of the SDR. In this section, I illustrate

the effect of non-dogmatism on cost-benefit analysis of public projects in a

calibrated numerical model, and also show how quickly disagreements about

SDRs may decline with maturity. The results in this section constitute a

normative counterfactual: I have argued that advocates of all theories of

intertemporal welfare should be non-dogmatic, and this section illustrates

what might happen to disagreements if they were.

To enable this analysis I will work with data on economists’ opinions on

the normatively appropriate IWF, collected by Drupp et al. (2018a,b).19 Al-

though there is no deep reason why economists’ normative judgments should

be seen as representative of the distribution of plausible theories, they do

arguably have an advantage in understanding the quantitative implications

of different recommendations for cost-benefit analysis. Rawls (1971), in his

notion of ‘reflective equilibrium’, argues that this is an essential feature

of good normative reasoning. For my purposes these economists’ opinions

merely provide an interesting distribution of informed views on these mat-

ters.

19I assume that the opinions expressed in the survey data do not already account fornon-dogmatism. Drupp et al. (2018a) state explicitly that ‘we structure the survey arounda well-known framework for inter-temporal welfare evaluations: Time Discounted Utili-tarianism’, and work with an iso-elastic utility function. Only two respondents objectedto the survey’s request for a constant pure rate of time preference, and none objected tothe request for a constant elasticity of marginal utility (personal communication). Bothof these quantities would vary with maturity if the respondents were non-dogmatic.


The Drupp et al. (2018a) survey contains 173 complete responses from

scholars who have published papers on social discounting. Each respondent

gave an opinion on the appropriate values of the parameters of a discounted

utilitarian IWF with iso-elastic utility function. The 5-95% ranges of opin-

ions on the pure rate of social time preference and elasticity of marginal

utility were [0,3.85%/yr] and [0.2,3] respectively. To calibrate the model I

assume that the intertemporal weights f ijs = βiswijs in (12) take the following

form:

βis = γi(αi)s, wijs = (Ps)i,j, where Pi,j =

x i = j

1−xN−1

i 6= j.(22)

The parameter x ∈ [0, 1] is the probability that planners stick to their cur-

rent theory next year.20 Conditional on switching, planners assign an equal

probability to all other theories. By (14), planners’ beliefs are consistent

in this model. The values of γi and αi are calibrated so that when x → 1

non-dogmatic planners’ IWFs are a close approximation to a discounted

utilitarian IWF, and consistent with the values for the pure rate of social

time preference that survey respondents reported. Utility functions U i(c)

are taken to be iso-elastic, with the elasticity of marginal utility calibrated

to respondents’ reported values.

One subtlety of the calibration procedure is worth pointing out. I have

chosen to present the model with an annual time step as discount rate

20In this model, wiis = x(Nx−1N−1

)s−1

+ 1N

(1−

(Nx−1N−1

)s−1)

for all i, and wijs =1−wii

s

N−1

for j 6= i. However, the appendix explains that the numerical results in this section arerobust to alternative specifications of the weights wijs for s ≥ 2. Note that these areweights on future IWFs, and not on future utilities; utility weights are determined by thesolution of the entire system (12). No matter what IWF planner i adopts in year s, it isdiscounted using her current intertemporal weight γi(αi)

s.


schedules are usually presented at this temporal resolution in practice. If,

however, we change the time step we also have to change the values of the

dynamical parameters in the model, i.e., consumption growth rates, rates of

pure time preference, and in particular the transition probability matrix P,

to reflect the change of units. This procedure is less straightforward in this

model of interdependent preferences than in more familiar dynamic models,

but can be accomplished. The appendix provides further details on this

point, and a full description of the calibration procedures.

Given this calibration methodology, the distribution of term structures for

the SDR can be computed for different values of the parameter x. Figure 1a

depicts the results of this exercise, assuming a constant consumption growth

rate of 2%/yr. The figure shows that disagreements about the SDR could

reduce substantially even at medium term maturities if planners were non-

dogmatic. Reductions in disagreement are greatest at longer maturities, but

are substantial even for maturities of 30 years. When planners’ judgments

are highly persistent (i.e., x is close to 1) the range of opinions on SDRs

expands, but for any x < 97.5% disagreement is reduced by more than

a factor of three at maturities greater than 50 years. In the appendix I

demonstrate that the rapid reduction in disagreement depicted in Figure 1a

is largely driven by non-dogmatic planners’ elasticities of marginal utility

(i.e., the analogue of the consumption growth term ηgs in (6)), and not by

their rates of pure social time preference (i.e., the analogue of ρ in (6), which

depends only on the intertemporal weights f ijs ). Disagreements about the

consumption growth term in the Ramsey rule are significantly larger than

disagreements about the pure rate of time preference, but decay rapidly with

maturity if planners are non-dogmatic. Disagreements about the pure rate


of time preference are smaller, but decay much more slowly with maturity.

More than 90% of the variation in r(s) is attributable to variation in the

pure rate of time preference alone for maturities s > 60 years, for all the

values of x depicted in Figure 1a.

Figure 1b illustrates how non-dogmatism reduces disagreements about the

net present values of payoff sequences, as defined by (3). The figure depicts

five payoff sequences πππ.21 To quantify the reduction in disagreement about

NPVs let σ({NPV (πππ;x)}) denote the standard deviation of the set of net

present values of πππ according to non-dogmatic planners with weight x in

(22), and compute the following ratio for each sequence πππ:

(23) Γ(πππ;x) =σ({NPV (πππ;x)})σ({NPV (πππ; 1)})

.

This ratio captures the reduction in disagreement about NPVs, relative to

the dogmatic benchmark at x = 1. Figure 1b shows that non-dogmatism

could substantially reduce disagreements about the value of projects whose

payoffs are concentrated at maturities of 30 years or greater, even if x =

97.5%. Reductions in disagreement increase strongly as payoffs move further

into the future. For the project on the far right, whose benefits largely occur

more than 60 years in future, disagreements are reduced by more than a

factor of 5 even if x = 97.5%.

The rate of dissipation of disagreement with maturity depicted in Fig-

ure 1 clearly depends on the value of the parameter x. The values x =

80%, 90%, 95%, 97.5% used in this figure correspond to a change of norma-

21These may be thought of as public projects with equal up-front costs but differ-ent temporal profiles of benefits. The undiscounted sum of benefits for each project isnormalized to 1.


0 20 40 60 80 100 120 140 160 180 200

Maturity s (yrs)

0

1

2

3

4

5

6

7

5-9

5%

range o

f r

i (s)

(%/y

r)

x = 80%

x = 90%

x = 95%

x = 97.5%

x = 1

(a) Simulated 5-95% range for non-dogmatic planners’ SDRs ri(s). The solid black curve

corresponds to x = 1 in (22), dotted purple x = 97.5%, dash-dotted yellow x = 95%, dashed

red x = 90%, and solid blue x = 80%. Consumption growth is a constant 2%/yr.

0 20 40 60 80 100 120Years (s)

Proj

ect p

ayof

fs (

s)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(b) Reduction in disagreement about NPVs when planners are non-dogmatic. Each curve in

the figure denotes a time sequence of payoffs. The markers centered on each curve denote thevalues of Γ(πππ, x), defined in (23), for this payoff sequence. ◦,+,×, � denote values of Γ(πππ, x)when x = 97.5%, 95%, 90%, 80% respectively in (22).

Figure 1. : Consequences of non-dogmatism for cost-benefit analysis.


tive views roughly once every 5, 10, 20, and 40 years respectively, on average.

Are these plausible values? In order to answer this question we must first

recognise that x can be interpreted either as a positive or a normative ob-

ject. On the one hand, any ethical observer’s insecurity in their normative

judgments could be construed as a subjective matter; in this interpretation

assessing the chance of those views changing is a positive question about

that observer’s state of mind. There is some suggestive evidence for the oc-

casional changes of heart that are needed to support the paper’s conclusions

under this interpretation, as professional philosophers’ convictions have been

shown to correlate with their age (Bourget and Chalmers, 2014).22 On the

other hand if we accept the persuasive motivation for the model, i.e., that it

is designed to nudge planners into forming their normative judgments in a

new way, then x plays a normative role. Tweaking this lever shows planners

how their normative views on IWFs should adjust away from the standard

discounted utilitarian framework, given a degree of non-dogmatism that is

‘normatively required’. Although both interpretations are consistent with

the model, the latter is more in keeping with the ethos of this paper.

Given this, it is reasonable to ask how much non-dogmatism is ‘norma-

tively required’. That is something of a meta-ethical question, and readers

will doubtless have their own views on it. Requiring planners to admit the

possibility of a change in their convictions roughly once every 10 or 20 years

does not seem like an excessively burdensome prescription (recall that they

still have the freedom to discount the preferences of future selves as they see

fit). Indeed, the argument that uncertainty or insecurity should play a non-

22This finding chimes with a witticism often attributed to Georges Clemencau: ‘Notto be a socialist at twenty is proof of want of heart; to be one at thirty is proof of wantof head.’


negligible role in normative judgments is not unique to this paper. Catholic

theology has grappled with related issues since the 16th century, when the

doctrine of ‘probabilism’ was introduced as a guide to action in the face of

moral uncertainty (Harty, 1913). Normative uncertainty is currently also

a central topic in philosophy, precisely because many have grown weary of

old debates that pit ethical theories against one another in a zero-sum game

(see e.g. Bostrom, 2009; MacAskill, 2016; MacAskill and Ord, 2018).

V. Conclusion

This paper introduced a normative model of social planners’ time pref-

erences based on a principle of ‘non-dogmatism’. This principle requires

advocates of alternative theories of intertemporal social welfare to exhibit

a degree of humility when forming their normative judgments: They admit

the possibility of a change in their views, and refrain from imposing their

current normative judgments on their future selves. The formalism allows

advocates of each theory the freedom to express idiosyncratic judgments

about all the contested normative aspects of social time preferences. In

spite of this, all non-dogmatic theories yield the same value of the long-run

social discount rate. As the appropriate value of this quantity has been

widely contested and has a powerful influence on the evaluation of pub-

lic projects with long-run consequences, this analysis may prove useful for

policy applications.

REFERENCES

Arrow, Kenneth J. 1999. “Discounting, morality, and gaming.” In Discounting and

Intergenerational Equity. , ed. Paul R. Portney and John P. Weyant. Resources for the

Future.


Arrow, Kenneth J., Partha Dasgupta, and Karl-Goran Maler. 2003. “Evaluating

Projects and Assessing Sustainable Development in Imperfect Economies.” Environ-

mental and Resource Economics, 26(4): 647–685.

Arrow, K., M. Cropper, C. Gollier, B. Groom, G. Heal, R. Newell, W. Nord-

haus, R. Pindyck, W. Pizer, P. Portney, T. Sterner, R. S. J. Tol, and M.

Weitzman. 2013. “Determining Benefits and Costs for Future Generations.” Science,

341(6144): 349–350.

Bergstrom, Theodore C. 1999. “Systems of Benevolent Utility Functions.” Journal of

Public Economic Theory, 1(1): 71–100.

Bostrom, Nick. 2009. “Moral uncertainty - towards a solution?” http: // www.

overcomingbias. com/ 2009/ 01/ moral-uncertainty-towards-a-solution. html ,

accessed on 20/05/2019.

Bourget, David, and David J. Chalmers. 2014. “What do philosophers believe?”

Philosophical Studies, 170(3): 465–500.

Chambers, Christopher P., and Federico Echenique. 2018. “On Multiple Discount

Rates.” Econometrica, 86(4): 1325–1346.

Dasgupta, Partha. 2008. “Discounting climate change.” Journal of Risk and Uncer-

tainty, 37(2): 141–169.

Dasgupta, Partha, Amartya Sen, and Stephen A Marglin. 1972. Guidelines for

project evaluation. New York:United Nations.

Drupp, Moritz, Mark C. Freeman, Ben Groom, and Frikk Nesje. 2018a. “Dis-

counting Disentangled.” American Economic Journal: Economic Policy, 10(4): 109–

134.

Drupp, Moritz, Mark C. Freeman, Ben Groom, and Frikk Nesje. 2018b. “Dis-

counting Disentangled: Dataset.” American Economic Journal: Economic Policy,

https://doi.org/10.1257/pol.20160240.

Feng, Tangren, and Shaowei Ke. 2018. “Social Discounting and Intergenerational

Pareto.” Econometrica, 86(5): 1537–1567.


Freeman, Mark C., and Ben Groom. 2015. “Positively Gamma Discounting: Com-

bining the Opinions of Experts on the Social Discount Rate.” The Economic Journal,

125(585): 1015–1024.

Galperti, Simone, and Bruno Strulovici. 2017. “A Theory of Intergenerational Al-

truism.” Econometrica, 85(4): 1175–1218.

Gollier, Christian. 2012. Pricing the Planet’s Future: The Economics of Discounting

in an Uncertain World. Princeton University Press.

Gollier, Christian, and James K. Hammitt. 2014. “The Long-Run Discount Rate

Controversy.” Annual Review of Resource Economics, 6(1): 273–295.

Gollier, Christian, and Richard Zeckhauser. 2005. “Aggregation of Heterogeneous

Time Preferences.” Journal of Political Economy, 113(4): 878–896.

Greaves, Hilary. 2017. “Discounting for public policy: A survey.” Economics and Phi-

losophy, 33(3): 391–439.

Harsanyi, John C. 1977. “Rule utilitarianism and decision theory.” Erkenntnis,

11(1): 25–53.

Harty, John M. 1913. “Probabilism.” In Catholic Encyclopedia, Volume 12. The Ency-

clopedia Press.

Heal, Geoffrey M., and Antony Millner. 2014. “Agreeing to disagree on climate

policy.” Proceedings of the National Academy of Sciences, 111(10): 3695–3698.

Hori, Hajime. 2009. “Nonpaternalistic altruism and functional interdependence of social

preferences.” Social Choice and Welfare, 32(1): 59–77.

Laibson, David. 1997. “Golden Eggs and Hyperbolic Discounting.” The Quarterly Jour-

nal of Economics, 112(2): 443–478.

MacAskill, William. 2016. “Normative Uncertainty as a Voting Problem.” Mind,

125(500): 967–1004.

MacAskill, William, and Toby Ord. 2018. “Why Maximize Expected Choice-

Worthiness?” Nous, , (doi: 10.1111/nous.12264): 1–27.


Millner, Antony, and Geoffrey Heal. 2018. “Discounting by committee.” Journal of

Public Economics, 167: 91–104.

Parfit, Derek. 1984. Reasons and Persons. Oxford University Press, USA.

Phelps, E. S., and R. A. Pollak. 1968. “On Second-Best National Saving and Game-

Equilibrium Growth.” The Review of Economic Studies, 35(2): 185–199.

Rawls, John. 1971. A theory of justice. Harvard University Press.

Ray, Debraj. 1987. “Nonpaternalistic intergenerational altruism.” Journal of Economic

Theory, 41(1): 112–132.

Ray, Debraj, Nikhil Vellodi, and Ruqu Wang. 2018. “Backward discounting.”

Working paper.

Saez-Marti, Maria, and Jorgen W. Weibull. 2005. “Discounting and altruism to

future decision-makers.” Journal of Economic Theory, 122(2): 254–266.

Sternberg, Shlomo. 2014. Dynamical Systems. Dover Publications.

Weitzman, Martin L. 2001. “Gamma Discounting.” The American Economic Review,

91(1): 260–271.

Non-dogmatic Social Discounting

Online Appendix

Antony Millner∗1

1Department of Economics, University of California, Santa Barbara.

Contents

A Proof of Lemma 1 2

B Proof of Lemma 2 3

C Proof of Proposition 1 4

D Consensus long-run SDRs under uncertainty 14

E Proof of Proposition 2 16

F Comparative statics of the consensus long-run pure rate of social time

preference 17

G Details of calibration 20

H Changing the model’s time step 23

I Decomposing non-dogmatic SDRs 26

J References 31

∗Millner: [email protected].

1

A Proof of Lemma 1

Lemma 1. The system (12) defines a unique bounded set of time preferences, which are

non-decreasing in all utilities, if

maxi

{∞∑s=1

N∑j=1

f ijs < 1

}.

Proof. The system of time preferences (12) can be written as a single matrix equation as

follows:

V 1τ...

V Nτ

V 1τ+1...

V Nτ+1...

=

U1(cτ )...

UN(cτ )

U1(cτ+1)...

UN(cτ+1)...

+

~0N f 111 . . . f 1N

1 f 112 . . . f 1N

2 . . ....

......

......

......

...

~0N fN11 . . . fNN1 fN1

2 . . . fNN2 . . .

~0N ~0N f 111 . . . f 1N

1 f 112 . . . . . .

......

......

......

......

~0N ~0N fN11 . . . fNN1 fN1

2 . . . . . ....

......

......

......

...

V 1τ...

V Nτ

V 1τ+1...

V Nτ+1...

where ~0N is an 1×N vector of zeros. Letting ~Xτ denote the vector on the left hand side

of this expression, Λ the infinite dimensional square matrix on the right hand side, and ~Uτ

denote the vector of Us on the right hand side, we have

~Xτ = ~Uτ + Λ ~Xτ

⇒ ~Xτ = (1∞ −Λ)−1~Uτ ,

where 1∞ is the infinite dimensional identity matrix, and we have assumed that the relevant

matrix inverse exists.

In general infinite dimensional matrices do not have unique inverses. However, Lemma

1 in Bergstrom (1999) shows that 1∞−Λ has a unique bounded inverse with non-negative

elements if and only if 1∞ − Λ is a dominant diagonal matrix. A denumerably infinite

matrix 1∞ − Λ with Λ ≥ 0 is said to be dominant diagonal if there exists a bounded

diagonal matrix D ≥ 0 such that the infimum of the row sums of (1∞ −Λ)D is positive.

Clearly, a sufficient condition for 1∞−Λ to be dominant diagonal is if∑∞

s=1

∑Nj=1 f

ijs < 1

for all i.

2

Although this lemma focusses on providing a sufficient condition that is easy to check,

the proof also provides a necessary and sufficient condition: 1∞ − Λ must be dominant

diagonal. This is equivalent to requiring the spectral radius of the linear operator Λ to be

less than 1, as this guarantees that the sequence (1∞−Λ)−1 = 1∞+Λ+Λ2 + . . . converges

(Duchin & Steenge, 2009). Checking this condition is however difficult in practice given

the infinite dimensionality of Λ. I will thus work with the simpler sufficient condition

throughout, but the results do not depend on this simplification. The proof of the main

proposition in Appendix C only requires the spectral radius of Λ to be bounded above by

1.

B Proof of Lemma 2

We wish to prove that non-dogmatic planners’ with preferences (12) have consistent beliefs

iff the intratemporal weights wijs satisfy (14). In the notation established in the text,

Lemma 2.

Probτ (i→ j; s) =N∑k=1

Probτ (i→ k; t)Probτ+t(k → j; s− t) (A.1)

for all τ ∈ N, s ≥ 2, 1 ≤ t < s if and only if there exists an N × N stochastic matrix P

such that

wijs = (Ps)i,j.

Let the beliefs of planners at time τ about the probability of a future self who subscribes

to theory i at time τ+s−1 switching to theory j at time τ+s be Tij,(τ)s . Denote the matrix

of these transition probabilities by T(τ)s . Let W

(τ)s be the matrix of time τ planners’ beliefs

about which theory they will subscribe to at time τ+s, whose i, j element is Probτ (i→ j; s).

Then we have

W(τ)s = T(τ)

s T(τ)s−1 . . .T

(τ)1 .

Using this relation, (A.1) can be written as the requirement that

T(τ)s T

(τ)s−1 . . .T

(τ)1 = T

(τ+t)s−t T

(τ+t)s−t−1 . . .T

(τ+t)1 T

(τ)t T

(τ)t−1 . . .T

(τ)1 , (A.2)

for all τ, t, s. It is clear that a sufficient condition for this to be satisfied is

T(τ)s = P

3

for all τ, s, where P is an N ×N stochastic matrix. To prove necessity, put s = 2, t = 1 in

(A.2) to find

T(τ)2 T

(τ)1 = T

(τ+1)1 T

(τ)1

which implies

T(τ)2 = T

(τ+1)1 . (A.3)

Putting s = 3, t = 1 in (A.2), we find

T(τ)3 T

(τ)2 T

(τ)1 = T

(τ+1)2 T

(τ+1)1 T

(τ)1

⇒T(τ)3 T

(τ)2 = T

(τ+1)2 T

(τ+1)1 .

and using (A.3) this reduces to

T(τ)3 = T

(τ+1)2 .

Repeating this process of substitution, we find that a necessary condition for (A.2) to be

satisfied is

T(τ)s+1 = T(τ+1)

s .

Since non-dogmatic planners’ preferences are time invariant, it must be the case that

T(τ+1)s = T(τ)

s .

Substituting this relation into the previous equation shows that

T(τ)s+1 = T(τ)

s

for all τ, s. This implies that the matrix of planners’ beliefs W(τ)s must be of the form

W(τ)s = (P)s

for all τ .

C Proof of Proposition 1

We prove a more general version of the result in Proposition 1. The proof has two main

steps. First we find conditions under which all planners’ utility weights aijs are proportional

to a common discount factor µs for large s. We then show that when these conditions are

4

satisfied all non-dogmatic planners’ long-run SDRs are the same.

STEP 1:

Begin by defining the sequence of N ×N matrices

Fs :=

f 11s f 12

s . . . f 1Ns

f 21s f 22

s . . . f 2Ns

......

......

fN1s fN2

s . . . fNNs

(A.4)

and the sequences of N × 1 vectors

~Vτ =

V 1τ

V 2τ...

V Nτ

, ~Uτ =

U1(cτ )

U2(cτ )...

UN(cτ )

. (A.5)

Our general model (12) can be written as:

~Vτ = ~Uτ +∞∑s=1

Fs~Vτ+s. (A.6)

We seek an equivalent representation of this system of the form

~Vτ :=∞∑s=0

As~Uτ+s, (A.7)

where As is a sequence of N ×N matrices of the form,

As :=

a11s a12

s . . . a1Ns

a21s a22

s . . . a2Ns

......

......

aN1s aN2

s . . . aNNs

(A.8)

where aijs is the weight planner i at time τ assigns to theory j’s utility function at time

τ + s, i.e., U j(cτ+s).

We now prove the following:

Proposition A.I. Assume that the condition (13) is satisfied, and that f iis > 0 for all

5

i = 1 . . . N , s = 1 . . .∞. Construct a directed graph G with N nodes labelled 1, 2, . . . , N .

Draw an edge from node i to node j 6= i iff f ijs > 0 for at least one s ≥ 1. If G contains a

directed cycle of length N , then there exists a µ ∈ (0, 1) such that

lims→∞

aijsµs

= Kij > 0

where the Kij are finite constants.

Notice that the definition of non-dogmatic time preferences in (12) automatically im-

plies that the directed cycle condition in this proposition is satisfied (the graph G is com-

plete in this case, i.e., all edges exist). However, the directed cycle condition itself is

considerably weaker than is assumed in this definition.

Proof. Substitute (A.7) into (A.6) to find

∞∑s=0

As~Uτ+s = ~Uτ +

∞∑p=1

Fp

(∞∑q=0

Aq~Uτ+p+q

)(A.9)

Equating coefficients of ~Uτ+s in this expression, we see that As must satisfy

A0 = 1N (A.10)

As =s∑

p=1

FpAs−p for s > 0. (A.11)

where 1N is the N×N identity matrix. The solution of this recurrence relation determines

the utility weights aijs . It will be convenient to split this matrix recurrence relation into a

set of N vector recurrence relations as follows. Let ~Ajs be the j-th column vector of As,

i.e.,

~Ajs =

a1js

a2js...

aNjs

. (A.12)

6

Define ~ej to be the unit vector with elements

(~ej)i =

{0 i 6= j

1 i = j

Then (A.11) is equivalent to the N vector recurrence relations

~Aj0 = ~ej

~Ajs =s∑

p=1

Fp~Ajs−p for s > 0. (A.13)

for j = 1 . . . N .

The proof now has the following steps. We consider finite order models, i.e., FM ′ = 0 for

all M ′ greater than some finite M . We show that if a certain augmented matrix constructed

from the matrices F1, . . . ,FM is primitive, all planners will have a common long-run pure

time discount factor. A square matrix B is primitive if there exists an integer k > 0 such

that Bk > 0. We then extend this result to infinite order models by taking an appropriate

limit of finite order models. Finally, we show that primitivity of the required matrices in

the infinite order case is ensured by the graph theoretic condition in the statement of the

proposition.

Begin with the finite order case. Let M = max{s|∃i, j f ijs > 0} <∞. In this case, for

all s > M , (A.13) reduces to

~Ajs =M∑p=1

Fp~Ajs−p. (A.14)

Define the NM ×NM matrix

ΦM =

F1 F2 . . . FM−1 FM

1N 0 . . . 0 0

0 1N . . . 0 0...

......

......

0 0 . . . 1N 0

(A.15)

7

where 1N is the N ×N identity matrix. In addition, define the ‘stacked’ vector

~Y js =

~Ajs~Ajs−1

...~Ajs−M+1

Then we can rewrite the Mth order recurrence (A.14) as a first order recurrence as follows:

~Y js = ΦM

~Y js−1

⇒ ~Y jM+s = (ΦM)s~Y j

M . (A.16)

We now assume that ΦM is a primitive matrix. By the Perron-Frobenius theorem for

primitive matrices (Sternberg, 2014), this implies

1. ΦM has a positive eigenvalue, which we label as µ(M).

2. All other eigenvalues of ΦM have complex modulus strictly less than µ(M).

3. There exists a matrix C > 0 such that

lims→∞

ΦsM

[µ(M)]s= C

4. µ(M) increases when any element of ΦM increases.

5.

µ(M) < maxi

∑j

φij. (A.17)

where φij is the ijth element of ΦM .

Since the first N elements of ~Y js coincide with aijs , the third of these conclusions implies

that

∀i, j, lims→∞

aijs[µ(M)]s

= C~Y jM > 0.

To bound the value of µ(M), note that from point 5 of the Perron-Frobenius theorem

in (A.17), and the definition of ΦM in (A.15), we have

µ(M) < maxi

{M∑s=1

N∑j=1

f ijs

}(A.18)

8

Thus, if∞∑s=1

N∑j=1

f ijs < 1 (A.19)

for all i, µ(M) < 1, and hence lims→∞ aijs = 0. Thus (13) guarantees that the time

preferences (12) are complete (i.e., finite on bounded consumption streams, and hence able

to rank arbitrary pairs of bounded consumption streams) for all finite M . This concludes

the finite M case.

We now extend this result to the case of infinite M . Assume that there exists an M ′ > 0

such that the matrix ΦM , defined in (A.15), is primitive for all M > M ′. For M > M ′,

define

~Vτ (M) = ~Uτ +M∑s=1

Fs~Vτ+s(M)

and let~Vτ = lim

M→∞~Vτ (M).

Define the equivalent representations of these preferences by

~Vτ (M) =∞∑s=0

As(M)~Uτ+s (A.20)

~Vτ =∞∑s=0

As~Uτ+s (A.21)

In addition, let µ(M) be the Perron-Frobenius eigenvalue of ΦM . We begin by proving

that:

Lemma 3.

µ := limM→∞

µ(M) exists. (A.22)

Proof. Consider the eigenvalue µ(M + 1), where M > M ′. This is the Perron-Frobenius

eigenvalue of ΦM+1. The M -th order preferences ~Vτ (M) are equivalent to an M + 1th

order model, with FM+1 = 0. The matrix ΦM , which controls the asymptotic behavior

of ~Vτ (M) can thus be thought of as an N × (M + 1) matrix, where the last M rows

and columns are zeros. Call this matrix ΦM+1. The matrix ΦM+1, associated with the

asymptotic behavior of ~Vτ (M + 1), has entries that are strictly larger than than those of

ΦM+1 in at least some elements. Thus, by point 4 in our statement of the Perron-Frobenius

theorem, µ(M + 1) > µ(M). We also know that µ(M) < 1 for all M . Since the sequence

9

µ(M) is increasing and bounded above, the monotone convergence theorem implies that µ

exists.

We have thus proved that if the matrices ΦM are primitive for M > M ′,

limM→∞

lims→∞

aijs+1(M)

aijs (M)= lim

M→∞µ(M) = µ. (A.23)

Note that since (A.17) and (A.19) are strict inequalities, µ < 1. We now wish to know

whether it is also true that:

lims→∞

limM→∞

aijs+1(M)

aijs (M)= µ. (A.24)

That is, can we change the order of the limits in (A.23)? For limit operations to be

interchangeable we require the sequence of functions they operate on to be uniformly

convergent. The functions in question here are V iτ (M) and V i

τ , which we can think of as

linear functions from the infinite dimensional space R∞ × RN = {(~Uτ , ~Uτ+1, ~Uτ+2, . . .)} to

R. If the sequence of functions V iτ (M) converges uniformly to V i

τ on any bounded subset

of R∞ × RN , then (A.24) will be satisfied. We now prove a second lemma:

Lemma 4. Let B be a compact subset of R∞ × RN , and assume that (13) is satisfied.

Then V iτ (M) converges uniformly to V i

τ on B.

Proof. Equation (A.13) shows that for all s ≤M , aijτ+s(M) = aijτ+s. Let U = maxj{sups{U j(cτ+s)}}be the largest component of any ~U ∈ B. For any ~U ∈ B,

sup~U∈B

∣∣∣V iτ (M)− V i

τ

∣∣∣ = sup~U∈B

∣∣∣∣∣∞∑s=1

N∑j=1

aijτ+M+s(M)U j(cτ+M+s)−∞∑s=1

N∑j=1

aijτ+M+sUj(cτ+M+s)

∣∣∣∣∣≤

∞∑s=1

N∑j=1

[∣∣aijτ+M+s(M)∣∣+∣∣aijτ+M+s

∣∣] UBy Lemma 3, µ < 1 also implies µ(M) < 1 for allM , so we know that limM→∞ a

ijτ+M+s(M) =

0 = limM→∞ aijτ+M+s for all i, j. Thus

limM→∞

sup~U∈B

∣∣∣V iτ (M)− V i

τ

∣∣∣ = 0.

Hence V iτ (M) converges uniformly to V i

τ .

This concludes the infinite order case.

10

The final step of the proof is to show that if the graph G, defined in the statement of

the proposition, has a directed cycle of length N , then there exists an M ′ > 0 such that for

all M > M ′ the matrix ΦM is primitive. We demonstrate this using a graphical argument.

Consider an aribtrary R × R matrix Bij, and form a directed graph H(B) on nodes

1 . . . R, where there is an edge from node i to node j iff Bij > 0. The matrix Bij is primitive

if there exists an integer k ≥ 1 such that there is a path of length k from each node i to

every other node j in H(B). If H(B) is strongly connected, i.e., there exists a path from

every node to every other node, then a sufficient condition for Bij to be primitive is for

there to be at least one node that is connected to itself.

Now consider our NM ×NM matrices ΦM . To construct the directed graph H(ΦM)

associated with ΦM in a convenient form, follow the following procedure: Construct an

M ×N grid of nodes (where N is the number of planners), with node (m,n) representing

planner n at time τ + m. For all m > 1, n, construct a directed edge from node (m,n) to

node (m− 1, n). In addition, construct a directed edge from node (1, n) to node (m′, n′) if

fnn′

m′ > 0.

As an example, take the case M = N = 3, i.e., a third order model with three plan-

ners. In this case ΦM is a 9 × 9 matrix. Assume that f iis > 0 for all i, s = 1 . . . 3, that

f 121 , f 23

1 , f 311 > 0, and that f ijs = 0 otherwise. Figure F.1 represents the directed graph

associated with the matrix Φ3 in this case.

Examination of the figure shows that since f iis > 0, each of the ‘column’ subgraphs

{(m, 1)}, {(m, 2)}, {(m, 3)},m = 1 . . . 3 is strongly connected. Moreover, the cycle between

columns (the red dashed edges) connects the columns to each other, and causes the entire

graph to be strongly connected. Since each node in the first row is connected to itself, the

matrix Φ3 in this example is primitive.

Returning to the general case, suppose that f iis > 0 for all i and s. From the example

in Figure F.1 it is clear that this implies that for each fixed i the subgraph {(m, i)|m =

1 . . .∞} is strongly connected, with each of the nodes (1, i) connected to itself. Thus, if

there is a directed cycle between all of the ‘columns’ of the graph H(ΦM ′) for some M ′,

then for all M > M ′, H(ΦM) is strongly connected, and contains nodes that are connected

to themselves. Hence for all M > M ′, ΦM is a primitive matrix. This concludes the

proof.

STEP 2:

We now show that when the conditions of Proposition A.I are satisfied, all non-dogmatic

theories yield the same long-run SDR, and we compute an explicit formula for this con-

11

Figure F.1: The directed graph H(Φ3) associated with the matrix in our example. Thevertical black edges arise from the identity matrices in the definition of ΦM (see (A.15)).The dashed blue edges arise from f iis > 0, and the dashed red edges from f 12

1 , f 231 , f 31

1 > 0.

sensus discount rate.

Begin by defining

ρ = − ln µ,

where µ is defined in (A.22). When the conditions of Proposition A.I hold we know that

aijs ∼ Kij(s)e−ρs (A.25)

where ∼ denotes asymptotic behaviour as s → ∞, and the multiplicative factors Kij(s)

satisfy lims→∞1s

lnKij(s) = 0.

Now integrate the definition of ηj(c) in (17) to find1

(U j)′(c) = exp

(−∫ c

0

ηj(x)

xdx

).

Make the change of variables x = cτegs′ in the integral in the exponent (recall that g is the

1In other words, solve the differential equation −c(U j)′′/(U j)′ = ηj(c) for (U j)′(c).

12

long-run consumption growth rate), and evaluate (U j)′(c) at c = cτegs to find

(U j)′(cτegs) = exp

(−g∫ s

0

ηj(cτegs′)ds′

).

Defining

ηj =

{limc→∞ η

j(c) g > 0

limc→0 ηj(c) g < 0

(A.26)

we see that the s→∞ asymptotic behaviour of marginal utility is given by

(U j)′(cτegs) ∼ Lj(s)e

−gηjs (A.27)

for some functions Lj(s) that satisfy lims→∞1s

lnLj(s) = 0. Combining (A.25) and (A.27),

we find

ri(s) = −1

sln

(1

(U i)′(cτ )

N∑j=1

aijs (U j)′(cτ+s)

)

∼ −1

sln

(∑j

Kij(s)Lj(s)e−ρse−η

jgs

)

∼ ρ− 1

sln

(∑j

Kij(s)Lj(s)e−ηjgs

)

Define Kij(s) = Kij(s)Lj(s), and let q be the index of the planner with the lowest (highest)

value of ηj when g > 0 (g < 0). Then∑j

Kij(s)Lj(s)e−ηjgs =

∑j

Kij(s)e−ηjgs

= Kiq(s)e−ηqgs

(1 +

∑j 6=q

Kij(s)

Kiq(s)e−(ηj−ηq)gs

)

Since ηj − ηq > 0 for all j 6= q when g > 0, and ηj − ηq < 0 for all j 6= q when g < 0,∑j

Kij(s)Lj(s)e−ηjgs ∼ Kiq(s)e

−ηgs,

13

where η is given by (18). Thus

ri(s) ∼ ρ− 1

sln(Kiq(s)e

−ηgs)

⇒ lims→∞

ri(s) = ρ+ ηg.

D Consensus long-run SDRs under uncertainty

It is straightforward to extend the proof of Proposition 1 to the case where future consump-

tion is uncertain. If consumption is uncertain non-dogmatic planners’ IWFs are simply the

expectation over their deterministic IWFs, i.e.,

V iτ = Ecτ+1,cτ+2,...

∞∑s=0

N∑j=1

aijs Uj(cτ+s)

where Ecτ+1,cτ+2,... denotes the expectation over future consumption values, and the co-

efficients aijs are determined by the dynamical system in (A.11), as in the deterministic

case.

The analysis of the consensus long-run SDR now proceeds in close analogy to the

second part of the proof of Proposition 1. The consensus long-run pure rate of social time

preference is unchanged, however examination of the proof shows that we need to account

for the effect of expectations on the growth terms in the Ramsey rule.

Under uncertainty planners’ marginal rates of substitution between consumption today

and consumption s years from now are given by:

e−ri(s)s = MRSis =

∑Nj=1 a

ijs Ecτ+s(U

j)′(cτ+s)

(U i)′(cτ )(A.28)

Define a planner specific ‘certainty equivalent’ long-run growth rate gj by requiring that

(U j)′(egjscτ ) ≡ Eg(Uj)′(egscτ ) (A.29)

as s→∞, i.e.,

gj ≡ lims→∞

1

slog[((U j)′)−1

(Eg(U

j)′(egscτ ))]. (A.30)

The long-run consumption growth rate g is uncertain in this expression, and Eg denotes

14

expectations over the value of g. In analogy with (A.26), define

ηj(gj) =

{limc→∞ η

j(c) gj > 0

limc→0 ηj(c) gj < 0.

Then for large s, we know from (A.27) that

Ecτ+s(Uj)′(cτ+s) = (U j)′(egjscτ ) ∼ e−gj ηj(gj)s

where ∼ denotes s→∞ asymptotic behaviour, as before.

As in the deterministic case, we see from (A.28) that planner i’s long-run elasticity of

marginal utility is determined by the term that dominates the sum

N∑j=1

aijs Ecτ+s(Uj)′(cτ+s) ∼

∑j

aijs e−gj ηj(gj)s

as s→∞. This sum is dominated by the exponential with the minimum value of gj ηj(gj)

(which may be negative), for all i. We thus conclude that the consensus long-run SDR

under uncertainty is given by

ρ+ mini{giηi(gi)} (A.31)

As an example of the application of this formula suppose that planners’ utility functions

are iso-elastic with elasticities of marginal utility ηi, i.e., (U i)′(c) = c−ηi . In addition,

assume that consumption growth is asymptotically log-normally distributed, i.e.,

log g ∼ N (µ, σ2).

From (A.29) planner i’s certainty equivalent long-run growth rate gi is thus defined by

requiring that at large s,

e−ηigis(cτ )−ηi ≡ Ege

−ηigs(cτ )−ηi = e−(ηiµ− 1

2η2i σ

2)s(cτ )−ηi

⇒ gi = µ− 1

2ηiσ

2

Since elasticities of marginal utility are constant by assumption we know that ηi(gi) = ηi,

and thus the consensus long-run SDR in this example is given by

ρ+ mini{µηi −

1

2η2i σ

2}.

15

E Proof of Proposition 2

Part 1 of the proposition is immediate from point 4 in our statement of the Perron-

Frobenius theorem in Proposition A.I. Part 2 of the proposition follows from the fact

that the eigenvalues of a matrix are continuous in its entries. Consider a set of N ‘dog-

matic’ models, in which each planner assigns weight only to her own theory in future

periods. This set of N independent planners’ time preferences can be represented as a

single non-dogmatic set of N planners as in (12), but where f ijs = 0 if j 6= i. As in the

proof of Proposition A.I, begin by considering a model of finite order M , so that no planner

places any weight on any IWF more than M years ahead. Equation (A.16) shows that the

asymptotic behaviour of such a model can be described by first order difference equations

of the form:~Y js = Φ0

M~Y js−1.

In this case however, the matrix Φ0M , defined in (A.15), is reducible. The largest eigenvalue

of Φ0M is the rate of decline of the utility weights of the most patient dogmatic planner in

the long-run. As M → ∞, the set of eigenvalues of Φ0M contains µi1, the long-run utility

discount factor of planner i, and all eigenvalues of Φ0M are less than or equal to maxi{µi1}.

Now consider the continuous set of models with weights f ijs (ε), where ε > 0. Let

ΦM(ε) be the corresponding ΦM matrix for this set of models, where by assumption

limε→0+ ΦM(ε) = Φ0M . The consensus long-run discount factor in model ε of order M ,

denoted µ1(ε,M) is the largest eigenvalue of ΦM(ε). Define

µ1(ε) = limM→∞

µ1(M, ε).

We know that this limit exists, due to the proof of Proposition A.I. Since the matrix ΦM(ε)

is continuous in ε > 0, and in the limit as M →∞ the largest eigenvalue of ΦM(0) = Φ0M

is equal to maxi{µi1}, we must have

limε→0+

µ1(ε) = maxi{µi1}.

Since ρ(ε) = − ln µ1(ε) by definition, the result follows.

16

F Comparative statics of the consensus long-run pure

rate of social time preference

It is naturally of interest to ask how the consensus long-run pure rate of social time prefer-

ence ρ depends on the intertemporal weights f ijs . Unfortunately strong comparative statics

results on this question are likely out of reach. Technically, we need to understand how the

spectral radius (i.e., largest eigenvalue) of the matrices ΦM from Proposition A.I behaves

when we spread out or contract the distribution of weights f ijs . In order to sign the effect of

a spread in the weights we require something akin to a convexity property for the spectral

radius. Unfortunately, it is known that the spectral radius of a matrix is a convex function

of its diagonal elements, but not of the off-diagonal elements (Friedland, 1981).2

This section describes a special case of the model in which clean comparative statics are

possible. Assume that planner i’s intertemporal weights f ijs depend on a parameter λi ⊂R+, i.e., f ijs = f ijs (λi). Let ~λ = (λ1, . . . , λN) be the vector of planners’ λ parameters, and

assume that ~λ takes values in a convex subset of RN+. Using the notation of Proposition A.I

we write the matrix of weights f ijs at a fixed value of s as Fs(~λ), where we now emphasize

the dependence of these weights on the parameter vector ~λ. We will say that preferences

are symmetric in ~λ iff for all permutation matrices3 P,

Fs(P~λ) = PFs(~λ)PT (A.32)

for all s, where PT is the transpose of P. Intuitively, if preferences are symmetric in~λ, switching any two planners’ values of λ is equivalent to switching their entire set of

intertemporal weights, as this induces a permutation of the weight matrix Fs(~λ). The pa-

rameters λi are thus ‘sufficient statistics’ for planners’ intertemporal weights, and switching

λi ↔ λj is equivalent to relabelling i↔ j.

As an example of preferences that are symmetric in ~λ consider the following:

f ijs =

{β(s, λi)xs j = i

β(s, λi)1−xsN−1

j 6= i(A.33)

2Similarly, it is not possible to sign the effect of premultiplying ΦM by a doubly stochastic matrix, as thespectral radius of a product of two matrices is not sub-multiplicative in general. Gelfand’s formula showsthat the spectral radius of a matrix product is sub-multiplicative if the matrices in question commute, butthis is not much use for our purposes.

3A square matrix is a permutation matrix if each of its rows and each of columns contains exactly oneentry of 1, and zeros elsewhere.

17

where xs ∈ [1/N, 1) for all s = 1 . . .∞, and∑∞

s=1 β(s, λ) < 1 for all λ ∈ I ⊂ R+. In this

model the time dependence of planners’ intertemporal weights f ijs has a common functional

form, given by a discount function β(s, λ) on the IWF of selves s years in the future, where

λ > 0 is a parameter. Variations in planners’ attitudes to time are solely due to differences

in their values of λ. The parametric model defined in (22), which we used in Section IV of

the paper, is of this form if γi = γ for all i.

Let ρ(~λ) be the consensus long-run pure rate of time preference in a model that is

characterized by the parameter vector ~λ.

Proposition A.II. Assume that planners’ time preferences are symmetric in ~λ and that

f ijs (λ) is strictly log-convex in λ > 0 for all i, j, s. Then if the parameter vector ~λA ma-

jorizes4 ~λB,

ρ(~λA) < ρ(~λB).

In words, this result says that if preferences are symmetric in ~λ, intertemporal weights

are log-convex functions of λ, and planners in group A disagree more about the parameter

λ than planners in group B, the consensus long-run pure rate of time preference will be

lower in group A than in group B.

I will provide some interpretation of the log-convexity condition in examples below, but

first we turn to the proof.

Proof. The proof relies on the following result due to Kingman (1961): Let bij(θ) ≥ 0 be

the elements of a non-negative matrix B, where θ ∈ R is a parameter. If bij(θ) is log-

convex in θ for all i, j, the spectral radius of B is a log-convex function of θ. Remark 1.3

in Nussbaum (1986) observes that Kingman’s result can be extended as follows: Let ~θ be

a vector of parameters that takes values in a convex set, and assume that the elements

bij(~θ) ≥ 0 of a matrix B are log-convex functions of ~θ. Then the spectral radius of B is

log-convex is ~θ.

We will employ the usual trick of working with finite order models first (i.e., setting

f ijs to zero for s > M), and taking a limit as M →∞ at the end. The consensus long-run

pure rate of time preference in a model of order M is determined by the largest eigenvalue

of ΦM , defined in (A.15). Denote this eigenvalue by µM(~λ). If the matrix elements f ijs (λ)

are log-convex functions of the scalar variable λ, then f ijs (~λ) = f ijs (λi) is also a log-convex

4~λA majorizes ~λB iff there exists a doubly stochastic matrix H such that ~λB = H~λA. Intuitively, theelements of ~λA are ‘more spread out’ than those of ~λB , and the sums of their elements are equal. Seee.g. Marshall (2010) for a discussion of majorization and its relationship to e.g. stochastic orders andinequality measures.

18

function of the vector of parameters ~λ. Thus, if f ijs (λ) is log-convex (or identically zero)

for all i, j, s, µM(~λ) is a log-convex function of ~λ.

The final step of the proof is to observe that because of the symmetry of the set of

intertemporal weights in (A.32) the spectral radius must be a symmetric function of ~λ, i.e.,

any permutation of the elements of ~λ will leave the spectral radius unchanged. This follows

since the eigenvalues of a matrix are invariant under the permutations (A.32). Since µM(~λ)

is a log convex, symmetric function of ~λ, its log is Schur-convex. Since µM(~λ) = e−ρM (~λ),

this implies that ρM(~λ) is Schur-concave in ~λ. Thus by the properties of Schur-concave

functions, if ~λA majorizes ~λB we must have

ρM(~λA) < ρM(~λB).

The final result follows by taking the limit as M →∞.

As an initial example of the application of this result, consider a model in which the

discount function β(s, λ) in the example in (A.33) declines exponentially, i.e.,

β(s, λ) = (1 + λ)−s .

This discount function satisfies log β(s, λ) = −s log(1 + λ), which is strictly convex in λ.

Thus the result applies – more disagreement about the parameter λ decreases the consensus

long-run pure rate of social time prefenence.

We can extend this finding to a more general class of models by assuming that β(s, λ) =

β(λs), i.e., the parameter λ acts to rescale the time variable s. Following Prelec (2004) we

will say that β(s) exhibits decreasing impatience if log β(s) is a convex function of s for

s > 0. Discount functions that exhibit decreasing impatience have the form β(s) = e−h(s)

where h(s) is a concave function. The rate of increase of h(s) (which measures impatience)

slows as the time horizon s increases.

Corollary 1. Assume that β(s) exhibits decreasing impatience, and that the parameter

vector ~λA majorizes ~λB. Then

ρ(~λA) < ρ(~λB).

Thus, for example, in a hyperbolic model (see e.g. Prelec, 2004) we would have

β(s) = (1 + s)−(1+p) ⇒ β(s, λ) = β(λs) = (1 + λs)−(1+p)

19

where p > 0 is a parameter. β(s) is log convex in s, so more disagreement about λ reduces

the consensus long-run pure rate of time preference in this model.

G Details of calibration

The data I use to calibrate the model and generate the results in Figures 1a and 1b are

taken from a recent survey by Drupp et. al. (2018). They surveyed expert economists who

have published papers on social discounting, asking for their opinions on, amongst other

things, the appropriate values of the pure rate of social time preference and the elasticity

of marginal social utility. The distribution of respondents’ views on these two parameters

is plotted in Figure F.2.

The calibration assumption I use is that the data in Figure F.2 correspond to ‘dogmatic’

views on the IWF, and in particular that these data correspond to the parameters of a

discounted utilitarian IWF with iso-elastic utility function. This assumption is consistent

both with the survey authors’ description of what they aim to elicit in their survey, and

with the participants’ responses. See footnote 19 of the main text for further explanation.

The calibration is made slightly delicate by the fact that there is no version of the

model in (12) in which planners place non-zero weight on all future selves that reduces

to a discounted utilitarian IWF. I calibrate the parametric model in (22) so that when

the weight on own preferences x = 1, planners’ time preferences can be represented by a

function that is a close approximation to a discounted utilitarian IWF, but still assigns

non-zero weight to all future selves.

To calibrate the values of γi, αi in (22), I use the fact that when x = 1 the model

reduces to a set of N independent intertemporal preferences of the form:

V iτ = U i(cτ ) + γi

∞∑s=1

(αi)sV i

τ+s, (A.34)

where αi ∈ (0, 1) and γi ∈ (0, 1−αiαi

). These time preferences have been studied by Saez-

Marti & Weibull (2005), and axiomatized by Galperti & Strulovici (2017). It is straight-

forward to show that they have the following equivalent representation:

V iτ = U i(cτ ) +

∞∑s=1

κsi

(1 + γiγi

)s−1

U i(cτ+s), where κi = αiγi. (A.35)

20

0 1 2 3 4 5 6 7 8

Pure Rate of Social Time Preference %/yr (ρi)

0

1

2

3

4

5

6

Ela

sticity o

f M

arg

ina

l U

tilit

y (η

i)

Figure F.2: Experts’ recommended values for the pure rate of social time preference (ρi),and the elasticity of marginal utility (ηi) for appraisal of long-run public projects, from theDrupp et. al. (2018) survey. 173 responses were recorded. The dashed box depicts datapoints that fall inside the 5− 95% ranges of both parameters. The red cross indicates thelocation of the median values of ρi and ηi.

Writing out the sequence of intertemporal utility weights in this model explicitly,

1, κi,

(1 + γiγi

)κ2i ,

(1 + γiγi

)2

κ3i ,

(1 + γiγi

)3

κ4i , . . . , (A.36)

it is clear that if we take the limit as γi → ∞ of this model holding κi fixed, we recover

discounted utilitarian time preferences with discount factor κi. For any finite γi the pref-

erences in (A.35) are quasi-hyperbolic, with a short run pure time discount factor given by

κi, and a long-run pure time discount factor given by(

1+γiγi

)κi.

Recall that the data in Figure F.2 correspond to the parameters of a discounted util-

itarian IWF, and that our calibration assumption is that these data correspond to the

21

x→ 1 limit of the non-dogmatic model (22). The sequence in (A.36) shows that to ensure

consistency with the calibration assumption we must calibrate κi so that

κi = e−ρi , (A.37)

where ρi is survey respondent i’s recommended value for the pure rate of social time prefer-

ence. In addition, we must choose γi sufficiently large that the model closely approximates

discounted utilitarian time preferences. Notice from (A.36) that the discount factor of

planner i for s > 1 is given by

(1 + γ−1i )κi ≈ e−(γ−1

i +ρi)

when γ−1i is small. Thus γ−1

i = 1%, for example, corresponds to an additional 1%/yr

discount rate on the long-run future, over and above the short run discount rate ρi. Thus

if γ−1i is too large, the model will provide a poor fit to a discounted utilitarian IWF when

x = 1, since non-dogmatic planners will exhibit sharply quasi-hyperbolic time preferences

in this case. To ensure that the model is a close approximation to discounted utilitarianism

when x = 1, but also that all planners place non-zero weight on all future selves’ IWFs

(which requires γi be finite), we must pick γ−1i to be small but non-zero for all i, i.e.,

γ−1i ≈ 0.1%. The numerical results presented in the paper are robust to heterogeneity in

γ−1i , provided that none of these parameters is too large relative to respondents’ pure rates

of social time preference. As stated, γ−1i must be small if the calibrated model is to provide

a good approximation to discounted utilitarian IWFs at x = 1.

In addition, I assume in line with Drupp et. al. (2018) that planners’ utility functions

are iso-elastic, i.e.,

U i(c) =c1−ηi

1− ηi(A.38)

for some ηi > 0. This implies that the elasticity of marginal utility is constant and equal

to ηi, and I simply calibrate ηi to be each respondent’s preferred value of this elasticity.

The requirement that the calibrated model provide a close approximation to discounted

utilitarian IWFs in an appropriate ‘dogmatic’ limit implies that the results depicted in

Figure 1a are robust to alternative specifications of the weights wijs for s > 1. The reason

for this is that, as discussed above (and as is evident from (A.36)), in order for the model

to closely approximate discounted utilitarian IWFs at x = 1, the calibrated values of γi

must be large, which in turn implies that the values of αi must be correspondingly small

22

since κi = e−ρi = γiαi, where ρi is the observed pure time preference rate recommendation

of respondent i. Now notice that the models in (22) can be written as


[αi

N∑j=1

wij1 Vjτ+1 + (αi)

2

N∑j=1

wij2 Vjτ+2 +O((αi)

3)

].

Since (αi)s � αi for all s ≥ 2 if αi � 1, it does not much matter how the weights wijs

behave for s ≥ 2. Even if a weight x is given to current preferences at every future maturity,

i.e.,

wijs =

{x i = j

1−x1−N i 6= j

(A.39)

for all s ≥ 1, the results of the simulations hardly change.5

H Changing the model’s time step

This section of the appendix describes how to transform the parameters of the model used

in Figure 1 when the time step is changed.

For the version of the model in question planners’ time preferences took the form


[αi

N∑j=1

(P)i,jVjτ+1 + (αi)

2

N∑j=1

(P2)i,jVjτ+2 +O((αi)

3)

]

where P is the annual transition probability matrix defined in (22), which depends on the

parameter x, i.e., the chance of a preference change in a year.

If the model’s time step is changed from 1 year to ∆T > 0 years the values of all its

dynamical parameters must change as well. Consumption growth rates are multiplied by

∆T , and, as in the calibration methodology set out in Section G above, the values of αi

and γi must be recalibrated so that:

κi = αiγi = e−ρi∆T ,

γ−1i ≈ 0.1%×∆T

Transforming the matrix P is more complex. To make the version of the model with time

5Planners with beliefs (A.39) do not obey the consistency condition (14), but this has no relevance forthis discussion.

23

step ∆T comparable to the original annual model, we need to find a stochastic matrix Q

such that

Q = P∆T . (A.40)

When ∆T is not a positive integer (e.g., if ∆T = 1/12 for a monthly time step) such

matrix equations may have no solution, or multiple non-negative solutions. However, in

our case the structure of the model ensures that there is a natural ‘∆T th power’ of P for

any ∆T > 0, and for all interesting values of the parameter x.

Begin by observing that the eigenvalues of P are 1 (with algebraic multiplicity 1)

and Nx−1N−1

(with algebraic multiplicity N − 1), and are thus positive provided that x >

1/N .6 Matrices with positive eigenvalues have a unique ‘principal power’ that satisfies

the equation (A.40) and itself has positive eigenvalues (see e.g., Horn & Johnson, 2013).

It is essential that transforming the time step of the model does not change the signs of

the eigenvalues of the model’s transition probability matrix. If this were not the case the

qualitative dynamics of preference change would not be preserved under a change of time

step. One could, for example, find that planner’s intratemporal weights wijs oscillate with

maturity s, where no such behaviour existed before.

Since P is diagonalizable, it can be written as

P = VDV−1

where

V =

1 −1 −1 . . . −1

1 1 0 . . . 0

1 0 1 . . . 0...

......

. . ....

1 0 0 . . . 1

is a matrix whose jth column corresponds to the jth eigenvector of P, and D is a diagonal

matrix of corresponding eigenvalues, i.e., (D)1,1 = 1, (D)j,j = Nx−1N−1

for j 6= 1. The principal

∆T th power of P is given by

Q = VD∆TV−1.

for any ∆T > 0.

Consider the case ∆T = 1/12, corresponding to a model with a monthly time step. It

6The case x < 1/N is not plausible.

24

0 20 40 60 80 100 120 140 160 180 200

Maturity s (yrs)

0

1

2

3

4

5

6

7

5-9

5%

ra

ng

e o

f r

i (s)

(%/y

r)

x = 80% (annual)

x = 90% (annual)

x = 95% (annual)

x = 97.5% (annual)

x = 1

Figure F.3: Replication of Figure 1a in the paper for a monthly time step. To facili-tate comparison with Figure 1a monthly discount rates have been converted to annualequivalents (vertical axis), and the horizontal axis is scaled to years, rather than months.

is clear from the definition in the previous equation that raising Q to the twelfth power

yields the original matrix P, and that Q has positive eigenvalues. The matrix Q is the

only 12th root of P that has these properties.7

Figure F.3 presents an analogue of Figure 1a in the paper, however this time I have

calibrated the model with a monthly time step using the procedure outlined above. The

figure shows that there is no appreciable difference between versions of the model defined at

different time steps provided that the model parameters are adjusted to reflect the change

in time step.

Finally, I note that any version of the model defined with a discrete time step can be

7Other solutions of (A.40) have the same basic form as Q however we may replace any of the entries onthe diagonal of D1/12 with any of the twelve complex roots of the corresponding eigenvalue of P. As thereis only one way of choosing these roots so that they are all positive (and real), there is a unique ‘principalpower’ of P.

25

thought of as an approximation to an underlying continuous model. Preferences could

change at any instant, and there is some underlying infinitesimal transition probability

matrix that could describe this continuous Markov process. But any discrete approximation

of this process, at any temporal resolution, is legitimate – any behaviour of the continuous

process, when aggregated up to a discrete time step ∆T by exponentiating the infinitesimal

transition matrix, can be replicated by an ‘ab initio’ discrete model with time step ∆T .

We lose nothing (at resolution ∆T ) in this discrete approximation, although the entries of

the discrete transition probability matrix (and hence the weight x) will differ according to

the magnitude of ∆T .

I Decomposing non-dogmatic SDRs

This section studies the resolution of disagreement about the two components of the SDR

– pure time preference and the consumption growth/inequality aversion term – separately.

It shows that much of the rapid convergence of SDRs with maturity shown in Figure 1a is

due to exponential convergence in the consumption growth term.

Section C of the appendix showed that the set of IWFs consistent with (12) can be

represented by

V iτ =

∞∑s=0

N∑j=1

aijs Uj(cτ+s),

where the coefficients aijs are determined by the difference equations in (A.11), and aii0 =

1, aij0 = 0 if i 6= j. Planner i’s SDR at maturity s is

ri(s) = −1

sln

(∑Nj=1 a

ijs (U j)′(cτ+s)

(U i)′(cτ )

)

We decompose this expression into a pure time preference term and a consumption growth

term. Defining

ρi(s) = −1

sln

(N∑j=1

aijs

)(A.41)

Gi(s) = −1

sln

∑Nj=1 a

ijs (U j)′(cτ+s)(∑N

j=1 aijs

)(U i)′(cτ )

. (A.42)

26

we have

ri(s) = ρi(s) +Gi(s). (A.43)

To understand the meaning of ρi(s), notice that∑N

j=1 aijs is the total weight on utilities at

maturity s in IWF i, i.e., it is a pure time discount factor. Hence ρi(s) is IWF i’s pure

rate of social time preference at maturity s. To interpret Gi(s) it is helpful to consider the

case where the utility functions U i(c) are iso-elastic as in (A.38). Denoting the compound

annual consumption growth rate at maturity s by gs, we have8

Gi(s) = −1

sln

(∑Nj=1 a

ijs e−ηjgss∑N

j=1 aijs

). (A.44)

Consider a hypothetical case in which planners have no normative insecurity, i.e., aijs = 0

for all j 6= i; in this case we see that Gi(s) = ηigs, and we recover the familiar consumption

growth term in the Ramsey rule. Gi(s) is the generalization of this term to the non-

dogmatic case, i.e., it is the contribution to the discount rate from consumption growth

and inequality aversion. Figure F.4 plots the range of values for ρ(s) and G(s) as a function

of maturity for the model calibration described in Section G of the appendix. The figure

shows two important things. First, disagreements about the consumption growth term are

significantly larger, and thus quantitatively more important, than disagreements about the

pure rate of social time preference.9 Second, although the range of values for G(0) is larger

than that for ρ(0), disagreements about this term reduce substantially faster as maturity

s increases. The expression for Gi(s) in (A.44) suggests why this occurs. The argument of

the log in this expression is a weighted sum of exponential functions, and thus converges

exponentially fast to e−minj{ηjgs}s as s increases. For example, if consumption growth is a

constant 2%/yr, and we take η = 2 as a modal value of η, and η = 0.05 as the smallest

value of η, at a maturity of 50 years we have e−2×0.02×50 = 0.13, and e−0.05×0.02×50 = 0.95.

Thus values of ηigs that differ substantially from minj{ηjgs} receive little weight at long

maturities, causing the values of G(s) to converge rapidly.

To relate variation in the components ρ(s) and G(s) back to variation in the SDR

8For convenience in this calculation we have chosen units so that current consumption cτ = 1. This iswithout loss of generality.

9The reader may wonder why the ranges for ρ(s) and G(s) depicted in Figure F.4 do not sum to therange for r(s) in Figure 1a. The answer is that the ranges in Figure F.4 are properties of the marginaldistributions of ρ(s) and G(s), while the range of their sum r(s) depends on the joint distribution of thesetwo quantities. Figure F.4 demonstrates how disagreements about these two independently meaningfulquantities reduce as a function of maturity.

27

05

01

00

15

02

00

Ma

turity

s (

yrs

)

0123456 5-95% range of ρi(s) (%/yr)

x =

80

%

x =

90

%

x =

95

%

x =

97

.5%

x =

1

05

01

00

15

02

00

Ma

turity

s (

yrs

)

0123456

5-95% range of Gi(s) (%/yr)

x =

80

%

x =

90

%

x =

95

%

x =

97

.5%

x =

1

Fig

ure

F.4

:R

ange

ofva

lues

for

the

two

com

pon

ents

ofth

eSD

R–

the

pure

rate

ofso

cial

tim

epre

fere

nce

(ρ(s

),le

ft),

and

the

consu

mpti

ongr

owth

term

(G(s

),ri

ght)

–as

afu

nct

ion

ofm

aturi

tys.

The

model

calibra

tion

isth

esa

me

asin

Fig

ure

1a.

28

r(s) = ρ(s) +G(s), we make use of the fact that

Var r(s) = Var ρ(s) + Var G(s) + 2Cov{ρ(s), G(s)}. (A.45)

Figure F.5a breaks the total variance in r(s) into each of these three components at each

maturity, for the illustrative case x = 97.5%. This figure confirms that much of the

variation in r(0) derives from variation in the growth term G(0), but that as maturities

increase disagreements about this term rapidly evaporate. Figure F.5b plots the ratioVarρ(s)

Varr(s) as a function of s for a range of values of x, showing that for all these parameter

values almost all the remaining variation in r(s) for s > 50 is attributable to variation in

ρ(s) – we have almost complete convergence on the dominant G(s) term at these maturities.

29

20 40 60 80 100 120 140 160 180 200

Maturity s (yrs)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5V

ar

r(s)

×10-4

Var ρ(s)

Var G(s)

2Cov(ρ(s),G(s))

(a) Components of the variance of r(s) (see equation (A.45)). x = 97.5%.

20 40 60 80 100 120 140 160 180 200

Maturity s (yrs)

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

Va

rρ(s

) /

Va

r r(

s)

x = 80%

x = 90%

x = 95%

x = 97.5%

x = 1

(b) Share of the variance of r(s) due to the variance of ρ(s).

Figure F.5: Decomposition of the variance of r(s).

J References

Bergstrom, Theodore C. 1999. “Systems of Benevolent Utility Functions.” Journal

of Public Economic Theory, 1(1): 71–100.

Drupp, Moritz, Mark C. Freeman, Ben Groom, and Frikk Nesje. 2018. “Dis-

counting Disentangled.” American Economic Journal: Economic Policy, 10(4): 109–

134.

Duchin F., Steenge B. 2009. “Mathematical Models in Input-Output Economics”, in

W. Zhang ed., Mathematical Science, UNESCO Encyclopedia of Life Support Sys-

tems.

Friedland, Shmuel. 1981. “Convex spectral functions.” Linear and Multilinear Algebra,

9(4): 299–316.

Galperti, Simone, and Bruno Strulovici. 2017. “A Theory of Intergenerational

Altruism.” Econometrica, 85(4): 1175–1218.

Horn, Roger A., Charles R. Johnson 2013. “Matrix Analysis.” Cambridge University

Press, New York, 2nd edition.

Kingman, J.F.C. 1961. “A convexity property of positive matrices.” Q J Math,

12(1): 283–284.

Marshall, Albert W., Ingram Olkin and Barry C. Arnold 2010. “Inequalities:

Theory of Majorization and its applications.” Springer, New York, 2nd edition.

Nussbaum, Roger D., 1986. “Convexity and log-convexity for the spectral radius.”

Linear algebra and its applications, 73: 59–122.

Prelec, Drazen. 2004. “Decreasing Impatience: A Criterion for Non-stationary Time

Preference and “Hyperbolic” Discounting.” Scandinavian Journal of Economics,

106(3): 511–532.

Saez-Marti, Maria, and Jorgen W. Weibull. 2005. “Discounting and altruism to

future decision-makers.” Journal of Economic Theory, 122(2): 254–266.

Sternberg, Shlomo. 2014. Dynamical Systems. Dover Publications.

31

Date post:	15-May-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

Non-dogmatic social discountingecon.ucsb.edu/~amillner/files/NDSD.pdf · Non-dogmatic social...

Documents