Innovation, Firm Size Distribution, and Gains from Trade€¦ · JEL Codes: F12, F13, F41....

Innovation, Firm Size Distribution, and Gains from

Trade∗

Yi-Fan Chen† Wen-Tai Hsu‡ Shin-Kun Peng§

February 12, 2020

Abstract

Unlike most trade models with firm heterogeneity which assume power-law distri-

butions, we derive power laws in a rather general environment in a general-equilibrium

trade model. We do so by allowing firms to determine their productivities in an inno-

vation stage in an otherwise standard Melitz (2003) model. We show that equilibrium

productivity and firm-size distributions exhibit power-law tails under general condi-

tions on demand and innovating technology. Moreover, the emergence of the power

laws is consistent with general underlying primitive heterogeneity among firms. We

investigate the model’s welfare implications and conduct a quantitative analysis of

welfare gains from trade. We find that conditional on the same trade elasticity and

values of the common parameters, our model yields 40% higher welfare gains from

trade than a standard model with exogenously given productivity distribution.

JEL Codes: F12, F13, F41.

Keywords: power law, gains from trade, firm heterogeneity, regular variation, innovation∗This paper was previously circulated under the title “Productivity Investment, Power Laws, and Wel-

fare Gains from Trade”. We are grateful for the detailed and insightful comments by Mathieu Parenti andMark Razhev. For helpful comments, we also thank Thomas Chaney, Pao-Li Chang, Davin Chor, Jen-FengLin, Peter Neary, Jacques-Francois Thisse, and Jonathan Vogel and the seminar participants at Baptist Uni-versity of Hong Kong, Chinese University of Hong Kong, National Taiwan University, the University ofMelbourne, the 2018 Midwest Trade and Theory Meetings, the 2018 SMU-NUS-Paris Joint Trade Work-shop, the 2019 SMU-INSEAD Joint Trade Workshop. The authors gratefully acknowledge the financialsupport from Academia Sinica (Academia Sinica Investigator Award #IA-105-H04) and Ministry of Scienceand Technology (MOST 105-2410-H-001-001-MY3).†School of Economics, Singapore Management University, Singapore. E-mail: [email protected].‡School of Economics, Singapore Management University, Singapore. Institute of Economics, Academia

Sinica. E-mail: [email protected].§Institute of Economics, Academia Sinica, Taipei, Taiwan. E-mail: [email protected].

1 Introduction

In the last two decades of the development of the trade literature on heterogeneous firms,

the source of heterogeneity has been mostly exogenous, e.g., the exogenously given pro-

ductivity distribution in Bernard, Eaton, Jensen, and Kortum (2003), Eaton and Kortum

(2002), Melitz (2003), Melitz and Ottaviano (2008), and the large literature following these

work-horse models. Trade affects which parts of the productivity distribution in each coun-

try are utilized via either firm selection or comparative advantage. Nevertheless, empirical

evidence shows that trade affects productivity at the level of individual firms, making the

distribution of productivity endogenous. For examples, see Pavcnik (2002), Fernandes

(2007), Lileeva and Trefler (2010), Bustos (2011), and Aghion, Bergeaud, Lequien, and

Melitz (2018). Furthermore, Lileeva and Trefler (2010), Bustos (2011) and Aghion et al.

(2018) show that innovation is a plausible mechanism through which trade affects firms’

productivity.1 Both studies explain their empirical results using models that assume firm

productivity as a function of innovating activities.

Following the above-mentioned studies, we study a variant of the Melitz (2003) model

in which we embed an innovation mechanism. We find that this incorporation allows us

to microfound the power laws in firm productivity and firm size under a rather general

general-equilibrium model of international trade with weak restrictions on underlying firm

heterogeneity. This paper also gauges the importance of this innovation mechanism by

analyzing the welfare gains from trade via a comparison with a Melitz model via the lens

of the framework proposed by Arkolakis, Costinot, and Rodríguez-Clare (2012; henceforth

ACR).

Power laws in both productivity and firm size are widely documented empirical reg-

ularities (see, for examples, Axtell 2001, Luttmer 2007, and Nigai 2017). Moreover, it

has been shown that these power laws provide microfoundation for the gravity equations

1Aghion et al. (2018) document that innovation activities are strongly and positively correlated withexport. Export of innovating firms are approximately 10 times larger than that of non-innovating firms. Theirempirical analysis also shows that export demand shocks have a significant positive impact on innovationactivities for large firms, especially for firms at top deciles.

1

(Arkolakis, Costinot, Donaldson, and Rodríguez-Clare 2019 and Chaney 2018) and that

the few very large firms may be what matters the most for macroeconomic performance,

i.e., granular economies (Gabaix 2011). Furthermore, the power-law coefficients are of-

ten tightly connected with welfare evaluation (as suggested by ACR and Arkolakis et al.

2019). Thus, it is important to understand the circumstances under which these power laws

may emerge.

Formally, a distribution is said to exhibit a power law if its tail probability at the upper

tail is given by a power function, i.e., limx→∞ Pr[X ≥ x] = αx−ζ , for some positive

constants α and ζ . The Pareto and Fréchet are two power-law distributions that are often

assumed in trade models of firm heterogeneity;2 the former case is popularized by Melitz

and Ottaviano (2008), as well as Chaney (2008) who assumes the Pareto in a Melitz model;

the latter is used in Bernard et al. (2003) and Eaton and Kortum (2002), and has now

become a standard machinery in quantitative trade models. Unlike these models of firm

heterogeneity which assume power-law distributions, we derive power laws in a rather

general environment in a general-equilibrium trade model.

Our model starts with a simple relation between productivity, innovation effort, and

a firm’s capability. More innovation leads to higher productivity, and the higher a firm’s

capability, the less innovation effort is needed to entail the same level of productivity. One

feature of this relation is that productivity is determined by effort and capability jointly in a

multiplicative manner, and we microfound this feature by an R&D process in which firms

decide the complexity of their production procedures and conduct Bernoulli trials (experi-

ments) to improve the performance of each procedure. Firms differ by their probabilities of

failure in these Bernoulli trials. We then embed this relation into a standard Melitz model

(with a CES preference) as an innovation stage after entry and before production. As a

first-cut, we show that if the distribution of failure probability has a finite and positive den-

sity near zero, then the power laws emerge. This is a rather weak condition, as it requires

little on the functional form of underlying firm heterogeneity.

2The Fréchet and Pareto are tail-equivalent as their tail probabilities are asymptotically proportional.

2

Using the tool of regular variation, all functions as the primitives of this model can be

substantially generalized. First, the preference/demand side can be generalized from the

CES to a regularly varying demand, which is much broader as it allows for non-homothetic

preference and variable markups. Several non-CES, non-homothetic preferences studied

by Mrázová and Neary (2017) in fact entail regularly varying demand functions. Second,

both the innovation-productivity relation and the distribution of failure probability can also

be generalized to be regularly varying functions. In particular, this implies that the power

laws continue to hold even if the density of the distribution of failure probability near zero

is zero or infinite, provided that the distribution is regularly varying at the left tail. As we

will explain in Section 2, this includes many well-known and widely-used distributions.

All of the above-mentioned results are shown in a closed economy. We then go on

to show that these results continue to hold in a very general open-economy environment

with all model parameters allowed to be country-specific. Interestingly, the tail indices of

both productivity and firm size distributions of each country depend on the market with the

largest competitiveness (largest price elasticities). As a result, opening up to trade (weakly)

fattens the right tails of both productivity and firm size distributions for each country.

We also analyze how productivity distribution is affected by trade liberalization. We

show that a lower variable trade cost increases (decreases) exporters’ (non-exporters’) in-

novation effort. On the one hand, a lower trade cost implies a larger effective market size

facing the exporters. Hence, the exporters’ marginal benefit of having a higher productiv-

ity increases, leading them to innovate more. On the other hand, the non-exporters face

more import competition and make less profit as the prices of imported goods are reduced

not only because of a lower variable trade cost but also due to the fact that these foreign

exporters become more productive. Consequently, a lower trade cost negatively affects the

productivities of non-exporters.

We conduct a quantitative analysis to clarify how innovation affects welfare gains from

trade. Despite some slight differences from the class of models characterized by ACR,

the welfare gains from trade still follow their formula, i.e., d lnW = 1εd lnλ, where W

3

is welfare, λ is the expenditure share on domestic goods, and ε is the trade elasticity. We

refer to this formula as the local ACR formula as it deals with small changes in trade cost.

However, the (global) ACR formula W ′/W = (λ′/λ)1/ε for large changes in trade cost

does not apply here because the trade elasticity ε in our model is a variable. Nevertheless,

one can obtain the welfare changes for large changes in trade cost by integrating over the

local ACR formula.

To highlight the role played by innovation, we compare the welfare gains from trade

with the Melitz (2003) model with an exogenous Pareto productivity distribution, in which

both the global and local ACR formula hold. For this purpose, we focus on a symmetric-

country world with CES preference and a power function for innovation-productivity re-

lation. When firms’ R&D abilities are uniformly distributed, the resulting productivity

distribution has a Pareto right tail; thus such a parameterization is adopted. We calibrate

the model to match the same trade elasticity, domestic expenditure share, and the share of

exporters. Following Melitz and Redding’s (2015) approach of comparing across models

by fixing common parameters, we compare the two models conditional on the same trade

elasticity and values of the common parameters. Our quantitative analysis finds that the

model with innovation entails larger welfare gains from trade than Melitz-Pareto by about

40%. The intuition is as follows. As mentioned, exporters innovate more and non-exporters

innovate less when facing trade liberalization, thus creating a larger productivity advantage

of exporters over non-exporters. Compared with the Melitz model with exogenous produc-

tivity distribution, the above-mentioned effect entails larger imports and exports, and by

the ACR formula, this implies larger welfare gains from trade. This also suggests the

importance of incorporating innovation and endogenous choice of productivity.

This paper is closely related to the literature on power laws in firm size. A popular ex-

planation for such power laws is based on firm-size dynamics that follow a random growth

process; see, for example, Luttmer (2007) and Acemoglu and Cao (2015). Also see Gabaix

(2009) for a survey of this random-growth approach. Recently, Chaney (2014, 2018) and

Geerolf (2017) have provided explanations for the power law in firm size via network

4

and firm hierarchy, respectively. Note that no models of the above-mentioned studies are

free of functional form assumption or restrictions; for examples, Luttmer (2007) and Ace-

moglu and Cao (2015) assume CES and constant-relative-risk-aversion (CRRA) prefer-

ences. Thus, our relaxation of demand from the CES to regularly varying functions should

be viewed as an advantage rather than a strong restriction. Most importantly, the common

theme of these studies and our work is that power laws emerge with weak restrictions on

the underlying firm heterogeneity. Our model differs from these studies in its economic

mechanism. Our mathematical mechanism is related to Geerolf (2017) as “the power law

change of variable technique” used in his work is also used here in our illustrative example.

This paper is also closely related to Yeaple (2005), Lileeva and Trefler (2010), Bus-

tos (2011), Bas and Ledezma (2015), Aghion et al. (2018), and Bonfiglioli, Crinò, and

Gancia (2018), who also model how innovation effort affects productivity. Whereas the

mechanism of our theory bears some similarity to these studies, our work differs at least in

the two following aspects: (1) we show that the concentration of innovation effort among

exporters and large firms results in power laws in both productivity and firm size under a

rather general environment; (2) we investigate the welfare effect of such innovation effort.

As mentioned, our theoretical and quantitative analyses on the welfare gains from trade

is closely related to ACR.3 Our approach in modeling innovation is similar in spirit to the

technological choice embedded in the ACR framework, but is different in form.4 Never-

theless, we show that the (local) ACR formula still holds in our model, despite a variable

trade elasticity. Atkeson and Burstein (2010) also incorporate process innovation and find

a result similar to the ACR in the sense that the new margins via firm heterogeneity do not

entail new gains from trade. As our model entails the local ACR formula, the innovation

mechanism here does not contribute to any extra gains from trade conditional on domestic

expenditure share and trade elasticity. However, our quantitative exercise is different be-

3Since ACR, there is a revived and still growing literature on gains from trade. See Costinot andRodríguez-Clare (2015) for a survey. Our work differs from this literature in that we focus on the effectof innovation.

4As is made clear in Section 2, innovation effort is determined in the stage before production and con-sumption, whereas ACR assumes they are simultaneous.

5

cause we adopt the approach of Melitz and Redding (2015) to compare the implications of

different models conditional on the same parameters. Whereas Melitz and Redding show

that the heterogeneous-firm model adds additional gains from trade compared with homo-

geneous firms, we show that innovation further adds gains from trade compared with the

Melitz-Pareto model. Moreover, we show that such extra gains could be substantial.

The rest of the paper is organized as follows. Section 2 presents the model and shows

how power laws emerge. Section 3 provides comparative statics of productivity distribution

on trade costs and other parameters. Section 4 studies the properties pertaining to welfare

gains from trade and conducts a quantitative analysis. Section 5 concludes.

2 Power Laws in Productivity and Firm Size

We first start with a closed economy model to illustrate the mechanism of innovation. We

show how power laws for productivity and firm size emerge from such a model. Such

results easily extend to a general open-economy environment, as we show in Section 2.3.

2.1 Model Setup

There are N individuals in the economy, each of which is endowed with one unit of labor.

All individuals are identical in their income earned from wages w, and they spend their

income on a continuum of varieties, each of which is indexed by υ. Assume that the

individual inverse demand function is given by p = D (q (υ) ;A) on[q,∞

)with q ≥ 0.

Namely, we assume that the inverse demand of a variety depends on all the other varieties

only through aggregate variables A ∈ Rn. Assume that D is twice differentiable and that

the law of demand holds: D′ < 0.5

5Such an inverse demand function can be generated by (but not limited to) maximizing an additivelyseparable utility U =

∫υ∈Υ

u (q (υ)) dυ subject to the budget constraint∫υ∈Υ

p (υ) q (υ) dυ = w. The sub-utility u (.) is defined on

[q,∞

)with q ≥ 0. Assume that u′ > 0 and u′′ < 0. The standard solution yields

the inverse demand function p = D (q (υ) ;A) ≡ u′ (q (υ)) /A, where A is the Lagrange multiplier of theconsumer’s problem and is a general-equilibrium object. Note that u′′ < 0 implies that D′ < 0. For otherforms of D (q (υ) ;A) than u′ (q (υ)) /A, see the discussion following Assumption 1 and Table 1.

6

On the production side, labor is the only input, and firms engage in monopolistic com-

petition. To enter, each entrant hires κe amount of labor, which allows the entrant to obtain

a distinct variety and a draw of an R&D parameter γ from a given distribution which we

explain shortly. For a firm to produce, κD units of labor as fixed input is required. The

productivity of a firm is endogenously determined and denoted as ϕ. By choosing labor

as numeraire, the wages equal 1, and the total cost of production as a function of output q

is q/ϕ + κD. As in Melitz (2003), a positive κD results in firm selection. As we will see,

whether there is selection or not (κD > 0 or κD = 0) is immaterial for the results on power

laws; we keep selection for generality and for the welfare comparison with the literature.

A firm’s profit from production is

π (ϕ) = pq − ϕ−1q − κD. (1)

Each firm determines its productivity by conducting R&D. The R&D efforts are in

terms of labor, and the labor requirement k for a γ-typed firm to acquire a productivity

level ϕ is given by the function

k = γV (ϕ) , (2)

where V (·) is strictly increasing and convex on R+, and limϕ→∞

V (ϕ) = ∞. In this innova-

tion cost function, γ is multiplicatively separable from V and serves as an inverse measure

for a firm’s R&D efficiency. This functional form can be microfounded by an R&D process

in which firms decide the complexity of their production procedures and conduct Bernoulli

trials (experiments) to improve the performance of each procedure. Firms differ by their

probabilities of failure γ ∈ (0, 1] in these Bernoulli trials. The c.d.f. and p.d.f. of the

distribution of γ are denoted as F (·) and f (·), respectively. Equation (2) suffices for our

purposes, and the microfoundation of this R&D process is given in the Appendix.

A γ-typed firm thus chooses an optimal productivity level ϕ that maximizes its total

profit

Π (ϕ; γ) = π (ϕ)− γV (ϕ) , (3)

7

and the resulting optimal choice of ϕ is denoted as ϕ∗ = ϕ (γ). As there may be firm

selection, the set of firms who actually produce, i.e. who pay V (ϕ) and κD, is denoted as

Ω. The free-entry condition is therefore

E (Π) ≡∫

Ω

Π (ϕ (γ) ; γ) dF (γ) = κe. (4)

In sum, the model contains three stages as follows:

Stage 1. Entry Stage: Each potential entrant decides whether to enter the market. If an

entrant decides to enter, it pays the fixed entry cost κe and draws its type γ randomly

from the distribution f (γ).

Stage 2. Innovation Stage: Given γ, each firm decides whether to invest in productivity or

not, and if yes, how much to invest to determine its productivity level ϕ.

Stage 3. Production/Consumption Stage: Each firm decides whether to produce or not.

If yes, each firm pays κD and determines its output and price. Production and con-

sumption take place and markets clear.

2.2 Equilibrium and Power Laws

This subsection provides an exposition of how power laws for both productivity and firm

size emerge in a closed economy. We first provide an illustrative example which differs

from the Melitz model only by having an innovation stage in which the the innovation cost

function V is given by a power function for tractability. We show how a weak restriction on

the underlying firm heterogeneity f allows the power laws in productivity and firm size to

emerge. A natural question is then how much does the result depend on the power-function

assumptions on demand and innovation cost. We will then show how both assumptions

can be substantially generalized, as well as how the underlying firm heterogeneity can be

generalized even further, by using the tool of regular variation.

8

2.2.1 Illustrative Example with Melitz Model

As in Melitz (2003), consider a CES demand: p =(

NP 1−σ

) 1σ q−

1σ where σ > 1. Also

for tractability, we use a simple power function for the innovation cost: k = γϕβ , where

β > 1. For any ϕ, a firm’s optimal output in the production stage is given by q (ϕ) =

NP 1−σ

(σ−1σ

)σϕσ. The operating profit is accordingly π (ϕ) = N

P 1−σ(σ−1

σ )σ

σ−1ϕσ−1 − κD.

In the innovation stage, a γ-type firm decides its productivity level to maximize its total

profit (3). The resulting productivity as a function of γ is

ϕ (γ) ≡

[N

P 1−σ

(σ−1σ

)σβ

] 1β−σ+1

γ−1

β−σ+1 . (5)

It is readily verified that the second-order condition is satisfied if and only if β > σ − 1,

i.e., the innovation cost function is convex enough, and this condition is imposed here. By

inserting (5) into π (ϕ), a firm’s total profit becomes

Π (γ) =

(N

P 1−σ

) ββ−σ+1

(σ − 1

σ

) βσβ−σ+1

β−β

β−σ+1

(β − σ + 1

σ − 1

)γ−

σ−1β−σ+1 − κD

hence there is an unique cutoff

γD =

[κ−1D

(N

P 1−σ

) ββ−σ+1

(σ − 1

σ

) βσβ−σ+1

β−β

β−σ+1

(β − σ + 1

σ − 1

)]β−σ+1σ−1

(6)

such that Π (γ) ≥ 0 (and hence the firm operates) if and only if γ ≤ γD.

An equilibrium is defined by (5), (6), the price index P 1−σ = Me

∫ γD0

[σ−1σϕ (γ)

]σ−1dF (γ)

,

where Me denotes the mass of entrants paying the entry cost, and the free-entry condition

(4). For the free-entry condition to hold, the expected profit E (Π) must be finite; we will

specify what condition guarantees this shortly.

The firm size is characterized by the sales revenue s ≡ pq. Using q (ϕ) = NP 1−σ

(σ−1σ

)σϕσ

and (5), firm size as a function of γ is therefore

s (γ) ≡[N

P 1−σ

(σ − 1

σ

)σ] ββ−σ+1

β−σ−1

β−σ+1σ

σ − 1γ−

σ−1β−σ+1 . (7)

Let G and Gs be the cumulative density functions of productivity ϕ and firm size s,

9

respectively; let g and gs denotes the corresponding density functions. Obviously, a distri-

bution exhibiting a power law with degree ζ is equivalent to its density following a power

function with exponent −ζ − 1. It is oftentimes more convenient to work with the equiv-

alent definition in terms of density. By applying change of variable, the density functions

of productivity and firm size are given as follows:

g (ϕ) =f (ϕ−1 (ϕ))

F (γD)

N

P 1−σ

(σ−1σ

)σ(β − σ + 1)

βϕ−(β−σ+1)−1,

gs (s) =f (s−1 (s))

F (γD)

(N

P 1−σ

) βσ−1 β − σ + 1

βσ

(σ − 1

σ

)βs−

β−σ+1σ−1

−1.

For the above distributions to exhibit power laws, limϕ→∞

g(ϕ)

ϕ−(β−σ+1)−1 and lims→∞

gs(s)

s−β−σ+1

σ−1 −1need

to be constants (invariant in ϕ and s). Observe from (5) and (7) that both ϕ and s become

arbitrarily large as γ approaches 0. Hence, the above distributions exhibit power laws if

limγ→0

f (γ) = K > 0:

g (ϕ)

ϕ−(β−σ+1)−1≈ K

F (γD)

N

P 1−σ

(σ−1σ

)σβ

(β − σ + 1) ,

gs (s)

s−β−σ+1σ−1

−1≈ K

F (γD)

(N

P 1−σ

) βσ−1 β − σ + 1

βσ

(σ − 1

σ

)β.

In other words, power laws emerge if the density of γ has a finite and positive limit at

zero. If γ is uniformly distributed, then the above distributions are both Pareto, a special

case of power-law distributions. Note that the expected profit E (Π) is finite if and only

if∫ γD

0γ−

σ−1β−σ+1f (γ) dγ < ∞. It is readily verified that the condition lim

γ→0f (γ) = K > 0

ensures that the expected profit is finite if β > 2 (σ − 1).

The mechanism above is referred to as “power law change of variable close to the

origin”: if the density of a random variable x has a finite and positive limit at the origin,

and the interested variable y is related to x in a reciprocal way, then y becomes arbitrarily

large as x goes to 0 and the distribution of y exhibits a power law tail.6 Since productivity

ϕ is related to γ in a reciprocal way given by (5), the condition limγ→0

f (γ) = K > 0 entails

6This technique has already been used in physics; see Jan et al. (1999), Sornette (2002), and Newman(2005). The name of the technique was given by Sornette (2006, Section 14.2.1).

10

a power law in the productivity distribution.

The above simple example illustrates how the addition of the innovation stage to the

Melitz model and a weak restriction on firm heterogeneity f can give rise to both power

laws in productivity and firm size. As mentioned, one naturally wonders how much the

result depends on the power-function assumptions on demand D and innovation cost V ,

and what if the density f tends to infinity or zero when γ → 0. We will show that the

conditions on D, V , and f can all be substantially generalized using regular variation.

2.2.2 Preliminaries: Regularly and smoothly varying functions

We first provide some preliminaries on regular variation. A function v (x) is regularly

varying if for some ζ ∈ R and for all t > 0

limx→∞

v (tx)

v (x)= tζ .

This implies that one can write v (x) = xζl (x), where l (x) is referred to as a slowly

varying function, i.e., a regularly varying function with ζ = 0. The definition of a smoothly

varying function is as follows (see e.g. Bingham et al. 1989).

Definition 1. A positive function v defined on some neighbourhood of infinity varies

smoothly with index ζ ∈ R if for all n ≥ 1

limx→∞

xnv(n) (x)

v (x)= ζ (ζ − 1) ... (ζ − n+ 1) , (8)

where v(n) (x) denotes the n-th derivative of v (x).

Literally speaking, a smoothly varying function is a regularly varying function that

does not oscillate too much. More importantly, any regularly varying function can be

approximated by a smoothly varying function asymptotically (Theorem 1.8.2 of Bingham

et al. 1989). Since we are concerned with the tail behavior of the productivity distribution

and operationally smoothly varying functions will be used, this theorem ensures that our

results also apply to regularly varying functions. Note that if l (x) is a smoothly and slowly

11

varying function, then Definition 1 implies that

limx→∞

xl′ (x)

l (x)= lim

x→∞x2 l′′ (x)

l (x)= 0. (9)

We now formally state our assumption on the demand and innovation cost functions as

follows.

Assumption 1. The inverse demand function of each variety can be written as p = D (q;A) ≡

q−1σQ (q;A), where σ > 1 and lim

q→∞Q (q;A) = CQ > 0. The innovation cost function can

be written as k (ϕ) = γV (ϕ) ≡ γϕβL (ϕ), where β > 1 and limϕ→∞

L (ϕ) = CL > 0.

Both Q and L are slowly varying functions because they have positive limits at infinity.

Assumption 1 thus implies that both the demand and the innovation cost functions are

regularly varying. As mentioned, we work with the smoothly varying representations of

these functions without loss of generality. As we will show shortly that there are one-to-

one mappings at the tails between γ → 0 and ϕ → ∞ and between ϕ → ∞ and q → ∞,

the requirement of σ > 1 is needed to ensure that the demand is consistent with monopoly

pricing at these tails.

Assumption 1 essentially requires the demand to be asymptotically CES, but the ad-

missible class of demand is actually more general than it seems at first glance. Needless

to say, this includes the CES demand. As shown in Table 1, several important classes of

demand functions with variable demand elasticity also satisfy this assumption.7 For ex-

ample, Assumption 1 includes several demand classes that exhibit “manifold invariance”

(Mrázová and Neary 2017),8 including Bipower Direct demand, Bipower Inverse demand,

Pollak Family demand (Pollak 1971, which is equivalent to the HARA [Hyperbolic Ab-

solute Risk Aversion] preference [Merton 1971; Zhelobodko et al. 2012]), PIGL (Price-

Independent Generalized Linear) demand (Muellbauer 1975), QMOR (Quadratic Mean

7The details are provided in Online Appendix. See https://wthsu.weebly.com/.8A demand manifold depicts a relation between price elasticity and the curvature of the demand func-

tion, and the demand manifolds in these two classes are invariant to changes in general-equilibrium objects,making them powerful tools for inferring demand/welfare by micro-level information such as firm sales andmarkups.

12

https://wthsu.weebly.com/uploads/8/4/4/0/8440121/productivity_investment_10122019_onlineappendix.pdf

Demand class Functional form Inverse demand

Bipower Directq = ap−ν + ap−σ ≡ q (p)

p = q−1σ

(a[q−1 (q)

]σ−ν+ a) 1σ

σ > 1, σ > ν, a > 0

Pollak (HARA)q = a+ ap−σ

p = q−1σ a

1σ

(1− a

q

)− 1σ

σ > 1, a > 0

PIGLq = ap−1 + ap−σ ≡ q (p)

p = q−1σ

(a[q−1 (q)

]σ−1+ a) 1σ

σ > 1, a > 0

QMORq = apr−1 + ap

r2−1 ≡ q (p)

p = q1r−1

(a+ a

[q−1 (q)

]− r2

) 11−r

σ ≡ 1− r > 1, a > 0

Bipower Inversep = aq−ν + aq−

1σ

p = q−1σ

(aq

1σ−ν + a

)σ > 1, ν > 1/σ, a > 0

CEMR (Inverse PIGL)p = aq−1 + aq−

1σ

p = q−1σ

(aq

1−σσ + a

)σ > 1, a > 0

CREMRp = a

q (q − a)σ−1σ

p = q−1σ a(

1− aq

)σ−1σ

σ > 1, a > 0, q > aσ

Table 1: Examples of demands satisfying Assumption 1

of Order r) expenditure function (Diewert 1976; Feenstra 2018), and CEMR (Constant

Elasticity of Marginal Revenue) demand. It also includes CREMR (Constant Revenue

Elasticity of Marginal Revenue) demand (Mrázová, Neary, and Parenti 2017).9 10

The assumption on the innovation cost function parallels that on the inverse demand

function. Obviously, simple power functions are included, but general polynomial func-

tions are also included.9Mrázová, Neary, and Parenti (2017) have shown that CREMR is the only consistent demand class in

a monopolistic competitive framework when both the productivity and sales distributions are required to be“general power functions”. As will be shown shortly, Assumption 1 leads to power laws for both productivityand sales distribution. Nevertheless, it is worth noting that distributions with power-law tails are not neces-sarily “general power functions”, whereas general power functions do not necessarily exhibit power laws intheir tails. Thus, neither our framework nor Mrázová, Neary, and Parenti’s (2017) is a subset of the other.

10Note that the CARA (Constant Absolute Risk Aversion) demand is excluded because its price elasticitytends to 0 when q goes to infinity, inconsistent with the requirement that σ > 1. To see this, observe thatthe CARA demand can be written as q = a − b ln p, where a > 0, b > 0. Its price elasticity equals b/q.Linear demand is also excluded because q is a finite value when p = 0. Put differently, the linear demand isinconsistent with power laws as it never generates unbounded firm sales.

13

2.2.3 Equilibrium quantity and productivity

Note that one key step in the illustrative example involves inverting (5). Compared with

this example, there is an issue about when the inverse demand function D and innovation

cost function V are generalized, whether a unique solution of quantity given productivity

and that of productivity given firm type exist so that q (ϕ) and ϕ (γ) are well-defined. By

dealing with these choice problems in the limit with regular variation, we now show that

these solutions are indeed unique at least for firms with small γ.

For any given ϕ, the first- and second-order conditions for an interior solution q from

(1) in the production stage are

p′q + p− ϕ−1 = 0 (10)

p′′q + 2p′ < 0. (11)

With the law of demand, these imply that |ε (q)| ≡ −p/ (qp′) > 1, and µ (q) ≡ − (p′′q) /p′ <

2. Namely, at the interior solution q, the demand elasticity needs to be greater than 1 so as

to be consistent with monopoly pricing, and the convexity of the demand curve needs to be

sufficiently small in order to satisfy the second-order condition.

Note that Assumption 1 only regulates the inverse demand p = D (q;A) for large

values of q. As there is no guarantee that the profit function will be strictly concave or

quasi-concave in the entire domain of q, there may exist corner solutions to the profit-

maximization problem or multiple solutions satisfying (10) and (11). As we are concerned

with the right tail of the firm size distribution (in terms of sales revenue s = D (q;A) q),

what is relevant is large values of q. This is because by Assumption 1, limq→∞

s (q) =

limq→∞

q1− 1σQ (q;A) =∞.

With Assumption 1, (10) and (11) can be written as

ϕ = q1σ

[Q×

(1− 1

σ+ q

Q′

Q

)]−1

(12)

14

q−1σ−1Q

[− 1

σ

(1− 1

σ

)+ 2

(1− 1

σ

)qQ′

Q+ q2Q

′′

Q

]≡ πqq (q, ϕ) < 0. (13)

The following assumption rules out the corner solution, and we have following lemma.

Assumption 2. The inverse demand function D is such that the revenues around q remain

finite. Namely, limq→q

s (q) <∞.

Lemma 1. Suppose that Assumption 1 holds. For any ϕ that is sufficiently large, there

exists a unique solution to (12) which is denoted as q∗ (ϕ). Moreover, q∗ (ϕ) strictly in-

creases in ϕ and limϕ→∞

q∗ (ϕ) = ∞. If, in addition, Assumption 2 holds, then q∗ (ϕ) is the

unique profit-maximizing quantity and limϕ→∞

π (ϕ) =∞.

Proof. Applying (9), qQ′

Qtends to zero and Q tends to a constant when q → ∞. For a

firm with an arbitrarily large ϕ, there exists a large q that satisfies (12) because the term

in the bracket tends to a constant. However, there is a possibility that this firm with an

arbitrarily large ϕ might choose a finite q such that the term Q ·(

1− 1σ

+ qQ′

Q

)tends

to zero. Nevertheless, plugging (12) into (1) entails π (ϕ) = q1− 1σQ(

1σ− qQ′

Q

)− κD.

Assumption 1 and (9) imply that when q becomes arbitrarily large as ϕ becomes arbitrarily

large, then the profit also becomes arbitrarily large. However, if a finite q is chosen, then

because this q is such that either 1σ− qQ

′

Qtends to one or Q tends to zero, the resulting

profit must be finite. Thus, q∗ is unique and limϕ→∞

q∗ (ϕ) = ∞. As a result, when ϕ (and

hence q) becomes arbitrarily large, the second-order condition (13) is satisfied because of

(9). Applying the implicit function theorem on (12) and noting that πqq (q∗, ϕ) < 0, we

have

dq∗

dϕ= − ϕ−2

πqq (q∗, ϕ)> 0. (14)

Finally, the only concern that q∗ is not the profit-maximizing quantity is that it might be

dominated by a corner solution at q.11 For this concern to be valid, it requires that the profit

tends to infinity as q → q. This, in turn, requires that q forms an asymptote of the demand

11Note that it is impossible for a profit-maximizing quantity to be a finite q0 > q ≥ 0 because this wouldimply that lim

q→q0p (q) =∞, which violates the law of demand.

15

curve so that limq→q

s (q) = ∞.12 This possibility is ruled out by Assumption 2, and thus q∗

is the unique profit-maximizing quantity.

In the innovation stage, a firm chooses ϕ to maximize its profit. With the envelope

theorem, the first-order and second-order conditions can be written as

γ =q∗ (ϕ)

ϕ2V ′ (ϕ). (15)

−2ϕ−3q∗ (ϕ) + ϕ−2∂q∗ (ϕ)

∂ϕ− γV ′′ (ϕ) < 0. (16)

The innovation cost function must be sufficiently convex so that (16) holds. Lemma 2

below shows that the second-order condition holds for large ϕ if and only if β > σ − 1.

It is intuitive that a firm endowed with a higher R&D ability (smaller γ) invests more

and obtains a higher productivity; as γ tends to 0 then the productivity tends to infinity.

The following lemma establishes this intuition.

Lemma 2. Suppose that Assumptions 1 and 2 hold, and that β > σ − 1. For those firms

with sufficiently small γ, the optimal choice of ϕ exists and is unique. Such an optimal

choice is denoted as ϕ∗ = ϕ (γ). Moreover, ϕ∗ is strictly decreasing in γ, and thus the

inverse function exists and is denoted as γ (ϕ) and limϕ→∞

γ (ϕ) = 0.

Proof. Plugging (12) into (15), the first-order condition can be written as

γ =

[Q (ϕ)

(1− 1

σ+ q∗ (ϕ) Q′(ϕ)

Q(ϕ)

)]σL (ϕ)

[β + ϕL′(ϕ)

L(ϕ)

] ϕ−(β−σ+1). (17)

Using (12), (13), (14), and (17), the left-hand side of (16) becomes

−Qσ(

1− 1

σ+ q

Q′

Q

)σϕσ−3

[2 +

1− 1σ + qQ

′

Q

− 1σ

(1− 1

σ

)+ 2

(1− 1

σ

)qQ′

Q + q2Q′′

Q

+β (β − 1) + 2βϕL

′

L + ϕ2 L′′

L

β + ϕL′

L

].

(18)

12The Pollak demand with σ > 0, a > 0, and a > 0 is such an example. Here, the demand requires thatq > a, and s (q) being increasing (concave) in q when q > σ

σ−1 a (q > 2σσ−1 a). However, the optimal output

degenerates to a for all ϕ because limq→a

π (q) = limq→a

(s (q)− ϕ−1q

)=∞.

16

Assumptions 1 and 2 imply that Lemma 1 holds. Assumption 1, Lemma 1, and (9) imply

that for large values of ϕ, (18) converges to −CσQ

(σ−1σ

)σ(β − σ + 1) lim

ϕ→∞ϕσ−3, which is

strictly negative if and only if β > σ − 1. As a result, for a firm with an arbitrarily small γ

there exists a large ϕ, denoted as ϕ∗, satisfying (17) and the second-order condition (16).

However, there is a possibility that this firm with an arbitrarily small γ might choose

a finite ϕ such that either Q (ϕ)(

1− 1σ

+ q∗ (ϕ) Q′(ϕ)Q(ϕ)

)tends to 0 or β + ϕL′(ϕ)

L(ϕ)tends to

infinity so that (17) holds. Note that V ′ > 0 would be violated if limϕ→ϕ0

L (ϕ) =∞ for some

finite ϕ0. With (12) and (17), (3) becomes

Π =Qσ ·(

1− 1

σ+ q∗

Q′

Q

)σ [ 1σ− q∗Q′

Q

1− 1σ

+ q∗Q′

Q

− 1

β + ϕL′

L

]ϕσ−1 − κD.

This implies that if Q (ϕ)(

1− 1σ

+ q∗ (ϕ) Q′(ϕ)Q(ϕ)

)tends to 0 or β + ϕL′(ϕ)

L(ϕ)tends to infinity

at some finite ϕ, then the profit is finite. In contrast, the profit becomes arbitrarily large for

an arbitrarily large ϕ. Thus, a finite ϕ would not be the solution to (17) when γ becomes

arbitrarily small, and hence ϕ∗ is the unique solution and is denoted as ϕ (γ).

For large values of ϕ, it is readily verified that (16) implies that the derivative of the

right-hand side of (15) is negative. Hence, ϕ′ (γ) < 0 and the inverse function γ (ϕ) is

well-defined. Obviously, limϕ→∞

γ (ϕ) = 0.

As in Melitz (2003), the existence of a fixed cost of production κD > 0 gives rise to

firm selection. This means that a successful entrant must be capable enough to obtain a

high enough productivity to survive. As Π (ϕ (γ) ; γ) = π (ϕ (γ))− γV (ϕ (γ)), dΠ/dγ =

−V < 0 by the envelope theorem. Thus, any firm produces if and only if γ ≤ γD, where

γD is defined by Π (ϕ (γD) , γD) = π (ϕ (γD))− γDV (ϕ (γD)) = 0.13

13The following definition of γD implicitly assumes continuity of Π in γ. Note that smooth variationguarantees that all relevant functions are continuous for large values of q and ϕ and small values of γ.However, even when Π is discontinuous in some large values of γ, a cutoff γD can still be well-defined aslong as Π strictly decreases in γ.

17

2.2.4 Power laws for productivity and firm size

Now we are ready to show how the power laws for productivity and firm size arise. Observe

that the p.d.f. of productivity is

g (ϕ) =f (γ (ϕ))

F (γD)|J (ϕ)| ,

where J (ϕ) is the Jacobian, and (See Appendix A.2)

|J (ϕ)| =∣∣∣∣∂γ (ϕ)

∂ϕ

∣∣∣∣ =

∣∣∣∣ ∂∂ϕ q∗ (ϕ)

ϕ2V ′ (ϕ)

∣∣∣∣ (19)

=Qσ

L

(1− 1

σ+ q∗Q

′

Q

)σβ + ϕL′

L

·

[2 +

β (β − 1) + 2βϕL′

L+ ϕ2L′′

L

β + ϕL′

L

+1− 1

σ+ q∗Q

′

Q

− 1σ

(1− 1

σ

)+ 2

(1− 1

σ

)q∗Q

′

Q+ (q∗)2 Q′′

Q

]· ϕ−(β−σ+1)−1.

Proposition 1 is our key result.

Proposition 1. Suppose that Assumptions 1 and 2 hold. Also suppose that f (γ) = γαm (γ)

where α > −1 and limγ→0m (γ) = Cm, and that β > α+2α+1

(σ − 1). Then, in equilib-

rium both the productivity and firm size distributions exhibit power laws with tail indices

(α + 1) (β − σ + 1) and (α+1)(β−σ+1)σ−1

respectively.

Proof. We sketch the proof as follows; for the detailed proof, see Appendix A.2. Note

that α > −1 and β > α+2α+1

(σ − 1) ensure that β > σ − 1; hence, with Assumptions 1

and 2, Lemmas 1-2 hold. For the free-entry condition to hold, the expected profit must

be finite, i.e.,∫ γD

0Π (ϕ (γ) ; γ) dF (γ) < ∞. Whether this integral is finite depends on

small γ (that is, the high-capability firms), and what matters is essentially the orders of

demand, σ, innovation cost function, β, and the distribution of failure probability around

γ = 0. We show in Appendix A.2 that this is ensured when α > −1 and β > α+2α+1

(σ − 1).

Intuitively, the innovation cost function must be sufficiently convex. Observe (19). First

note that Assumption 1, Lemmas 1-2, and (9) imply that Q (q;A) and L (ϕ) converge to

some constants CQ and CL, respectively, and that q∗Q′

Q, ϕL′

L, (q∗)2 Q′′

Q, and ϕ2L′′

Lall go to

18

zero. Thus, |J (ϕ)| converges to a power function of ϕ with exponent − (β − σ + 1) −

1 < 0. The assumption on f (γ (ϕ)) allows us to write g (ϕ) = γ(ϕ)αm(γ(ϕ))F (γD)

|J (ϕ)| ,and

as (17) implies that γ (ϕ)αm (γ (ϕ)) converges to a power function of ϕ with exponent

−α (β − σ + 1). Thus, the productivity distribution exhibits a power law with a tail index

(α + 1) (β − σ + 1). Following the same procedure, the firm size distribution also exhibits

a power law with a tail index (α+1)(β−σ+1)σ−1

.

Proposition 1 establishes how power laws can emerge from a general environment in a

standard general-equilibrium model. As discussed in Section 2.2.2, regularly varying de-

mand includes a large class of non-CES and non-homothetic preferences. The requirement

for the innovation cost function V to be regularly varying is also general as it includes all

polynomial functions that are increasing when ϕ goes to infinity and sufficiently convex so

that β > α+2α+1

(σ − 1).

That the distribution f (γ) being regularly varying around 0 with the slowly varying

part converging to a constant is also more general than it seems. This includes many well-

known, widely-used distributions such as Beta (which subsumes the uniform), Gamma, F,

and Weibull.14 Table 215 provides a list of examples in this class. Compared with Geerolf’s

(2017) power-law result, Proposition 1 is more general as his key condition is equivalent

to limγ→0

f (γ) = K > 0 used in our illustrative example in Section 2.2.1 and is a special case

here, i.e., α = 0.16

Proposition 1 connects with the empirical regularity in firm size and provides a mi-

crofoundation for assuming power-law distributions in the theoretical literature, e.g. the

Pareto, Fréchet and Nigai’s (2017) two-piece distributions. We now turn to a general open

14For the distributions that are defined on (0,∞), proper truncation to the right is needed as the distributionof γ is on (0, 1].

15For 0 to be in the support of Generalized Pareto, we require µ ≤ 0 for ξ ≥ 0, and µ ≤ 0 ≤ µ − σξ for

ξ < 0. The parameters of other distributions are all positive.16It can be shown that if m (γ) is slowly varying around 0 (not necessarily converges to a constant), then

both the productivity and firm size distributions are regularly varying. By applying Proposition 4.6 of Cooke,Nieboer, and Misiewicz (2014; pp. 53-54), the condition that β > α+2

α+1 (σ − 1) ensures that the expectedprofit is finite, and Proposition 1.5.7 from Bingham et al. (1989) ensures that m (γ (ϕ)) is slowly varying inϕ to its right tail.

19

Distribution f (γ) ∝

Beta γα (1− γ)b−1

Gamma γαe−γb

F γα [b+ 2 (1 + α) γ]−2(1+α)+b

2

Weibull (1+α)b

(γb

)αe−( γb )

α

Kumaraswamy (1 + α) bγα(1− γ1+α

)b−1

Inverse Pareto b (1 + α) γα

Log-Logistic(

1+αb

) (γb

)α [1 +

(γb

)1+α]−2

Rayleigh γa2 e− γ2

2a2

Generalised Pareto 1σ

(1 + ξ γ−µσ

)−(1+ 1ξ

)

Table 2: Examples of distributions that are regularly varying around 0 with the slowlyvarying part converging to a constant

economy to investigate whether and how power laws hold in that environment.

2.3 Power Laws in Open Economy

2.3.1 Model setup in open economy

There are n + 1 asymmetric countries with the asymmetry in possibly every aspect of the

model. Not only can all the trade cost, entry cost, and fixed cost of production parame-

ters vary across countries, but the inverse demand function Di, innovation cost function

ki, and the density function of failure probability fi can all be country-specific (and hence

σi, βi, αi can also be country-specific). Similar to the closed-economy case, Assump-

tions 1 and 2 are assumed to hold with CQ,i and CL,i also allowed to be country-specific.

The timing is the same as in the closed economy, and in the production stage each firm

can determine whether to export, and, if yes, the price and quantity of exported goods.

After paying the fixed cost of production κD,i, the profit of a firm located in country i

obtained from selling to country j is πij (ϕ) = pijqij − τijwiϕ−1qij − κij , where τij ≥ 1

20

denotes the variable trade cost, κij the fixed selling cost from i to j, and wi the wage in

country i. A firm produces if and only if∑

j πij (ϕ) ≥ κD,i.

2.3.2 Equilibrium and power laws for productivity and firm size

Given ϕ, the first-order condition for qij is similar to (12) and is given as follows:

ϕ = wiτijq1σj

ij

[Qj ×

(1− 1

σj+ qij

Q′jQj

)]−1

. (20)

It is straightforward to see that Lemma 1 holds here. That is, we have limϕ→∞

q∗ij (ϕ) = ∞

and limϕ→∞

πij (ϕ) = ∞. Note that when ϕ becomes arbitrarily large, the firm must sell to

every market j because the fixed selling cost κij is fixed while the gross profit also becomes

arbitrarily large. Observe that for a given γ, the first-order condition is

γ =

∑j Iijτijq∗ij (ϕ)

ϕ2V ′i (ϕ), (21)

where Iij = 0, 1 is the indicator function that indicates whether the firm with γ at country

i sells to country j. Combining (20) with (21), (21) becomes

γ =

∑j Iijτ

1−σjij w

−σji Q

σjj ·(

1− 1σj

+ q∗ijQ′jQj

)σj· ϕσj−βi−1

Li ·(βi + ϕ

L′iLi

) . (22)

Each component in the numerators of (22) is similar to those in the closed-economy case.

Thus, for an arbitrarily small γ, there exists a corresponding large ϕ such that (22) holds

with Iij = 1 for all j. Following similar procedure, it is also straightforward to verify that

Lemma 2 also holds here. That is, if βi > σj − 1 for all i and j, the optimal choice of

ϕ exists and is unique for firms with sufficiently small γ. Denote this optimal choice as

ϕ∗i = ϕi (γ); we have ϕ′i (γ) < 0 and limϕ→∞

γi (ϕ) = 0.

Appendix A.3.1 shows if αi > −1 and βi > αi+2αi+1

(maxj σj − 1), then the expected

profit of entrants in each country remains finite. Since we are concerned with the tail

behavior of the productivity distribution, it suffices to focus on the right-most piece of the

21

productivity distribution. The corresponding Jacobian is obtained by differentiating (21):

|Ji (ϕ)| =

∣∣∣∣∣∣∣∣∂γi (ϕ)

∂ϕ︸︷︷︸−

∣∣∣∣∣∣∣∣ = −n∑j=0

∂

∂ϕ

τijq∗ij (ϕ)

ϕ2V ′i (ϕ). (23)

Obviously, each component of (23) is similar to (19), and converges to a power function

of ϕ with exponent − (βi + 2− σj). Following the same argument to Proposition 1, Ap-

pendix A.3.2 shows that the productivity distribution exhibits a power law with the tail

index (αi + 1) (βi + 1−maxj σj).

We now turn to the firm size distribution. Denote sij as a firm’s sales from i to j

and thus the firm size of the firms that export to all countries is s ≡∑n

j=0 sij . Noting

that ∂s∂ϕ

=∑n

j=0∂sij∂ϕ

=∑n

j=0∂sij∂qij

∂qij∂ϕ

and following a similar procedure to the proof of

Proposition 1, Appendix A.3.3 shows that firm size distribution also follows a power law

with the tail index (αi+1)(βi+1−maxj σj)

maxj σj−1. Thus, we have the following proposition.

Proposition 2. Suppose that Assumptions 1 and 2 hold. For all i ∈ 0, 1, 2, ..., n,

suppose that fi (γ) = γαimi (γ) where αi > −1 and limγ→0mi (γ) = Cm,i,and that

βi >αi+2αi+1

(maxj σj − 1). Then, the productivity distribution in each country i has a power

law tail with a tail index of (αi + 1) (βi + 1−maxj σj), and the distribution of firm size

has a power law tail with a tail index of (αi+1)(βi+1−maxj σj)

maxj σj−1.17

The tail indices of both the productivity and firm size distributions in each country i

are associated with its technology parameters αi and βi, as well as the largest σj among all

destination countries. As a larger σj generally implies a larger elasticity of substitution and

larger price elasticity, the destination with the largest σj entails the largest responsiveness

of firm sales to productivity changes. Thus, the destination with the largest σj plays the

dominant role in determining the tail indices of every source country. The same logic

applies analogously for the firm size distribution. Proposition 2 implies that opening up

17Note that the statement about tail indices here resembles the well-known theorem that the tail index ofa sum of independent Pareto random variables is the minimum of the tail indices of these random variables.However, the different components of (23) are not literally independent random variables.

22

to trade causes the tails of both productivity and firm-size distributions in each country to

(weakly) fatten.

In the trade model by di Giovanni, Levchenko, and Rancière (2011), productivity and

firm size distributions are assumed to be Pareto and their tail indices are exogenous and

thus not affected by trade. They show that trade may cause the empirical estimates of tail

indices to be lower than the true ones. However, Proposition 2 here implies a very different

message from theirs because the true tail indices in our model can be affected by trade.

3 The Effect of Trade on Productivity Distribution

This section analyzes the effects of trade on productivity distribution. For tractability, we

follow Melitz (2003) by assuming n + 1 symmetric countries and CES demand in this

and the next sections. In particular, for the welfare analysis in the next section, the CES

demand is needed to be comparable with the ACR formula. Also for tractability, we assume

a power function for the innovation cost: k = γϕβ . We allow the distribution of γ to be

general until Section 4.2 where we need to generate a Pareto productivity distribution for

comparison purposes. Given the functional-form assumptions on the inverse demand and

innovation cost, Assumption 1 is satisfied. The profit-maximizing solution of q∗ (ϕ) and

ϕ (γ) must be interior and unique given by the relevant first- and second-order conditions.

Thus, Assumption 2 is no longer needed.

To solve the model, we start with the production stage. Given their productivities, the

profits of a non-exporting firm and an exporting one are ΠD (ϕ) = πD (ϕ) − γϕβ and

ΠX (ϕ) = πD (ϕ) + nπX (ϕ) − γϕβ , respectively, where the domestic market is denoted

by subscript D and each foreign market is denoted by subscript X , and

πD (ϕ) =N

P 1−σ

(σ−1σ

)σσ − 1

ϕσ−1 − κD

πX (ϕ) =τ 1−σ N

P 1−σ

(σ−1σ

)σσ − 1

ϕσ−1 − κX .

In the innovation stage, a firm decides its productivity level according to whether it serves

23

the foreign market or not. The productivity level is given by

ϕ (γ) =

(

NP 1−σ

) 1β−σ+1

[(σ−1

σ )σ

β

] 1β−σ+1

γ−1

β−σ+1 for non-exporting firms

φ(

NP 1−σ

) 1β−σ+1

[(σ−1

σ )σ

β

] 1β−σ+1

γ−1

β−σ+1 for exporting firms

, (24)

where φ ≡ (1 + nτ 1−σ)1

β−σ+1 . Since exporting decisions are made after the firm has in-

vested in its productivity, the firm chooses a higher productivity level if it plans to export

afterward. The ratio φ can thus be interpreted as the productivity advantages of the export-

ing firms versus the non-exporting ones. Since countries are symmetric, the argument in

Appendix A.3.1 implies that (24) is optimal if and only if β > σ − 1. The above leads to

ΠD (γ) =

(N(σ−1σ

)σβP 1−σ

) ββ−σ+1

β − σ + 1

σ − 1γ−

σ−1β−σ+1 − κD (25)

ΠX (γ) =

(N(σ−1σ

)σβP 1−σ

) ββ−σ+1 [

β (1 + nτ 1−σ)

σ − 1φσ−1 − φβ

]γ−

σ−1β−σ+1 − κD − nκX . (26)

Observe that the gross profits are proportional to γ−σ−1

β−σ+1 . The cutoffs are given by

γD =

1

κD

(N(σ−1σ

)σβP 1−σ

) ββ−σ+1 β − σ + 1

σ − 1

β−σ+1σ−1

(27)

γX =

1

nκX

(N(σ−1σ

)σβP 1−σ

) ββ−σ+1

[β(1 + nτ1−σ)φσ−1

σ − 1− φβ − β − σ + 1

σ − 1

]β−σ+1σ−1

, (28)

such that ΠD (γ) ≥ 0 if and only if γ ≤ γD and ΠX (γ) ≥ ΠD (γ) if and only if γ ≤ γX .

From (27) and (28), we have

δ ≡ γXγD

=

(κDnκX

)β−σ+1σ−1 [(

1 + nτ 1−σ) ββ−σ+1 − 1

]β−σ+1σ−1

. (29)

Notice that, if γD ≤ γX , then all operating firms are exporters, which is counter-factual.

24

Thus, similar to the literature, we consider only the case of γX < γD, i.e., δ < 1, which re-

quires trade frictions κX or τ to be sufficiently large relative to the fixed cost of production

κD.

The free-entry condition is E (Π) =∫ γX

0ΠX (γ) dF (γ) +

∫ γDγX

ΠD (γ) dF (γ) = κe.

An equilibrium is accordingly defined by (24), (27), (28), the free-entry condition and the

price index:

P 1−σ =Me

[∫ γD

γX

(σ − 1

σ

)σ−1

ϕ (γ)σ−1 dF (γ) +

∫ γX

0

(σ − 1

σ

)σ−1

ϕ (γ)σ−1 dF (γ)

](30)

+ nMe

∫ γX

0τ1−σ

(σ − 1

σ

)σ−1

ϕ (γ)σ−1 dF (γ) ,

where Me denotes the mass of entrants paying the entry cost. The price index is composed

of three terms. The first and second terms are associated with the prices charged by do-

mestic non-exporting and exporting firms, respectively. The third term is associated with

the imported goods. Note that by (24), there is a jump in the function ϕ (γ) at γX .

The following proposition establishes the unique existence of equilibrium.

Proposition 3. Suppose that f (γ) = γαm (γ) where α > −1 and m (γ) is slowly varying

around the origin, β > α+2α+1

(σ − 1) and δ < 1 where δ is defined by (29). Then, E (Π)

is a strictly increasing function of γD. If κe ∈ (0, E (Π) |γD=1), then a unique equilibrium

exists, and there are both exporters and non-exporters in the economy. Also, the power

laws in productivity and firm size hold.

Proof. That the power laws hold follow immediately from Proposition 1. For the rest, see

Appendix A.4.

We explore how the iceberg cost τ affects the productivity distribution. The results

are summarized in the following proposition, and are illustrated in Figure 1. The proof is

relegated to Appendix A.5.

Proposition 4. Assume that the conditions of Proposition 3 hold. An increase in τ results in

a higher γD and a lower γX . Productivity ϕ increases (decreases) for any non-exporting

25

Figure 1: The effect of increasing τ

(exporting) firm which remains non-exporting (exporting) after the shock. Productivity

decreases for any firm which switches from exporting to non-exporting after the shock.

To see the intuition behind how τ affects selection and exporting cutoffs, first note that

an increase in τ makes exporting more difficult so that firms must be more efficient in inno-

vation to become exporters. Therefore, the exporting cutoff γX decreases. Because fewer

exporters entails less import competition faced by the firms in the domestic market, the

selection of firms becomes more lenient and the surviving cutoff γD increases accordingly.

Rearranging (27), we have P ∝ γ1β

D . Thus, a higher γD induced by a higher τ raises

the price index, which reflects the fact that differentiated goods are more expensive in

unit of labor when trade frictions are larger. Due to less import competition, for non-

exporting firms which remain non-exporting, they have more incentive to acquire a higher

productivity, as is evident by observing (24) and (25). For exporting firms which remain

exporting, their domestic profits also benefit from less import competition, but as their

productivity advantage φ shrinks with larger trade friction, their effective market sizes

may shrink (see [24] and [26]). The latter force dominates the former and hence their

productivities actually reduce. A lower γX implies that some firms switch from exporting

to not exporting. For these firms, their productivities decrease because of the loss of the

foreign market.

In other words, trade liberalization increases (decreases) exporters’ (non-exporters’)

innovation effort and productivity; it also makes more firms choose to export but selection

26

become tougher.

4 Welfare Gains from Trade

This section analyzes the properties of welfare gains from trade in our model and then car-

ries out a corresponding quantitative analysis. As is standard, welfare in both our model

and the ACR framework is measured by Wj = wjNj/Pj. We are concerned with the wel-

fare gains from trade, d lnW/d ln τ . ACR show that under CES demand and certain macro

restrictions, the welfare change from a small change in the trade cost, d ln τ , is given by1εd lnλ, where λ is the expenditure share on domestic goods, and ε = ∂ ln (Xij/Xjj) /∂ ln τ

with i 6= j is the trade elasticity. We refer to this result as the local ACR formula. If the

trade elasticity is invariant in τ , then the welfare change from a large change in the trade

cost can be expressed as W ′/W = (λ′/λ)1/ε, and this is referred to as the global ACR

formula. As the trade elasticity depends on λ, the main message from ACR is that trade

flows provide sufficient statistic to the welfare gains from trade.

In ACR (2012), technology choice is also incorporated, and the choice is made simul-

taneously with production and sales, and thus the technology innovation is multiplicative

in the overall fixed cost of production and exporting. Our model is different from the ACR

framework because of innovation occurs after entry and before production and sales. But,

as will be shown shortly, it turns out that the welfare gains from trade in our model still

follow the local ACR formula. However, the trade elasticity varies with τ , and hence the

global ACR formula is not applicable.18

4.1 Welfare Formula and Trade Elasticity

Appendix A.4 shows that the free-entry condition can be written as

κDγσ−1

β−σ+1

D

ΓD +

[(1 + nτ 1−σ) β

β−σ+1 − 1]

ΓX

− κDF (γD)− nκXF (γX) = κe, (31)

18One can, of course, obtain the gains from trade under large changes in τ by integrating over the localformula.

27

where Γz ≡∫ γz

0γ−

σ−1β−σ+1dF (γ) for z ∈ D,X. With (29), (31) determines equilibrium

γD. Note that ΓzF (γz)

is proportional to the average productivity of firms in (0, γz). Therefore,

Γz measures the contribution of the productivities in (0, γz) to the expected profit of an

entrant. Let ηz denote the elasticity of Γz to the cutoff γz.

The following proposition shows that, for any distribution of γ, the local ACR formula

holds with a variable trade elasticity. Under the symmetric country setting with wages

normalized to 1, the expenditure share on the product imported from a foreign country

equals (1− λ) /n and the trade elasticity equals ε = d ln(

1−λnλ

)/d ln τ .

Proposition 5. Suppose that the conditions of Proposition 3 hold. For a general distribu-

tion of γ, F (.) , the welfare gains from trade follow the local ACR formula:

d lnW

d ln τ=

1

ε

d lnλ

d ln τ= − (1− λ) . (32)

The trade elasticity is given by

ε = (σ − 1)ΓD − ΓX

ΓD + (φσ−1 − 1) ΓX

d lnφ

d ln τ+ (1− σ) (33)

+ΓD

ΓD + (φσ−1 − 1) ΓXηXd ln γX

γD

d ln τ+

ΓDΓD + (φσ−1 − 1) ΓX

(ηX − ηD) β (1− λ) ,

where the domestic expenditure share is given by

λ =ΓD + (φσ−1 − 1) ΓX

ΓD + [(1 + nτ 1−σ)φσ−1 − 1] ΓX.

Proof. See Appendix A.6.

Note that the ACR local formula holds regardless of the distribution of γ or produc-

tivity. As shown in the ACR paper, the ACR formula holds for the Melitz model with an

(exogenous) Pareto productivity distribution under a general asymmetric-country setting.

Here we show that under a symmetric-country setting, the distributional form assumption

is dispensable.

It is readily verified with a numerical example that the trade elasticity is variable in

τ and depends on the distribution of γ. Section 3 shows that trade costs affect firm-level

28

productivities, as well as selection and exporting cutoffs. In particular, the productivity

schedule across firm types has a jump at the exporting cutoff γX , and trade cost affects

the productivities of exporters and non-exporters in opposite ways. Following the same

procedure in Appendix A.6, one can show that if κX = 0, in which case every surviving

firm is an exporter, the trade elasticity equals 1 − σ. That is, when all surviving firms are

exporters, the productivity schedule no longer has a jump, and trade costs affect firm-level

productivities in similar ways. As a result, the trade cost affects trade flows only through

the intensive margin in a multiplicative manner, and the trade elasticity becomes a constant.

4.2 Quantitative Analysis of Welfare Gains from Trade

In this subsection we conduct a quantitative analysis of welfare gains from trade. In par-

ticular, to assess the role of innovation quantitatively, we compare with the Melitz model

with an exogenous Pareto productivity distribution (henceforth MP), as both our model and

MP satisfy the (local) ACR formula but differ only in how the productivity distribution is

generated. Formally, the density function of the productivity distribution in the MP model

is denoted as gMP (ϕ) = θMPϕ−θMP−1 where θMP > σ − 1 is the tail index. The trade

elasticity in MP is εMP = −θMP . Note that in general the price index can be written as

P 1−σ = P 1−σD + nP 1−σ

X , where P 1−σD and P 1−σ

X are the components of P 1−σ in which the

goods are from domestic and foreign firms, respectively. Thus, λ ≡ P 1−σD /P 1−σ. In MP,

λMP =

[1 + nτ 1−σ

(ϕXϕD

)σ−θMP−1]−1

=

1 + nτ−θMP

(κXκD

)σ−θMP−1σ−1

−1

. (34)

As mentioned, ε = d ln(

1−λnλ

)/d ln τ under the symmetric country setting, and thus

d lnWMP

d ln τ=

1

εMP

d lnλMP

d ln τ= −

(1− λMP

).

We now turn to our model, which is referred to as IN (innovation) from now on. To

single out the effect of innovation, we assume that γ is uniformly distributed so that the

29

resulting productivity distribution is similar to the Pareto distribution with the tail index

θ ≡ β − σ + 1, except that there is a jump at γX when κX > 0. The domestic expenditure

share in our model is

λ =1 + [φσ−1 − 1]

(γXγD

)1−σ−1θ

1 + [(1 + nτ 1−σ)φσ−1 − 1](γXγD

)1−σ−1θ

. (35)

To quantify the model, we calibrate the values of σ, β, n, τ , and κX/κD. We calibrate

these parameters from the viewpoint of the US in 2002. Feenstra and Weinstein (2017)

report that the median of markups in the US is 1.3. Taking this median as a representative

for our constant-markup model, σ ≈ 4.33. Under the uniform distribution of γ, δ (see

[29]) is the fraction of exporters among all (surviving) firms. As documented by Bernard,

Jensen, Redding, and Schott (2007), this fraction in the US in 2002 equals 0.18.

Denote domestic absorption and imports as DA and M . By definition, λ then equals

(DA−M) /DA. Using data from Penn World Table 9.0 (PWT 9.0), λ is 0.853 in 2002

for the US.19 To better fit our symmetric-country model, the number of countries, n + 1,

is computed as the ratio of the world GDP to that of the US. Also using PWT 9.0, this

number equals 4.41. We therefore set n = 3. We adopt the estimate of the trade elasticity

in Simonovska and Waugh (2014), which is 4.63.20 This implies that θMP = 4.63. We

calibrate β, κX/κD, and τ to match λ = 0.853, δ = 0.18, and ε = 4.63 using (29), (33),

and (35). The results are β = 7.838, κX/κD = 0.572, and τ = 2.097. This, in turn, implies

that the tail index θ = 4.51, which is rather close to θMP .

Given the calibrated parameters, we compute the local welfare gains for both IN and

MP models. We also compare the welfare gains by moving from autarky (τ → ∞) to the

current level of trade cost τ for both models. As the global ACR formula applies to the MP

model, WMP

WMPτ→∞

=(λMP

)− 1

θMP . The global ACR formula does not apply to our IN model,

19We also use the US’s Input-Output Table (obtained from OECD-IOT) as our alternative data set tocompute λ. We compute DA by subtracting the net exports from the total value added across industries. Withthis alternative data set, λ equals 0.862 and is similar to that computed with PWT 9.0.

20See their Table 7.

30

but by combining (27), (29) and (31), some algebraic manipulations yield

P β =κeκD

θ − σ + 1

σ − 1κ

θσ−1

D

[(σ−1σ

)σβ

]− βσ−1 (

θ

σ − 1

)− θσ−1

N−βσ−1

(1 + δn

κXκD

)−1

.

And thus,

W

Wτ→∞=Pτ→∞P

=

1 + n1− θ

σ−1

(κXκD

)1− θσ−1 [(

1 + nτ 1−σ)βθ − 1] θσ−1

1β

.

From autarky to the calibrated τ , IN and MP entail 3.5% and 2.5% of welfare gains,

respectively. Hence, the gains from trade in IN is 40% larger than those in MP. The welfare

elasticities to trade cost, d lnW/d ln τ , are −0.147 and −0.108, for IN and MP, respec-

tively. This implies that for small changes of τ , the welfare gains in IN is 36.1% higher

than those in MP; this is quite similar to the case comparing with autarky.

Figure 2 shows the welfare elasticity (in absolute value) under different values of τ

given the calibrated β and κX/κD. The result is plotted with the horizontal axis being 1/τ ,

of which the lower bound, 0, corresponds to autarky and the upper bound corresponds to

δ → 1 (recall that we restrict δ < 1). The welfare elasticity (actually given by 1 − λ as

in Proposition 5) increases when there is more trade openness in both IN and MP models.

The gains from trade (compared with autarky) is simply the area under the curve of welfare

elasticity. It is clearly seen from Figure 2 that the IN model entails higher gains from trade

than the MP model because λ < λMP at every value of τ . To see why this is the case,

recall from Proposition 4 that a reduction in trade cost induces exporters to invest more

and become more productive and non-exporters to invest less and become less productive.

This means that the productivity advantage of exporter vs. non-exporters widens with trade

liberalization at a larger rate than the MP model. Thus, the rate of increase in 1 − λ (the

expenditure share on imports) is larger in the IN model.

31

Figure 2:∣∣d lnWd ln τ

∣∣5 Conclusion

As highlighted by both Arkolakis et al. (2019) and Chaney (2018), the power law for pro-

ductivity or firm size is instrumental for the gravity equation. Also evidenced is the fact that

the performance of top firms is what matters the most for the aggregate economies (Gabaix

2011). Thus, understanding these power laws is of first-order importance. This paper has

demonstrated that with an innovation stage added to a standard general-equilibrium model

of trade, power laws for both productivity and firm size could emerge in a rather general

environment. The conditions on regularly varying demand, innovation cost, and density

are all more general than it may seem: the demand class includes various non-CES and

non-homothetic preferences; the class of innovation cost functions includes all polynomial

functions that are increasing when productivity goes to infinity and sufficiently convex;

the density of firm heterogeneity includes many well-known, widely-used distributions.

All of these results hold in a very general open economy in which all parameters can be

country-specific and all bilateral trade costs can be country-pair-specific.

Conditional on the same trade elasticity and values of the common parameters, quan-

titatively our model yields 40% higher welfare gains from trade than the Melitz-Pareto

model. This suggests the importance of incorporating innovation in a trade model because

innovation naturally reacts to changes in trade cost. The economics is fundamentally a

32

market-size effect that works differently for exporters and non-exporters.

As shown by the welfare analysis, welfare gains from trade critically depend on the tail

indices of these power laws, which reflects how granular the economy is. In this model,

the tail indices depend on the price elasticities and how costly it is to conduct innovation.

Interestingly, trade plays an important role because the market with the largest competi-

tiveness (largest price elasticities) dominates and determines the tail index. This provides

an important angle to comprehend trade wars. For example, the Trump administration’s

sharp increase in tariffs against Chinese products, regardless of whether it benefits or hurts

the US or global economy, will certainly have a strong negative impact on the Chinese ag-

gregate economy and welfare because the US tends to be the largest and most competitive

market, and thus affects the top Chinese firms the most.

Reference

1. Acemoglu, D., and D. Cao (2015), "Innovation by Entrants and Incumbents," Jour-

nal of Economic Theory, 157, 255-294.

2. Aghion, P, A. Bergeaud, M. Lequien, and M. J. Melitz (2018), “Exports and Innova-

tion: Theory and Evidence,” Harvard University manuscript.

3. Arkolakis, C., A. Costinot, and A. Rodríguez-Clare (2012), “New Trade Models,

Same Old Gains?,” American Economic Review, 102, 94-130.

4. Arkolakis, C., A. Costinot, D. Donaldson, and A. Rodríguez-Clare (2019), “The

Elusive Pro-competitive Effects of Trade,” Review of Economic Studies, 86, 46–80.

5. Atkeson, A., and A. T. Burstein (2010), “Innovation, Firm Dynamics, and Interna-

tional Trade”, Journal of Political Economy, 118, 433-484.

6. Axtell, R. L. (2001), “Zipf Distribution of U.S. Firm Sizes,” Science, 293, 1818-

1820.

7. Bas, M., and I. Ledezma (2015), “Trade Liberalization and Heterogeneous Technol-

ogy Investments,” Review of International Economics, 23, 738-781.

33

8. Bernard, A. B., J. Eaton, J. B. Jensen, and S. Kortum (2003), “Plants and Productivity

in International Trade,” American Economic Review, 93, 1268-1290.

9. Bernard, A. B., J. B. Jensen, S. J. Redding, and P. K. Schott (2007). "Firms in

International Trade," Journal of Economic Perspectives, 21, 105-130.

10. Bingham, N. H., C. M. Goldie, and J. L. Teugels (1989), Regular Variation, Cam-

bridge University Press.

11. Bonfiglioli, A., R. Crinò, and G. Gancia (2018), “Betting on Exports: Trade and

Endogenous Heterogeneity.” Economic Journal, 128, 612-651.

12. Bustos, P. (2011), “Multilateral Trade Liberalization, Exports and Technology Up-

grading: Evidence on the Impact of MERCOSUR on Argentinean Firms,” American

Economic Review, 101, 304-340.

13. Chaney, T. (2008), “Distorted Gravity: The Intensive and Extensive Margins of In-

ternational Trade,” American Economic Review, 98, 1707-1721.

14. Chaney, T. (2014), “The Network Structure of International Trade,” American Eco-

nomic Review, 104, 3600-3634.

15. Chaney, T. (2018), “The Gravity Equation in International Trade: An Explanation,”

Journal of Political Economy, 126, 150-177.

16. Cooke, R. M., D. Nieboer, and J. Misiewicz (2014), Fat-tailed Distributions: Data,

Diagnostics, and Dependence, John Wiley & Sons.

17. Costinot, A., & Rodríguez-Clare, A. (2015), “Trade Theory with Numbers: Quanti-

fying the Consequences of Globalization,” in E. Helpman, K. Rogoff, & G. Gopinath

(Eds.), Handbook of International Economics (Vol. 4, pp. 197-261). North-Holland:

Elsevier.

18. Di Giovanni, J., A. A. Levchenko, and R. Rancière (2011), “Power Laws in Firm Size

and Openness to Trade: Measurement and Implications,” Journal of International

Economics, 85, 42-52.

19. Diewert, W. E. (1976), “Exact and Superlative Index Numbers,” Journal of Econo-

metrics, 4, 115-145.

34

20. Eaton, J., and S. Kortum (2002), “Technology, Geography, and Trade,” Economet-

rica, 70, 1741-1779.

21. Feenstra, R. C. and D. E. Weinstein (2017), “Globalization, Markups, and US Wel-

fare,” Journal of Political Economy, 125, 1040-1074.

22. Feenstra, R. C. (2018), “Restoring the Product Variety and Pro-competitive Gains

from Trade with Heterogeneous Firms and Bounded Productivity,” Journal of Inter-

national Economics, forthcoming.

23. Fernandes, A. (2007), “Trade Policy, Trade Volumes and Plant-level Productivity

in Colombian Manufacturing Industries,” Journal of International Economics, 71,

52-71.

24. Gabaix, X. (2009), “Power Laws in Economics and Finance,” Annual Review of

Economics, 1, 255-294.

25. Gabaix, X. (2011), “The Granular Origins of Aggregate Fluctuation,” Econometrica,

79, 733-772.

26. Geerolf, F. (2017), “A Theory of Pareto Distributions,” UCLA manuscript.

27. Jan, N., L. Moseley, T. Ray, and D. Stauffer (1999), “Is the Fossil Record Indicative

of a Critical System,” Advances in Complex Systems, 2, 137-141.

28. Lileeva, A., and D. Trefler (2010), “Improved Access to Foreign Markets Raises

Plant-level Productivity... For Some Plants,” Quarterly Journal of Economics, 125,

1051-1099.

29. Luttmer, E. G. (2007), “Selection, growth, and the size distribution of firms,” Quar-

terly Journal of Economics, 122, 1103-1144.

30. Merton, R. C. (1971), “Optimum Consumption and Portfolio Rules in a Continuous-

Time Model,” Journal of Economic Theory, 3, 373-413.

31. Melitz, M. J. (2003), “The Impact of Trade on Intra-Industry Reallocations and Ag-

gregate Industry Productivity,” Econometrica, 71, 1695-1725.

32. Melitz, M. J. and G. I. Ottaviano (2008), “Market Size, Trade, and Productivity,”

Review of Economic Studies, 75, 295-316.

35

33. Melitz, M. J., and S. J. Redding (2015), “New Trade Models, New Welfare Implica-

tions,” American Economic Review, 105, 1105-1146.

34. Mrázová, M., and J. P. Neary (2017), “Not So Demanding: Demand Structure and

Firm Behavior,” American Economic Review, 107, 3835-3874.

35. Mrázová, M., J. P. Neary, and M. Parenti (2017), “Sales and Markup Dispersion:

Theory and Empirics,” CEPR Discussion Papers DP12044.

36. Muellbauer, J. (1975), “Aggregation, Income Distribution and Consumer Demand,”

Review of Economic Studies, 42, 525-543.

37. Newman, M. E. J. (2005), “Power Laws, Pareto Distributions and Zipf’s Law,” Con-

temporary Physics, 46, 323-351.

38. Nigai, S. (2017), “A Tale of Two Tails: Productivity Distribution and the Gains from

Trade,” Journal of International Economics, 104, 44-62.

39. Pavcnik, N. (2002), “Trade Liberalization, Exit and Productivity Improvement: Evi-

dence from Chilean Plants,” Review of Economic Studies, 69, 245-276.

40. Pollak, R. A. (1971), “Additive Utility Functions and Linear Engel Curves,” Review

of Economic Studies, 38, 401-414.

41. Simonovska, I., and M. E. Waugh (2014), “Trade Models, Trade Elasticities, and the

Gains from Trade,” NBER Working Paper No. 20495.

42. Sornette, D. (2002), “Mechanism for Powerlaws without Self-Organization,” Inter-

national Journal of Modern Physics C, 13, 133-136.

43. Sornette, D. (2006), Phenomena in Natural Sciences - Chaos, Fractals, Self-Organization

and Disorder, Springer Verlag.

44. Sutton, J. (1991), “The Analytical Framework II: Endogenous Sunk Costs,” Sunk

Costs and Market Structure, MIT Press, 45-82.

45. Yeaple, S. (2005), “A Simple Model of Firm Heterogeneity, International Trade, and

Wages,” Journal of International Economics, 65, 1-20.

46. Zhelobodko, E., S. Kokovin, M. Parenti, and J.-F. Thisse (2012), “Monopolistic

Competition in General Equilibrium: Beyond the Constant Elasticity of Substitu-

36

tion,” Econometrica, 80, 2765-2784.

A Appendix

A.1 Microfundation for Innovation Cost Function (2)

Each entrant can determine its productivity level by engaging in R&D activities in the

following manner. The production process involves a continuum of procedures, and the

entrant can choose the size of the continuum, k. How well the firm can perform in each

procedure (which we term the quality of the procedure) depends on the outcome of a se-

quence of experiments that the firm conducts. For each procedure, every firm is endowed

with one quality unit to begin with. When the first experiment is successful, then the firm

obtains one additional quality unit for this procedure, and can continue to conduct the sec-

ond experiment. Recursively, every successful experiment results in one additional quality

unit and the chance to conduct the next experiment. But if the experiment fails, no more

experiments will be performed and the quality of the procedure is finalized. Firms differ

in their probabilities of failure, γ ∈ (0, 1]. The probability of obtaining quality y = 1, 2, ...

for a procedure is therefore (1− γ)y−1 γ, i.e., y is geometrically distributed. The process

is illustrated as in Figure 3.

Each procedure requires a worker, say a research assistant, to perform the experiments.

Therefore, the mass of research assistants employed by the firm equals the size of the

continuum of procedures, k. The productivity ϕ is a function of the total quality of all k

procedures, kE (y):

ϕ ≡ B (kE (y)) = B

(k∞∑y=1

(1− γ)y−1 γy

)= B

(k

γ

).

The function B (·) is strictly increasing and concave on R+, and limk→∞

B(kγ

)= ∞. The

concavity of B (·) reflects the management burden for the firm to manage these research

37

Figure 3: A sequence of Bernoulli trials

assistants. Inverting the above equation yields (2), where V ≡ B−1 is strictly increasing

and convex in ϕ and limϕ→∞

V (ϕ) =∞.

A.2 Proof of Proposition 1

We show this proposition in the following three steps. In the first step, we show that

β > α+2α+1

(σ − 1) must hold for the expected profit to be finite to ensure the existence of

equilibrium. In the second and third steps we show that both productivity and firm size

distributions exhibit power laws.

Step 1:We require

∫ γD0

Π (ϕ (γ) ; γ) dF (γ) < ∞ for the free-entry condition to be well-defined. Since Π (ϕ (γ) ; γ) is finite for all γ > 0, the only concern for the expectedprofit to explode is when γ is close to 0. Note that using (12) and (17) we can write

Π (ϕ (γ) ; γ) =

(1

L

1

β + ϕL′

L

) σ−1β−σ+1 (

1− 1

σ+ q∗

Q′

Q

) (1+β)(σ−1)β−σ+1

Qβσ

β−σ+1

(1

σ− q∗Q

′

Q−

1− 1σ + q∗Q

′

Q

β + ϕL′

L

)γ−

σ−1β−σ+1−κD.

The expected profit is finite if∫ γD

0[Π (ϕ (γ) ; γ) + κD] γαm (γ) dγ < ∞. Assumption 1

and limγ→0m (γ) = Cm implies that

limγ→0

[Π (ϕ (γ) ; γ) + κD]

γ−σ−1

β−σ+1

m (γ) =CmC

σββ−σ+1

Q

(σ−1σ

) σββ−σ+1

Cσ−1

β−σ+1

L ββ

β−σ+1

(β − σ + 1

σ − 1

),

38

and hence for any ω > 0 there exists a γ > 0 such that for any γ < γ,

[Π (ϕ (γ) ; γ) + κD]

γ−σ−1

β−σ+1

<CmC

σββ−σ+1

Q

(σ−1σ

) σββ−σ+1

Cσ−1

β−σ+1

L ββ

β−σ+1

(β − σ + 1

σ − 1

)+ ω.

By picking a sufficiently small γ and noting that γ < 1, we have

∫ γ

0

[Π (ϕ (γ) ; γ) + κD] f (γ) dγ <

CmC σββ−σ+1

Q

(σ−1σ

) σββ−σ+1

Cσ−1

β−σ+1

L ββ

β−σ+1

(β − σ + 1

σ − 1

)+ ω

∫ 1

0

γα−σ−1

β−σ+1 dγ.

It follows that∫ 1

0γα−

σ−1β−σ+1dγ <∞ if α > −1 and β > α+2

α+1(σ − 1).

Step 2:

Starting from the definition of the Jacobian and using (14), we have

|J (ϕ)| =∣∣∣∣∂γ (ϕ)

∂ϕ

∣∣∣∣ =q∗ (ϕ)

ϕ2V ′ (ϕ)

(2ϕ−1 +

V ′′ (ϕ)

V ′ (ϕ)

)+

1

ϕ2V ′ (ϕ)

ϕ−2

πqq (q∗ (ϕ) , ϕ).

Then, by (12), (13), and Assumption 1, we can replace V ′ (ϕ), V ′′ (ϕ), q∗ (ϕ), and πqq (q∗ (ϕ) , ϕ)

to obtain (19) as

|J (ϕ)| =Qσ

L

(1− 1

σ+ q∗Q

′

Q

)σβ + ϕL′

L

·

[2 +

β (β − 1) + 2βϕL′

L+ ϕ2L′′

L

β + ϕL′

L

+1− 1

σ+ q∗Q

′

Q

− 1σ

(1− 1

σ

)+ 2

(1− 1

σ

)q∗Q

′

Q+ (q∗)2 Q′′

Q

]· ϕ−(β−σ+1)−1.

By Assumption 1, Lemmas 1-2, and (9), we have

limϕ→∞

|J (ϕ)|ϕ−(β−σ+1)−1

=CσQ

(σ−1σ

)σ(β − σ + 1)

CLβ.

The p.d.f. of productivity is therefore

g (ϕ) =f (γ (ϕ))

F (γD)|J (ϕ)| = f (γ (ϕ))

F (γD)

|J (ϕ)|ϕ−(β−σ+1)−1

ϕ−(β−σ+1)−1

=m (γ (ϕ))

F (γD)

Qσ

L

(1− 1

σ+ qQ

′

Q

)σβ + ϕL′

L

α |J (ϕ)|ϕ−(β−σ+1)−1

ϕ−(1+α)(β−σ+1)−1.

As ϕ becomes arbitrarily large, m (γ (ϕ)), the bracketed term and |J(ϕ)|ϕ−(β−σ+1)−1 converge to

39

constants. It thus follows that the productivity distribution exhibits a power law with a tail

index (1 + α) (β − σ + 1).

Step 3:

By (12) and Lemma 1, firm size in terms of sales s is a function of ϕ:

s = ϕσ−1Qσ

(1− 1

σ+ q∗

Q′

Q

)σ−1

. (36)

By Assumption 1 and Lemmas 1 and 2, there are one-to-one mappings at the tails between

s → ∞ and ϕ → ∞, and between ϕ → ∞ and γ → 0, such that limϕ→∞

s = ∞. Let

s (ϕ) denote the firm size with productivity ϕ as defined by (36); ϕ (s) denotes its inverse

function. Combining (17) and (36), we have

γ (ϕ (s)) ≡ γ (s) = γ =Qσ

L

(1− 1

σ+ qQ

′

Q

)σβ + ϕL′

L

[Qσ

(1− 1

σ+ q

Q′

Q

)σ−1]β−σ+1

σ−1

s−β−σ+1σ−1 ,

which converges to a power function of s with exponent −β−σ+1σ−1

under Assumption 1.

Using (12) and (13), we have

∂s (ϕ)

∂ϕ=∂s

∂q∗∂q∗

∂ϕ=

1− 1σ

+ q∗Q′

Q

1σ

(1− 1

σ

)− 2

(1− 1

σ

)q∗Q

′

Q− (q∗)2 Q′′

Q

q∗ϕ−2 > 0. (37)

Using (12), (36), and (37), we obtain the Jacobian |Js (s)|:

|Js (s)| =∣∣∣∣∂γ (s)

∂s

∣∣∣∣ =

∣∣∣∣∂γ (ϕ)

∂ϕ

∂ϕ (s)

∂s

∣∣∣∣ =|J (ϕ)|

ϕ−(β−σ+1)−1ϕ−(β−σ+1)−1

(∂s (ϕ)

∂ϕ

)−1

=|J (ϕ)|

ϕ−(β−σ+1)−1

1σ

(1− 1

σ

)− 2

(1− 1

σ

)q∗Q

′

Q− (q∗)2 Q′′

Q(1− 1

σ+ q∗Q

′

Q

)−(β−σ+1)

Q−σ(β−σ+1)

σ−1

s−β−σ+1σ−1

−1.

The density of firm size distribution gs (s) can be written as

gs (s) =m (γ (s)) [γ (s)]α

F (γD)|Js (s)| = m (γ (s))

F (γD)

|Js (s)|s−

β−σ+1σ−1

−1

(γ (s)

s−β−σ+1σ−1

)αs−

(α+1)(β−σ+1)σ−1

−1.

By Assumption 1, Lemmas 1-2, and (9), we know that |Js (s)| /s−β−σ+1σ−1

−1, γ (s)α /s−αβ−σ+1σ−1 ,

and m (γ (s)) converge to constants as s tends to infinity. Therefore, the firm size distribu-

tion exhibits a power law with a tail index (α+1)(β−σ+1)σ−1

.

40

A.3 Derivation for Section 2.3

A.3.1 Finiteness of Expected Profit

Let k ≡ arg maxk σk, we can rewrite (22) by extracting ϕσk−βi−1 as follows

γ =

∑j

Iijτ1−σjij w

−σji Q

σjj ·(

1− 1σj

+ q∗ijQ′jQj

)σjLi ·

(βi + ϕ

L′iLi

) ϕσj−σk

ϕσk−βi−1. (38)

We can also rewrite (20) as

qij = Qσjj ×

(1− 1

σj+ qij

Q′jQj

)σjw−σji τ

−σjij ϕσj . (39)

Using Assumption 1, (38), and (39), the total operating profit becomes

∑j

(πij + κij) =∑j

(pijqij − τijwiϕ−1qij

)=

∑j

1σj− qij

Q′jQj

1− 1σj

+ qijQ′jQj

Qσjj w

1−σji τ

1−σjij ϕσj−σk

ϕσk−1

=

∑j

1σj−qij

Q′jQj

1− 1σj

+qijQ′j

Qj

Qσjj w

1−σji τ

1−σjij ϕσj−σk

∑j

Iijτ1−σjij w

−σji Q

σjj ·(

1− 1σj

+q∗ijQ′j

Qj

)σjLi·(βi+ϕ

L′i

Li

) ϕσj−σk

σk−1

σk−βi−1

γσk−1

σk−βi−1 .

As ϕ (γ) and q∗ (ϕ) exist and are unique, Assumption 1 and (9) imply that∑

j (πij + κij) /γσk−1

σk−βi−1

converges to a constant as γ becomes infinitesimal. For small enough γ,∫ γ

0

ΠidF (γ) =

∫ γ

0

∑j

(πij + κij) dF (γ)−∫ γ

0

∑j

κijdF (γ)− κD,i

=

∫ γ

0

∑j (πij + κij)

γσk−1

σk−βi−1

γαi−

σk−1

βi−σk+1mi (γ) dγ − Constant.

Recall that αi > −1 for all i, the same procedure as in Appendix A.2 implies thatE (Πi) <

∞ if βi > αi+2αi+1 (σk − 1) .

41

A.3.2 Productivity Distribution

Following the same procedure as in Appendix A.2, (20) and (21) yield

∂

∂ϕ

τijq∗ij (ϕ)

ϕ2V ′i (ϕ)=−

τijq∗ij (ϕ)

ϕ2V ′i (ϕ)

[2ϕ−1 +

V ′′i (ϕ)

V ′i (ϕ)

]+

τijϕ2V ′i (ϕ)

∂q∗ij (ϕ)

∂ϕ

=−w−σji τ

1−σjij Q

σjj

Li (ϕ)

(1− 1

σj+ qij

Q′jQj

)σjβi + ϕ

L′i(ϕ)

Li(ϕ)

·

2 +βi (βi − 1) + 2βiϕ

L′i(ϕ)Li(ϕ) + ϕ2 L

′′i (ϕ)Li(ϕ)

βi + ϕL′i(ϕ)

Li(ϕ)

+1− 1

σj+ qij

Q′jQj

− 1σj

(1− 1

σj

)+ 2

(1− 1

σj

)qij

Q′jQj

+ q2ij

Q′′jQj

ϕ−(βi−σj+1)−1

≡− Jijϕ−(βi−σj+1)−1. (40)

Using (23), we have

|Ji (ϕ)| =

∣∣∣∣∣n∑j=0

∂

∂ϕ

τijq∗ij (ϕ)

ϕ2V ′i (ϕ)

∣∣∣∣∣ =

(∑j

Jijϕσj−σk

)ϕ−(βi−σk+1)−1. (41)

As Jij converges to a constant,∑

j Jijϕσj−σk converges to a constant because ϕσj−σk → 0

for all j 6= k and ϕσj−σk = 1 for j = k.

As a result, the productivity distribution is given by

gi (ϕ) =mi (γ (ϕ))

Fi (γD)γ (ϕ)αi |Ji (ϕ)|

=mi (γ (ϕ))

Fi (γD)

(γ (ϕ)

ϕσk−βi−1

)αi (∑j

Jijϕσj−σk

)ϕ−(1+αi)(βi−σk+1)−1.

From Assumption 1, (9), (38), (40), and that limγ→0mi (γ) = Cm,i, the distribution of ϕ

exhibits a power law with a tail index (αi + 1) (βi + 1−maxj σj).

A.3.3 Firm Size Distribution

For firms with sufficiently large ϕ, the firm size s is defined as the sum of export revenue

s ≡∑

j sij . As ϕ (γ) and q∗ij (ϕ) exist and are unique and that sij = pijqij = q1− 1

σj

ij Qj ,

the functions sij (γ), sij (ϕ), sij (qij) and their inverse functions exist. Moreover, sij is

42

decreasing in γ and increasing in both ϕ and qij . When γ becomes arbitrarily small, both

ϕ and sij becomes arbitrarily large. By (20) and s =∑

j q1− 1

σj

ij Qj , we have

ϕ =s

1σk−1

∑j

[w

1−σji τ

1−σjij

[Qj ×

(1− 1

σj+ qij

Q′jQj

)]σj−1

Qjϕσj−σk

] 1σk−1

. (42)

Deriving ∂q∗ij/∂ϕ using (20) and applying product rule to sij = q1− 1

σj

ij Qj yield

∂sij (ϕ)

∂ϕ=∂sij∂q∗ij

∂q∗ij∂ϕ

=1− 1

σj+ q∗ij

Q′jQj

1σj

(1− 1

σj

)− 2

(1− 1

σj

)q∗ij

Q′jQj−(q∗ij)2 Q′′j

Qj

wiτijϕ−2qij

=

(1− 1

σj+ q∗ij

Q′jQj

) [Qj ×

(1− 1

σj+ q∗ij

Q′jQj

)]σj1σj

(1− 1

σj

)− 2

(1− 1

σj

)q∗ij

Q′jQj−(q∗ij)2 Q′′j

Qj

w1−σji τ

1−σjij ϕσj−2.

(43)

As a result,

∂s (ϕ)

∂ϕ=

n∑j=0

∂sij (ϕ) /∂ϕ

ϕσj−2ϕσj−2 =

(n∑j=0

∂sij (ϕ) /∂ϕ

ϕσj−2ϕσj−σk

)ϕσk−2. (44)

Using (23), (41), (42), (43), and (44), the absolute value of the Jacobian term |Js,i (ϕ)|

is thus

|Js,i (ϕ)| ≡∣∣∣∣∂γ (s)

∂s

∣∣∣∣ =

∣∣∣∣∂γ (ϕ)

∂ϕ

∂ϕ (s)

∂s

∣∣∣∣ = |Ji (ϕ)|(∂s (ϕ)

∂ϕ

)−1

=

∑j Jijϕ

σj−σk∑nj=0

∂sij(ϕ)/∂ϕ

ϕσj−2 ϕσj−σk

(ϕ

s1

σk−1

)−βis−

(βi−σk+1)σk−1

−1.

As a result, the above equation along with (38) yields the firm size distribution gs,i (s):

gs,i (s) =mi (γ (s))

Fi (γD)γ (s)

αi |Js,i (ϕ)|

=mi (γ (s))

Fi (γD)

∑j Jijϕ

σj−σk∑nj=0

∂sij(ϕ)/∂ϕ

ϕσj−2 ϕσj−σk

(ϕ

s1

σk−1

)−βi+αi(σk−βi−1)(γ (ϕ)

ϕσk−βi−1

)αis− (αi+1)(βi−σk+1)

σk−1 −1.

43

From (9), (38), (41), (42), (43), Assumption 1, and limγ→0mi (γ) = Cm,i, each of the

multiplicative terms besides s−

(αi+1)(βi−σk+1)σk−1

−1converges to a constant. Thus, the firm

size distribution exhibits a power law with a tail index (αi+1)(βi+1−maxj σj)

maxj σj−1.


Applying the symmetric-country assumption to Proposition 2 implies that E (Π) < ∞

under α > −1 and β > α+2α+1

(σ − 1). Using (25) - (27), the definition of φ and recalling

that γX/γD = δ < 1, we can restate the expected profit as a function of γD:

E (Π) = κDγσ−1

β−σ+1

D

ΓD +

[(1 + nτ 1−σ) β

β−σ+1 − 1]

ΓX

− κDF (γD)− nκXF (γX) ,

where Γz ≡∫ γz

0γ−

σ−1β−σ+1dF (γ) for z ∈ D,X. In equilibrium, γD solves the free-entry

condition. Using (29), it is then readily verified that

∂E (Π)

∂γD=κD

σ − 1

β − σ + 1γ

σ−1β−σ+1

−1

D

[(ΓD − ΓX) +

(1 + nτ 1−σ) β

β−σ+1 ΓX

]> 0.

Note that both ΓD and ΓX are positive and increasing in γD; thus limγD→∞

γσ−1

β−σ+1

D ΓD =

limγD→∞

γσ−1

β−σ+1

D ΓX = ∞. Since both F (γD) and F (γX) are less than 1, it follows that

limγD→∞

E (Π) = ∞. Since E (Π) is bounded from above by E (Π) |γD=1, for any κe ∈

(0, E (Π) |γD=1) there exists a unique γD such that the free-entry condition holds, hence

establishes the uniqueness of equilibrium.


We first derive the effect of τ on the surviving cutoff γD. Total differentiating E (Π) with

respect to τ yields dγDdτ

= − ∂E(Π)/∂τ∂E(Π)/∂γD

. We have obtained ∂E (Π) /∂γD earlier. Partially

differentiate the expected profit with respect to τ yields

∂E (Π)

∂τ=− (σ − 1) β

β − σ + 1

nτ−σ

1 + nτ 1−σκDγσ−1

β−σ+1

D

(1 + nτ 1−σ) β

β−σ+1 ΓX .

44

The above then leads to

dγDdτ

=βγDnτ−σ

1 + nτ 1−σ(1 + nτ 1−σ)

ββ−σ+1 ΓX

(ΓD − ΓX) + (1 + nτ 1−σ)β

β−σ+1 ΓX> 0.

The effect of τ on the exporting cutoff γX is defined by dγXdτ

= δ dγDdτ

+ dδdτγD. It is

readily checked that

dγXdτ

=δβγDnτ−σ

1 + nτ 1−σ

[(1 + nτ 1−σ)

ββ−σ+1 ΓX

(ΓD − ΓX) + (1 + nτ 1−σ)β

β−σ+1 ΓX− (1 + nτ 1−σ)

ββ−σ+1

(1 + nτ 1−σ)β

β−σ+1 − 1

].

The first and second terms in the bracket are less and greater than 1, respectively. We thus

conclude that dγX/dτ < 0.

Combining (27) and (24), we obtain equilibrium productivity:

ϕ (γ) =

κ1β

Dγσ−1

β(β−σ+1)

D

(β−σ+1σ−1

)− 1β γ−

1β−σ+1 if γ ∈ (γX , γD]

φκ1β

Dγσ−1

β(β−σ+1)

D

(β−σ+1σ−1

)− 1β γ−

1β−σ+1 if γ ∈ [0, γX ]

. (45)

For the effect on productivity, taking derivatives to (45) yields

dϕ (γ)

dτ=

κ

1βDnτ

−σ(1+nτ1−σ)σ−1

β−σ+1 ΓXγ

σ−1β(β−σ+1)D

(ΓD−ΓX)+(1+nτ1−σ)β

β−σ+1 ΓX

(β−σ+1σ−1

)−β+1β γ−

1β−σ+1 > 0 for non-exporting firms

−φκ1βDnτ

−σ(1+nτ1−σ)σ−1

β−σ+1 ΓXγ

σ−1β(β−σ+1)D

(ΓD−ΓX)+(1+nτ1−σ)β

β−σ+1 ΓX

(β−σ+1σ−1

)−β+1β γ−

1β−σ+1 < 0 for exporting firms.

The claims on the comparative statics of ϕ thus follow.


We can write the price index as P 1−σ = P 1−σD + nP 1−σ

X , where P 1−σD and nP 1−σ

X are the

components of P 1−σ in which the goods are from domestic and foreign firms, respectively.

Therefore, the expenditure share on domestic products is defined by λ ≡ P 1−σD /P 1−σ, and

the expenditure share on goods from a foreign country is defined by λX = P 1−σX /P 1−σ =

(1− λ) /n. Plugging (24) into (30) and use the definitions above yield

λ =ΓD + (φσ−1 − 1) ΓX

ΓD + [(1 + nτ 1−σ)φσ−1 − 1] ΓX(46)

45

λX =φσ−1τ 1−σΓX

ΓD + [(1 + nτ 1−σ)φσ−1 − 1] ΓX. (47)

From (27) and (28) we have

d ln γD =βd lnP (48)

d ln γX =βd lnP + d ln δ. (49)

Note that the assumption of Proposition 3 ensures that E (Π) < ∞, hence ΓD and ΓX

are both finite. We further define a short-hand notation ηz ≡ γ1− σ−1

β−σ+1z f (γz) Γ−1

z , where

z ∈ D,X.

The welfare is defined as the real income W ≡ N/P . It thus follows from (48) thatd lnWd ln τ

= − 1βd ln γDd ln τ

. Rearranging (31) yields

γσ−1

β−σ+1

D =κe + κDF (γD) + nκXF (γX)

κD

ΓD +

[(1 + nτ 1−σ)

ββ−σ+1 − 1

]ΓX

.Log-differentiating the above equation yields

σ − 1

β − σ + 1d ln γD =d ln (κe + κDF (γD) + nκXF (γX))− d ln

ΓD +

[(1 + nτ1−σ) β

β−σ+1 − 1

]ΓX

=

κDF (γD)

κe + κDF (γD) + nκXF (γX)

γDf (γD)

F (γD)d ln γD +

nκXF (γX)


γXf (γX)

F (γX)d ln γX

−1

1 +

[(1 + nτ1−σ)

ββ−σ+1 − 1

]ΓXΓD

ηDd ln γD −

[(1 + nτ1−σ) β

β−σ+1 − 1

]ΓXΓD

1 +

[(1 + nτ1−σ)

ββ−σ+1 − 1

]ΓXΓD

ηXd ln γX

−

[(1 + nτ1−σ) β

β−σ+1 − 1

]ΓXΓD

1 +

[(1 + nτ1−σ)

ββ−σ+1 − 1

]ΓXΓD

d ln

[(1 + nτ1−σ) β

β−σ+1 − 1

]. (50)

Note that (31) and the definitions of ηD and ηX imply the following:

κDF (γD)


γDf (γD)

F (γD)=

1

1 +[(1 + nτ 1−σ)

ββ−σ+1 − 1

]ΓXΓD

ηD,

nκXF (γX)


γXf (γX)

F (γX)=

[(1 + nτ 1−σ)

ββ−σ+1 − 1

]ΓXΓD

1 +[(1 + nτ 1−σ)

ββ−σ+1 − 1

]ΓXΓD

ηX .

Note that (1 + nτ 1−σ)β

β−σ+1 = (1 + nτ 1−σ)φσ−1, we can rearrange (50) to obtain d ln γD =

46

β (1− λ) d ln τ . The welfare gains from trade is accordingly

d lnW

d ln τ=− 1

β

d ln γDd ln τ

= λ− 1. (51)

Next, we show that (51) is consistent with the ACR formula. Note that trade elasticity

is defined as d ln (λX/λ) /d ln τ ; thus under the symmetric country assumption we can

restate the ACR formula as

d lnW

d ln τ=

1

ε

d lnλ

d ln τ= =

d lnλ/d ln τ

d ln (λX/λ) /d ln τ=

d lnλ/d ln τ

d ln(

1−λλ

)/d ln τ

= λ− 1,

which is equivalent to (51).

For the trade elasticity, recall that d ln δ = d ln (γX/γD) by (48) and (49), and d ln γD/d ln τ =

β (1− λ). Log-differentiating (46) and (47) with respect to τ thus yields

d lnλXλ

=d ln

(φσ−1τ 1−σΓX

ΓD

)− d ln

[1 +

(φσ−1 − 1

) ΓXΓD

]= (1− σ) d ln τ + (σ − 1)

1− ΓXΓD

1 + (φσ−1 − 1) ΓXΓD

d lnφ

+1

1 + (φσ−1 − 1) ΓXΓD

ηXd ln δ +1

1 + (φσ−1 − 1) ΓXΓD

(ηX − ηD) d ln γD

Recall the definition ε = d ln(λX/λ)d ln τ

and that δ = γX/γD, the above equation leads to

(33).

47

Date post:	21-Sep-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Innovation, Firm Size Distribution, and Gains from Trade€¦ · JEL Codes: F12, F13, F41....

Documents