Welfare Gains from Optimal Pollution RegulationJose Miguel Abitoy November 5, 2012 Abstract...

Welfare Gains from Optimal Pollution Regulation∗

Jose Miguel Abito†

November 5, 2012

Abstract

Successful implementation of pollution regulation often requires redistributing a portion of the benefits back

to firms who incur abatement costs. When firms have private information on their costs, they have an incentive

to overstate these costs and demand higher compensation. Optimal pollution regulation in this environment

sacrifices allocative effi ciency to reduce information rents. I measure the gains from optimal pollution regulation

by empirically examining the effect of sulfur dioxide emissions regulation on electric utilities. These electric

utilities also face economic regulation, and I exploit this institutional detail.

I derive estimates of marginal abatement costs from the cost of jointly producing electricity and emissions,

allowing for time-varying unobserved heterogeneity to capture cost effi ciency. Cost effi ciency consists of exoge-

nous (intrinsic type) and endogenous (managerial effort) components which are private information of the firm.

To separately identify these components, I model economic regulation as a signaling game of auditing. I show

that a particular equilibrium exists where the firm does not exert effort during the “rate case”, but it exerts a

positive level of effort afterwards. I provide empirical evidence for the plausibility of this equilibrium using cost

and rate case data. This equilibrium generates exclusion restrictions that are used to estimate parameters of

the cost function and disutility of effort. I show that the type distribution can be nonparametrically identified

using deconvolution methods, and estimate this distribution via a smoothed discrete approximation. Finally,

I conduct counterfactual welfare simulations.

I find that annual welfare gains from optimal pollution regulation relative to a uniform emission standard

range from $32 million to $155 million per electric utility, or about 10% to 47% of combined electricity generation

and abatement costs. Implementing the optimal form of regulation is diffi cult, if not impossible, so I examine

simpler regulatory regimes. A class of regimes with uniform emission taxes captures 52% to 80% of these gains.

∗Job Market Paper. I would like to thank my advisors Aviv Nevo, David Besanko and Robert Porter for all of their

help and guidance. I also thank Mark Chicu, Daniel Diermeier, Igal Hendel, Matt Masten, Tiago Pires, Mike Powell, Min

Ren, William Rogerson, Kosuke Uetake, Michael Whinston and seminar participants at Northwestern. Data acquired from

SNL Financial was partly funded by the TGS Graduate Research Grant and by the Center for the Study of Industrial

Organization (CSIO) at Northwestern University.†Department of Economics, Northwestern University. Email: [email protected].

1

1 Introduction

Successful implementation of pollution regulation often requires redistributing a portion of the benefits

back to firms who incur abatement costs. For example, in the US Acid Rain Program about $600

million to $1.8 billion worth of emission permits were given to electric utilities for free, instead of being

auctioned. This type of redistribution is not without welfare costs. By giving away permits for free, the

policy-maker forgoes revenues that can be used to reduce distortionary taxes or fund other productive

activities (Goulder et al, 1997). A further policy constraint is that firms may have private information

about their abatement cost. The firm can exploit this informational advantage and extract information

rents by overstating their costs and demanding higher compensation.

Laffont (1994) uses the framework of incentive regulation (e.g. Laffont and Tirole, 1993) to character-

ize the optimal form of pollution regulation in this environment.1 The key insight is that when information

rents are costly, it may be optimal to distort allocative effi ciency to decrease information rents. Thus,

the optimal form of regulation may involve abatement levels that do not equate the marginal damages

from emissions with marginal abatement costs. Despite the simplicity of this insight, policies inspired by

incentive regulation have rarely been implemented in pollution regulation, and in economic regulation in

general. The design and implementation of such mechanisms require a lot on the part of the regulator in

terms of information gathering, rigorous auditing and sophisticated analyses (Joskow, 2008; Kahn, 1988).

Moreover, uncertainty over the actual benefits and costs of these policies, and the subsequent negative

political and economic consequences in cases where such attempts are unsuccessful, make it diffi cult to

convince policy-makers to adopt untested mechanisms. My paper addresses the following questions. How

much do we gain by implementing optimal pollution regulation relative to a uniform emission standard?

Can more practical alternatives approximate these gains?

The paper focuses on sulfur dioxide (SO2) emissions regulation of electric utilities in the US to

empirically answer these questions. An interesting institutional feature of my setting is that polluting

sources were facing both pollution and economic regulation. This feature offers an excellent setting to

study issues of redistribution and asymmetric information. Pollution regulation comes in the form of the

Acid Rain Program which is administered by the Environmental Protection Agency at the federal-level.

Economic regulation on the other hand is implemented by state-level public utility commissions in charge

of regulating the price of electricity. Since state utility commissions are directly responsible for providing

adequate compensation to electric utilities, commissions care about the impact of pollution regulation on

the cost of producing electricity. Commissions can then let this concern be heard by state legislators and

1See also Lewis (1996). Spulber (1998) shows that when information rents are too large, the policy that maximizes

allocative effi ciency may not even be implementable.

2

influence the design of pollution regulation. Although some of the windfall gains from pollution regulation

may be passed on to consumers through lower electricity prices (Schmalensee and Stavins, 2012), the part

of excess payments due to information rents do not get passed on if the economic regulator does not have

the same information as the firm.

In computing welfare under optimal pollution regulation, I consider a social planner who is in charge

of both pollution and economic regulation. Economic regulation makes explicit the need to design a

pollution regulatory regime that adequately compensates the firm. I use the static regulatory framework

of Laffont (1994) to characterize optimal pollution regulation. The size of distortions from allocative

effi ciency depends on the distribution of marginal abatement costs across the possible unobserved types

of the firm. If, given the same level of abatement, differences in marginal abatement costs are large, the

incentives for low cost firms to claim to be of high cost rise much faster as abatement is increased. In this

case, large first order gains in welfare are achieved by inducing high cost types to abate less compared

to the allocatively effi cient level. These first order gains are achieved at the expense of second order

losses. Thus, the gains from optimal pollution regulation relative to other regulatory regimes depend on

the distribution of marginal abatement costs. My main task is to estimate the distribution of marginal

abatement costs from the data.

I estimate marginal abatement costs of electric utilities using data from 1988-1999. My focus is on the

cost of fuel-switching, which was the popular mode of abatement during the time period. Fuel-switching

directly impacts the cost of producing electricity and marginal abatement costs can be measured as the

increase in the cost of producing electricity from an incremental decrease in emission rates. Thus, I

can study and use data on the cost of producing electricity to infer what marginal abatement costs are.

Formally, the main object of analysis is a multiproduct cost function which captures the cost of jointly

producing electricity and emissions (or abatement).

In estimating firms’multiproduct cost functions, I allow for time-varying unobserved heterogeneity

to capture unobserved cost effi ciencies. I model the firm’s cost effi ciency as having a component that is

exogenous (intrinsic type) and a component that is endogenous (managerial effort), and these are private

information of the firm. While it is possible to estimate the firm’s cost effi ciency solely using cost and

operations data, this is not enough to decompose cost effi ciency into its type and effort components. A

firm with high realized cost can either be a firm that is intrinsically ineffi cient or a firm that did not exert

effort. We need additional information that explicitly links a firm’s observed cost with its unobserved

intrinsic type and chosen effort. I use a model of economic regulation (rate regulation) to provide this

link. Although my paper’s focus is pollution regulation, I exploit rate regulation to link firms’observed

behavior with primitives. Rate regulation affects firms’incentives to manage their electricity generation

costs, which directly ties with abatement costs through the multiproduct cost function.

3

I model rate regulation as a signaling game of auditing, where the firm provides information about its

costs in a rate case, and the regulator decides on the firm’s allowed revenues based on this information.

I show that there exists an equilibrium where the firm has no incentive to exert effort during the rate

case, and a positive optimal level of effort once the case concludes.2 Therefore under this equilibrium,

the effort component does not appear in the firm’s cost effi ciency during the rate case. The wedge

between cost effi ciencies during and after the rate case reveals the firm’s chosen effort. I can then infer

the firm’s “disutility” from exerting effort from the chosen level of effort after the rate case. I provide

empirical evidence to support the plausibility of this particular equilibrium using cost and rate case data.

First, I find that costs and heat rates (i.e. amount of fuel burned per unit of electricity produced) are

higher during the rate case. Second, I provide evidence that the regulator’s auditing strategy under this

equilibrium obtains in the data.

I impose parametric assumptions on firms’ cost function and disutility of effort in my empirical

model. In computing welfare under different regulatory regimes, I need to know what the underlying

costs and disutilities are for arbitrary values of emission rates and effort levels. However, I do not

impose distributional assumptions on firms’unobserved intrinsic types. The distribution of unobserved

types determines the distribution of marginal abatement cost and therefore is an important ingredient

in the welfare analysis. Because effort is chosen by the firm and cost effi ciency is unobserved by the

econometrician, there is an endogeneity problem when estimating the parameters of the empirical model.3

The equilibrium mentioned earlier provides information on what cost effi ciency is during different time

periods and events (i.e. rate case and non-rate case years). I can then use similar techniques from the

dynamic panel literature to identify and estimate the parameters. I pose the problem of identifying the

unobserved type distribution as a measurement error problem with repeated measurements and apply

the result of Kotlarski (1967) to establish nonparametric identification. Finally, I estimate the type

distribution using the smoothed discrete approximation developed by Hausdorff (1923) and applied by

Beran and Hall (1992).

I examine welfare under different regulatory regimes given the estimated primitives. Welfare gains

from optimal pollution regulation are computed relative to the uniform emission standard that maximizes

allocative effi ciency. Optimal pollution regulation can be theoretically implemented using type-dependent

transfers and type-dependent emission tax rates. Because this is diffi cult to implement in practice espe-

cially when firms are suffi ciently heterogeneous, I estimate welfare from a uniform emission tax regime and

2 Incentives to exert effort after the rate case is a common feature in models with a regulatory lag, e.g. Baumol and

Klevorick (1970), Bailey and Coleman (1971), and Pint (1992). Regulatory lag here refers to the time between rate cases

rather than the duration of the case.3Firms’intrinsic type may also be correlated with electricity output and prices of procured fuel. This potential correlation

is another source of endogeneity.

4

a hybrid regime to see how much of the welfare gains from optimal pollution regulation can be captured

by these simpler alternatives. The hybrid regime is an emission tax regime that allows firms to opt-out

and join a uniform emission standard. While the hybrid regime sacrifices allocative effi ciency, it allows

the social planner to decrease information rents. If the increase in welfare due to lower information rents

out-weighs the loss due to distortions in allocative effi ciency, then opt-out improves welfare.

When damages from SO2 emissions are valued at $100 per ton, the welfare gains from optimal pollution

regulation relative to an effi cient uniform emission standard are about $32 million per firm, or 10% of

the combined variable cost of electricity generation and abatement. Welfare gains rise when abatement

is valued more. When damages are $1000 per ton, annual welfare gains rise to $155 million per firm.

Finally, I find that simpler alternatives capture a large part of these gains. The uniform emission tax

and hybrid regimes capture from 52% to 80% of the welfare gains from optimal pollution regulation. The

hybrid regime out-performs the uniform emission tax regime when the cost of public funds is high, i.e.

when reducing information rents is relatively more valuable.

The paper is organized as follows. The next section provides a background of the institutions. In

section 3, I start with the definition of welfare to lay out the things we need to perform the welfare

comparisons. I then discuss the model of rate regulation and characterize its equilibria. Section 4

describes the data. I also present evidence to support the particular equilibrium that will be useful for

identification and estimation. Section 5 is the main empirical section of the paper and it starts with the

empirical model set-up. Identification is tackled in subsection 5.1, followed by estimation and a discussion

of the results. Section 6 contains the counterfactual welfare exercise. The final section concludes.

Related literature My paper is most related to the line of empirical regulation literature pioneered

by Wolak (1994). Wolak (1994) and Brocas et al (2006) use the normative models of Baron and Myerson

(1982) and Besanko (1985) to provide a link between observed behavior and the firm’s private information.

This approach assumes that the actual regulatory institutions can be modeled “as if”the optimal form

of regulation was being implemented by the regulator. The optimal mechanism characterizes a mapping

between the firm’s private information and observed regulatory variables (e.g. price and rate of return)

which can then be inverted to identify and estimate the firm’s primitives. Perrigne and Vuong (2011)

formalize this identification strategy for the normative model of Laffont and Tirole (1986). One issue

with using a normative model is that it assumes a highly sophisticated regulator that can design and

commit to the optimal mechanism.4 For example, in order to derive the optimal mechanism in the Laffont

and Tirole (1986) model, the regulator needs to know the exact functional form for the effort disutility

4Although Perrigne and Vuong (2011) allow observed regulatory variables to deviate from the one specified by the optimal

mechanism, this deviation should be unsystematic, i.e. unrelated to the firm’s primitives.

5

function. The regulator then designs and offers a set of contracts, and it is assumed the regulator can

commit to these.5 My approach is to directly model the rate case regulatory institution to provide the

link between observed behavior and the firm’s primitives. I build a signaling model of regulation where

the regulator takes an action after the firm provides information. Thus, I do not require the regulator to

design and commit to a particular mechanism before the firm moves.

Gagnepain and Ivaldi (2002) do not rely on a normative model and instead exploit variation in actual

regulatory regimes to estimate welfare in the French urban transport industry. My paper differs from

their identification strategy in two ways. First, the firms in their setting either face a fixed-price or a

cost-plus contract. Under the assumption that the assignment to a regulatory regime is exogenous, the

variation in regimes in the data allows identification of firms’type and disutility of effort. In my setting,

firms face the same regulatory regime. I exploit the induced equilibrium behavior of firms across time to

get the variation I need. Second, I do not impose distributional assumptions on the type distribution.

The type distribution is nonparametrically identified and flexibly estimated.

The paper contributes to the empirical literature on pollution regulation. The closest paper is Carlson

et al (2000). They estimate the cost-savings from Phase I of the Acid Rain Program (ARP) relative to

command-and-control regimes (e.g. uniform emission standard). The sample of electric utilities I study

own the set of plants that were under Phase I. Similar to their paper, I estimate marginal abatement costs

from fuel-switching by estimating a multiproduct cost function. However, Carlson et al (2000) ignore

economic regulation in estimating marginal abatement costs which may lead to biased estimates (Wolak,

1994).6 Moreover, my focus is on welfare and optimal regulation rather than cost-savings alone.

My identification and estimation strategy for the empirical model’s parameters has its roots in the

dynamic panel literature (see Arellano and Honoré (2001) and Arellano (2003)). The key idea is to model

how unobserved heterogeneity evolves and to use transformations of the data so that the unobserved

heterogeneity does not appear in the estimating equations. The equilibrium I characterize generates

restrictions on the evolution of unobserved heterogeneity.

I use deconvolution techniques to nonparametrically identify the distribution of intrinsic types. De-

convolution methods have been applied in measurement error models (e.g. Li and Vuong (1998) and

Schennach (2004)), in panel data and error components models (e.g. Horowitz and Markatou (1996);

Evdokimov (2008, 2010); Bonhomme and Robin (2010); and Arellano and Bonhomme (2012)), and in

5Baron and Besanko (1984) introduce auditing in the Baron and Myerson (1982) model which brings the model closer to

what happens in a rate case. The commitment assumption is crucial in this model otherwise the regulator does not have an

incentive to audit the firm and the optimal auditing policy breaks down.6Fowlie (2010) provides evidence that rate regulation induce firms to choose more capital-intensive abatement options in

the context of NOx emissions regulation. I look at the effect of rate regulation on abatement costs rather than the choice of

abatement method.

6

the auctions literature (e.g. Li et al (2000); Asker (2010); and Krasnokutskaya (2011)). In contrast to

this literature, I do not use an inverse Fourier transform to estimate the type distribution. Instead, I rely

on the smoothed discrete approximation developed by Hausdorff (1923) to solve the classical problem

of moments (Shohat and Tamarkin, 1943). The idea is to approximate the underlying distribution by a

discrete distribution whose probability mass is a linear combination of the moments of the underlying

distribution. Beran and Hall (1992) apply Hausdorff’s (1923) approximation to estimate the distribution

of random coeffi cients without imposing distributional assumptions on the error term.

The welfare exercise I perform is similar to the exercise in the empirical price discrimination literature,

e.g. Leslie (2004), Miravete (2007), Villas-Boas (2009), Hendel and Nevo (2012), and Lazarev (2011),

where the fully optimal pricing strategy is compared to simpler ones. Finally, the hybrid regime I construct

can be seen as a binary menu in the spirit of Rogerson (2003) and Chu and Sappington (2007). These

two papers use numerical examples to examine the performance of the simpler binary menu relative to

the fully optimal menu. My paper does this exercise empirically.

2 Institutional background

I first provide an overview of the investor-owned electric utility and how the utility simultaneously pro-

duces electricity and emissions. I then briefly discuss the history of SO2 emissions regulation. Finally, I

describe rate regulation and what goes on in a rate case. Although the paper is about pollution regulation,

accounting for the existing form of economic regulation is an integral part of my research strategy.

Electric utilities are vertically-integrated monopolists regulated by the State Public Utility Commis-

sion (PUC). They own and operate the generation, transmission and distribution of electricity within a

given service area (typically within a state but can sometimes cross state boundaries). The generation

sector is composed of multiple plants that transform energy sources such as fossil-fuels and nuclear energy

into electricity. The transmission sector is responsible for moving electricity from plants to local distri-

bution centers using high-voltage power lines. The distribution sector is then responsible for delivering

electricity to end-users. My paper focuses on the operating expenses related to generating electricity from

fossil-fuels, which are about 40% of total operating expenses.

An electric utility owns multiple plants and these plants differ depending on the type of fuel they

burn. The electric utilities I consider all own coal, oil and natural gas plants.7 Coal plants are typically

baseload plants since these plants run continuously and cost-effectively meet some minimum level of

7Electricity output of the utility can also come from nuclear plants and from other plants not owned by the utility

(purchased power). I exclude these sources from my cost measure and only focus on output from coal, oil and natural gas

plants.

7

Figure 1: Coal-fired power plant. Source: http://edu.glogster.com/glog.php?glog_id=15469719

electricity demand that the utility expects. In contrast, natural gas are peaking plants which are only

turned on and utilized during times when demand is high and baseload plants are inadequate to meet

demand. The electric utilities in my sample primarily rely on coal to produce electricity. The average

ratio of coal consumption to total fuel burned is 92%.

Figure 1 illustrates the electricity generation process in a coal-fired power plant. First, coal is fed

into mills and pulverized into fine powder. This fine powder is mixed with air and then blown into the

boiler’s furnace and burned. At the same time, water flows through tubes inside the boiler. The burned

coal releases heat which then turns water inside the boiler into high pressure steam. The high pressure

steam rotates the turbine blades and the attached generator converts mechanical energy into electrical

energy. The coal-burning process also produces by-products such as ash and emissions. Ash is collected

while emissions flow through the plant’s stacks and into the atmosphere.

Coal-fired plants account for 65% of SO2 emissions (Environmental Protection Agency, 2001). Coal

contains sulfur and SO2 is released to the atmosphere as a by-product when the coal is burned. Sulfur

content ranges from about 0.2 pounds per heat input (lbs/MMBtu) to about 7 lbs/MMBtu (Perry

et al, 1997) and coal used for fuel is generally categorized either as bituminuous or sub-bituminuous.

Bituminuous coal tends to have a higher heat content but also high sulfur content compared to sub-

bituminuous coal. There is typically a tradeoff between heat and sulfur content so absent pollution

regulation, plants tend to burn coal with higher sulfur content. Distance of the plant from coal mines

is another factor that determines coal choice since transportation costs are a significant component of

delivered prices. The dirtiest plants in terms of SO2 are those that are located far from sources of lower

8

sulfur coal.8

Two primary forms of SO2 emissions abatement are fuel-switching (or blending), and installation of

a flue-gas desulfurization (FGD) unit or scrubber. Fuel-switching involves using coal with lower sulfur

content or blending different types of coal with varying sulfur contents. This form of compliance has a

direct impact on electricity production costs. Lower sulfur coal produces less heat, hence more coal has

to be burned to produce the same quantity of electricity. As a second form of compliance, a plant can

install an FGD which is an end-pipe control technology installed near the plant’s emission stacks. The

plant can still burn high sulfur coal, and the FGD will scrub SO2 from the emissions stream. Although

installing a scrubber can also affect the cost of producing electricity by lowering fuel effi ciency of the

plant (Fabrizio et al, 2007), capital and installation costs are the main components of abatement cost

and is less captured by the cost of producing electricity.

I focus on fuel-switching as an abatement strategy and measure marginal abatement cost as the

increase in the cost of producing electricity for an incremental reduction in emission rates. If C (q, s) is

the cost of producing electricity q given an emission rate of s, then the marginal abatement cost in units

of lbs per million British thermal units (MMBtu) is9

MAC =∂C (q, s)

∂s.

Fuel-switching is the popular abatement method during my sample period (1988-1999). In my sample,

there are only 15 plants out of about 150 that newly-installed an FGD. Plants with FGDs represent only

20% of all the plants. This number includes plants that installed FGDs to satisfy SO2 regulations that

were in place before Title IV of the Clean Air Act Ammendements of 1990. The share of abatement from

fuel-switching during this period ranged from 54% to 60% (Ellerman and Montero, 2007, Table 5).

Utility-level differences in productivity and cost effi ciency depend on the portfolio of plants it owns

and the manpower involved to run these plants. While I focus on overall utility-level cost effi ciency, an

important driver of differences of cost effi ciencies across firms is the individual effi ciencies of the plants they

own. Because fuel expenses make up 75% of operating expenses (excluding capital), an important aspect

of plant-level effi ciency is fuel effi ciency. More importantly, fuel effi ciency directly impacts abatement

costs when a significant part of emission reductions come from fuel-switching.

Differences in fuel effi ciency can be driven by factors related to manpower. At the plant-level, Bushnell

and Wolfram (2007) document differences in plant operator skill and effort levels that lead to significant

8Rail deregulation and falling delivered prices of sub-bituminuous coal from the Powder River Basin (PRB) made this

type of coal more competitive. However Ellerman et al (1990, p. 89) note that although the competitiveness of PRB coal

led to an overall decrease in contracted prices of coal, long-term contracts continued delivering high sulfur coal.9This measure of marginal abatement cost can be converted to per-ton terms by using information on the amount of fuel

burned (in MMBtu).

9

differences in plant effi ciency. While some processes are automated, activities such as controlling the rate

at which coal mills feed pulverized fuel to burners, adjusting the mix of air and fuel in the mills, and

operating soot blowers in boilers crucially depend on the plant operator’s skill and effort levels, especially

at coal-fired plants. Despite the impact on plant effi ciency of the “operator effect”, salaries of plant

operators are not commensurate to the cost differences induced by plant effi ciency, and managers have

rarely instituted personnel policies directly aimed to improve operator effi ciency. The authors remark

that one reason for such a lack of policies is that existing economic regulation does not provide adequate

incentives to the firm and its managers to improve effi ciency.

Another dimension where “effort”can affect operating costs is via fuel procurement. H. S. Chan et

al (2012) find evidence that restructuring lowered fuel procurement costs by about 6%. The idea is that

rate regulation may not be providing enough incentives for the firm’s managers to find the best price or

to renegotiate long-term contracts.

2.1 SO2 emissions regulation

SO2 produces sulfates when emitted in the atmosphere and these particles can lead to heart and lung

disease (EPA, 2009). SO2 is also a precursor of acid rain which has adverse effects on the eco-system.10

Ellerman et al (1990, Ch 2) provide a detailed summary of the political history of SO2 emissions

regulation. I highlight some interesting points in what follows. The traditional form of pollution regulation

is command-and-control where the regulator either sets a fixed uniform upperbound on the emission rate

of firms (uniform emission standard) or requires firms to install specific control technologies (technology

mandate). The Clean Air Act Ammendments (CAAA) of 1970 established the New Source Performance

Standards (NSPS) as a direct form of SO2 emissions regulation. NSPS required new coal-fired plants

to have an emission rate below 1.2 lbs/MMBtu which can be met by burning lower sulfur fuel. Older

plants were not subjected to this requirement but it was expected that these plants would be retired

in the near future. The CAAA was further ammended in 1977 and essentially required new plants to

install scrubbers despite already meeting the NSPS emission rate. Old plants were again shielded from

this requirement. However the expected retirements never materialized. By 1985, 83% of emissions from

power plants came from these exempted old plants.

Concerns about the adverse effects of Acid Rain on the eco-system served as impetus to enlarge

the scope of SO2 emissions regulations to coal-fired plants that were not subject to NSPS. Recognizing

10Acid rain is formed when SO2 is emitted in the atmosphere and mixed with water, oxygen and oxidants to form acidic

compounds that eventually fall back to the earth (National Acid Precipitation Assessment Program, 2005). Acid rain

increases the acidity of lakes and other bodies of water, leads to the degradation of forests and soil quality, and damages

structures (EPA, 2007).

10

plants have heterogenous abatement capabilities and that firms have better information on what these

capabilities are, policy-makers have moved from the one-size-fits-all regime to a decentralized, market-

based regime. This led to the creation of the Acid Rain Program (ARP) under Title IV of the Clean Air

Act Ammendments of 1990. Firms were required to hold emission permits for each ton of emission and

these permits can be traded in a market.

While generally lauded as a success (G. Chan et al, 2012), the legislative history of ARP illustrates

that implementation of the program largely hinged on the ability to redistribute the benefits of abatement

and compensate affected polluting sources via freely allocated initial permits (Joskow and Schmalensee,

1998; Ellerman et al, 1990 Ch 3; G. Chan et al, 2012). Around 6 million permits were grandfathered

(Joskow and Schmalensee, 1998), which had a value of about $600M to $1.8B. This type of redistribution

has its own costs since forgone revenues from grandfathered permits could have been “recycled” and

used to reduce distortionary taxes elsewhere in the economy (Goulder et al, 1997). This issue leads to

debates on whether the government should grandfather emission permits or sell them in an auction (see

for example, Cramton and Kerr (2002)).

2.2 Rate case

The traditional form of economic regulation is rate regulation.11 Rate regulation is primarily conducted

within a rate case. The rate case is a quasi-judicial proceeding whose main goal is to set the revenue

requirement, which forms the basis for the regulated prices to charge consumers. The revenue requirement

is the total amount that needs to be collected from consumers to compensate the firm for providing

services. It is the sum of operating expenses and the return on the rate base (RRB), which is the

monetary value assigned to the firm’s invested capital (rate base) multiplied by an allowed rate of return.

RRB can be thought of as the utility’s profit over and above its operating costs.

The rate case serves as a platform for the firm to provide information about its operating cost and

environment to the regulator (public utility commission or PUC), who then decides on what revenue

requirement to authorize. The case is typically initiated by the firm although the regulator, urged by

consumer groups, can also initiate a case. A hearing takes place where the firm and concerned parties (e.g.

consumer interest groups) participate and provide testimony on the rationale of the proposed changes and

the potential impacts these may have on consumer welfare. The firms (and its experts), consumer groups,

and commission staff testify to support their position and to refute opposing arguments. A discovery

phase also occurs where bodies of facts and data are presented. If a settlement between concerned parties

is not reached, the PUC commissioners decide on the case. The decision consists of the approved revenue

requirement which often differs from the initial proposal of the firm.

11The traditional form of regulation is also sometimes called rate-of-return regulation or cost-of-service regulation.

11

In theory, the debate and disagreement in rate cases revolve around these three elements: operating

expenses, the rate base, and the rate of return. In practice, major rate cases focus on the determination of

the rate base and especially the rate of return. Reported operating expenses are typically passed through

as long as these abide certain accounting rules.

To have a flavor of what goes on in a rate case, I summarize a few rate cases in the appendix. These

cases come from written reports prepared by the Regulatory Research Associates (RRA). Consistent with

Alt’s (2006, p. 27) guide to major rate cases, most of the disallowances in expenses are actually accounting-

related adjustments. A typical expense that is disallowed concerns depreciation of the firm’s fixed assets.

Presumably, it is harder to find strong, admissibile evidence that the firm operated ineffi ciently, while

deviations from accounting adjustment rules are just more tangible.

In terms of the rate base, the PUC may disallow certain assets if they do not satisfy the “used and

useful”criterion. For example, in the case involving Gulf Power and the Florida PUC in the appendix,

the firm’s stake in a plant was disallowed because the PUC concluded that the firm already has enough

capacity.

The sample cases in the appendix provide examples of how the authorized rates of return are reached.

The firm starts with a proposed rate of return, predicated on a proposed capital structure, cost of debt,

and return on equity. The firm presents witnesses to support its proposal. The PUC staff performs its

own research and presents what the rates should be based on its findings. Typically the commision staff

reports a range of rates of return. The PUC commissioners examine the firm’s and staff’s arguments and

finally vote on what rate to authorize.

The PUC can punish the firm for “unethical or illegal” activities by imposing a deduction on the

firm’s rate of return (see Gulf Power case in the appendix). Thus, potentially the PUC can use the rate

of return as an incentive for the firm to operate effi ciently. The model I present in the next section allows

the regulator to use the authorized rate of return as an incentive device for the firm to operate effi ciently.

Whether the regulator actually uses this device is an empirical question (see section 4).

3 Model

This section presents the model of rate regulation. Specifically, I model the rate case as a signaling game

of auditing. Before discussing the model of rate regulation, I take a step back and talk about welfare in

the social planner’s problem. Ultimately I want to compute welfare under the optimal form of pollution

regulation, which is a counterfactual. The definition of welfare tells us what elements are necessary for

this computation. The purpose of the rate regulation model is to rationalize the observed data, which

allows me to back out these elements.

12

Consider a social planner whose responsibility encompasses both pollution and economic regulation.

The social planner is the combination of the pollution regulator (Environmental Protection Agency) and

the economic regulator (Public Utility Commission). The planner faces a population of electric utilities,

each endowed with type (θ,R) which is distributed according to the joint distribution F . The variable θaffects the firm’s operating cost of producing electricity and emissions (abatement), while R is the firm’s

capital costs. I assume θ ∈ [0, θU ] and R ∈ [0, RU ], where θU and RU are finite upperbounds.

The goal of the social planner is to maximize welfare and I assume regulation is static. Following

Laffont (1994), I define social welfare as

W =

∫V (q (θ,R))−D (s (θ,R))− (1 + λ) t (θ,R) + Π (θ,R) dF (1)

where V (q (θ,R)) is the gross consumer surplus from electricity produced by firm (θ,R), i.e. q (θ,R);

D (s (θ,R)) is the pollution damage given emission rate s (θ,R); t (θ,R) is a lump-sum transfer paid to

the firm; λ is the social cost of public funds; and Π (θ,R) is the firm’s profit. Thus welfare is the sum of

consumer and producer surplus, taking into account that transfers to the firm are funded by distortionary

taxes to consumers. The planner decides on quantities and transfers to maximize welfare.

The profit of firm (θ,R) is given by

Π (θ,R) = t (θ,R)− [exp (θ − e (θ,R))C (q (θ,R) , s (θ,R)) + ψ (e (θ,R)) +R] . (2)

The term in square brackets is the firm’s total economic cost, which is composed of three elements. First,

the operating cost of producing electricity q and emission rate s is given by exp (θ − e)C (q, s) where

e ≥ 0 is managerial effort. Second, ψ (e) captures the disutility from managerial effort. I assume the firm

and its managers are one entity so ψ (e) appears in the firm’s total cost. Finally R is the firm’s capital

cost.

In order to evaluate welfare under different counterfactual regulatory regimes (including the optimal

one), I need to figure out how firms behave when facing regulation. The elements required are the

distribution of types F , the disutility function ψ (·) and the (baseline) cost function C (q, s). To identify

these elements from the data, I exploit the fact that electric utilities were subject to economic regulation.

Although the focus of the paper is pollution regulation, the model of economic regulation allows me to

back out the required primitives from firms’observed behavior, and estimate the distribution of marginal

abatement costs. It is therefore important to use a model that realistically captures the actual form of

regulation.

3.1 Rate regulation

As discussed in section 2.2, rate regulation is carried out through the rate case where the electric utility

and the regulator (i.e. public utility commission or PUC) sets the revenue requirement. The revenue

13

requirement is the amount of revenues the firm is allowed to collect from consumers and regulated prices

are based on this amount. The revenue requirement is the sum of firm expenses and the return on the

rate base (RRB). RRB is equal to a rate of return multiplied by the rate base, which is the value of

invested capital that the firm is allowed to earn profits on. In the model, the firm’s RRB is represented

by R.

Ideally the regulator would like to set the revenue requirement equal to the firm’s total economic

costs so that the firm earns zero economic profits. As seen from equation (2), economic profits depend

on (θ,R) and e which I assume the regulator does not observe. The hidden type θ and hidden effort e

are standard components of private information in regulatory models in the spirit of Laffont and Tirole

(1986). As in Laffont and Tirole (1986), I assume the regulator observes exp (θ − e)C but not θ and e

separately. Thus a firm with high operating cost may either be a high cost type or a firm that did not

exert effort.

The second dimension of unobserved type is the required RRB. Much of the debate in rate cases is

on what the fair rate of return should be and what should be included in the rate base (i.e. prudently

incurred investments that are used and useful). Instead of separately modeling these two components, I

assume that R is private information of the firm.12 Owners of the firm are likely to have better information

about what investments need to be pursued and what outside investment opportunities exist.

The rate case acts as a platform for the regulated firm to propose a revenue requirement and share

relevant information, and for the regulator to use this information and decide on what revenue requirement

to authorize. I assume that there is a fixed and known level of output that the firm has to meet, and

the rate case is about how to compensate the firm for producing this output. I model the rate case as

a signaling game. The purpose of the model is to provide predictions on how the firm behaves during a

rate case and immediately after it. Thus the full model is a three period (year) model: an initial period

where the firm draws its type; a second period where the rate case actually occurs; and a third period

which is the year after the rate case. I first present the timing of the full game and then discuss the firm

and regulator’s payoffs and optimization problems after. The timing of the game is as follows:

• t = 0 (Initial)

—Firm draws (θ,R) from a joint distribution F .

• t = 1 (Rate case)

—Firm produces the required output by exerting effort e1. Firm proposes the return on the rate

base (RRB) denoted by R and reports operating cost C = exp (θ − e1)C.

12 I plan to explore in the future the case where the firm can use capital as a signal.

14

—The regulator observes(C, R

)and fully passes through reported operating cost, i.e. authorized

expense is equal to C. To determine authorized RRB, the regulator decides on auditing

intensity α ∈ [0, 1] and incurs auditing cost A (α).

—An auditing technology leads to an authorized RRB denoted by R. The authorized revenue

requirement is thus C + R, and the firm is allowed to collect this amount at the end of the

period.

• t = 2 (Post-rate case)

—Given the authorized revenue requirement, the firm produces the required output by exerting

effort e2.

The formal signaling part of the game occurs when the firm proposes a return on the rate base

(RRB) R. The regulator observes(C, R

)and determines the revenue requirement as follows. First,

the regulator fully passes through observed operating cost by setting authorized expense equal to C.

Second, to determine the authorized RRB, the regulator audits the firm. In particular, the regulator

chooses auditing intensity α ∈ [0, 1] and this determines how close the authorized RRB R is to the true

R. Larger values of α reflect tougher auditing but this entails a nonlinear cost A (α). I describe the

auditing technology in the discussion of the regulator’s problem.

I make the following assumptions. First I assume that message spaces are[0, RU

]for R, and

(0, exp (θ)C] for the reported cost C. I assume RU ≥ RU which means I allow firms to propose an

RRB that is larger than the highest possible R.13 As for the message space of C, I allow the firm to

report almost zero costs, which it can do if it exerts an extremely high level of effort. Since a firm with

type θ generates the report C through production, the largest possible C is when it exerts zero effort.

Second, I assume auditing cost is strictly increasing and strictly convex in the auditing intensity: A′ > 0

and A′′ > 0. If auditing is viewed as a kind of information-gathering process, then this assumption

means that it is cheap to gather information at the start but as the regulator exhausts the pool of useful

information, new useful information is harder to get by.

3.2 Regulator’s problem

Define V as the sum of gross consumer surplus from output during t = 1 and 2. The revenue requirement

is the payment to the firm collected from consumers and so this reduces consumer surplus. Authorized

13This assumption is not necessary although it helps provide a clean characterization of equilibrium. The assumption

basically allows a fully-separating equilibrium to exist. Without the assumption, the fully-separating equilibrium becomes a

partially-pooling equilibrium with R-types in the upper edge of its type space pooling on the signal RU .

15

expense is equal to the observed operating cost during the rate case, i.e. C. The authorized RRB is

determined via auditing and is equal to R. Thus authorized revenue requirement is C+R. I assume that

auditing costs A (α) are shouldered by consumers so this reduces consumer surplus. Finally I assume

that the regulator only cares about consumer surplus and thus welfare is14

WPUC = V − 2C +R

−A (α) .

The regulator is required to authorize a rate of return that is fair from the point of view of the firm,

given prudently incurred investment. The law does not provide specific guidance as to how the fair rate of

return is determined except that the regulator, in determining the rate, “has made a reasonable attempt

to ensure that the results of its actions are not confiscatory or unfairly burden any of the parties to the

proceeding”(Joskow, 1974). To eliminate potential expropriation by the auditing technology, I assume

that auditing is biased in the sense that it always produces an authorized RRB above R, i.e. R ≥ R.There is no obvious way to model the auditing technology. Banks (1992) and Baron and Besanko

(1984) have modeled this as a perfect technology, i.e. the regulator chooses the probability of audit and

auditing perfectly reveals R. The problem with such an interpretation is that there is no direct link

between the authorized RRB in the data and the model.

I instead model auditing as a result of a technology with the property that greater auditing intensity

leads to an authorized RRB that is “nearer” the true R. Since auditing becomes more expensive as

intensity increases, the regulator faces a tradeoff between authorizing a rate of return that is closer to R

and paying a higher auditing cost, or authorizing a rate that is closer to R and paying a lower auditing

cost. I assume

R ≡ αR+ (1− α) R. (3)

Thus an increase in auditing intensity puts more weight on the true RRB.

A formal example of how to interpret the technology is as follows. Imagine that auditing intensity

generates a distribution with support[R, R

]and this distribution is decreasing in α, in the first order

stochastic sense. Assume that the regulator authorizes an RRB equal to the mean of this distribution.

Thus higher values of α would lead to authorized RRBs that are closer to R. A distribution that generates

(3) as its mean is a four-parameter beta distribution with shape parameters (1− α) and α, and bounds

R and R.

The regulator knows how the auditing technology works. However it does not know true R and hence

14This assumption is not critical for the results. It suffi ces to have the regulator put a higher weight on consumer surplus

relative to producer surplus. Notation becomes more complicated when welfare is defined as a weighted-sum of consumer

and producer surplus since the regulator has to apply its beliefs on the firm’s profit.

16

the regulator has to form a conjecture when choosing α. I denote this “belief”as %.15 Finally note that

the only observable that explicitly enters equation (3) is the firm’s proposed RRB, R.16 Nonetheless the

belief % is allowed to be a function of the reported operating costs so it implicitly enters equation (3)

through α.

Putting all these together, the regulator’s problem is to choose auditing intensity α ∈ [0, 1] to maximize

WPUC = V − 2C + α%+ (1− α) R

−A (α)

after observing(C, R

).

An interior optimal auditing strategy satisfies

A′ (α) = 2(R− %

). (4)

Given R and the belief %(R, C

), the regulator chooses auditing intensity such that the marginal cost of

auditing is just equal to the marginal benefit. The marginal benefit of auditing is the amount that the

firm gets from overstating its RRB when the regulator forgoes auditing.

3.3 Firm’s problem

I define U as the sum of the firm’s profit during and after the rate case, i.e. U = Π1 +Π2. The firm incurs

operating cost C = exp (θ − e1)C during the rate case but receives the authorized revenue requirement

C + R at the end of the period. After the rate case the firm incurs operating cost exp (θ − e2)C and

receives the authorized revenue requirement again. Thus firm’s profit is given by

U = 2[exp (θ − e1)C +R

]− [exp (θ − e1)C + ψ (e1) + exp (θ − e2)C + ψ (e2)]− 2R

and the firm chooses e1, e2 and R to maximize U given the the firm’s conjecture about the regulator’sauditing strategy.

15Formally, a belief is a probability distribution µ((θ,R) |

(R, C

)). What I call “belief” in the body of the paper is

actually

%(R, C

)= Eµ (R) =

∫Rdµ.

16 I have analyzed the more general model

R ≡ αR+ (1− α) R− αX(C, θ

)where X

(C, θ

)is a punishment for exerting less effort during the rate case compared to the “first best”. I still find that there

is an equilibrium where e1 = 0 (similar to Proposition 1 except αC > 0) and the data is consistent with this equilibrium.

17

An interior optimal R equates the marginal benefit from increasing the proposal with its marginal

cost:

(1− α) = αR· (R−R)

where αR

= ∂α/∂R. For a dollar increase in the proposed RRB, the authorized RRB increases by (1−α).

However the increase in R affects the regulator’s auditing intensity. If the dollar increase makes auditing

more intense, then αR· (R − R) reflects the loss of the firm from a tougher audit. Thus any interior

solution requires αR> 0 otherwise there is no cost to proposing larger values of R. Finally the marginal

cost of increasing the proposal R is decreasing in R. This feature allows sorting of types based on the

proposed RRB.

The optimal R can be seen as a markup over R where the markup reflects the elasticity of auditing

intensity with respect to a change in the proposal:

R = R+(1− α)

αR

.

The markup is larger the less sensitive auditing intensity is to increases in the proposed RRB.

The firm operates for two periods: during the rate case and after it. The optimal choice of effort after

the rate case satisfies

ψ′ (e2) = exp (θ − e2)C.

It equates the marginal disutility of effort with the cost-reduction due to effort. Because the authorized

revenue requirement is already fixed, the firm is the residual claimant to all cost-savings due to effort and

so the marginal benefit of effort is equal to exp (θ − e2)C.

Optimal effort during the rate case satisfies the inequality

ψ′ (e1) ≥[2α

C·(R−R

)− 1]

exp (θ − e1)C

where αC

= ∂α/∂C. If this inequality is strict, then the firm does not exert effort. The term on the right-

hand side is the marginal benefit from exerting effort. The firm does not benefit from cost-reductions

during the rate case because these are fully passed through to consumers. Moreover, exerting effort

reduces next period’s revenues hence creating further disincentives to exert effort. However if auditing is

suffi ciently increasing in the firm’s operating cost, then the firm may have incentives to exert a positive

level of effort. Thus a necessary condition for a positive optimal level of effort is that auditing becomes

tougher when the regulator observes larger operating costs. In this case, the firm may have enough

incentives to exert effort.

18

3.4 Equilibrium

I use Weak Perfect Bayesian Equilibrium (PBE) as my equilibrium concept. In my context, a PBE is

defined as follows. Note that instead of specifying a probability distribution µ(

(θ,R) |(R, C

))for the

beliefs, I directly use

%(R, C

)=

∫Rdµ

and refer to this object as the firm’s "beliefs". Finally, I restrict to differentiable auditing strategies in

the equilibrium definition. This allows me to characterize equilibria using partial derivatives of α, which

I denote as αRand α

C.

Definition 1 A Weak Perfect Bayesian Equilibrium of the game is characterized by a set of strategies

R (θ,R), e1 (θ,R) and e2 (θ,R) for the firm; a differentiable strategy α(R, C

)and “beliefs”%

(R, C

)for

the regulator, such that

1. Given α(R, C

)and %

(R, C

), the functions R (θ,R), e1 (θ,R) and e2 (θ,R) maximize the firm’s

profit U for each (θ,R);

2. Given any R and C, α(R, C

)maximizes welfare WPUC under the belief %

(R, C

);

3. Beliefs %(R, C

)are updated via Bayes’rule, whenever possible.

As in most signaling games, the rate case game has multiple equilibria. One approach to reducing the

set of equilibria is to adopt an equilibrium refinement.17 Instead of applying an equilibrium refinement, I

focus on a particular separating equilibrium. I then check whether the data is consistent with predictions

of this equilibrium in the next section.

The equilibrium I will be focusing on has the following interesting feature. The regulator ignores the

firm’s operating cost during the rate case when deciding on its auditing intensity. This then eliminates

any incentive for the firm to exert effort during the rate case. The firm effectively uses its proposed RRB

to signal its true RRB and in equilibrium, the regulator’s beliefs are correct, i.e.

%(R (θ,R) , C (θ,R)

)= R

and the regulator successfully sorts types R based on R (the regulator does not care about θ when deciding

on auditing as far as R is already “known”). The regulator then chooses auditing intensity based on this

17For example, Banks’(1992) auditing model uses the Universal Divinity equilibrium refinement of Banks and Sobel (1987)

to reduce the set of equilibria to a singleton. Besanko and Spulber (1992) adopt the same equilibrium refinement in their

model of investment of regulated firms.

19

correct belief. Note that although the regulator correctly infers R, it still needs to produce admissible

evidence to support R which is done via auditing. This is the same assumption made in Bank’s (1992)

auditing model. I characterize the equilibrium in the following proposition.

Proposition 1 Suppose RU is suffi ciently high.18 The following fully-separating equilibrium exists: The

firm exerts zero effort during the rate case,

e1 = 0.

After the rate case, the firm chooses the “first-best” effort, i.e. e2 solves

ψ′ (e2) = exp (θ − e2)C.

In this equilibrium, the firm proposes RRB such that

R = R+(1− α)

αR

.

The regulator ignores the operating cost signal C in its auditing strategy α(R, C

), and in particular,

αC

= 0.

Finally, the regulator’s auditing strategy is increasing in the proposed RRB and is the solution to∫A′ (α)

1− α dα = 2R.

Proof. See appendix.

The main equilibrium predictions that I will check in the data are the following. First, operating costs

will tend to be higher during the rate case compared to the year after. I also check whether heat rates

are higher during rate cases since short-run variation in heat rates are more likely due to changes in effort

than changes in firm’s capital or the skill-level of manpower. Second, the firm’s auditing strategy will be

flat with respect to operating costs during the rate case. Third, the firm’s auditing strategy is increasing

in the proposed RRB. After providing empirical support for the plausibility of this equilibrium, I use the

equilibrium’s prediction about optimal effort during and after the rate case to identify the distribution

of types and the disutility function ψ (·).18What I need is for RU to be larger than a threshold R, where R solves

(1− α(R, C

)) = αR

(R, C

)(R−RU ).

That is, the optimal proposal of the largest type RU is still an interior proposal. Note that in this equilibrium α is not a

function of C so R does not depend on C as well.

20

An interesting question is whether there are equilibria that provide incentives to the firm to exert

effort during the rate case. The following proposition characterizes a particular one. The complete

characterization and the proof are in the appendix.

Proposition 2 The following equilibrium exist: The firm’s rate case effort e1 is positive and is the

solution to [2α

C·(RU −R

)− 1]

exp (θ − e1)C = ψ′ (e1) .

After the rate case, the firm chooses the “first-best” effort, i.e. e2 solves

ψ′ (e2) = exp (θ − e2)C.

In this equilibrium, every type (θ,R) proposes the largest possible RRB, i.e. R = RU . However the

regulator ignores the proposed RRB in its auditing strategy, and in particular,

αR

= 0.

The regulator’s strategy is strictly increasing in C, i.e.

αC> 0.

Thus the regulator uses the firm’s operating cost C as an informative signal about what R is, albeit

imperfectly since different groups of types pool at different values of C.

4 Data and evidence

I construct a list of generating units affected by Phase I of the Acid Rain Program using compliance

data from EPA’s Air Markets Program. The compliance data includes all generating units that were part

of Phase I. For each unit in this list, I get unit-level data on net electricity generation and nameplate

capacity from the Energy Information Administration’s (EIA) Form 767 for the period 1988-1999. I

aggregate the data to the plant-level and get data on emissions, fuel consumption (coal, oil and natural

gas), and on whether the plant has a flue gas desulfurization (FGD) unit installed. I then aggregate

these measures at the utility-level so that I can match these to the regulatory and rate case data. Utility-

level fuel prices are constructed from the Federal Energy Regulatory Commissions’(FERC) Form 423

by averaging delivered prices across a utility’s plants for each fuel type. Finally I match these utilities

to the regulatory database of SNL Financial and extract data on fuel expense and non-fuel operations

and maintenance expense related to electricity generation, excluding expenses from nuclear plants. I also

21

Table 1: Summary statistics of operations and costs data

Variable Mean Std dev Min Max

O&M var cost $M 330 257 23 1198

Net generation MwH 2.4× 107 2.4× 107 558739 9.9× 107

emission rate lbs/MMBtu 1.77 1.04 0.23 7.22

Nameplate MW 4999 480 232 23227

FGD dummy FGD dummy 0.33 . 0 1

Salary $000/emp/mo 16.1 8.3 4.3 52.3

Price coal $/ton 32.76 10.06 12.48 53.80

Price oil $/barrel 23.29 59.17 10.06 51.39

Price gas $/MMBtu 2.96 1.03 1.34 15.48

get average monthly salaries of full-time employees involved in electricity generation. This comprises the

operations and cost data.

The rate case data comes from Regulatory Research Associates (RRA), a research and consulting

company owned by SNL Financial. These contain SNL utility codes that I use to match the rate case

data with operations and cost data. I get data on the year the rate case was proposed, the year it was

authorized, the test year, proposed and authorized rate base, and the proposed and authorized rate of

return (ROR). From these data I can construct the proposed and authorized return on the rate base

(RRB).

I am able to identify 84 utilities that own at least one Phase I plant by matching the EPA data with

the EIA data. Of these I can match 69 utility codes to SNL’s regulatory data. My primary variables are

net generation from coal, oil and natural gas plants; emission rate; a dummy for whether the utility has

at least one plant with an FGD; total nameplate capacity; and average prices for coal oil and gas. The

number of utilities with nonmissing data and with at least two rate cases during 1988-1999 goes down

to 38. Table 1 contains summary statistics for these utilities. The number of firm-year observations is

363. The O&M variable cost measure is the sum of fuel expense and non-fuel O&M expense related

to electricity generation. Fuel expense accounts for about 75% on average. Moreover on average, coal

accounts for about 92% of total fuel consumption (in MMBtu) while about 5% and 3% for oil and natural

gas respectively.

Table 2 contains rate case summary statistics for these utilities. On average, a rate case lasts just over

a year and can extend for 3 years. The number of years from the time a rate case is authorized to a new

rate case is proposed is 3 on average but can be as long as 6 years. A utility in my sample experienced

between 2 to 3 rate cases during 1988-1999. The average RRB disallowance (proposed minus authorized)

22

Table 2: Summary statistics of rate case data

Variable Mean Std dev Min Max

Rate case duration Years 1.2 0.7 1 3

Time between case Years 3.0 2.1 1 6

Proposed RRB $M 312 398 7 1868

Authorized RRB $M 287 365 5 1644

Percent disallow RRB % of Prop RRB 7 4 0 31

Proposed rate base $M 2536 3251 73 15963

Authorized rate base $M 2376 3054 66 14485

Proposed ROR % 10.2 0.9 7.9 12.2

Authorized ROR % 9.8 1.0 7.4 11.8

when measured as a percentage of proposed RRB is 7%. The percent disallowance in RRB ranges from

0%, i.e. no disallowance, to as high as 31%.

4.1 Preliminary analysis

Examining O&M variable costs in and outside of the rate case, I find that O&M variable costs are about

5% higher during a rate case. The basic regression involves regressing the log of variable O&M costs on

log of output19 (electricity and emissions), input prices (labor, coal, oil and gas) and capital (nameplate

and indicator if the firm has a scrubber), together with indicator variables for whether the observation

comes from years when the rate case is ongoing. I construct three indicator variables. The first dummy is

equal to one if the observation falls during the rate case, i.e. from proposed to authorized year, inclusive.

The second dummy is equal to one if the observation falls on the year immediately after the authorization

year. Finally the third dummy is equal to one if neither of the two dummies are one. In the regression, the

omitted dummy category is the second dummy so dummy coeffi cients measure the % difference relative

to the year after the rate case concludes.

Table 3 contains results from different specifications of the basic regression. I suppress the estimates

for the other explanatory variables in this table. Focusing on the estimates for the rate case dummy,

we see that average O&M variable costs are 5% higher during a rate case compared to the year after.

Moving to the “neither” dummy coeffi cient estimate, we find no statistically significant differences in

O&M variable costs among non-rate case years. These results hold even when controlling for output,

19 I include specifications where I use state-level of electricity demand as an instrument for electricity output and regional

prices for low and high sulfur coal as instruments for emission rates. Low sulfur coal is defined as coal with sulfur content

below 1.2 lbs/MMBtu. First stage F-statistics are 164 and 27 for electricity output and emission rates respectively.

23

Table 3: Regression results: O&M variable cost and rate case dummies.

log O&M var cost (1) (2) (3) (4) (5) (6)

Rate case 0.052∗ 0.053∗ 0.057∗∗ 0.047∗∗∗ 0.046∗∗ 0.044∗∗

(0.028) (0.027) (0.022) (0.017) (0.019) (0.021)

Neither Rate case 0.028 0.016 0.006 −0.011 −0.012 −0.015

nor Year after (0.046) (0.044) (0.020) (0.025) (0.026) (0.028)

Year Trend Yes Yes Yes Yes Yes

Firm No No Yes No No No

Firm-Rate Case No No No Yes Yes Yes

IV for electricity No No No No Yes Yes

IV for emission rate No No No No No Yes

Num. Obs. 363 363 363 363 314 314

Notes: Standard errors are either clustered at firm or firm-rate case level. Re-

gression via OLS except when indicated. Additional regressors are a dummy for

FGD; the logs of electricity output, emission rate, input prices (labor, coal, oil

and gas), and nameplate rating. I use log of state electricity demand as an IV

for electricity output and regional prices for low (<1.2 lbs/MMBtu) and high

sulfur coal for emission rates. Significance level: * 10%, ** 5%, *** 1%.

input prices, capital, and year effects, and also by looking at within firm and within firm-rate case

variation.

This pattern of O&M variable cost implied by the regressions can be rationalized by the equilibrium

characterized in Proposition 1. In that equilibrium, the firm has no incentive to exert effort during the

rate case. However, once the revenue requirement is fixed, the firm becomes the residual claimaint of

cost-savings induced by effort. This provides incentives to exert effort once the rate case has concluded.

This pattern is just suggestive and can be rationalized by other stories. For example, since rate cases are

initiated by the firm, they might strategically initiate rate cases when they know costs will be high to lock

in the rates. Thus in this story, exogenous differences in costs that the firm is aware of can explain this

pattern. To further investigate whether the pattern is induced by firm’s effort, I look at whether the same

pattern arises for heat rates, which is defined as the amount of fuel burned per unit of electricity produced.

Short-run variations in heat rates are more likely due to effort (either plant manager or operator) than

differences in equipment or skills. A higher heat rate means less effi cient production since the firm burns

more fuel to produce the same amount of electricity. I regress the log of heat rate on the log of electricity

24

Table 4: Regression results: Heat rates and rate case dum-

mies

log heat rate (1) (2) (3) (4)

Rate case 0.066∗∗ 0.050∗∗ 0.042∗∗ 0.038∗∗

(0.028) (0.020) (0.020) (0.018)

Neither Rate case 0.005 −0.018 −0.013 −0.006

nor Year after (0.019) (0.028) (0.033) (0.028)

Year Yes Yes Yes Yes

Firm Yes No Yes No

Firm-Rate Case No Yes No Yes

IV for electricity No No Yes Yes

Num. Obs. 363 363 314 314

Notes: Standard errors are either clustered at firm or

firm-rate case level. Regression via OLS except when in-

dicated. Additional regressors include a dummy for FGD

and the logs of electricity output and nameplate. I use log

of state electricity demand as an IV for electricity output.

Significance level: * 10%, ** 5%, *** 1%.

generated, log of capital, indicator for FGD, and the rate case dummies. State electricity demand is

used as an instrument for generated electricity to account for potential simultaneity bias (Fabrizio et al,

2007).20 Table 4 contains results of this regression. Consistent with my earlier finding, heat rates are

about 4 to 6% higher during rate cases and this is statistically significant at the 5% level. Interestingly,

there are no significant differences in heat rates when I compare the year after the rate case, and the

succeeding non-rate case years. Thus effort reduction during the rate case only has impact during the

case and disappears after.

To further investigate the plausibility of the equilibrium characterized in Proposition 1, I check whether

the characteristics of the regulator’s auditing strategy under this equilibrium is consistent with the data.

One complication is that the auditing strategy is not observed. To work around this, I establish a link

between observable disallowances in the return on the rate base (RRB) with the unobserved auditing

strategy. I define disallowances as

20First stage F-statistic is 168.

25

∆ ≡ R−R.

This disallowance is the difference between the proposed and authorized RRBs, both of which are ob-

served. The key step is to interpret observed authorized RRB as if this were generated by equation (3).

Substituting the expression for R gives

∆(R, C

)= α

(R, C

) [R−R

(R, C

)]where R

(R, C

)is the R-type of the firm that picks

(R, C

)in the data. This equation captures the

link between ∆ that is observed by the econometrician, and the regulator’s auditing strategy α that is

unobserved. The link does not rely on a specific equilibrium selection in the data. However, to proceed

with the analysis I make the following assumption:

Assumption 1 A single equilibrium is played in the data. Moreover, in any equilibrium that involves

pooling, any set of R-types that pool on the same signal is an interval.

While I do not assume that a specific equilibrium is being played, I do assume that the same equi-

librium is throughout the sample. Also, I restrict the set of possible equilibria that can be played. The

equilibria characterized in the previous section fall in this class, although I have not proven that any

candidate equilibrium that does not belong in this class are not an equilibrium. The following theorem

shows that I can infer the behavior of α from the behavior of ∆.

Proposition 3 Define

∆(R, C

)= α

(R, C

) [R−R

(R, C

)].

1. Under assumption 1, we have

∆(R, C

)= ∆

(R, C ′

)⇔ α

(R, C

)= α

(R, C ′

)for any C and C ′ in the data, where C > C ′.

2. Let RU be the largest possible proposed RRB. For any R and R′ in the data such that RU > R > R′,

we have

∆(R, C

)> ∆

(R′, C

)⇔ α

(R, C

)> α

(R′, C

).

Proof. See appendix.

The equilibrium characterized by Proposition 1 involve αC

= 0 and αR> 0. Proposition 3 allows us

to check these predictions using data on disallowances ∆, O&M variable cost C and proposed RRB R.

26

010

2030

4050

Del

ta

250 300 350 400C_hat

95% CI disallowratehatX lpoly smooth

kernel = epanechnikov, degree = 2, bandwidth = 27.78, pwidth = 41.67

Local polynomial smooth

Figure 2: Partial regression of ∆ on C

Figures 2 and 3 contain partial (local polynomial) regression plots of ∆ on C and R respectively. These

partial regression plots are constructed as follows. Consider the partial regression plot with respect to

C. I first regress ∆ on R, output, capacity, and firm-, year- and state-fixed effects. Then I get the

residual and normalize the location by adding back the mean of ∆. Next, I regress C on the same set

of explanatory variables, get the residual, and normalize the location. Finally I do a local polynomial

regression of the normalized ∆ residual on the normalized C residual. I do the same for R but replacing

C in place of R in the set of explanatory variables. These partial regression plots provide support for

αC

= 0 and αR> 0 and hence the plausibility of the equilibrium characterized by Proposition 1. The

next section discusses how Proposition 1 is used for identification.

5 Empirical model

I estimate a multiproduct cost function to provide a measure of (marginal) abatement costs for electric

utilities.21 The cost of reducing emissions is reflected by the increase in the cost of producing electricity

due to changes in production methods, i.e. fuel-switching. I restrict attention to costs, output, emissions

and input choices related to coal, oil and gas plants. The analysis is done at the utility-level since rate

regulation and rate cases involve the firm as a whole. Moreover, Ellerman et al (2000, p. 301) remark

that compliance decision-making is often made at the utility-level even if pollution regulation per se is

21Carlson et al (2000) estimate a similar multiproduct cost function to get abatement costs for fuel-switching plants.

27

010

2030

4050

Del

ta

160 180 200 220 240R_hat

95% CI disallowratehat lpoly smooth

kernel = epanechnikov, degree = 2, bandwidth = 20.24, pwidth = 30.36

Local polynomial smooth

Figure 3: Partial regression of ∆ on R

at the unit-level.

I assume a stochastic specification for realized O&M variable costs of producing electricity and emis-

sions. For firm i at time t, realized O&M variable cost is given by

Cit = exp(ωit)C(qit, sit, plit, pfit, Nit, dFGDit;β) exp (εit) (5)

where

ωit = θit − eitpfit = (pcit, poit, pgit)

C (q, s, pl, pf , κ;β) = NβN exp (βFGDdFGDit) qβqsβs+βsddFGDp

βll p

βcc p

βoo p

βgg .

The term exp(ω) = exp (θ − e) is the unobserved cost effi ciency of the utility where θ is the firm’sintrinsic cost type and e is unobserved managerial effort. The utility knows θ and chooses e. The

function C(q, s, pl, pf , N, dFGD;β) is the baseline cost function of the utility where q is net generated

electricity, s is the SO2 emission rate, pl is the average salary for full-time employees related to electricity

generation, pf is a vector composed of fuel prices22 for coal, oil and gas, averaged across the utility’s

plants, N is the sum of nameplate ratings of the utility’s plants, and dFGD is a dummy equal to one if the

22Fuel prices are either spot or contracted prices. Managerial effort can affect the actual price the firm faces and hence

introduces an endogeneity problem.

28

utility has at least one plant with a flue-gas desulfurization (i.e. scrubber) unit installed. This baseline

cost captures differences in O&M costs that can be explained by differences in input prices, outputs and

capital. The vector β contains the parameters of the baseline cost function that need to be estimated.

Finally ε is a mean zero stochastic error term that summarizes factors that affect realized costs. I assume

ε is unanticipated by the firm when making its input choices and uncorrelated with the regressors

I assume the firm’s intrinsic type θit is a draw from the distribution Fθ. Ideally Fθ would be con-ditioned on variables such as firm’s capacity or portfolio of plants, but as a first step I assume Fθ is afunction of the rate case year. Next, the reason why the firm’s unobserved type is indexed by t is that

I allow θ to change across different rate cases. However I assume that θ remains constant between rate

cases. Let tτ be the time index (year) for a specific firm’s rate case τ . For example, if firm i has three

rate cases during the sample period, then τ ∈ 1, 2, 3 which occurs on years t1, t2 and t3 respectively.Formally, for all i, t and τ ,

θit =

θitτ if t ∈ [tτ , tτ+1)

θitτ+1 if t = tτ+1.

I assume for each firm i, θit1 is a draw from Fθ, θit2 is a draw from Fθ|θit1 , θit3 is a draw from Fθ|θit2 , etc.In the next subsection, I discuss identification of the distribution of types Fθ, the disutility function

ψ (·) and the baseline cost function parameters β.

5.1 Identification

There are two interrelated challenges for identification.23 The first challenge is an endogeneity problem

in identifying the cost parameters β. The second challenge involves extracting the distribution of the

unobserved type θ from the variation in realized costs that is unobserved by the econometrician, i.e.

exp (θit − eit) exp (εit). The first challenge arises precisely because eit is an endogeneous variable chosen

by the firm. Moreover, cost effi ciency (ωit = θit − eit) affects electricity output and potentially inputprices. The firm’s baseline cost is the main variable that determines what level of effort to exert since this

captures the cost reductions from effort. The firm’s cost effi ciency affects electricity output since regulated

electricity prices are based on reported expenses. Finally, plant managers in charge of fuel procurement

may affect the actual price the firm pays for its fuel. If ωit is observed by the econometrician, then we

23 I focus on identification of the distribution of θ and leave identification and estimation of the distribution of R in the

appendix. Although the firm’s type is two-dimensional, the screening problem that I solve to derive the optimal mechanism is

one-dimensional. The reason is that the regulator does not have any instrument to screen R and so every R-type reports the

highest possible R. I plan to explore a richer model where R is an explicit function of installed capital and some unobserved

type (e.g. rate of return). Thus capital can be a screening variable. The screening problem then becomes a nonseparable

two-dimensional problem. Non-separability arises because capital enters operating costs.

29

can directly identify β. However ωit is not observed and therefore we need to find a way to control for it.

Furthermore, I need to extract the distribution of θ from the unobserved variation exp (θit − eit) exp (εit).

My identification strategy involves two parts. First, to identify the parameters of the empirical

model, I use Proposition 1 to pin down ωit for different time periods. This allows me to take different

transformations of the data to eliminate ωit from the estimating equations. Second, to identify the

distribution of intrinsic types θ, I recast the problem under the framework of measurement error with

repeated measurements (e.g. Li and Vuong (1988)) and use the deconvolution result of Kotlarski (1967).

I briefly mention an alternative identification strategy at the end.

5.1.1 Identification of parameters

In the equilibrium characterized by Proposition 1, the firm does not exert effort during the rate case,

hence

ωitτ = θitτ .

After the rate case, i.e. at time t = tτ + 1, the firm exerts effort such that

ψ′ (eitτ+1) = exp (ωitτ+1)C(qitτ+1, sitτ+1, plitτ+1, pfitτ+1, Nitτ+1, dFGDitτ+1;β).

To determine what ωit is after the rate case, I impose the following functional form24 for ψ (·):

Assumption 2 The disutility of effort is given by

ψ (eit, υit) =1

γexp (γeit + υit)

where γ is a parameter and υit’s are mean zero shocks that are uncorrelated with (qit, sit, plit, pfit, Nit, dFGDit)

and iid across i and t.

Remark 1 I do not include a constant in the specification for the baseline cost function and also for

ψ (·). The reason is that these are not identified when I include a constant ρ0 later in the evolution of

θit across rate cases. That is, the means of ε and υ are subsumed in the mean of the error term in the

evolution of θ.

Assumption 2 allows me to express ωit as a linear function of θit, the log of the baseline cost function

Cit (β) = C (qit, sit, plit, pfit, Nit, dFGDit;β) and the shock υit:

ωit =1

1 + γ(γθit − lnCit (β)− υit) (6)

24Gagnepain and Ivaldi (2002) uses a similar exponential form for the disutility function.

30

for t = tτ + 1. Proposition 1 and the assumption that θit is constant within rate cases give expressions

for realized costs during different “events”:

ln Citτ = θitτ + lnCitτ (β) + εitτ (7)

ln Citτ+1 =γ

1 + γ(θitτ + lnCitτ+1 (β)) +

1

1 + γυitτ+1 + εitτ+1 (8)

for all rate cases τ . The first line is the realized cost during rate cases, while the second line is for the

year after the case.

Although θit is constant within rate cases for each firm i, I allow θit to vary across rate cases. I assume

that θit follows a linear process across two rate cases :

Assumption 3 For each i and τ , intrinsic types across two rate cases τ and τ − 1 evolve according to

θitτ = ρ0 + ρ1θitτ−1 + ξitτ

where (ρ0, ρ1) are parameters and ξitτ’s are iid across i and tτ .

Assumption 3 provides a way to difference out cost effi ciency ωit. Using assumption 3, I can quasi-

difference equation (7) for two consecutive rate cases. This yields

ln Citτ − ρ1 ln Citτ−1 = ρ0 + lnCitτ (β)− ρ1 lnCitτ−1 (β) + ηitτ

where

η1itτ = ξitτ + εitτ − ρ1εitτ−1 ≡ η1itτ (β, ρ0, ρ1) .

I can then construct moment conditions

E[η1itτ (β, ρ0, ρ1) · zitτ−1

]= 0 (9)

where zitτ−1 =(qitτ−1 , sitτ−1 , plitτ−1 , pfitτ−1 , Nitτ−1 , dFGDitτ−1

)′. These moment conditions hold becauseξitτ and εitτ are iid across t and εitτ−1 is an unanticipated shock during tτ−1.

Another way to difference out ωit is by looking at observations during and after the rate case. Specif-

ically, consider the following quasi-difference across tτ+1 and tτ :

ln Citτ+1 −γ

1 + γln Citτ =

γ

1 + γ(lnCitτ+1 (β)− lnCitτ (β)) + η2itτ

where

η2itτ =1

1 + γυitτ+1 + εitτ+1 −

γ

1 + γεitτ ≡ η2itτ (β, γ) .

I can rewrite this as

ln Citτ+1 =γ

1 + γ(lnCitτ+1 (β)− lnCitτ (β)) +

γ

1 + γln Citτ + η2itτ

31

from which I can construct the moment condition

E[η2itτ (β, γ) · Citτ−1

]= 0. (10)

Since Citτ is correlated with η2itτ through εitτ , I use Citτ−1 as an instrument for Citτ . Realized cost

during the previous rate case is uncorrelated with the shock in the current rate case. Moreover, Citτ−1will be correlated with Citτ as long as ρ1 6= 0.

Finally consider the following quasi-difference across tτ and tτ−1 + 1:

γ

1 + γln Citτ − ρ1 ln Citτ−1+1 =

γ

1 + γρ0 +

γ

1 + γ

(lnCitτ (β)− ρ1 lnCitτ−1+1 (β)

)+ η3itτ

where

η3itτ =γ

1 + γ

(ξitτ + εitτ

)− ρ1

(1

1 + γυitτ−1+1 + εitτ−1+1

)≡ η3itτ (β, γ, ρ0, ρ1) .

I rewrite this as

γ

1 + γln Citτ =

γ

1 + γρ0 +

γ

1 + γ

(lnCitτ (β)− ρ1 lnCitτ−1+1 (β)

)+ ρ1 ln Citτ−1+1 + η3itτ

and construct moment conditions

E

[η3itτ (β, γ, ρ0, ρ1) ·

(1

Citτ+1

)]= 0. (11)

Notice that I have used Citτ+1 as an instrument for Citτ−1+1. Realized cost in t = tτ + 1 is uncorrelated

with past shocks but is correlated with Citτ−1+1 through the evolution of θ.

The parameters β, γ and (ρ0, ρ1) are identified as the solution to the moment conditions (9), (10) and

(11). Uniqueness of the solution can be seen by taking each equation one at a time. For example, given

(ρ0, ρ1), equation (9) is linear in β; given β, equation (9) is linear in γ/ (1 + γ) which uniquely pins down

γ; and given β and γ, equation (11) is linear in (ρ0, ρ1).

5.1.2 Identification of type distribution

Given the parameters and using assumption 3, I can rewrite realized cost during two consecutive rate

cases as

ln Citτ − lnCitτ (β)− ρ0

ρ1

= θitτ−1 +ξitτ + εitτ

ρ1

ln Citτ−1 − lnCitτ−1 (β) = θitτ−1 + εitτ−1 .

The problem of finding the distribution of θ can be recast in the framework of measurement error with

repeated measurements. Let(ξitτ + εitτ

)/ρ1 and εitτ−1 be the “measurement errors”while θitτ−1 is the

32

latent variable. The two measurement errors and the latent variable are all mutually independent and

this follows from the assumptions on ξitτ and the unanticipated cost shocks. Let φθ, φU1 and φU2 be the

characteristic functions of θitτ−1 ,(ξitτ + εitτ

)/ρ1 and εitτ−1 respectively. Assuming φθ, φU1 and φU2 have

no real zeros25, Kotlarski’s (1967, Lemma 1) identification result imply26

φθ (t) = exp

(∫ t

0

∂φY (0, t2) /∂t1φY (0, t2)

dt2

)φU1 (t) =

φY (t, 0)

φθ (t)

φU2 (t) =φY (0, t)

φθ (t)

where φY (·, ·) is the characteristic function of(

ln Citτ−lnCitτ (β)−ρ0ρ1

, ln Citτ−1 − lnCitτ−1 (β)). Since char-

acteristic functions uniquely determine the distribution of random variables, we can therefore identify the

distribution of θitτ−1 from the distribution and characteristic function of(ln Citτ − lnCitτ (β)− ρ0

ρ1

, ln Citτ−1 − lnCitτ−1 (β)

).

5.1.3 An alternative identification strategy

The functional form assumptions on the evolution of types across rate cases and the effort disutility

function can be relaxed if one is willing to (1) make a timing assumption on the input choice decision of

the firm; (2) assume that the firm chooses its inputs to minimize cost conditional on cost effi ciency ωit;

and (3) assume that this cost minimization problem leads to a cost function where cost effi ciency enters

multiplicatively.

Assume that natural gas is the only flexible input (i.e. other inputs are decided before observing elec-

tricity demand). Using Shephard’s lemma and the assumption that cost effi ciency and the unanticipated

cost shock enters multiplicatively in realized cost, we can construct the following estimating equation

based on the expenditure share of natural gas:

log

[pgitxgit

Cit

]= log

pgit(∂Cit∂pg

)Cit

− εitwhere xgit is the level of natural gas consumption. One can then identify the distribution of unanticipated

cost shocks from this equation. This strategy is similar to the strategy recently developed by Gandhi et25Arellano and Bonhomme (2012) provide intuition for this technical requirement. When the characteristic function of

the measurement errors are zero at certain points or intervals, the characteristic function of the observed measurements is

not informative about the latent variable. Evdokimov and White (2012) replace this assumption with weaker conditions.26See Rao (1992) and Li and Vuong (1998).

33

al (2011), which uses the revenue share of the flexible input and profit maximization behavior to identify

the unanticipated output shock. I use the dual problem of cost minimization instead.

Given this distribution, the only unobservable left in the stochastic specification of realized cost

(equation (5)) is cost effi ciency ωit. During rate cases, ωit = θit. I can relax assumption 3 and instead

have the more general assumption that θit evolves as a Markov process across rate cases, following the

production function literature (e.g. Olley and Pakes (1996); Levinsohn and Petrin (2003); Ackerberg et

al (2006); and Gandhi et al (2011)). One can then construct moment conditions to identify and estimate

parameters by exploiting the orthogonality of explanatory variables from past rate cases with the error

from predicting today’s intrinsic type θitτ (using θ from past rate cases).

Once the distribution of unanticipated cost shocks and parameters are identified, the distribution of

θ can also be identified. Implied effort levels can then be generated by looking at cost effi ciencies after

the rate case. Proposition 1 can then be used to nonparametrically identify the disutility function using

the generated effort levels and cost data.

5.2 Estimation

I discuss how I estimate the parameters of the cost function, disutility function and the evolution of types

across rate cases, and how I estimate the distribution of θitτ−1 . The appendix provides details on how I

estimate the auditing strategy α and the distribution of true return on the rate base R from rate case

data.

To estimate the parameters, I use the sample analog of the moment conditions given by equations (9),

(10) and (11). Ideally I would have a single estimating sample to construct the three moment conditions.

However these moment conditions taken together require each firm in the sample to have at least two

rate cases that are initiated and completed in the period 1988-1998. This leaves me with just 22 firms.

The vector β contains 9 elements and therefore I need to estimate 12 parameters in total. To increase

the number of firms, I treat the same firm in two different rate cases as if they were different firms.27 For

example, I can define two different “firms”as (firm i, rate case τ) and (firm i, rate case τ + 1). Although

there is dependence across these two firms, this dependence is captured by intrinsic types θ across rate

cases. Thus differencing out θ’s essentially gives independent samples (conditional on observables z). To

further alleviate the problem of a small sample size, I construct different samples for each of the moment

conditions. Moment condition (9), which identifies the cost function parameters conditional on (ρ0, ρ1),

only depends on rate case years so I include observations with rate cases initiated on or before 1999 even

if they are concluded after 1999 in estimating this moment condition. Sample selection bias may arise

because the timing of the rate case is partly controlled by the firm. Suppose a rate case is initiated by

27 I make this assumption when estimating the parameters, but not when estimating the type distribution.

34

the firm at time tτ+1 when at time t ∈ (tτ + 1, tτ+1) realized costs are above some threshold. Whether

the firm initiates a rate case at tτ+1 or not depends on the time t ∈ (tτ + 1, tτ+1) values of observables,

cost effi ciencies, unanticipated cost shocks ε, and the unobserved threshold that is unrelated to εitτ+1(otherwise this threshold provides information about εitτ+1 hence εitτ+1 will not be fully unanticipated by

the firm). Selection bias thus arises because cost effi ciencies ωit are unobserved by the econometrician and

these are correlated across time through the firm’s intrinisic type θit. My estimating equations difference

out ωit’s. Therefore, using different samples does not introduce sample selection bias of this nature.

Finally, I use a bootstrap procedure that samples over the “firms”to compute standard errors since the

moment conditions are based on different samples.

To estimate the distribution of θitτ−1 , I use the algorithm described in Beran and Hall (1992) which

adapts the discrete approximation of Hausdorff (1923).28 The idea is to approximate the distribution of

θitτ−1 by a discrete distribution that is constructed from estimated moments of θitτ−1 . The algorithm is

implemented as follows:

1. I estimate the firstm = 15moments of θitτ−1 using data on(

ln Citτ−lnCitτ (β)−ρ0ρ1

, ln Citτ−1 − lnCitτ−1 (β)).

I only use data for the first two rate cases for each firm.

2. Define the k-th moment of θitτ−1 as µk. Following Beran and Hall (1992), I assume the distribution

of θitτ−1 , i.e. Fθ, is supported on the compact interval [−c, c] where

c = 5√µ2.

Define the transformed moment

µk =

k∑j=0

(k

j

)(2c)−j 2−(k−j)µk

for k = 0, 1, 2, ...,m where µ0 = 1.

3. Construct the discrete distribution over 0, 1/m, 2/m, ..., 1 with

Pr

(j

m

)=

(m

j

)∆m−jµk

28An alternative procedure is to estimate the characteristic function of(ln Citτ−lnCitτ (β)−ρ0

ρ1, ln Citτ−1 − lnCitτ−1 (β)

),

derive the characteristic function of θitτ , and then use an inverse Fourier transform to get the density of θitτ (see for

example Li and Vuong (1998) and Krasnokutskaya (2011)). Li and Vuong (1998) note that the procedure of Beran and

Hall (1992) is a special case of their estimation procedure since all moments of the distribution are used to estimate the

distribution. Beran and Hall (1992) instead only use a finite number of moments and apply the discrete approximation of

Hausdorff (1923). To the extent that the distribution of θitτ can be captured by a finite number of moments, the Beran and

Hall (1992) procedure requires less data since this introduces less bias from (implicitly) estimated higher-order moments.

35

for j = 0, 1, 2, ...,m, where ∆r is the r-th order difference operator defined as

∆rµk =r∑i=0

(r

i

)(−1)i µk+i.

Hausdorff (1923) shows that this discrete distribution converges to Fθ (Shohat and Tamarkin, 1943,p. 93-94).

I construct an estimate of the discrete distribution by using the estimated moments of θitτ−1 in place

of µk. To use this distribution in the counterfactual welfare simulations, I draw a random sample of size

50 from the cumulative distribution function (cdf) of the discrete distribution and use this sample. I fit

a 6th order polynomial to the cdf29, then invert it to get the random draws.

5.3 Results

Table 5 presents the parameter estimates. The first two columns present results from the procedure

described in the previous subsection. The coeffi cient on the emission rate imply that for a 10% decrease

in emission rates, O&M variable cost increases by 3.6%, and this is significant at the 5% level. If the

utility has at least one flue-gas desulfurization unit, the effect of decreasing emission rates goes down by

about half. To interpret the coeffi cient on log electricity output, I compute a simple measure of single-

output returns to scale using Nelson’s (1985) equation (7) for variable cost functions. My estimates imply

a returns to scale of 1.69 which tends to be high. For example, recent estimates of returns to scale range

from 0.99 to 1.56 (Kleit and Tecrell, 2001). Cost elasticities for the firm’s variable inputs imply cost

shares of roughly 30%, 60%, 7% and 4% for labor, coal, oil and natural gas inputs.

To interpret the estimated disutility function, suppose the firm’s cost when it does not exert effort is

$100M, while at the optimal level of positive effort, the firm reduces its cost by 5%. At the optimum, the

marginal disutility is equal to the marginal cost reduction:

exp (γe∗) = − ∂

∂e[exp (θ − e)C (β)]

∣∣∣∣e∗

= exp (θ − e∗)C (β) = $95M.

Thus1

γexp (γe∗) = $19M

and so a 5% reduction from $100M incurs a level of disutility valued at $19M. My chosen effort disutility

function is not a function of firm attributes. As a robustness check, I include the total nameplate capacity

29Beran and Hall (1992) use the polygonal approximant (see Feller (1971, p. 540)) to the cdf of the discrete distribution.

Basically the polygonal approximant convolutes a uniform distribution between two points of the discrete distribution. That

is, one draws a line that connects two steps of the cdf.

36

of the utility and the proportion of coal burned relative to total fuel. A firm with more or larger plants

might be more diffi cult to manage. Moreover, in interviews with plant engineers and managers, Bushnell

and Wolfram (2007) note that there is greater scope for an individual plant operator’s skill and effort to

affect plant effi ciency among coal plants. Thus monitoring the operator’s performance is likely to be more

diffi cult in coal plants. Estimated coeffi cients on these variables are positive although only the coal ratio

is significant (10% level). The estimated coeffi cient on effort, i.e. γ, is smaller but the 95% confidence

interval still contains my previous estimate.

The estimated evolution of intrinsic types show strong persistence. The coeffi cient on the past rate

case’s intrinsic type is 1.002 and this is statistically significant at the 1% level. An interesting question is

whether a firm fixed effect would be suffi cient to capture the unobserved heterogeneity in cost effi ciencies

given the high persistence of intrinsic types across rate cases. The last two columns of table 5 show the

estimates from a regression model with firm fixed effects and year dummies. Focusing on the estimates

of the coeffi cients on electricity output and emissions, we see that the estimates from the fixed effect

model are attenuated. Although the firm fixed effect can capture the variation in cost effi ciencies due to

variation in intrinsic types across firms, the fixed effect fails to capture the effect of endogenous effort on

cost effi ciency. The upward bias in the coeffi cient on emission rates can be explained as follows. Think of

effort as an omitted variable and imagine that emission rate is the only regressor. This omitted variable

is negatively correlated with cost and negatively related to emission rates (because lower emission rates

increase cost, which increases the marginal benefit from exerting effort). Thus there will be upward bias.

An upward bias in the coeffi cient on emission rates leads to underestimated marginal abatement costs

(MAC) since

MAC = − ∂

∂s[exp (θ − e)C (β)] = |βs| exp (θ − e)C (β) s−1.

Figure 4 plots the discrete approximation to the cumulative distribution of intrinsic type θ, and the

fitted polynomial. I draw a sample of size 50 using this fitted polynomial. The distribution of θ implies

a distribution of MACs and I plot the histogram of MACs in figure 5. In generating the distribution of

MACs, I assume (i) all firms have emission rate of 2.5 lbs per MMBtu, (ii) observable variables (electricity

output, input prices and fuel burned) are at their median values, (iii) firms do not have FGDs installed

(i.e. dFGD = 0), and (iii) firms exert optimal positive effort. The emission standard of 2.5 is the implicit

emission standard under Phase I of the Acid Rain Program, so figure 5 reflects the distribution of MACs

if SO2 regulation were implemented by a uniform emission standard. There is considerable heterogeneity

in MACs. The median MAC is $182 per ton while the mean MAC is $325. The 75th percentile is $365

so most of the mass of the distribution is in the sub-$400. The 90th and 95th percentiles are $869 and

$1405 respectively, so there is a nonneglible mass of firms that have MACs above $800. A more flexible

pollution regulatory regime takes advantage of the heterogeneity in cost effi ciencies. For example, the

37

Table 5: Parameter estimates

log O&M variable cost Model FE

Est SE Est SE

log emission rate -0.356∗∗ 0.200 -0.210∗∗∗ 0.035

log emission rate*FGD 0.177∗ 0.134 0.161∗∗∗ 0.045

log Electricity output 0.694∗∗ 0.376 0.458∗∗∗ 0.044

log Price of labor 0.297∗∗∗ 0.121 0.056∗∗ 0.056

log Price of coal 0.595∗∗∗ 0.143 0.659∗∗∗ 0.062

log Price of oil 0.065 0.054 0.200∗∗∗ 0.052

log Price of gas 0.043 0.044 0.085∗ 0.044

log Nameplate -0.174 4.515 -0.172∗ 0.102

FGD -1.371 3.97 -0.203∗∗∗ 0.047

Disutility (γ) 4.975∗∗ 2.423 . .

Type evolution (ρ0) -0.124 1.342 . .

Type evolution (ρ1) 1.002∗∗∗ 0.277 . .

Model SE computed via bootstrap. * 10%, ** 5%, *** 1%. FE = firm & year

regime that minimizes the total cost of achieving the same level of abatement can be implemented by

setting a uniform emission tax equal to $113 per ton and letting firms decide their emission rates. Annual

cost-savings under this regime are about $12M per firm.

6 Counterfactual welfare

The social planner’s responsibility encompasses both pollution and economic regulation. Pollution reg-

ulation is concerned with emission rates while economic regulation deals with how the firm will be paid

for providing its services. I focus on emission rates as the regulatory variable, taking the quantity of elec-

tricity, capital and input prices as exogenously given. A regulatory regime is a direct revelation contract

that specifies a bundle (s, e, t) for each type (θ,R). The bundle consists of an emission rate s, a level

of managerial effort e and a lump-sum transfer t.30 The lump-sum transfer should be suffi cient to cover

both the cost of producing electricity and abatement. Different regimes correspond to different mappings

between types and bundles.

30The level of effort e can be part of the contract since the social planner observes the firm’s cost and can then recover

what e is, assuming the contract is incentive compatible. An equivalent way of specifying the contract is(s, C, t

)where C

is the firm’s realized operating cost.

38

6 4 2 0 2 4 60.2

0

0.2

0.4

0.6

0.8

1

1.2

θ

Es timated cdf of θ

Discrete approxPolynomial fit

Figure 4: Estimated cdf of θ

0 200 400 600 800 1000 1200 1400 16000

5

10

15

20

25

$/ton

Histogram of MACs

Figure 5: Histogram of marginal abatement costs in $/ton (uniform standard = 2.5 lbs/MMBtu)

39

The planner cares about social welfare which is given by equation (1) which I reproduce here:

W =

∫V (q (θ,R))−D (s (θ,R))− (1 + λ) t (θ,R) + Π (θ,R) dF (1)

Moreover, the planner faces constraints in designing the regime. First, the planner needs to satisfy

individual rationality constraints which require leaving firms with nonnegative economic profits:

Π (θ,R) = t (θ,R)− exp [θ − e (θ,R)] · C (s (θ,R)) + ψ [e (θ,R)] +R ≥ 0 (12)

for all (θ,R). As in Laffont (1994) and Laffont and Tirole (1986), I assume the social planner observes

realized cost but not the firm’s type and effort. Thus the planner also face an informational constraint.

This informational constraint is captured by incentive compatibility constraints

Π (θ,R) ≥ t(θ′, R′

)−

exp[θ − e

(θ′, R′

)]· C(s(θ′, R′

))+ ψ

[e(θ′, R′

)]+R

(13)

for all (θ,R) and θ′ 6= θ or R′ 6= R. These constraints ensure that a type (θ,R) does not have an incentive

to pick some other type’s bundle.

Although the firm’s type is two-dimensional, the screening problem can be reduced to a single-

dimensional screening problem. Since there is no action to screen R, all of the firms will report the

highest possible R. This holds in any regime and thus R’s do not play a role in comparing welfare (except

for the full information regime). From hereon I will just treat θ as the firm’s type and ignore R. I let Θ

be the random sample of θ’s that I have drawn from the estimated distribution of θ. I have N = 50 types

in total. I use the median values of electricity output, input prices, amount of fuel burned and capital to

compute welfare. I also assume firms do not own a plant that has a flue-gas desulfurization unit.

Given a regulatory regime, I compute

W (p, λ) =1

N

∑θ∈Θ

−p · S (s (θ))− (1 + λ) t (θ) + Π (θ)

where

Π (θ) = t (θ)−

exp [θ − e (θ)] ·Ψ · s (θ)βs + ψ [e (θ)]

Ψ = NβN exp (βFGDdFGD) qβqpβll p

βcc p

βoo p

βgg

βs < 0.

The linear function S (s (θ) ; Γj) converts an emission rate s (θ) to tons of SO2 emissions using the median

amount of fuel burned. I impose a linear pollution damage function so p represents the constant marginal

40

damage from a ton of pollution.31 The variable λ is the social cost of public funds. I treat (p, λ) as

simulation parameters and I compute W for different combinations of (p, λ). Finally, the welfare metric

W does not include the surplus from electricity consumption and thus I focus on W − WUE , where WUE

is the corresponding welfare metric for the uniform emission standard regime. W − WUE measures the

welfare gain of a given regulatory regime relative to the uniform emission standard.

6.1 Regulatory regimes

I compute W under the following regulatory regimes:

Full-information The planner observes θ and e so incentive compatibility constraints are not rel-

evant. Define the first best allocation as the pair(sFB (θ) , eFB (θ)

)that solves

p · dS (s (θ) ; Γj)

ds= (−βs) (1 + λ) exp [θ − e (θ)] ·Ψ (Γj) · s (θ)βs−1

ψ′ [e (θ)] = exp [θ − e (θ)] ·Ψ (Γj) · s (θ)βs .

The planner pays firms a transfer that is just enough to cover costs. Thus the full-information regime is

characterized by

θ 7→(sFB (θ) , eFB (θ) , exp

[θ − eFB (θ)

]· C(sFB (θ)

)+ ψ

[eFB (θ)

]).

Optimal regulation The planner chooses (s, e, t) to maximize welfare subject to individual ra-

tionality and incentive compatibility constraints. The optimal mechanism is fully characterized in the

appendix and is similar to the mechanism characterized in Proposition 2 of Laffont (1994). Allocations

(s (θ) , e (θ)) deviate from(sFB (θ) , eFB (θ)

)except for the most effi cient type, because of the planner’s

desire to reduce information rents. The most ineffi cient type earns zero profits while the rest earn strictly

positive profits.

Uniform emissions standard (s = effi cient standard) The planner requires s (θ) to be equal

to the emissions standard s for all θ. The effi cient uniform emission standard is the emission rate that

maximizes allocative effi ciency under the constraint that all firms have s (θ) = s. Given s, the planner

induces the firms to choose effort e (θ) such that

ψ′ [e (θ)] = exp [θ − e (θ)] ·Ψ (Γj) · sβs31Allowing for a more complicated nonlinear damage function necessitates sophisticated techniques to estimate marginal

damages across sources. Fowlie and Muller (2012) perform welfare analysis for non-uniformly mixed pollutants by utilizing

the method for computing marginal damages developed in Muller and Mendelsohn (2009). They do not touch on issues

arising in regulation with asymmetric information and costly information rents which is my main focus.

41

Figure 6: Hybrid regime

and offers transfers that satisfy individual rationality and incentive compatibility constraints.

Emission tax The planner sets an emission tax of p/ (1 + λ) per ton. This leads to firms choos-

ing the allocation (s (θ) , e (θ)) =(sFB (θ) , eFB (θ)

)which maximizes allocative effi ciency. The planner

chooses transfers such that individual rationality and incentive compatibility constraints are satisfied

given the first best allocation. Transfers are allowed to depend on type.

Hybrid: emission tax with opt-out The planner offers firms two choices: IN or OUT. If the firm

chooses IN, it is required to pay an emission tax of p/ (1 + λ) per ton and in return will be provided a

transfer t∗. The transfer does not depend on the firm’s type unlike in the previous emission tax regime. If

the firm chooses OUT, it is required to set s (θ) equal to the first best emission rate of the most ineffi cient

type which I define as θ. The firm is paid a transfer equal to the total cost (including disutility) of θ, i.e.

exp[θ − eFB

(θ)]· C(sFB

(θ))

+ ψ[eFB

(θ)].

Figure 6 summarizes the hybrid contract.

6.2 Results and analysis

I examine welfare gains under three different values for the constant marginal damage: p = 100, 300

and 1000. The range of emission permit prices during Phase I was about $60 to about $300 per ton.

Moreover, the range of emission tax rates under the proposed Sulfur and Nitrogen Emissions Tax Act of

1987 (H.R. 2497) is $300 to $900 per ton. Thus these choices of constant marginal damages are reasonable

approximations of what policy-makers had in mind with respect to marginal damage from SO2 emissions.

Finally I look at two values for the cost of public funds: λ = 0.3 and 0.7. The value λ = 0.3 comes from

42

Table 6: Welfare gains

W−WUE

($M )\(λ,pa) (0.3, 100) (0.3, 300) (0.3, 1000) (0.7, 100) (0.7, 300) (0.7, 1000)

Emiss std = Effi cient

Full Info 104.4 136.6 178.7 215.4 280.6 370.0

Opt Reg 32.1 54.1 72.9 55.8 108.8 155.4

Tax 25.4 41.4 54.6 33.3 73.9 97.3

Hybrid 23.0 40.3 53.1 29.1 74.9 98.6

Table 7: Mean emission rates and effi cient uniform standard

lbs/MMBtu\(λ,pa) (0.3, 100) (0.3, 300) (0.3, 1000) (0.7, 100) (0.7, 300) (0.7, 1000)

Opt Reg 3.41 2.02 0.86 4.10 2.60 1.27

Tax 2.90 1.40 0.55 3.31 1.72 0.68

Hybrid 3.05 1.85 0.73 3.77 2.27 0.90

Effi cient std 3.75 1.61 0.63 4.61 1.98 0.78

estimates of the cost of public funds for the US in the public finance literature (Laffont, 2005; Ballard,

Shoven and Whalley, 1985) while λ = 0.7 reflect an environment where taxes are diffi cult to collect.

Tables 6 and 7 present the welfare gains and mean emission rates under the different regulatory regimes

and parameter constellations. Welfare gains measure the improvement in welfare under the regime when

compared to a uniform emission standard. These gains are in millions of 1995 dollars and intepreted as

the average annual gain per firm. The mean emission rates are in terms of lbs/MMBtu. All of these

measures reflect averaging across types.

Annual welfare gains range from $32M to $155M per firm. These gains represent about 10% (i.e.

32/330) to 47% of the average O&M variable cost in my sample of electric utilities. Welfare gains

increase with both the constant marginal damage parameter p and the cost of public funds λ. The main

weakness of a uniform emission standard is its lack of flexibility in terms of emission allocations across

heterogeneous firms. The gains from flexibility that optimal pollution regulation is able to achieve comes

from two sources. First, a more flexible emission allocation scheme increases allocative effi ciency, i.e. the

proper balance between marginal damages from emissions and marginal abatement costs. More ineffi cient

firms have higher abatement costs so less abatement is required for these types. Second, flexibility allows

the planner to reduce information rents by lowering abatement levels for types that have larger impacts

on overall information rents. Lowering required abatement for ineffi cient types lowers the reward of more

effi cient types from claiming to be ineffi cient, hence less information rents have to be paid. Figure 7 shows

43

Figure 7: Welfare gains: Effi ciency & Information Rents

the division of welfare gains into these two sources. Almost all of the gains from flexibility come from

reduction in information rents. Information rents are large under the uniform standard because ineffi cient

types are required to abate the same level as effi cient types. The more stringent the standard is, the

larger are the information rents. The gain from allocative effi ciency is bounded above by the difference

in allocative effi ciencies under the first best allocation and under the effi cient uniform standard, i.e.

−D1 + γ (1 + |βs|)γ |βs|

·[(

1

N

∑θ∈Θ

sFB (θ)

)− s]

where D > 0 is the equivalent marginal damage from an increase in emission rates. This upperbound

only depends on the difference between the mean emission rate under the first best allocation and the

effi cient uniform standard. When this gap is small, the gains from allocative effi ciency are also small.

Laffont (1994) suggests that optimal pollution regulation can be implemented using differentiated

emission taxes and transfers. For example, the transfer will be a function of the firm’s reported cost

while the tax will be a function of reported emission rate. Each type reports different combinations of

cost and emission rate, and in return receives different transfers and faces different tax rates. When

the number of types are large, such a policy would be diffi cult to implement. An interesting question

then is how well do simpler regulatory regimes perform? I first look at a uniform emission tax regime

that provides differentiated subsidies to firms. The welfare gains under this regime is the upperbound of

44

Table 8: Percent of welfare gains from optimal regulation that is captured by simple contracts

%\(λ,pa) (0.3, 100) (0.3, 300) (0.3, 1000) (0.7, 100) (0.7, 300) (0.7, 1000)

Emiss std = Effi cient

Tax 79.1 76.5 74.9 59.7 67.9 62.6

Hybrid 71.7 74.5 72.8 52.2 68.8 63.4

Table 9: Opt-out emissions standard

lbs/MMBtu\(λ,pa) (0.3, 100) (0.3, 300) (0.3, 1000) (0.7, 100) (0.7, 300) (0.7, 1000)

Opt-out std 7.00 5.25 2.08 7.00 6.46 2.55

the class of regimes with uniform emission taxes since this has the most flexible compensation scheme.

Second, I consider a hybrid regime where firms can choose either to participate in the uniform emission

tax regime or to opt-out and join a lenient emission standard. If the firm decides to pay emission taxes,

it receives a transfer that is not differentiated across types. If the firm opts out, then it will be required

to have the first best emission rate of the most ineffi cient firm. In exchange it receives a transfer equal

to the cost of the most ineffi cient firm.

Table 8 shows how much of the welfare gains from optimal regulation is captured by simpler regimes.

The opt-out emission rates are given by table 9. The emissions tax regime can capture from 60% to 80%

of the welfare gains from optimal regulation. These numbers indicate that a uniform emission tax regime

can yield welfare gains that are not significantly far from the more complicated optimal mechanism.

Although allocations are decentralized in the emission tax regime, it is complicated to implement

since transfers are type-dependent. The hybrid regime is basically an emission tax regime with a type-

independent transfer so it is a simpler alternative. A hybrid regime with 100% participation (no opt-out)

is clearly welfare dominated by the tax regime with differentiated transfers because the planner leaves

higher information rents in the former. The nice thing about the hybrid regime is that it can lower

information rents by allowing firms to opt-out. However opt-out distorts allocative effi ciency and so has

a negative effect on welfare. It turns out that if the gains from lowering information rents is suffi ciently

large relative to the loss from allocative effi ciency distortions, then it is possible that the hybrid regime

can do better than the uniform emission tax regime with differentiated transfers. One such case is when

λ is large. Table 8 shows that when λ = 0.7 and p ≥ 300, the hybrid is actually better than the emission

tax regime with differentiated transfers. When λ = 0.3, the hybrid regime is worse however the gap is

not huge.

The intuition for why the hybrid regime works is precisely the intuition for optimal regulation: there

45

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Emission rate

Distribution of emission rates ( λ=0.7, p=300)

Opt RegFirst bestHybridUniform standard

Figure 8: CDF of emission rates under different regimes (λ = 0.7, p = 300)

is a tradeoff between allocative effi ciency and information rent extraction. The hybrid regime can be

seen as a binary menu in the spirit of Rogerson (2003) and Chu and Sappington (2007). Figure 8 plots

the cdf of the distribution of emission rates s (θ) under different regimes. The hybrid regime basically

approximates the distribution under optimal regulation in a limited way.

7 Conclusion

Annual welfare gains from optimal pollution regulation relative to an effi cient uniform emission standard

range from $32 million to $155 million per electric utility, or about 10% to 47% of electricity generation

costs. The optimal form of regulation can be theoretically implemented by designing a menu of type-

dependent emission tax rates and transfers. A simpler and more practical way to allocate emission rates is

through a uniform emission tax. A regime with a uniform emission tax and type-dependent transfers can

capture from 60% to 80% of these gains. However this still requires the social planner to design transfers

that depend on the firm’s type. I consider a hybrid regime where both the emission tax and transfer are

uniform but allows firms to opt-out and join a uniform emission standard. The hybrid regime captures

from 52% to 75% of the welfare gains and can even do better than the more complicated emission tax

regime if the cost of public funds is high.

46

I use a model of rate regulation to identify and estimate the firm’s hidden type and disutility from

exerting effort. In the model and analysis, I did not explicitly model capital choice and how it can be

used as a signal during the rate case. The primary mode of compliance during the time period I study

was fuel-switching so capital-based compliance methods played a smaller role. However, more recent data

reflects greater popularity of capital-based compliance methods hence, explicitly modeling capital choice

is important. Future research will deal with this more general case. The optimal mechanism in this

case is the solution to a non-separable multidimensional screening problem. While this is a complicated

problem to solve analytically, numerical methods can be used with a discretized type space.

Another avenue for future research is to compare my estimates with estimates from a normative

model. The nonparametric identification strategy developed recently by Perrigne and Vuong (2011) can

be used to estimate the Laffont and Tirole’s (1986) normative model. A formal econometric test along

the lines of Vuong (1989) and Smith (1992) can be used to assess which model is a better fit to the data.

Finally, regulation in my case is static. One reason for this is that Title IV, as originally conceived,

is a long-term program. However EPA, starting in 2005, decided to redesign the program to take into

account the cross-state transport effects of SO2 emissions. In doing so, it updated the estimates of

marginal abatement costs. Some firms and states sued the EPA and up until today, the future of this

policy is largely uncertain. Because of the ability of the regulator to update its information about the

firms and change the policy accordingly, it would be more expensive to incentivize firms to reveal their

types. Thus, policies that reduce information rents would probably yield higher welfare than those that

focus on allocative effi ciency.

References

Ackerberg, D. A., K. Caves and G. Frazer (2006), “Structural Identification of Production Functions,”

mimeo, UCLA.

Alt, L. E. (2006), Energy Utility Rate Setting: A Practical Guide to the Retail Rate-Setting Process for

Regulated Electric and Natural Gas Utilities, Morrisville, NC: Lulu.com.

Arellano, M. (2003) Panel Data Econometrics, Oxford: Oxford University Press.

Arellano, M. and S. Bonhomme (2012), “Identifying Distributional Characteristics in Random Coeffi -

cients Panel Data Models,”Review of Economic Studies, 79:3, 987-1020.

Arellano, M. and B. Honoré (2001), “Panel data models: some recent developments,”in J.J. Heckman

and E.E. Leamer (ed.), Handbook of Econometrics, Vol. 5, Amsterdam: Elsevier Science.

47

Armstrong, M. and D. E. M. Sappington (2007), “Recent Developments in the Theory of Regulation,”

in M. Armstrong and R. Porter (ed.), Handbook of Industrial Organization, Vol. 3, Amsterdam:

Elsevier Science.

Asker, J. (2010), “A Study of the Internal Organization of a Bidding Cartel,”American Economic

Review, 100:3, 724-762.

Bailey, E. E. and R. D. Coleman (1971), “The Effect of Lagged Regulation in an Averch-Johnson

Model,”Bell Journal of Economics, 2:1, 278-292.

Ballard, C. L., J. B. Shoven and J. Whalley (1985), “General Equilibrium Computations of the Marginal

Welfare Costs of Taxes in the United States,”American Economic Review, 75:1, 128-138.

Banks, J. S. (1992), “Monopoly Pricing and Regulatory Oversight,”Journal of Economics & Manage-

ment Strategy, 1:1, 203-233.

Banks, J. S. and J. Sobel (1987), “Equilibrium Selection in Signaling Games,” Econometrica, 55:3,

647-661.

Baron, D. P. and D. Besanko (1984), “Regulation, Asymmetric Information, and Auditing,”RAND

Journal of Economics, 15:4, 447-470.

Baron, D. P. and R. B. Myerson (1982), “Regulating a Monopolist with Unknown Costs,”Econometrica,

50:4, 911-930.

Baumol, W. J. and A. K. Klevorick (1970), “Input Choices and Rate-of-Return Regulation: An

Overview of the Discussion,”Bell Journal of Economics and Management Science, 1:2, 162-190

Beran, R. and P. Hall (1992), “Estimating Coeffi cient Distributions in Random Coeffi cient Regressions,”

Annals of Statistics, 20:4, 1970-1984.

Besanko, D. (1985), “On the Use of Revenue Requirements Regulation Under Imperfect Information,”

in M. A. Crew (ed.), Analyzing the Impact of Regulatory Change in Public Utilities, Lexington, MA:

Lexington Books, 39-55.

Besanko, D. and D. F. Spulber (1992), “Sequential-Equilibrium Investment by Regulated Firms,”

RAND Journal of Economics, 23:2, 153-170.

Bonhomme, S. and J.-M. Robin (2010), “Generalized Nonparametric Deconvolution with an Application

to Earnings Dynamics,”Review of Economic Studies, 77:2, 491-533.

48

Brocas, I., K. Chan and I. Perrigne (2006), “Regulation under Asymmetric Information in Water

Utilities,”American Economic Review, Papers and Proceedings, 96, 62-66.

Bushnell, J. and C. Wolfram (2007), “The Guy at the Controls: Labor Quality and Power Plant

Effi ciency,”NBER working paper 13215.

Carlson, C. P., D. Butraw, M. Cropper and K. Palmer (2000), “SO2 Control by Electric Utilities: What

are the Gains from Trade,”Journal of Political Economy, 108:6, 1292-1326.

Chan, G., R. Stavins, R. Stowe and R. Sweeney (2012), “The SO2 Allowance-trading System and the

Clean Air Act Ammendments of 1990: Reflections on 20 Years of Policy Innovation,”National Tax

Journal, 65:2, 419-452.

Chan, H. S., H. Fell, I. Lange and S. Li (2012), “Effi ciency and Environmental Impacts of Electricity

Restructuring on Coal-fired Power Plants,”mimeo, Univ of Maryland.

Chu, L. Y. and D. E. M. Sappington (2007), “Simple Cost-Sharing Contracts,”American Economic

Review, 97:1, 419-428.

Cramton, P. and S. Kerr (2002), “Tradeable Carbon Permit Auctions: How and why to auction not

grandfather,”Energy Policy, 30, 333-345.

Ellerman, A. D., P. L. Joskow, R. Schmalensee, J.-P. Montero and E. Bailey (2000), Markets for Clean

Air: The U.S. Acid Rain Program, Cambridge, UK: Cambridge University Press.

Environmental Protection Agency (2001), Acid Rain Program: 2001 Progress Report, EPA-430-R-02-

009.

Environmental Protection Agency (2007, June 8),What is Acid Rain? Retrived from http://www.epa.gov/acidrain/what/index.html.

Environmental Protection Agency (2009, May 13) Effects of Acid Rain - Human Health, Retrived from

http://www.epa.gov/acidrain/effects/health.html.

Evdokimov, K. (2008), “Identification and Estimation of a Nonparametric Panel Data Model with

Unobserved Heterogeneity,”mimeo, Yale University.

Evdokimov, K. (2010), “Nonparametric Identification of a Nonlinear Panel Model with Application to

Duration Analysis with Multiple Spells,”mimeo, Princeton University.

Evdokimov, K. and H. White (2012), “Some Extensions of a Lemma of Kotlarski,”Econometric Theory,

28:4, 925-932.

49

Fabrizio, K., N. Rose and C. Wolfram (2007), “Do Markets Reduce Costs? Assessing the Impact of

Regulatory Restructuring on US Electric Generation Effi ciency,”American Economic Review, 97:4,

1250-1277.

Feller, W. (1971), An Introduction to Probability Theory and its Applications, Vol 2, New York: Wiley.

Fowlie, M. (2010), “Emissions Trading, Electricity Industry Restructuring, and Investment in Pollution

Control,”American Economic Review, 100:3, 837-869.

Fowlie, M. and N. Muller (2012), “Market-based emissions regulation when damages vary across sources:

What are the gains from differentiation?”mimeo, UC Berkeley.

Gagnepain, P. and M. Ivaldi (2002), “Incentive Regulatory Policies: The Case of Public Transit Systems

in France,”RAND Journal of Economics, 33:4, 605-629.

Gandhi, A., S. Navarro and D. Rivers (2011), “On the Identification of Production Functions: How

Heterogenous is Productivity,”mimeo, Univ of Wisconsin-Madison.

Goulder, L. H., I W. H. Parry and D. Butraw (1997), “Revenue-raising versus Other Approaches

to Environmental Protection: The Critical Significance of Preexisting Tax Distortions,” RAND


Hausdorff, F. (1923), “Momentprobleme für ein endliches Intervall”, Mathematische Zeitschrift, 16,

220-246

Hendel, I. and A. Nevo (2012), “Intertemporal Price Discrimination in Storable Goods Markets,”

mimeo, Northwestern Univ.

Horowitz, J. L. and M. Markatou (1996), “Semiparametric Estimation of Regression Models for Panel

Data,”Review of Economic Studies, 63:1, 145-168.

Joskow, P. L. (1974), “Inflation and Environmental Concern: Structural Change in the Process of

Public Utility Price Regulation,”Journal of Law and Economics, 17:2, 291:327.

Joskow, P. L. (2008), “Incentive Regulation and its Application to Electricity Networks,”Review of

Network Economics, 7:4, 547-560.

Joskow, P. L. and R. Schmalensee (1998), “The Political Economy of Market-Based Environmental

Policy: The U.S. Acid Rain Program,”Journal of Law and Economics, 41:1, 37-83.

50

Kahn, A. E. (1988), The Economics of Regulation: Principles and Institutions, Cambridge, MA: MIT

Press.

Kotlarski, I. I. (1967), “On Characterizing the Gamma and Normal Distribution,”Pacific Journal of

Mathematics, 20, 69-76.

Krasnokutskaya, E. (2011), “Identification and Estimation of Auction Models with Unobserved Het-

erogeneity,”Review of Economic Studies, 78:1, 293-327.

Laffont, J.-J. (1994), “Regulation of Pollution with Asymmetric Information,”in C. Dosi and T. Tomasi

(eds.), Nonpoint Source Pollution Regulation: Issues and Analysis, Kluwer Academic Publishers,

39-66.

Laffont, J.-J. (2005), Regulation and Development, Cambridge, UK: Cambridge University Press.

Laffont, J.-J. and J. Tirole (1986), “Using Cost Observation to Regulate Firms,”Journal of Political

Economy, 94, 614-641.

Laffont, J.-J. and J. Tirole (1993), A Theory of Incentives in Procurement and Regulation, Cambridge,

MA: MIT Press.

Lazarev, J. (2011), “The Welfare Effects of Intertemporal Price Discrimination: An Empirical Analysis

of Airline Pricing in U.S. Monopoly Markets,”mimeo, Stanford GSB.

Leslie, P. (2004), “Price Discrimination in Broadway Theater,”RAND Journal of Economics, 35:3,

520-541.

Levinsohn, J. and A. Petrin (2003), “Estimating Production Functions Using Inputs to Control for

Unobservables,”Review of Economic Studies, 317-342.

Lewis, T. R. (1996), “Protecting the Environment when Costs and Benefits are Privately Known,”

RAND Journal of Economics, 27:4: 819:847.

Miravete, E. J. (2007), “The Limited Gains from Complex Tariffs,”mimeo, Univ of Texas-Austin.

Li, T. and Q. Vuong (1998), “Nonparametric Estimation of the Measurement Error Model Using

Multiple Indicators,”Journal of Multivariate Analysis, 65, 139-165.

Li, T., I. Perrigne and Q. Vuong (2000), “Conditionally Independent Private Information in OCS

Wildcat Auctions,”Journal of Econometrics, 98, 129-161.

51

Muller, N. and R. Mendelsohn (2009), “Effi cient Pollution Regulation: Getting the Prices Right,”

American Economic Review, 99:5, 1714-1739.

National Acid Precipitation Assessment Program (2005), National Acid Precipitation Assessment Pro-

gram Report to Congress: An Integrated Assessment, Retrieved from http://ny.water.usgs.gov/projects/NAPAP/NAPAPReport2005.pdf.

Nelson, R. A. (1985), “Returns to Scale from Variable and Total Cost Functions,”Economic Letters,

18, 271-276.

Olley, S. and A. Pakes (1996), “The Dynamics of Productivity in the Telecommunications Equipment

Industry,”Econometrica, 64, 1263-1295.

Pint, E. (1992), “Price-Cap Versus Rate-of-Return Regulation in a Stochastic-Cost Model,” Rand


Perrigne, I. and Q. H. Vuong (2011), “Nonparametric Identification of a Contract Model With Adverse

Selection and Moral Hazard,”Econometrica, 79:5, 1499-1539.

Perry, R. H., D. W. Green, J. O. Maloney (eds.) (1997), Perry’s Chemical Engineers’ Handbook,

McGraw-Hill.

Rao, B. L. S. Prakasa (1992), Identifiability in Stochastic Models: Characterization of Probability Dis-

tributions, San Diego: Academic Press.

Rogerson, W. (2003), “Simple Menus of Contracts in Cost-Based Procurement and Regulation,”Amer-

ican Economic Review, 93:3, 919-926.

Schennach, S. M. (2004), “Estimation of Nonlinear Models with Measurement Error,”Econometrica,

72:1, 33-75.

Schmalensee, R. and R. N. Stavins (2012), “The SO2 Allowance Trading System: The Ironic History of

a Grand Policy Experiment,”Harvard Kennedy School Faculty Working Paper Series, RWP12-030.

Shohat. J. A. and J. D. Tamarkin (1943), The Problem of Moments, Mathematical Surveys Number 1,

Providence, RI: American Mathematical Society.

Smith, R. J. (1992), “Non-Nested Tests for Competing Models Estimated by Generalized Method of

Moments,”Econometrica, 60:4, 973-980.

Spulber, D. F. (1998), “Optimal Environmental Regulation under Asymmetric Information,”Journal

of Public Economics, 35:2, 163-181.

52

Villas-Boas, S. B. (2009), “An Empirical Investigation of the Welfare Effects of Banning Wholesale

Price Discrimination,”RAND Journal of Economics, 40:1: 20-46.

Vuong, Q. H. (1989), “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses”, Econo-

metrica, 57:2, 307-334

Wolak, F. A. (1994), “An Econometric Analysis of the Asymmetric Information, Regulator-Utility

Interaction,”Annales d’Economie et de Statistiques, 34, 13-69.

Appendix

Proof of propositions

Proof of Proposition 1

Since the equilibrium auditing strategy of the regulator is strictly increasing in R, the firm chooses R

according to the markup equation

R = R+1− ααR

.

The regulator’s equilibrium auditing strategy is not a function of C so I can write equilibrium R as just

R (R). The equilibrium strategy is increasing in R since URR

> 0 and is supermodular. Next, since α is

not a function of C, the firm does not have any incentive to exert effort during the rate case so e1 = 0

for all (θ,R). After the rate case, e2 equates the marginal disutility of effort with the marginal benefit.

Notice that e2 is just a function of θ.

On the equilibrium path, % = R so the regulator’s first order condition becomes

A′ (α) = 2(R−R

).

The firm’s equilibrium choice of R satisfies the markup equation so I can rewrite the regulator’s FOC as

A′ (α) = 2

(1− ααR

).

Since α is not a function of C, this is just a separable ordinary differential equation:∫A′ (α)

1− α dα = 2R. (14)

Off-equilibrium proposals can be grouped into two. The first group involves the firm still proposing

equilibrium R (R) but some other C 6= exp (θ)C. The second group involves the firm proposing a different

53

R, i.e. type R proposes R 6= R (R). For the first group, since α is not a function of C, off-equilibrium

proposals involving just C does not change the regulator’s behavior. Given this, the firm does not have

an incentive to deviate from exerting zero effort during the rate case.

For the second group, consider deviations R ≤ R (RU ). This means that the type R firm is proposing

someone else’s proposal, say R (R′), and be audited as if the firm was R′. This is not a profitable deviation

for the firm since it is interior but does not satisfy the markup equation for type R. Now suppose the

firm deviates by proposing some R > R (RU ). The equilibrium auditing strategy defined in equation

(14) allows R > R (RU ). For these proposals, the auditing strategy treats the firm as if it were a type

above RU and remains to be strictly increasing. Since the firm’s optimal proposal is increasing in R and

R ≤ RU , this deviation is not profitable for type R.

Proof of Proposition 2 [[TBD]]

In this equilibrium, all types (θ,R) propose the highest possible RRB, RU and types differ only on their

reported operating cost, C. Given the regulator’s equilibrium auditing strategy α, a type (θ,R) firm

chooses e1 = e1 (θ,R) such that[2α

C·(RU −R

)− 1]

exp (θ − e1)C = ψ′ (e1) . (15)

The reported operating cost of (θ,R) is thus C (θ,R) = exp (θ − e1 (θ,R))C. Since the cardinality of the

type space [0, θU ]× [0, RU ] is larger than that of the message space of C, we necessarily have pooling of

types at different values of C. Formally, the set of types that report a given C in equilibrium is given by

T(C)

=

(θ,R) : ψ′

(θ − ln

C

C

)=[2α

C

(C)·(RU −R

)− 1]C

. (16)

To show that e1 (θ,R) > 0 for any (θ,R), I assume ψ (e) = exp (e) for convenience. Note that I have

used a similar exponential form for ψ in the empirical part of the paper. Given this functional form, I

can represent T(C)as “iso-cost”curves in R− θ space:

θ = ln

[2α

C

(C)(

RU −R)− 1] C2

C

. (17)

First, I want to show that the marginal benefit of exerting effort is strictly positive, i.e.[2α

C

(C)·(RU −R

)− 1]> 0

or equivalently,

R < RU −1

2αC

(C) .

54

Second, the solution e1 (θ,R) such that C = exp (θ − e1 (θ,R))C and equation (15) holds is strictly

positive. This is equivalent to

θ > lnC

C.

To show the first requirement, recall that θ ≥ 0 and thus

ln

[2α

C

(C)(

RU −R)− 1] C2

C

≥ 0.

Define R∗ as the value of R such that

ln

[2α

C

(C)(

RU −R∗)− 1] C2

C

= 0.

Note that R ≤ R∗ since the left-hand side of the inequality is decreasing in R. From this equation we get

R∗ = RU −1

2αC

(C) − C

2αC

(C)C2

< RU −1

2αC

(C) ,

hence

R < RU −1

2αC

(C) .

As for the second requirement, it suffi ces that

C (θU , RU ) < exp (θU )C

since the iso-cost curves radiate in the northeast direction as C increases, under some regularity condi-

tions.32 Note that there exists (θ∗, R∗) ∈ T(C (θU , RU )

)such that

C (θU , RU ) = exp(θ∗ − eFB (θ∗)

)since the regulator chooses α by taking expectations under the belief supported on T

(C (θU , RU )

). The

first best level of effort is equal to eFB (θ) = 12 (θ + lnC), hence

exp(θ − eFB (θ)

)= exp

(1

2(θ − lnC)

).

This expression is increasing in θ so we have

C (θU , RU ) = exp(θ∗ − eFB (θ∗)

)≤ exp

(1

2(θU − lnC)

).

32A suffi cient condition for this is that α is not “too concave”:

[2αC

(C)+ αCC

(C)C]>

1

RU.

55

Therefore, a suffi cient condition for the second requirement is

θU > lnC−3.

I now discuss the regulator’s equilibrium auditing strategy. On the equilibrium path, the regulator

chooses α such that

A′ (α) =

∫2(R−R

)dF(θ,R| (θ,R) ∈ T

(C)).

Using equation (17) gives

A′ (α) =

∫ 1

αC

(C) [C

Cexp (θ) + 1

] dF(θ,R| (θ,R) ∈ T

(C)).

Let

exp (θ) =

∫exp (θ) dF

(θ,R| (θ,R) ∈ T

(C)).

Then α is the solution to

A (α) =[exp (θ)C

]ln C + C + κ

for some constant of integration κ. Off-equilibrium, exp (θ) is based on some arbitrary belief.

Proof of Proposition 3

1. Consider C and C ′ with C > C ′. Let R be the R-type that picks(R, C

)in the data and similarly

R′ be the R-type that picks(R, C ′

). Using the definition of ∆ gives

∆(R, C

)−∆

(R, C ′

)= α

(R, C

) [R−R

]− α

(R, C ′

) [R−R′

](18)

=α(R, C

)− α

(R, C ′

) [R−R

]− α

(R, C ′

) [R−R′

](19)

where the second line comes from adding and subtracting α(R, C ′

) [R−R

]. Let

T =R : pick

(R, C

)T ′ =

R : pick

(R, C ′

)Note T and T ′ can be nonsingleton sets. Let % and %′ be the corresponding “beliefs” about

R for each signal. From the regulator’s optimal auditing strategy and strict convexity of A (·),we have α

(R, C

)> α

(R, C ′

)if and only if % < %′. The inequality % < %′ is equivalent to

max T < min T ′ since pooling sets are intervals. Finally, max T < min T ′ is equivalent to

56

R < R′ since R ∈ T and R′ ∈ T ′. Therefore α(R, C

)> α

(R, C ′

)if and only if R < R′. Applying

this to equation (19) yields

∆(R, C

)= ∆

(R, C ′

)⇔ α

(R, C

)= α

(R, C ′

).

2. The following lemma is useful for the proof:

Lemma 1 Suppose there exist two distinct R-types R and R′′, that pick(R, C

)in equilibrium.

Then R = RU .

Proof. Suppose R < RU . From the FOCs of R and R′′ with respect to R, we have(1− α

(R, C

))− α

R

(R, C

) [R−R

]= 0 =

(1− α

(R, C

))− α

R

(R, C

) [R−R′′

].

But this implies R = R′′ so the two R-types cannot be distinct.

Consider R and R′ with RU > R > R′. Let R be the R-type that picks(R, C

)in the data and

similarly R′ be the R-type that picks(R′, C

). Using the definition of ∆ and adding and subtracting

α(R′, C

) [R−R

]give

∆(R, C

)−∆

(R′, C

)=α(R, C

)− α

(R′, C

) [R−R

]+ α

(R′, C

) [(R−R

)−(R′ −R′

)].

Using lemma 1 and the fact that RU > R > R′, we can conclude that R is the only R-type that

picks(R, C

)and R′ is the only R-type that picks

(R′, C

)in equilibrium. Thus the regulator’s

belief pins down R and R′, i.e. % = R and %′ = R′. Since α(R, C

)> α

(R′, C

)if and only if

R− % > R′ − %′, we have

∆(R, C

)> ∆

(R′, C

)⇔ α

(R, C

)> α

(R′, C

).

Identifying and estimating α and R

If we knew the function α, then we can get R from the markup equation:

R = R− 1− ααR

.

The main task then is to identify and estimate α. Consider the disallowance

∆ = R−R

57

which is part of the data. Using the definition of R in the model, we can link ∆ with α:

∆ = α ·(R−R

).

The equilibrium R of the firm satisfies

R−R =1− ααR

.

Thus we have the following differential equation:

∆ =α (1− α)

αR

.

This is an ordinary differential equation since α is not a function of C in this equilibrium. The solution

to this ODE is

α(R)

=

1 + exp[−Υ

(R)]−1

where

Υ(R)

=

∫1

∆(R)dR

For estimation, I approximate the function ∆(R)by a linear function in R, i.e. ∆

(R)

= a0 + a1R,

so that I can compute Υ(R)easily. To get the coeffi cients, I regress ∆ on R and firm, state and year

effects.

Characterization of optimal pollution regulation

The regulator maximizes welfare W subject to individual rationality (IR) and incentive compatibility

(IC) constraints. Although the original type space is two-dimensional, there is no instrument to screen

R-types. Thus all firms will pool at the highest possible R. I solve the problem as a one-dimensional

screening problem since R does not affect welfare comparisons (except for the full information regime).

The distribution of types is discrete so I adapt standard methods for continuous types (e.g. Laffont and

Tirole, 1993; Laffont, 1994) to my setting. The first step is to reduce the set of IC constraints into upward

local ICs. I solve the problem in terms of firms’profits instead of transfers. For any type θi and θj , IC

requires

Πi ≥ Πj + [exp θj − exp θi] exp (−ej) Ψsβsj

Πj ≥ Πi − [exp θj − exp θi] exp (−ei) Ψsβsi .

Combining these, we get

exp (−ei) Ψsβsi ≥ exp (−ej) Ψs

βsj .

58

As long as (s, e)’s satisfy this inequality, we can focus on upward local ICs. I solve the reduced problem

and check this inequality ex-post.

By standard arguments, the IR of the most ineffi cient type will be binding while the ICs of the rest

of the types will be binding. Thus ΠN = 0 and for i = 1, 2, ..., N − 1,

Πi = Πi+1 + [exp θi+1 − exp θi] exp (−ei+1) Ψsβsi+1.

Given these, I can rewrite the regulator’s objective function as

W =1

N

N∑i=1

−Dsi − ψ (ei)

− [exp θi + (i− 1)λ (exp θi − exp θi−1)] exp (−ei) Ψsβsi

where D > 0 is the marginal damage from an increase in the emission rate.

The first order condition with respect to si is

−D = βλ [exp θi + (i− 1)λ (exp θi − exp θi−1)] exp (−ei) Ψsβs−1i .

This FOC differs from the FOC for the first best emission rate because the regulator takes into account

the effect of si on the incentives of types j = 1, 2, 3, ..., i − 1 to reveal their type. An increase in si

increases the required profits that the regulator has to give to all types that are more effi cient than θi in

the “second best”world. The first order condition with respect to ei is

ψ′ (ei) = [exp θi + (i− 1)λ (exp θi − exp θi−1)] exp (−ei) Ψsβsi .

and the same comments apply.

Define

ΩFB = exp θi

ΩOR = exp θi + (i− 1)λ (exp θi − exp θi−1) .

Using the functional form33 of ψ to compute optimal effort, the FOC with respect to si becomes

−D = βλ (ΩORΨ)γ

1+γ sβs

γ1+γ−1

i .

For the first best emission rate, the FOC is

−D = βλ (ΩFBΨ)γ

1+γ sβs

γ1+γ−1

i .

Since ΩFB < ΩOR and βs < 0, the first best emission rate is lower than the emission rate under optimal

pollution regulation. Thus firms under-abate relative to the first best, except for the most effi cient type.33 I ignore the the shock υ for simplicity.

59

Sample rate cases

Table 10: Gulf Power case

Case details Expense Rate of return

Gulf Power (FL) “Exclusion of certain economic Firm sought ROE of 13%,

Proposed: 12/15/1989 development expenditures” with firm’s witness supporting as high

Authorized: 10/3/1990 Rate Base as 13.5%; Commissioners authorized

Test: 12/31/1990 Firm’s 25% stake (212MW) in a 12.55% ROE. Using adopted capital

Last: 11/7/1984 Plant Scherer 3 disallowed structure leads to ROR of 8.1%, in

Initiator because company’s capacity contrast to prosposed ROR of 8.34%.

Firm deemed as adequate even Miscellaneous

Why? without this stake Firm was imposed a 2-year

Permanent rate 50 basis point deduction

increase requested in ROE for “unethical/illegal

activities”

60

Table 11: Georgia Power case


Georgia Power (GA) Revision in revenue forecast; depreciation Firm sought an ROE of 13.25%; staff

Proposed: 4/2/1991 represcription; adjustment in post-retirment recommended an ROE in the range of

Authorized: 11/26/1991 benefits; nuclear O&M and decomissioning 12% to 12.7%; Commissioners

Test: 4/30/1992 Rate Base authorized an ROE of 12.25%; They

Last: 9/28/1989 Reduction in cash working capital also adopted a lower cost of long-term

Initiator debt which further reduced the ROR to

Firm 10.7% compared to the proposed 11.17%.

Why? Miscellaneous

Permanent rate .

increase requested

Table 12: Baltimore Gas & Electric case


Baltimore Gas & Elec. (MD) Firm requested a portion of deferred fuel balances Firm proposed an ROE of 12.87%

Proposed: 9/25/1992 to be expensed but the PUC denied the request with staff proposing 11.61% which

Authorized: 4/23/1993 Rate Base is the upperbound of the staff’s range

Test: 11/30/1992 Adjustments in recognition of accrued of 10.61%-11.61%; Commissioners

Last: 12/17/1990 construction and phase-in costs found flaws in both the firm’s and

Initiator staff’s arguments but mostly

Firm sided with the staff, finally

Why? approving an ROE of 11.75%

Permanent rate Miscellaneous

increase requested .

61

Table 13: Ohio Edison case


Ohio Edison (OH) Adjustment in wage annualization, advertising Company’s witness used a DCF

Proposed: 8/1/1989 expenses, amortization of deferred plant expenses analysis incoporating a stock price of

Authorized: 8/16/1990 Rate Base $18.85 to support a 14.32% ROE;

Test: 12/31/1989 Adjustment to working capital and PUC staff recommended an ROE range of

Last: 1/26/1988 plant-in-service levels 12.37%-13.39% based on a stock price of

Initiator $19.31; PUC authorized a 13.21% ROE

Firm leading to an ROR of 11.2% compared

Why? to proposed ROR of 11.68%

Permanent rate Miscellaneous

increase requested .

62

Date post:	15-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Welfare Gains from Optimal Pollution RegulationJose Miguel Abitoy November 5, 2012 Abstract...

Documents