Quality Ratings and Premiums in the Medicare Advantage
Market
Ian M. McCarthy∗
Department of Economics
Emory University
Michael Darden†
Department of Economics
Tulane University
January 2015
Abstract
We examine the response of Medicare Advantage contracts to published quality ratings. We
identify the effect of star ratings on premiums using a regression discontinuity design that exploits
plausibly random variation around rating thresholds. We find that 3, 3.5, and 4-star contracts
in 2009 significantly increased their 2010 monthly premiums by $20 or more relative to contracts
just below the respective threshold values. High quality contracts also disproportionately dropped
$0 premium plans or expanded their offering of positive premium plans. Welfare results suggest
that the estimated premium increases reduced consumer welfare by over $250 million among the
affected beneficiaries.
JEL Classification: D21; D43; I11; C51
Keywords: Medicare Advantage, Premiums, Quality Ratings, Regression Discontinuity
∗Emory University, Rich Memorial Building, Room 306, Atlanta, GA 30322, Email: [email protected]†206 Tilton Memorial Hall, Tulane University, New Orleans, LA 70115. E-mail: [email protected]
1
1 Introduction
The role of Medicare Advantage (MA) plans in the provision of health insurance to Medicare bene-
ficiaries has grown substantially. Between 2003 and 2014, the share of Medicare eligible individuals
in an MA health plan increased from 13.7% to 30%.1 To better inform enrollees of MA quality, in
2007, the Center for Medicare and Medicaid Services (CMS) introduced a five-star rating system that
provided a rating of one to five stars to each MA contract – a private organization that administers
potentially many differentiated plans – in each of five quality domains.2 For the 2009 enrollment
period, CMS began aggregating the domain level quality scores into an overall star rating for each
MA contract in which each plan offered by a contract would display the contract’s quality star rating.
Since in 2012, contracts have been incentivized to earn high quality star ratings through star-dependent
reimbursement and bonus schemes.
Early studies on the effects of the star rating program focus on the informational benefits to
Medicare beneficiaries. To this end, the program has been found to have a relatively small positive
effect on beneficiary choice, with heterogeneous effects across star ratings (Reid et al., 2013; Darden
& McCarthy, forthcoming). However, one area thus far overlooked concerns the supply-side response
to MA star ratings, where a natural consequence of the star rating program could be for contracts
to adjust premiums and other plan characteristics in response to published quality ratings.3 Indeed,
while the quality star program is often presented as a potential information shock to enrollees, the
program could also serve as an information shock to health insurance contracts, better informing them
of competitor quality and better informing contracts of their own signal of quality to the market. For
example, learning that its plans have the highest quality star rating in a market in 2009, a contract may
choose to price out its quality advantage in 2010 by raising plan premiums. Conversely, a relatively
low-rated contract may lower its 2010 premium in response to its 2009 quality star rating. More
generally, the extent to which policy may cause health insurance companies to adjust premiums is a
central question in health and public economics.4
The current paper provides a comprehensive analysis of 2010 premium adjustments to the 2009
publication of MA contract quality stars. We investigate the specific mechanisms by which contracts
can adjust their premiums in response to their quality ratings, and we calculate the corresponding wel-
fare effects. We adopt a regression discontinuity (RD) design that exploits plausibly random variation
around 2009 star thresholds, allowing us to separately identify the effect of reported quality on price
1Kaiser Family Foundation MA Update, available at http://kff.org/medicare/fact-sheet/medicare-advantage-fact-sheet/.
2For example, one domain on which contracts were rated was “Helping You Stay Healthy.”3Preliminary evidence of a supply-side response to the publication of MA quality stars was found in Darden &
McCarthy (forthcoming), albeit with a restricted sample of contract/plan/county/year observations.4For example, see Pauly et al. (2014) on the effects of the Affordable Care Act on individual insurance premiums.
2
from the overall relationship between quality and price. Our data on contract/plan market shares,
reported contract quality, plan premiums, and other plan characteristics come from several publicly
available sources. Our results suggest strong premium adjustments following the 2009 star rating pro-
gram, with average to above average star-rated contracts significantly increasing premiums from 2009
to 2010. When we conduct our analysis at the contract level, we find that 3, 3.5, and 4-star contracts
increase their average premiums across existing plans by $33.60, $29.30, $31.85, respectively, relative
to contracts with 2009 ratings just below the respective threshold values. At the plan level, we estimate
mean increases of $19.40, $41.99, and $31.52 for 3, 3.5, and 4-star contract/plans, respectively. These
effects are sizable compared to overall average premium increases of between $9 and $15. The results
are also broadly consistent across a range of sensitivity analyses, including consideration of alternative
bandwidths, falsification tests with counter-factual threshold values, and the exclusion of market-level
covariates.
While an MA contract may directly adjust its plans’ premiums in response to quality stars, the
contract may also adjust the mix of plans it offers within a market (county). For example, in response
to the published star ratings, a contract could alter the number of zero-premium plans; adjust the
number of plans that include Medicare Part D coverage; change the drug deductible in plans that offer
part D coverage; or add/drop plans entirely. Indeed, our data show that nearly all of the regional
variation in plan premiums is due to selection of plan offerings by contracts, as opposed to contracts
charging different premiums in different areas of the country. We find that contracts just above the
3 and 3.5-star thresholds in 2009 are more likely to drop $0 premium plans in 2010, with 3.5-star
contracts also more likely to introduce positive premium plans into new markets. We find no such
disproportionate change in $0 or positive premium plans among contracts with a 4-star rating in 2009.
Meanwhile, low quality contracts (those just above the 2.5-star threshold in 2009) maintain their 2009
plan offerings at largely the same premium levels in 2010, while contracts just below the 2.5-star
threshold in 2009 are much more likely to exit the market altogether in 2010.
Overall, our results suggest that the star rating program in 2009 may have caused low quality
contracts to drop plans while generating large premium increases among contracts receiving 3-star
ratings and above. Adopting the consumer welfare calculations used in Town & Liu (2003) and
Maruyama (2011), our estimated increases in premiums imply a reduction in consumer surplus of over
$250 million among those beneficiaries enrolled in the relevant plans. To the extent that higher quality
plans are replacing low quality plans at reasonable premium levels, plan entry and exit behaviors
induced by the star-rating program may partially offset this welfare loss; however, given the number of
new plans estimated to have entered the market due to the star ratings, such offsets are likely relatively
small (Maruyama, 2011).
3
In what follows, we discuss the institutional details of Medicare Advantage and the recent star
rating program in Section 2. The data and methods are discussed in Sections 3 and 4, respectively.
We present our results in Section 5, with a series of robustness checks discussed in Section 6. Section
7 examines the potential mechanisms underlying our estimated premium adjustments, and Section 8
summarizes the welfare effects associated with our estimated premium increases. The final section
concludes.
2 Institutional Background
Since Medicare’s inception, beneficiaries have had the option to receive benefits through private health
insurance plans. The Balanced Budget Act of 1997 (BBA) classified all private Medicare health insur-
ance plans as Medicare Part C plans, and it allowed for additional types of business models including
Preferred Provider Organizations (PPOs), Provider-Sponsored Organizations (PSOs), Private fee-for-
service (PFFS) plans, and Medical Savings Accounts (MSAs). Later, in addition to the beneficiary
entitlement to prescription drug coverage, the Medicare Modernization Act of 2003 renamed Medi-
care Part C plans as Medicare Advantage (MA) plans. In each year since 2003, Medicare beneficiaries
choose to enroll in traditional fee-for-service (FFS) Medicare or an MA plan during an open enrollment
period from November 1st through December 31st. By enrolling in an MA plan, enrollees must pay
Medicare Part B premiums in addition to any additional premium charged by the plan. In exchange,
MA plans provide at least (often more than) the services covered by traditional FFS Medicare. In
2009, 38% of MA plans charged no additional premium, while 77% of plans also offered prescription
drug coverage. Given the generosity of plan coverage at possibly no additional cost relative to tradi-
tional Medicare FFS, the MA has grown dramatically in recent years with share of Medicare eligible
individuals in an MA plan increasing from 13.7% in 2003 to 30% in 2014.5
Broadly, an MA contract is an agreement between a private insurance company and CMS whereby
the company agrees to insure Medicare beneficiaries in exchange for reimbursement. A contract is
approved by CMS to operate in specific counties, and an approved contract typically offers a menu
of MA plans that are differentiated by premium, prescription drug coverage, and, if covered, the
prescription drug deductible. Most MA contracts are required to offer at least one plan that includes
prescription drug coverage. For the 2015 enrollment year, 78% of all Medicare beneficiaries live in a
county with access to at least one plan that offers prescription drug coverage (MA-PD) and charges no
additional premium (above the Part B premium).6 In 2009, the mean number of MA plans available to
5Kaiser Family Foundation MA Update, available at http://kff.org/medicare/fact-sheet/medicare-advantage-fact-sheet/.
6http://kff.org/medicare/issue-brief/medicare-advantage-2015-data-spotlight-overview-of-plan-changes/
4
beneficiaries was roughly 11 plans per county.7 However, there exists considerable regional variation in
the availability of MA plans, and enrollments in MA plans are concentrated in a few national contracts.
Indeed, according to the Kaiser Family Foundation (KFF), 60% of all plans offered in 2015 are affiliated
with just seven health insurance companies.8
Staring in the 2007 enrollment year, CMS began collecting and distributing a one to five-star
quality rating in each of five quality domains (e.g., “Helping You Stay Healthy”). Each domain
was itself an aggregation of many individual quality metrics such as the percentage of enrollees with
access to an annual flu vaccine. These individual quality metrics are calculated based on data from a
variety of sources, including HEDIS, the Consumer Assessment of Healthcare Providers and Systems
(CAHPS), the Health Outcomes Survey (HOS), the Independent Review Entity (IRE), the Complaints
Tracking Module (CTM), and CMS administrative data. Starting in enrollment year 2009, CMS began
aggregating the domain level quality stars to an overall contract rating of between one and five stars
(in half-star increments).9 And since 2011, CMS constructs the contract-specific quality ratings as a
function of Part D coverage, when relevant. Our focus is on the 2009 and 2010 enrollment years -
the first two years of the overall contract star rating program and the years in which all contracts,
including those offering prescription drug coverage, were rated based on the same underlying quality
metrics.
The literature on the MA quality rating initiatives has generally focused on the enrollment effects.
Recently, Reid et al. (2013) find large effects of increases in star-ratings on enrollment that are homo-
geneous across the reported quality distribution, but results from that paper fail to disentangle the
effects of quality from quality reporting on enrollment. Attempting to disentangle these effects, Darden
& McCarthy (forthcoming) find heterogeneous effects of the quality star rating program on MA plan
enrollment in 2009 and no significant effect in 2010. At the plan level, they find that a marginally
higher rated contract at the lower end of the quality distribution (e.g., a 3 as compared to 2.5 starred
contract) realized a positive and significant enrollment effect equal to 4.75 percentage points relative
to traditional FFS Medicare in 2009 enrollments. This effect diminishes for higher rated contracts,
and vanishes for the 2010 enrollment year. The lack of an enrollment response to 2010 quality stars
suggests that the 2009 star ratings may have acted as a one-time informational event, or that there
was a supply-side response in 2010 based on the 2009 ratings.
Generally, the potential for supply-side responses to Medicare Advantage policy has received little
attention from researchers. One recent exception is Stockley et al. (2014), who examine how MA plan
premiums and benefits respond to variation in the benchmark payment rate - the subsidy received
7Author’s calculation. See Section 3 for a presentation of our data.8See http://kff.org/medicare/issue-brief/medicare-advantage-2015-data-spotlight-overview-of-plan-changes/.9For a complete discussion of the star rating program, see Darden & McCarthy (forthcoming).
5
by the MA contract for each enrollee. Those authors find that contracts do not adjust premiums
directly as a result of changes in benchmark payment rates, but rather contracts adjust the generosity
of plan benefits in response. Conversely, Darden & McCarthy (forthcoming) find that contract/plans
in 2010 raise premiums in response to higher 2009 contract-level quality star ratings. However, the
sample used to estimate the supply-side response of contracts in 2010 was restricted to just those
contract/plans with a.) 10 or more enrollees in both 2009 and 2010 and b.) nonmissing quality ratings
in 2010. Furthermore, that paper only focuses on direct premium increases, ignoring the possibility
of indirect premium adjustments such as changing the number of zero-premium plans or adjust the
plan-mix within a county. The current paper provides a comprehensive examination of the supply-side
response to quality star ratings, examining the full population of approved MA contracts to evaluate
several potential response mechanisms as well as potential welfare consequences.
3 Data
We collect data on market shares, contract/plan characteristics, and market area characteristics from
several publicly available sources for calendar years 2009 and 2010.10 As a base, we use the Medicare
Service Area files to form a census of MA contracts that were approved to operate in each county
in the United States in 2009 and 2010. To these contract/county/year observations, we merge con-
tract/plan/county/year data on enrollment and other contract characteristics.11 To our market share
data, we merge further information on MA contract quality ratings, contract/plan premiums, county-
level MA market share, CMS benchmark rates, fee-for-service costs, hospital discharges, and census
data. The CMS quality information includes an overall summary star measure; star ratings for differ-
ent domains of quality (e.g., helping you stay healthy); as well as star ratings and continuous summary
scores for each individual metric (e.g., percentage of women receiving breast cancer screening and an
associated star rating). Data are not available for the overall continuous summary score (i.e., the score
rounded to generate an overall star rating), but we are able to replicate this variable by aggregating the
specific quality measures following CMS instructions. We explain this process thoroughly in Appendix
B. Hospital discharge data are from the annual Hospital Cost Reporting Information System (HCRIS),
and CMS benchmark rates and average FFS costs by county are publicly available from CMS. Finally,
county-level demographic and socioeconomic information are from the American Community Survey
(ACS).
10See Appendix C for a detailed discussion of our dataset and specific links.11CMS suppresses enrollment counts for contract/plans with 10 or fewer enrollees, but we keep these observations and
impute enrollment. The Service Area files are needed because the enrollment files do not account for migration. Forexample, it is possible for the enrollment file to contain a positive enrollment record for a contract/plan in a county evenif that contract is not approved to operate in the county. See Appendix C for futher details.
6
Our enrollment data is available monthly; however, there is little variation in enrollments across
months due to the nature of the open enrollment process at the end of each calendar year. Further-
more, all other variables of interest are specific to a calendar year. Therefore, we take the average
enrollment of each plan across months in a given year. The resulting unit of observation is the con-
tract/plan/county/year. Our analysis focuses only on health maintenance organizations (HMO), local
and regional preferred provider organizations (PPO), and private fee-for-service (PFFS) contracts.
We exclude all special needs plans and employer/union-specific plans (also known as 800-series plans),
and we drop all observations that pertain to United States Territories and Outlying Areas. Our final
sample includes 247,978 contract/plan/county/years.
Table 1 provides summary statistics for our final dataset at the plan, county, and contract level.
The data consist of 51,442 and 34,642 plan/county observations in 2009 and 2010, respectively, with
an increase in average MA enrollment per plan from 292 in 2009 to 361 in 2010.12 The county-level
summary statistics also reveal an increasing penetration of MA in the overall Medicare market, from
15.6% in 2009 to 16.5% in 2010, alongside a decrease in the number of plans offered per county,
an increase of just over $15 in average premiums, an increase in the percentage of plans offering
prescription drug coverage, and an increase in the proportion of HMO and PPO plans relative to
PFFS plans. Finally, the bottom panel of Table 1 illustrates a slight rightward shift in the distribution
of star ratings from 2009 to 2010, with 1.5-star contracts either improving in rating in 2010 or exiting
the market, and with a relative increase in the percentage of 4.5 and 5-star contracts in 2010.
Table 1
4 Methodology
Since star ratings are assigned to contracts (rather than specific plans operating within a contract),
our initial analysis follows Town & Liu (2003), Cawley et al. (2005), Dafny & Dranove (2008), Frakt
et al. (2012) and others in aggregating plan characteristics to the contract level by taking the mean
values across plans within a contract (in the same county). We then examine the relationship between
a contract’s quality star rating in 2009 and the contract’s premiums in 2010. Denoting the vector of
mean characteristics in market m (county) for contract c by ycm = {ycm,1, ..., ycm,K}, we specify the
12As indicated in Table 1, enrollment data are not available for all plans as CMS does not provide enrollment countsfor plans with 10 or fewer enrollments. As such, the mean enrollment figures presented are higher than the true meanas they exclude a large number of plans with missing enrollment data.
7
mean characteristic k for contract c as follows:
ycmk = f (qc, Xcm,Wm) + εcmk, (1)
where qc denotes the contract’s star rating in 2009, Xcm denotes other contract characteristics, Wm
denotes 2010 market-level data on the age, race, and education profile of a given county, and εcmk
is an error term independently distributed across characteristics and markets.13 Given our focus on
premiums, our plan characteristics of interest consist of the average premium and the proportion of
the contract’s plans (in the same county) charging a $0 premium.14
The CMS quality rating system relies on a continuous summary score between 1 and 5 which is
rounded to the nearest half. A contract with a 2.24 summary score is therefore rounded down to a 2-star
rating, while a contract with a 2.26 summary score is rounded up to a 2.5-star rating. Intuitively, these
two contracts are essentially identical in quality but received different quality ratings. We propose to
exploit the nature of this rating system using a regression discontinuity (RD) design.15 More formally,
denote by Rc the underlying summary score, by R the threshold summary score at which a new star
rating is achieved (e.g., R = 2.25 when considering the 2.5 star rating), and by Rc = Rc − R the
amount of improvement necessary to achieve an incremental improvement in rating. We then limit our
analysis to contracts with summary scores within a pre-specified bandwidth, h, around each respective
threshold value, R. For example, to analyze the impact of improving from 2.0 to 2.5 stars, the sample
is restricted to contracts with summary scores of 2.25 ±h.
To implement our approach, we specify plan/contract quality as follows:
qc = γ1 + γ2 × I(Rc > R
)+ γ3 × Rc + γ4 × I
(Rc > R
)× Rc, (2)
where γ2 is the main parameter of interest. Incorporating this RD framework into equation (1), and
adopting a linear functional form for f(.), yields the final regression equation
yckm = γ1 + γ2 × I(Rc > R
)+ γ3 × Rc + γ4 × I
(Rc > R
)× Rc
+βcXcm + βmWm + εckm, (3)
where Wm and Xcm are as discussed previously. Our baseline analysis estimates equation 3 using
ordinary least squares with a bandwidth of h = 0.125. We consider alternative bandwidths in Section
13We cluster standard errors by contract; however, the results are qualitatively unchanged when clustering standarderrors at the county level.
14The overall plan type (e.g., HMO versus PPO) is typically contract-specific and therefore does not vary across planswithin the same contract.
15See Imbens & Lemieux (2008) for a detailed discussion of the RD design and its application in economics.
8
6 as well as a more traditional RD design with a triangular kernel Imbens & Lemieux (2008).
Changes in mean premiums at the contract level can arise in several ways, most directly via changes
to premiums among specific plans. To investigate this possibility, we also estimate a regression of 2010
plan premiums as a function of the plans’ 2009 premiums, 2009 star ratings and other contract-level
variables, and 2009 county characteristics. This analysis is akin to estimating equation 3 but where
our analysis is at the plan level rather than aggregating to the contract level. For this analysis, we
examine only plans that were available in the same county in both 2009 and 2010.
5 Results
5.1 Average Premiums at the Contract Level
Table 2 presents the results of a standard OLS regression of mean contract characteristics in 2010
on the 2009 mean value, the contract’s 2009 star rating, as well as additional county and contract-
level covariates. To the extent that contract quality is already reflected in the contract’s mean plan
characteristics, we would expect the effects of increasing star ratings to be relatively small in magnitude.
This is the case in Table 2, where we see small decreases in average premiums among 2.5 and 4-star
contracts with small increases in premiums among 3 and 3.5-star contracts (relative to contracts with
one-half star lower ratings). Note that, in order to better reflect the premium charged to a given
enrollee in a specific contract, our analysis of average premiums at the contract level excludes plans
with 10 or fewer enrollments.16 Our analysis at the plan-level makes no such exclusion.
Table 2
The OLS results say little about the specific effects of an increase in reported quality on premiums.
To address this question directly, Table 3 presents the initial RD results at the contract level for a
bandwidth of h = 0.125. The results suggest a large premium increase for contracts receiving a 3, 3.5,
or 4 star rating in 2009, with these contracts increasing average premiums by between $29 and $34
per month from their 2009 levels relative to contracts with one-half star lower ratings. By contrast,
contracts receiving a 2.5-star rating showed no statistically significant increase in premiums. By virtue
of the RD design and the nature of the CMS star rating program, we argue that these estimates can be
interpreted as the causal effect of a one-half star increase in quality ratings separate from the quality
16Not surprisingly, low star-rated plans with 10 or fewer enrollments also charge much higher premiums relative tothe same quality plans with higher enrollments. For example, in 2010, the average premium among 2.5-star plans with10 or fewer enrollments was $63, compared to just $32 among 2.5-star plans with 11 or more enrollments. The resultsare nonetheless consistent when we include all plans and an indicator variable for missing enrollment data.
9
of the contract itself. For example, 3.5-star contracts of comparable “true” quality to 3-star contracts
were able to increase their premiums on average $29 per month. Looking purely at sample averages,
all other contracts receiving a 3.5-star rating in 2009 increased their premiums by an average of $12,
while 3-star contracts falling just below the 3.25 threshold increased their premiums by just over $3.
We provide extensive robustness and sensitivity analyses for these results in Section 6.
Table 3
5.2 Premiums at the Plan Level
Table 4 summarizes the RD results for 2010 plan premiums as a function of 2009 premiums, county-
level covariates, as well as the contract’s quality rating as specified in equation 2. This analysis
therefore estimates premium changes at the plan level (for the same plans offered in both 2009 and
2010), rather than analyzing average premiums at the contract level as in Table 3. For the same
plan/county/contract, the results again show a large and statistically significant increase in premiums
for 3, 3.5, and 4-star contracts, with premiums increasing by between $19 and $42 per month for the
same plans.
Table 4
6 Robustness and Sensitivity Analysis
The appropriateness of our proposed RD design depends critically on whether contracts can sufficiently
adjust their summary scores. Intuitively, it is unlikely that contracts can manipulate their scores
because the star ratings are calculated based on data two or three years prior to the current enrollment
period. Contracts would therefore not have the opportunity to manipulate other observable plan
characteristics in response to their same-year star ratings. To test this formally, McCrary (2008)
proposes a test of discontinuity in the distribution of summary scores around the threshold values.
The resulting t-statistics range from 0.15 to 0.96, suggesting no evidence of a discontinuity in the
running variable at any of the threshold values. In the remainder of this section, we investigate the
sensitivity of our results along several other dimensions, including: 1) bandwidth selection; 2) inclusion
of covariates; and 3) falsification test with counter-factual threshold values.
10
6.1 Choice of Bandwidth
The choice of bandwidth is a common area of concern in the RD literature (Imbens & Lemieux, 2008;
Lee & Lemieux, 2010). To assess the sensitivity of our results to the choice of bandwidth, we replicated
the local linear regression analysis from Tables 3 and 4 for alternative bandwidths ranging from 0.1 to
0.25 in increments of 0.005. The results for mean plan premiums at the contract level (Table 3) are
illustrated in Figure 1, where each graph presents the estimated star-rating coefficient, γ2, along with
the upper and lower 95% confidence bounds. Similar results for plan-level premium adjustments are
presented in Figure 2. In general, our results are consistent across a range of alternative bandwidths.
Figure 1
6.2 Inclusion of Covariates
The RD literature generally advises against including covariates in a standard RD design (Imbens
& Lemieux, 2008; Lee & Lemieux, 2010). The intuition for this advice is as follows: if treatment
assignment is random within the relevant bandwidth, then the covariates should also be randomly
assigned to the treated and control groups. However, in our setting, purely randomized quality scores
at the contract level would not necessarily imply randomization in county-level variables. As such, we
argue that county-level covariates belong in our analysis in order to control for geographic variation
influencing contract location and plan offerings.
Nonetheless, we assess the sensitivity of our analysis to the exclusion of these covariates by estimat-
ing a more traditional RD model with right-hand side variables presented in equation 2. We estimate
the effect of a one-half star increase in quality ratings with a triangular kernel and our preferred
bandwidth of h = 0.125. The results, summarized in Table 8, are generally consistent with our initial
findings in Tables 3 and 4, where we again see large increases in average premiums among 3, 3.5, and
4-star contracts relative to contracts just below the respective star-rating thresholds. One exception is
the estimated effect on individual plan premiums for 4-star versus 3.5-star contracts presented in the
bottom right of Table 8. In this case, unlike the estimates in Table 4, we find no significant increase
in premiums among 4-star contracts along with a reduction in the magnitude of the estimated effect.
This is perhaps not surprising given the location of higher rated contracts throughout the country,
where 4-star contracts are more concentrated in specific geographic areas relative to lower star-rated
contracts.
11
Table 8
6.3 Falsification Tests
Finally, it is possible that the observed jumps at threshold values of 2.25, 2.75, etc. are driven more by
specific contracts that happen to fall above or below the threshold versus the star rating system itself.
As a test, we therefore considered a series of counter-factual threshold values above and below the
true threshold values. Intuitively, we should not see any jumps in premiums around these thresholds.
Figure 3 presents the results of this analysis for mean premiums at the contract/county level, where
we estimated the effects just as we did for Figure 1 and Table 3. The results support 2.75 and 3.25 as
the true threshold values, with the largest premium increases occurring just above those thresholds.
The results for 2.25 and 3.75 thresholds are less conclusive, with apparent jumps in premiums for what
should be irrelevant thresholds such as 1.9, 3.65, and 3.85.
Figure 3
7 Mechanisms for Premium Adjustment
Comparing our contract-level (Table 3) and plan-level (Table 4) analysis, we see larger premium
increases at the plan level for 3.5-star contracts and smaller increases at the plan level for 3-star
contracts. These results suggest that increases in average premiums at the contract level do not arise
solely from increases in premiums of the same plans from 2009 to 2010. Rather, the results suggest that
contracts also alter their plan mix from one year to the next (e.g., dropping plans within a contract,
introducing new plans under the same contract, or expanding plans to new counties).
Table 5 summarizes the exit behaviors from 2009 to 2010 by star rating, where we see low quality
plans were significantly more likely to exit their respective markets than plans associated with higher
star ratings. In particular, we see almost all 1.5-star plans leave the market from 2009 to 2010, with
very little exit among 4 and 4.5-star plans.17 Regarding plan entry, Table 5 shows that of the contracts
receiving a 1.5-star rating in 2009 that still operate in 2010, 37% of the underlying plans entered into a
new county in 2010. Similarly, 55% of 2-star plans (in 2009) entered into a new county in 2010, while
higher rated contracts were relatively less likely to enter into new markets. Collectively, the exit and
17The 1.5-star contracts that stayed in the market from 2009 to 2010 also had a marginally higher star rating in 2010.As such, there are no 1.5-star contracts remaining in 2010 (see Table 1).
12
entry figures reflect larger turnover in plan offerings among lower rated contracts relative to higher
rated contracts. This is perhaps expected as higher rated contracts may be more deliberate in their
market entry/exit decisions and less likely to quickly cycle through new plans from one year to the
next.
Table 5
7.1 Analysis of Plan Exit
To examine plan exit more directly, we follow Bresnahan & Reiss (1991), Cawley et al. (2005), Abraham
et al. (2007), and others in assuming that an insurance company will only offer a plan in a given county
if the plan positively contributes to the contract’s profit. Assuming profit is additively separable across
geographic markets (counties), our observed plan choice indicator becomes:
yc(j)m =
1 if πc(j)m = g(Wm, Xc(j)m
)+ εc(j)m ≥ 0
0 if πc(j)m < 0
(4)
where Wm again denotes county-level demographics, Xc(j)m denotes contract and plan characteristics
(including the contract’s 2009 quality, qc, plan premium, Part D participation, etc.), and εc(j)m is an
error term independently distributed across markets and plans.
We adopt a reduced form, linear profit specification with covariates including the benchmark CMS
payment rates, 2009 contract quality (qc), the plan’s enrollments in 2009, the number of physicians
in the county, the average Medicare FFS cost per beneficiary in the county, and plan characteristics
such as premiums, whether the plan offers prescription drug coverage, and indicators for HMO or PPO
plan type. Within this specification, we also consider the RD design from equation 2. We estimate
equation 4 with a linear probability model where yc(j)m = 1 indicates that the contract continued to
offer the plan in 2010 and yc(j)m = 0 indicates the plan was dropped. By definition, this analysis is
based on existing plans as of 2009.
The results of our RD analysis of plan exit are summarized in Table 6. The top panel presents
results for all plans, while the remaining panels present results for plans with $0 premiums and plans
with positive premiums, respectively. Overall, we see that 2.5-star contracts are significantly less likely
to exit markets than 2-star contracts of similar overall quality. Relative to 2.5-star contracts, 3-star
contracts show no significant differences in exit behaviors, but they are significantly more likely to
drop their $0 premium plans and less likely to drop positive premium plans. Somewhat surprisingly,
13
contracts receiving a 3.5-star rating are more likely to drop plans overall; however, from the middle
panel of Table 6, we see that this result is entirely driven by 3.5-star contracts dropping their $0
premium plans. Finally, 4-star contracts are significantly less likely to exit overall, particularly for
their positive premium plans.18
Table 6
7.2 Analysis of Plan Entry
An important and relatively unique aspect of the MA market concerns the distinction between plan
and contract-level decisions. Specifically, contracts must obtain CMS approval in order to be offered
in a given county; however, conditional on receiving CMS approval, the decision of which plan(s) to
offer in a county is relatively less regulated. As a result, we argue that the fixed costs of entry are
primarily incurred at the contract level while the plan-level entry/exit decisions are based on the vari-
able profits per enrollee (i.e., regardless of market share). With regard to plan entry, this unique CMS
approval process alleviates many of the traditional econometric issues surrounding multiple equilibria
or endogeneity of other players’ actions in models of market entry with incomplete information (Berry
& Reiss, 2007; Bajari et al., 2010; Su, 2012). Conditional on plan characteristics, our entry analysis
therefore need only consider variable cost shifters and should be largely independent of the number or
type of competing plans in the county.19
The full set of plans available to a contract in a given market m is identified by taking all plans
offered under that contract across the entire state in the same year. All such plans are therefore
considered “eligible” to be operated in any given county, and the contract must choose which of those
plans to offer in each county, where yc(j)m = 1 indicates that the plan was added to the county (under
that contract) in 2010, and yc(j)m = 0 indicates that the plan was not offered. As with our analysis
of plan exit, we estimate the entry-equivalent to equation 4 using a standard linear probability model,
with entry considered as a function of 2010 county and plan characteristics as well as 2009 contract
quality as in equation 2.
Table 7 summarizes the results of our RD analysis for plan entry. Note that these results only
apply to markets in which the contracts previously operated (i.e., we do not consider the contract-level
18The robustness of our plan exit results to bandwidth selection is summarized in Appendix D. The overall results (toppanel of Table 6) at the 2.75 threshold appear relatively sensitive to bandwidth selection, with the statistical significance,magnitude, and sign of the point estimates changing within bandwidths from 0.1 to 0.2. In terms of hypothesis testing,we interpret this as evidence in favor of the null that the star rating has no effect on plan exit at the 2.75 threshold. Assuch, the qualitative findings from our point estimates in Table 6 are unchanged.
19Results are robust when we weaken this assumption and allow predicted 2010 market shares to influence entrybehaviors. The results are excluded for brevity but available upon request.
14
entry decisions and instead focus specifically on the plan-level entry of pre-existing contracts). The RD
results indicate that a one-half star improvement for 3 or 3.5-star contracts makes them significantly
more likely to expand their plans into new markets. The bottom panels of Table 7 further reveal that
the increase in probability of plan entry occurs for the positive premium plans, with 3.5-star contracts
significantly less likely to enter new markets with their $0 premium plans.20
Table 7
8 Welfare Effects
To examine the welfare effects of our estimated premium increases in Section 5, we follow Town &
Liu (2003) and Maruyama (2011) in estimating a standard Berry-type model of plan choice based on
market-level data (Berry, 1994). Specifically, let the utility of individual i from selecting Medicare
option c(j) in market area m be given as
Uic(j)m = δc(j)m + ξc(j)m + ζig + (1− σ)εic(j)m, (5)
where δc(j)m and ξc(j)m represent the mean level of utility derived from observed and unobserved
contract-plan-market area characteristics, respectively. We include in δc(j)m observed characteristics at
the contract and plan level, including premiums, plan type (HMO, PPO, or PFFS), and the underlying
summary score of the contract. Similar to Town & Liu (2003), we partition the set of Medicare options
into two groups: 1) MA plans that offer prescription drug coverage (MA-PD plans); and 2) MA plans
that do not offer prescription drug coverage (MA-Only). Traditional Medicare FFS is taken as our
outside option.
In addition to the i.i.d. extreme value error εic(j)m, individual preferences are allowed to vary
through group dummies ζig. This nested logit structure relaxes the independence of irrelevant al-
ternatives assumption and allows for differential substitution patterns between nests. The nesting
parameter, σ, captures the within-group correlation of utility levels.
Following Berry (1994) and others, the parameters in equation 5 can be estimated using market-
level data on the relative share of MA plans. Specifically, our estimation equation is as follows:
ln(Sc(j)m)− ln(S0m) = xc(j)mβ − αFc(j) + σln(Sc(j)m|g) + ξc(j)m, (6)
20The robustness of our plan entry results to bandwidth selection is summarized in Appendix D.
15
where xc(j)m denotes observed plan/contract characteristics, and ξc(j)m denotes the mean utility de-
rived from unobserved plan characteristics. We estimate the parameters of equation 6 using two-stage
least squares (2SLS) due to the endogeneity of within-group shares, Sc(j)m|g, and plan premiums, Fc(j).
We take as instruments the number of contracts operating in a county, the number of hospitals in a
county, the Herfindahl-Hirschman Index (HHI) for hospitals in a county (based on discharges), and
the number of physicians in the county. The results of this regression are presented in Appendix D.
With estimates of the mean observed utility, δc(j)m, and the within-group correlation, σ, estimated
monthly consumer surplus for a representative MA beneficiary is then derived as follows (Manski &
McFadden, 1981; Town & Liu, 2003; Maruyama, 2011):
Wi =1
α(1− σ) ln
∑j∈Jm
exp
(δc(j)m + ξc(j)m
1− σ
) . (7)
Our results yield an estimated $120 reduction in yearly consumer surplus per beneficiary for every
$10 increase in premiums (all else equal). In 2010, there were approximately 1,080,000 beneficiaries
enrolled in a 3, 3.5, or 4-star MA plan with a summary score just above the relevant threshold value.
Assuming a $20 increase in premiums from 2009 to 2010 (the smallest estimated effect in Tables 3 and
4), this yields a total reduction in consumer surplus of approximately $259 million.
9 Discussion
The potential supply-side response of MA contracts to the CMS quality rating system is critical both
from a policy perspective as well as a consumer welfare perspective. If contracts can take advantage
of improved quality scores by increasing premiums (holding the contract’s true quality constant), then
this suggests a lack of competitiveness in the MA market with contracts raising prices without any
true improvement in quality. Building on the initial results of Darden & McCarthy (forthcoming),
the current paper finds strong evidence of such premium increases among average to above average
star-rated contracts.
Based on the results in Section 5 and the range of sensitivity analyses in Section 6, we conclude
that the increases in premiums for 3-star versus 2.5-star contracts (the 2.75 threshold) as well as 3.5-
star versus 3-star contracts (the 3.25 threshold) are not due to chance but are instead reflective of a
true increase in premiums following an increase in reported quality. Meanwhile, we find no consistent
changes in premiums for 2.5 relative to 2-star contracts. We find some initial evidence for increases
in premiums among 4-star contracts relative to 3.5-star contracts; however, this finding is sensitive to
bandwidth specification, and the effect does not persist in our falsification tests. Plan-level results for
16
4-star rated contracts are also sensitive to the inclusion of market-level covariates,
There are likely several reasons for a contract to increase 2010 premiums in response to its prior-
year quality ratings. One natural reason is pure rent extraction - contracts may seek to capitalize on
their high reported quality by charging a higher price to its existing customers. However, contracts
may also increase premiums in order to better curb adverse selection. In this case, contracts of
higher reported quality but comparable true quality may want to price-out certain customers from
the market, particularly if sicker beneficiaries are more likely to make decisions based in-part on
the quality ratings. With market level data, we cannot empirically identify either of these effects
individually. Nonetheless, our results generally suggest that the perceived benefits of the star rating
program in terms of beneficiary decision-making are at least partially offset by the supply-side response
of higher premiums.
17
References
Abraham, Jean, Gaynor, Martin, & Vogt, William B. 2007. Entry and Competition in Local Hospital
Markets. The Journal of Industrial Economics, 55(2), 265–288.
Bajari, Patrick, Hong, Han, Krainer, John, & Nekipelov, Denis. 2010. Estimating static models of
strategic interactions. Journal of Business & Economic Statistics, 28(4).
Berry, Steven, & Reiss, Peter. 2007. Empirical models of entry and market structure. In: Armstrong,
M., & Porter, R. (eds), Handbook of industrial organization, vol. 3. Amsterdam: Elsevier.
Berry, Steven T. 1994. Estimating discrete-choice models of product differentiation. The RAND
Journal of Economics, 242–262.
Bresnahan, Timothy F, & Reiss, Peter C. 1991. Entry and competition in concentrated markets.
Journal of Political Economy, 977–1009.
Cawley, John, Chernew, Michael, & McLaughlin, Catherine. 2005. HMO participation in Medicare+
Choice. Journal of Economics & Management Strategy, 14(3), 543–574.
Dafny, L., & Dranove, D. 2008. Do report cards tell consumers anything they don’t already know?
The case of Medicare HMOs. The Rand journal of economics, 39(3), 790–821.
Darden, M., & McCarthy, I. forthcoming. The Star Treatment: Estimating the Impact of Star Ratings
on Medicare Advantage Enrollments. Journal of Human Resources.
Frakt, Austin B, Pizer, Steven D, & Feldman, Roger. 2012. The Effects of Market Structure and
Payment Rate on the Entry of Private Health Plans into the Medicare Market. Inquiry, 49(1),
15–36.
Imbens, G.W., & Lemieux, T. 2008. Regression discontinuity designs: A guide to practice. Journal of
Econometrics, 142(2), 615–635.
Lee, David S, & Lemieux, Thomas. 2010. Regression Discontinuity Designs in Economics. Journal of
Economic Literature, 48, 281–355.
Manski, Charles F, & McFadden, Daniel. 1981. Structural analysis of discrete data with econometric
applications. Mit Press Cambridge, MA.
Maruyama, Shiko. 2011. Socially optimal subsidies for entry: The case of medicare payments to hmos*.
International Economic Review, 52(1), 105–129.
18
McCrary, Justin. 2008. Manipulation of the running variable in the regression discontinuity design: A
density test. Journal of Econometrics, 142(2), 698–714.
Pauly, Mark, Harrington, Scott, & Leive, Adam. 2014. ‘Sticker Shock’ in Individual Insurance under
Health Reform. Tech. rept. National Bureau of Economic Research.
Reid, Rachel O, Deb, Partha, Howell, Benjamin L, & Shrank, William H. 2013. Association Between
Medicare Advantage Plan Star Ratings and EnrollmentStar Ratings for Medicare Advantage Plan.
JAMA, 309(3), 267–274.
Stockley, Karen, McGuire, Thomas, Afendulis, Christopher, & Chernew, Michael E. 2014. Premium
Transparency in the Medicare Advantage Market: Implications for Premiums, Benefits, and Effi-
ciency. Tech. rept. National Bureau of Economic Research.
Su, Che-Lin. 2012. Estimating discrete-choice games of incomplete information: Simple static exam-
ples. Quantitative Marketing and Economics, 1–41.
Town, Robert, & Liu, Su. 2003. The welfare impact of Medicare HMOs. RAND Journal of Economics,
719–736.
19
A Appendix A: Star Rating Metrics
The star rating system consists of five domains, with the names of each domain, the underlying metrics
in each domain, and the data sources for each metric changing over the years. The metrics and relevant
domains for 2009 are listed in Table 9.
Table 9
20
B Appendix B: Star Rating Calculations
Although the domains and individual metrics changed from year to year, the way in which overall star
ratings were calculated was consistent across years. The calculations follow in five steps, as described
in more detail in the CMS technical notes of the 2009, 2010, and 2011 star rating calculations:
1. Raw summary scores for each individual metric are calculated as per the definition of the metric
in question. As discussed in the text, these scores are derived from a variety of different datasets
including HEDIS, CAHPS, HOS, and others. The resulting summary scores are observed in our
dataset.
2. The summary scores in each metric are translated into a star rating. For most measures, the
star rating is assigned based on percentile rank; however, CMS makes additional adjustments in
cases where the distribution of scores are skewed high or low. Scores derived from CAHPS have
a more complicated star calculation, based on the percentile ranking combined with whether or
not the score is significantly different from the national average. The resulting stars for each
individual metric are observed in our dataset.
3. The star values from each metric are averaged among each respective domain to form domain
level stars, provided a minimum number of metric-level scores are available for each domains.
For example, in 2009 and 2010, a domain-level star was only calculated if the contract had a star
value for at least 6 of the 12 individual measures. The domain-level star ratings are observed in
our dataset.
4. Overall Part C summary scores are then calculated by averaging the domain-level star ratings
and adding an integration factor (i-Factor). The i-Factor is intended to reward consistency in a
plan’s quality across domains, and is calculated as follows:
(a) Derive the mean and variance of all individual metric summary scores for each contract.
(b) Form the distribution of the mean and variance across contracts.
(c) Assign an i-Factor of 0.4 for low variance (below 30th percentile) and high mean (above 85th
percentile), 0.3 for medium variance (30th to 70th percentile) and high mean, 0.2 for low
variance and relatively high mean (65th to 85th percentile), and 0.1 for medium variance
and relatively high mean. All other contracts are assigned an i-Factor of 0.
5. Overall Part C star ratings are then calculated by rounding the overall summary score to the
nearest half-star value.
21
We do not observe the i-Factors in the data. We therefore replicated the CMS methodology, ultimately
matching the overall star ratings for 98.8% and 98.5% of the plans in 2009 and 2010, respectively. As
discussed in the text, plans for which we were unable to replicate star ratings were dropped from the
analysis. Note also that star ratings are based on data from at least the previous calendar year and
sometimes further back depending on ease of access from CMS. New plans therefore do not have a star
rating available, nor was a star rating for such plans provided to beneficiaries.
Tables 10 and 11 presents example calculations of the overall summary score and resulting star
values for 5 contracts in 2009. The table lists the summary scores for the individual metrics along
with the corresponding star values, each of which are observed in the raw data. The high mean and
low mean thresholds for i-Factor calculations were calculated to be 3.6667 and 3.2381, respectively.
Similarly, the high variance and low variance thresholds were 1.3462 and 1.0362, respectively.
Table 10 and 11
The calculations for each contract in Table 10 are discussed individually below:
1. Contract H0150: With a mean star value of 2.583 and a variance of 0.879, the contract received
an i-Factor of 0 (due to a low mean), which provided an overall summary score of 2.583 and a
star rating of 2.5.
2. Contract H0151: With a mean star value of 2.667 and a variance of 0.8, the contract received an
i-Factor of 0 (again from a low mean), which provided an overall summary score of 2.667 and a
star rating of 2.5, just 0.083 points away from receiving a 3-star rating.
3. Contract H1558: With a mean star value of 3.967 and a variance of 1.275, the contract received
an i-Factor of 0.3 (high mean and medium variance), which provided an overall summary score
of 4.267, just 0.0167 above the 4.25 threshold required to round up to a 4.5-star rating.
4. Contract H0755: With a mean star value of 3.5278 and a variance of 1.285, the contract re-
ceived an i-Factor of 0.1 (relatively high mean and medium variance), which provided an overall
summary score of 3.6278 and a star rating of 3.5.
5. Contract H1230: With a mean star value of 3.694 and a variance of 1.018, the contract received
an i-Factor of 0.4 (high mean and low variance), which provided an overall summary score of
4.094 and a star rating of 4.0.
22
C Appendix C: Data
Our analysis merges publicly available data from several sources. As our starting point, we merge
together enrollment and contract information by month/year/contract id/plan id for all Medicare
Advantage MA contract/plans from June 2008 through December of 2011.21 For a small number of
counties, CMS reports enrollment counts at the Social Security Administrative (SSA) level.22 For
these observations, we aggregate enrollment to the county level, and, after limiting our focus to
HMO, PPO, and PFFS type contracts, we have a dataset of 50,269,123 observations at the contract
id/plan/county/month/year level.
The enrollment files are invalid for providing a census of MA contracts that operate in a given
market (county) because of migration. For example, if contract A is approved to operate in Orange
County, North Carolina, and an enrollee in contract A moves to Miami-Dade County, Florida, the
enrollment files will report positive enrollment in contract A in Miami-Dade County regardless if
contract A is approved to operate in Miami-Dade. To overcome this problem, CMS provides separate
service area files that list all contracts that are approved to operate in a given county.23 In addition to
the CMS service files, we merge our enrollment dataset to quality star data at the contract/year level24;
CMS contract/plan premium data25; Medicare Advantage market share data at the county/contract
id level26; and county-level census data from the American Community Survey for 2006-2010 in wide
format.
Given the size of the resulting data, we proceed in cleaning the data for 2009 and 2010 separately. In
what follows, we document our cleaning of the 2009 data, with 2010 in parenthesis. Our 2009 (2010)
data contain 19,290,326 (13,427,779) contract id/plan id/county/month observations. We begin by
dropping the 331,272 (204,355) observations from U.S. Territories and Outlying Areas. Next, we drop
all contract/plans that are specific to an employer or union-only group (these are also known as the
“800-series plans”). While the decision to eliminate these plans reduces our sample by 17,051,609
(11,988,547) observations, these contract/plans are not available to the public and are not our primary
21CMS records enrollment data in separate files from contract characteristic information. Data are availableat http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/MCRAdvPartDEnrolData/Monthly-Enrollment-by-
Contract-Plan-State-County.html
22The contract characteristic files contain a small number duplicate observations, which we drop.23Data are available at http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-
Reports/MCRAdvPartDEnrolData/MA-Contract-Service-Area-by-State-County.html. For the few counties that are sub-dividedby SSA, we aggregate to the county level.
24Contract-level quality data available at http://www.cms.gov/Medicare/Prescription-Drug-
Coverage/PrescriptionDrugCovGenIn/PerformanceData.html.25Data on plan premiums available at http://www.cms.gov/Medicare/Prescription-Drug-
Coverage/PrescriptionDrugCovGenIn/index.html?redirect=/PrescriptionDrugCovGenIn/. County names and FIPS codes availableat http://www.census.gov/popest/about/geo/codes.html.
26MA penetration data available at http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-
Reports/MCRAdvPartDEnrolData/MA-State-County-Penetration.html.
23
focus. Next, we drop the 231,655 (159,439) observations of special needs plans. Finally, we drop the
observations that did not merge perfectly between the CMS enrollment files and the service area files.
These reflect either contracts with positive enrollment in a month/year/county that were not approved
to operate in that county (due to migration) or contracts that were approved to operate in a county
but had no corresponding enrollment record. Our final sample size for 2009 is 1,422,887 (841,790)
contract id/plan id/county/month. We also collect hospital discharge data from the annual Hospital
Cost Reporting Information System (HCRIS) as well as CMS benchmark rates and average FFS costs
by county.27
27Data are available at http://www.cms.gov/Research-Statistics-Data-and-Systems/Files-for-Order/CostReports/Cost-Reports-by-Fiscal-
Year.html.
24
D Appendix D: Additional Analyses
D.1 Robustness Checks
Figure 2 illustrates the sensitivity of the plan-level RD analysis to our bandwidth selection. As should
be the case, the figure closely follows that of the contract-level analysis from Figure 1. Generally, Figure
2 suggests that the findings from the point estimates in Table 4 are relatively persistent across alter-
native bandwidths (provided the bandwidths are sufficiently narrow and include a sufficient number
of contracts).
Figure 2
Figures 4 and 5 present similar graphs for the analysis of plan exit and plan entry, respectively.
The figures generally support the robustness of the point estimates in Tables 6 and 7 to our bandwidth
selection. Our analysis of plan exit and entry at the 2.75 threshold (2.5 versus 3-star contracts) is one
possible exception, with the statistical significance, magnitude, and sign of the point estimates changing
within bandwidths from 0.1 to 0.2. In terms of hypothesis testing, we interpret this as evidence in
favor of the null that the star rating has no effect on plan exit or entry at the 2.75 threshold. As
such, the qualitative findings from our point estimates in Table 6 are unchanged, while the overall
findings from our analysis of plan entry (top panel in in Table 7) are less definitive among 3.0 relative
to 2.5-star rated contracts.
Figures 4 and 5
D.2 Welfare Analysis
The results of estimating equation 6 with OLS and 2SLS are presented in Table 12 along with the
first-stage results for the 2SLS estimator.
25
E Tables and Figures
Table 1: Summary Statistics
2009 2010Mean (S.D.) Mean (S.D.)
Plan-level Data, n=51,442 and 34,642Enrollmenta 291.55 (1,413) 361.17 (1,600)Overall Share % 1.18 1.26Within-nest Share, % 28.87 31.07Premium 37.69 (42.23) 53.27 (52.97)Drug Coverage, % 58.58 64.39HMO, % 16.32 24.12PPO, % 18.53 33.71
Market Characteristics, n=3,139 and 3,094MA Penetration 15.59 (11.03) 16.50 (12.12)Mean Number of Plans 37.38 (22.31) 26.61 (17.58)Population > 65 in 1,000s 12.22 (34.90) 12.59 (35.74)Population > 85 in 1,000s 1.72 (5.11) 1.79 (5.34)Unemployed, % 5.79 9.01White, % 86.30 86.41Black, % 9.11 9.18Female, % 50.16 50.17College Graduates, % 18.68 18.62South, % 42.02 42.63
Contract-level Star Ratings %, n=252 and 2951.5 1.98 0.002.0 9.92 4.072.5 24.21 24.413.0 28.97 29.833.5 21.43 20.674.0 11.11 12.204.5 2.38 7.785.0 0.00 1.02
aEnrollment data available for 20,768 plans in 2009 and 17,334 plans in 2010. Remaining plans have 10 or fewerenrollments and specific enrollments are therefore not provided by CMS.
26
Table 2: OLS Results for Average Characteristicsa
Star Indicator 2.5 3.0 3.5 4.0y = Average Premiumγ2 -5.18*** 6.74*** 6.15*** -8.84***
(1.55) (1.39) (1.54) (2.52)N 4,303 4,182 2,672 1,213R2 0.52 0.66 0.71 0.75y = Proportion of $0 Premium Plansγ2 0.17*** -0.13*** 0.03*** -0.03**
(0.02) (0.01) (0.01) (0.01)N 4,303 4,182 2,672 1,213R2 0.36 0.70 0.75 0.63
aOLS regression of the 2010 mean characteristics on the relevant 2009 mean characteristic and star ratings. Re-gressions estimated separately for each star rating, with γ2 denoting the estimated effect of a one-half star increasein quality ratings. Contract-level averages are based on all plans with more than 10 enrollments. Standard errors inparenthesis are robust to clustering at the county level. Additional controls not in the table include county-level vari-ables on the population over 65, population over 85, unemployment rate, percent white, percent black, percent female,regional dummy (south), percent graduating college, and the number of MA plans and contracts in the county, theCMS benchmark payment rate and average FFS cost, and number of physicians in the county, as well as contract-levelvariables including the number of counties in which the contract operated in 2009, whether the contract operates asan HMO or PPO, and the total number of enrollees under the contract in 2009. * p<0.1. ** p<0.05. *** p<0.01.
Table 3: RD Results for Average Characteristicsa
Star Threshold 2.25 2.75 3.25 3.75y = Average Premiumγ2 4.81 33.60*** 29.30*** 31.85***
(4.27) (7.27) (6.12) (6.38)N 2,029 982 432 309R2 0.39 0.72 0.69 0.92y = Proportion of $0 Premium Plansγ2 -0.14* -0.16** 0.02 -0.13*
(0.08) (0.06) (0.04) (0.07)N 2,029 982 432 309R2 0.21 0.90 0.72 0.55
aResults based on OLS regressions with RD approach and a bandwidth of h = 0.125. Robust standard errors inparenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an insufficientnumber of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Regressions estimatedat the contract level, with dependent variables measured as the average value of each plan characteristic by contract(excluding plans with 10 or fewer enrollments). Additional controls not in the table include county-level variableson the population over 65, population over 85, unemployment rate, percent white, percent black, percent female,regional dummy (south), percent graduating college, and the number of MA plans and contracts in the county, theCMS benchmark payment rate and average FFS cost, and number of physicians in the county, as well as contract-levelvariables including the number of counties in which the contract operated in 2009, whether the contract operates asan HMO or PPO, and the total number of enrollees under the contract in 2009. * p<0.1. ** p<0.05. *** p<0.01.
27
Table 4: RD Results for Plan-level Characteristicsa
Star Threshold 2.25 2.75 3.25 3.75y = 2010 premiumγ2 5.00** 19.40*** 41.99*** 31.52***
(2.10) (3.93) (5.17) (5.10)N 4,912 6,894 1,024 1,082R2 0.63 0.76 0.83 0.94y = Indicator for $0 premium plan in 2010γ2 0.04 -0.32*** 0.02 -0.15***
(0.04) (0.05) (0.03) (0.05)N 4,912 6,894 1,024 1,082R2 0.24 0.89 0.51 0.59
aResults based on OLS regressions with RD approach and a bandwidth of h = 0.125. Robust standard errors inparenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to an insufficientnumber of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Regressions estimatedat the plan level for all plans in the dataset. Additional controls not in the table include county-level variables on thepopulation over 65, population over 85, unemployment rate, percent white, percent black, percent female, regionaldummy (south), percent graduating college, and the number of MA plans and contracts in the county, the CMSbenchmark payment rate and average FFS cost, and number of physicians in the county, as well as the plan’s totalnumber of enrollees in 2009 (set to 0 if missing), an indicator variable for missing number of enrollees (¡10 enrolleesin the plan), an indicator for HMO or PPO plan type, and the lagged dependent variable. * p<0.1. ** p<0.05. ***p<0.01.
Table 5: Summary of Plan Exit and Entrya
2009 Rating Exit (%) Entry (%)1.5 Star 99.49 36.512.0 Star 51.40 55.162.5 Star 53.58 52.793.0 Star 29.37 23.913.5 Star 25.97 17.204.0 Star 8.25 32.454.5 Star 8.24 7.72All 49.77 38.20
aExit defined as the same plan-county-contract observation in 2009 no longer active in 2010.
28
Table 6: RD Results for Plan Exita
Star Threshold 2.25 2.75 3.25 3.75Overall Resultsγ2 -0.83*** -0.07 0.12** -0.25***
(0.06) (0.09) (0.06) (0.06)N 10,791 9,806 1,177 1,435Among Plans with Premiums = $0γ2 -0.84*** 0.25** 1.07*** -0.07
(0.06) (0.11) (0.30) (0.05)N 9,110 613 140 281Among Plans with Premiums > $0γ2 -1.37*** -0.82*** 0.04 -0.36***
(0.13) (0.12) (0.05) (0.07)N 1,681 9,193 1,037 1,154
aResults based on linear probability model with RD approach and a bandwidth of h = 0.125. Robust standarderrors in parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to aninsufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Additionalcontrols not in the table include county-level variables on the population over 65, population over 85, unemploymentrate, percent white, percent black, percent female, regional dummy (south), percent graduating college, and thenumber of MA plans and contracts in the county, the CMS benchmark payment rate and average FFS cost, andnumber of physicians in the county, as well as 2009 plan characteristics and enrollment. * p<0.1. ** p<0.05. ***p<0.01.
Table 7: RD Results for Plan Entrya
Star Threshold 2.25 2.75 3.25 3.75Overall Resultsγ2 0.06 -0.23*** 0.18*** 0.30***
(0.12) (0.07) (0.06) (0.06)N 6,352 2,453 1,252 852Among Plans with Premiums = $0γ2 -0.76*** -0.02 -1.80** 0.65***
(0.08) (0.09) (0.75) (0.12)N 3,360 793 171 331Among Plans with Premiums > $0γ2 2.34*** -1.28*** 0.22*** 0.20***
(0.16) (0.19) (0.07) (0.06)N 2,992 1,660 1,081 521
aResults based on linear probability model with RD approach and a bandwidth of h = 0.125. Robust standarderrors in parenthesis, clustered at the county level. Results were excluded for the 1.5 and 4.5 star ratings due to aninsufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds, respectively. Additionalcontrols not in the table include county-level variables on the population over 65, population over 85, unemploymentrate, percent white, percent black, percent female, regional dummy (south), percent graduating college, the CMSbenchmark payment rate and average FFS cost, and number of physicians in the county, as well as plan characteristics(premium, Part D participation, and HMO versus PPO). * p<0.1. ** p<0.05. *** p<0.01.
29
Table 8: RD Results for Premiums without Covariatesa
Star Threshold 2.25 2.75 3.25 3.75y = Mean Contract Premiumsγ2 12.82 16.25*** 28.58*** 26.97***
(3.26) (4.53) (5.09) (12.66)N 2,029 982 432 309y = Individual Plan Premiumsγ2 -4.34*** 10.88*** 31.27*** 8.36
(1.59) (2.31) (3.42) (7.23)N 4,912 6,894 1,024 1,082
aResults based on RD with triangular kernel and a bandwidth of h = 0.125. Results were excluded for the 1.5 and4.5 star ratings due to an insufficient number of contracts on the lower and upper ends of the 1.75 and 4.25 thresholds,respectively. * p<0.1. ** p<0.05. *** p<0.01.
30
Tab
le9:
Domains,
Metrics,
and
Data
Sourcesfor2009M
ASta
rRatingPro
gra
ma
Sta
yingHealthy
GettingTim
ely
Care
from
Docto
rsPlan
Resp
onsiveness
and
Care
ManagingChro
nic
Conditions
HandlingofAppeals
Bre
ast
Cance
rScr
eenin
g(H
ED
IS)
Acc
ess
toP
rim
ary
Care
Doct
or
Vis
its
(HE
DIS
)G
etti
ng
App
oin
tmen
tsand
Care
Quic
kly
(CA
HP
S)
Ost
eop
oro
sis
Managem
ent
(HE
DIS
)P
lan
Makes
Tim
ely
Dec
isio
ns
ab
out
App
eals
(IR
E)
Colo
rect
al
Cance
rScr
eenin
g(H
ED
IS)
Follow
-up
Vis
itw
ithin
30
Day
sof
Dis
charg
eaft
erH
osp
ital
Sta
yfo
rM
enta
lIl
lnes
s(H
ED
IS)
Over
all
Rati
ng
of
Hea
lth
Care
Quality
(CA
HP
S)
Dia
bet
esC
are
-E
ye
Exam
(HE
DIS
)R
evie
win
gA
pp
eals
Dec
isio
ns
(IR
E)
Card
iova
scula
rC
are
-C
hole
ster
ol
Scr
eenin
g(H
ED
IS)
Doct
or
Follow
-up
for
Dep
ress
ion
(HE
DIS
)O
ver
all
Rati
ng
of
Hea
lth
Pla
n(C
AH
PS)
Dia
bet
esC
are
-K
idney
Dis
ease
Monit
ori
ng
(HE
DIS
)
Dia
bet
esC
are
-C
hole
ster
ol
Scr
eenin
g(H
ED
IS)
Get
ting
Nee
ded
Care
wit
hout
Del
ays
(CA
HP
S)
Call
Answ
erT
imel
ines
s(H
ED
IS)
Dia
bet
esC
are
-B
lood
Sugar
Contr
olled
(HE
DIS
)
Gla
uco
ma
Tes
ting
(HE
DIS
)D
oct
ors
Who
Com
munic
ate
Wel
l(C
AH
PS)
Dia
bet
esC
are
-C
hole
ster
ol
Contr
olled
(HE
DIS
)
Appro
pri
ate
Monit
ori
ng
of
Pati
ents
Takin
gL
ong-T
erm
Med
icati
ons
(HE
DIS
)
Cust
om
erSer
vic
e(C
AH
PS)
Anti
dep
ress
ant
Med
icati
on
Managem
ent
(HE
DIS
)
Annual
Flu
Vacc
ine
(CA
HP
S)
Contr
ollin
gB
lood
Pre
ssure
(HE
DIS
)
Pneu
monia
Vacc
ine
(CA
HP
S)
Rheu
mato
idA
rthri
tis
Managem
ent
(HE
DIS
)
Impro
vin
gor
Main
tain
ing
Physi
cal
Hea
lth
(HO
S)
Tes
ting
toC
onfirm
CO
PD
(HE
DIS
)
Impro
vin
gor
Main
tain
ing
Men
tal
Hea
lth
(HO
S)
Conti
nuous
Bet
aB
lock
erT
reatm
ent
(HE
DIS
)
Ost
eop
oro
sis
Tes
ting
(HO
S)
Impro
vin
gB
ladder
Contr
ol
(HO
S)
Monit
ori
ng
Physi
cal
Act
ivit
y(H
OS)
Red
uci
ng
the
Ris
kof
Fallin
g(H
OS)
aD
escr
ipti
on
of
dom
ain
san
dad
dit
ional
det
ails
availab
leat
ww
w.c
ms.
gov.
Data
sou
rce
for
CM
Sca
lcu
lati
on
sp
rovid
edin
pare
nth
esis
.
31
Tab
le10:Sta
rRatingCalculation
Examples
Sta
rsRaw
Score
s
H0150
H0151
H1558
H0755
H1230
H0150
H0151
H1558
H0755
H1230
Bre
ast
Cance
rScr
eenin
g2
25
45
59
57
87
75
87
Colo
rect
al
Cance
rScr
eenin
g2
34
54
35
45
62
71
59
Card
iova
scula
rC
are
-C
hole
ster
ol
Scr
eenin
g3
34
45
79
81
93
90
96
Dia
bet
esC
are
-C
hole
ster
ol
Scr
eenin
g3
24
44
77
74
88
92
94
Gla
uco
ma
Tes
ting
33
55
460
60
84
76
73
Appro
pri
ate
Monit
ori
ng
for
Long-t
erm
Med
icati
ons
43
54
290
88
93
90
82
Annual
Flu
Vacc
ine
32
54
567
66
87
77
84
Pneu
monia
Vacc
ine
32
53
467
63
80
68
77
Impro
vin
gor
Main
tain
ing
Physi
cal
Hea
lth
33
33
360
54
59
60
55
Impro
vin
gor
Main
tain
ing
Men
tal
Hea
lth
33
33
381
78
81
82
80
Ost
eop
oro
sis
Tes
ting
12
33
356
58
68
68
71
Monit
ori
ng
Physi
cal
Act
ivit
y3
33
33
46
41
44
48
51
Acc
ess
toP
rim
ary
Care
Doct
or
Vis
its
44
55
494
92
99
97
89
Follow
-up
aft
erH
osp
ital
Vis
itfo
rM
enta
lIl
lnes
s3
24
546
41
72
77
Doct
or
Follow
-up
for
Dep
ress
ion
11
12
53
20
22
Get
ting
Nee
ded
Care
wit
hout
Del
ays
35
53
383
88
90
86
83
Get
ting
App
oin
tmen
tsand
Care
Quic
kly
12
54
368
72
83
77
75
Over
all
Rati
ng
of
Hea
lth
Care
Quality
33
54
384
85
90
86
85
Over
all
Rati
ng
of
Hea
lth
Pla
n4
45
34
86
87
92
86
87
Call
Answ
erT
imel
ines
s4
24
35
83
72
84
81
96
Doct
ors
Who
Com
munic
ate
Wel
l3
45
44
90
91
93
91
91
Cust
om
erSer
vic
e3
35
33
88
87
92
88
86
Ost
eop
oro
sis
Managem
ent
11
12
17
16
19
28
Dia
bet
esC
are
-E
ye
Exam
33
55
553
55
82
79
91
Dia
bet
esC
are
-K
idney
Dis
ease
Monit
ori
ng
33
34
576
77
77
85
97
Dia
bet
esC
are
-B
lood
Sugar
Contr
olled
22
45
453
55
82
87
83
Dia
bet
esC
are
-C
hole
ster
ol
Contr
olled
22
45
433
30
58
63
59
Anti
dep
ress
ant
Med
icati
on
Managem
ent
22
25
44
40
43
63
Contr
ollin
gB
lood
Pre
ssure
12
45
433
51
63
68
62
Rheu
mato
idA
rthri
tis
Managem
ent
23
33
68
71
73
75
Tes
ting
toC
onfirm
CO
PD
22
22
224
21
33
32
30
Conti
nuous
Bet
aB
lock
erT
reatm
ent
32
24
79
69
73
85
Impro
vin
gB
ladder
Contr
ol
22
22
237
34
39
38
37
Red
uci
ng
the
Ris
kof
Fallin
g3
33
54
55
55
55
63
61
Pla
nM
akes
Tim
ely
Dec
isio
ns
ab
out
App
eals
44
14
586
88
43
91
100
Rev
iew
ing
App
eals
Dec
isio
ns
14
33
366
86
79
77
77
32
Table 11: Star Rating Calculation Examples, Cont.H0150 H0151 H1558 H0755 H1230
Mean Summary Score 2.5833 2.6667 3.9667 3.5278 3.6944Variance Summary Score 0.8786 0.80 1.2747 1.2849 1.0183i-Factor 0 0 0.3 0.1 0.4Summary Score 2.5833 2.6667 4.2667 3.6278 4.0944Star Rating 2.5 2.5 4.5 3.5 4
Table 12: Welfare Analysisa
OLS 2SLS
Premium -0.00** -0.04***(0.00) (0.01)
Within-group Share 0.71*** 0.74***(0.03) (0.10)
HMO -0.03 -1.26**(0.09) (0.61)
PPO -0.21 -0.55(0.13) (0.38)
Part D 1.19*** 2.22***(0.11) (0.48)
Part D Cost -0.00 -0.00(0.00) (0.00)
Summary Score 0.43*** 1.96***(0.10) (0.63)
N 20,738 18,300First-stage Statistics
Premium Within-group Share
Contract Count 0.07 -0.00(0.30) (0.01)
Hospital Inpatient HHI -0.27 1.45***(1.31) (0.04)
Hospital Count -0.43*** -0.00(0.10) (0.00)
Total Physicians 0.00 -0.00***(0.00) (0.00)
F-stat 9.80 647.94
aRobust standard errors in parenthesis, clustered at the contract level. In the 2SLS estimation, premium andwithin group share were instrumented using number of contracts operating in a county, the number of hospitals ina county, the Herfindahl-Hirschman Index (HHI) for hospitals in a county (based on discharges), and the number ofphysicians in the county as instruments. * p<0.1. ** p<0.05. *** p<0.01.
33
Figure 1: Effect of Star Rating on Mean Contract Premium for Varying Bandwidths Around Thresholds2.25, 2.75, 3.25 and 3.75
-40
-20
020
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .15 .2 .25
Bandwidth
1020
3040
5060
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .15 .2 .25
Bandwidth
a. 2.25 b. 2.75
1020
3040
50
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .15 .2 .25
Bandwidth
-20
020
4060
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .15 .2 .25
Bandwidth
c. 3.25 d. 3.75
34
Figure 2: Effect of Star Rating on Plan Premiums for Varying Bandwidths Around Thresholds 2.25,2.75, 3.25 and 3.75
-10
-50
510
15
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
010
2030
40
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
a. 2.25 b. 2.75
3040
5060
7080
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
020
4060
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
c. 3.25 d. 3.75
35
Figure 3: Falsification Test: Effect of Star Rating on Mean Contract Premium around Counter-factualThresholds
-50
050
100
Sta
r R
atin
g C
oeffi
cien
t, g 2
2.1 2.2 2.3 2.4
Counterfactual Threshold
-40
-20
020
4060
Sta
r R
atin
g C
oeffi
cien
t, g 2
2.6 2.7 2.8 2.9
Counterfactual Threshold
a. Around the true 2.25 threshold b. Around the true 2.75 threshold
-100
-50
050
100
Sta
r R
atin
g C
oeffi
cien
t, g 2
3.1 3.2 3.3 3.4
Counterfactual Threshold
-50
050
Sta
r R
atin
g C
oeffi
cien
t, g 2
3.6 3.7 3.8 3.9
Counterfactual Threshold
c. Around the true 3.25 threshold d. Around the true 3.75 threshold
36
Figure 4: Effect of Star Rating on Plan Exit for Varying Bandwidths Around Thresholds 2.25, 2.75,3.25 and 3.75
-1.5
-1-.
50
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
-1.5
-1-.
50
.5
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
a. 2.25 b. 2.75
-.4
-.2
0.2
.4
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
-.6
-.4
-.2
0
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
c. 3.25 d. 3.75
37
Figure 5: Effect of Star Rating on Plan Entry for Varying Bandwidths Around Thresholds 2.25, 2.75,3.25 and 3.75
-10
12
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
-.6
-.4
-.2
0.2
.4
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
a. 2.25 b. 2.75
-.1
0.1
.2.3
.4
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
0.1
.2.3
.4.5
Sta
r R
atin
g C
oeffi
cien
t, g 2
.1 .2 .3 .4 .5
Bandwidth
c. 3.25 d. 3.75
38