News, Copyright, and Online AggregatorsPreliminary
Lesley Chiou and Catherine E. Tucker
October 1, 2010
Abstract
This paper examines how the practices of online aggregators of news content affectconsumers search of news online. The aggregation of content by third-party websiteshas become a controversial digital copyright issue. On the one hand, aggregators arguethat they provide a useful service to consumers and that the amount and substantialityof the news article used in their links is small enough to be considered fair use. Onthe other hand, content providers argue that such practices represent copyright in-fringement and that third-party aggregators profits from advertising associated withcontent reduces the potential market for or value of copyrighted material. We empir-ically examine how the severing of a relationship between a major content providerand a major news aggregator affected consumers search for online news. Specifically,we investigate how the removal of all hosted articles by The Associated Press fromGoogle News at the end of 2009 (due to a dispute in licensing negotiations) affectedwhich news sites consumers visited. Our empirical analysis suggests that the removalof The Associated Presss content was correlated with a decline in subsequent visits totraditional news sites immediately after visiting Google News compared to other newsaggregators that continued to host Associated Press content.
Economics Department, Occidental College, CAMIT Sloan School of Management, MIT, Cambridge, MA.We thank Christopher Hafer of Experian Hitwise.
1
1 Introduction
The Internet has reshaped the news and media industry by providing a wealth of information
and easy access to news, stories, and events that occur locally and globally. Over 74 million
users visit newspapers sites each month, accounting for more than a third of all web users
(Advertising Age, 2009). With the proliferation of search engines and news sites, consumers
face an array of options and sources for information. Search engines have responded to
consumers demand for information by creating online news aggregators, such as Google
News and Yahoo! News. These news aggregators feature a collection of stories and headlines
from various online news sources, which are collated into a single site. As such, aggregators
offer a convenient place for users to consolidate their news reading.
We study empirically whether using a news aggregator shifts consumers consumption of
news online and whether news aggregators are substitutes or complements for the primary
sources they feature. Surprisingly, there has been no empirical work that quantifies the
effect of aggregators on primary news sources, even though the US Copyright Office requires
proof of fair use and asks users to determine whether their use of the copyrighted material
reduces its potential market or value. On the one hand, news aggregators may steal
traffic from media sites if users rely solely on the aggregators abbreviated descriptions of
the article and do not visit the source site. In fact, this accusation has been levied by several
major media organizations, including The Associated Press and News Corporation. Rupert
Murdoch, chairman of News Corp., has accused news aggregators of stealing content and
violating copyrights (Sandoval, 2009). Mark Cuban, chairman of HDNet, even referred to
such practices as vampiric, saying that Newspapers are getting their blood sucked by Google
and content aggregators (Kaplan, 2010). Moreover, some anecdotal evidence exists that a
significant fraction of readers scan only the headlines from Google News and do not visit
the source site (Sullivan, 2010). On the other hand, news aggregators may expose readers
2
to news sources and sites that they might otherwise not visit, and therefore generate traffic
for the primary news source. As argued by Arrington (2010), When an aggregator puts up
a link to your site, they are doing you a favor by sending you traffic.
Therefore, the overall effect of news aggregators on consumers news-seeking habits is
an empirical question. Our paper focuses on how the practices of news aggregators affect
consumers search for news online. We use a rich dataset of consumers online search behavior
and site visits to examine the effect of a policy change in content displayed by a major
online news aggregator in displaying content from a primary news. The decision to host
news sources may be correlated with other factors that influence a consumers consumption
of news, so we use a discontinuous event that altered the set of news sources provided by a
major news aggregator.
In January 2010, after a breakdown in licensing negotiations, Google removed all news
articles by The Associated Press from its news aggregator (Haddad, 2010). We compare
consumers site visits before and after this policy relative to traffic from Yahoo! News, which
continued to provide Associated Press content in this period. Yahoo! News and Google News
play a large role in the online news market and are among the top 5 news sites visited by
readers. We find that after Associated Press content was removed from Google News, fewer
consumers subsequently visited traditional news sites (the sources for much of Associated
Press content) relative to consumers using Yahoo! News. We checked the robustness of the
result in a variety of ways. Our finding suggests that the aggregation of news content actually
complements the original content. In other words, users are more likely to be provoked to
seek the original source and read further when they come across a story summarized by an
aggregator, rather than being merely content with the summary.
Our paper builds on a growing literature that documents how the Internet has affected
the consumption of traditional news media. Gentzkow (2007) investigated the relationship
between oine and online newspapers. He found some evidence of complementarity but ulti-
3
mately concluded that this was an artifact of customer heterogeneity and that the provision
of online news was costly to newspaper print edition revenues. This finding was supported
in separate research by George (2008) and Filistrucchi (2005). Kaiser and Kongsted (2005)
find evidence of complementarity for online magazines, which suggests that substitution is
particularly important for news content rather than more magazine-type content. To our
knowledge, however, we are the first to study how the practice of online aggregation affects
online news consumption. Our distinction between traditional media, which is primarily
local, and news aggregators that have national reach builds on earlier research that has
documented the importance of the interaction between national and local news distribution
practices for understanding consumption of news (George and Waldfogel, 2006; Oberholzer-
Gee and Waldfogel, 2009). Our work also relates to research that has evaluated the conflict
between digitization and copyright. These studies have focused predominantly on the issues
relating to the piracy of film and musical content (Rob and Waldfogel, 2006; Oberholzer-Gee
and Strumpf, 2007; Danaher et al., 2010) by unauthorized distribution channels.
Our paper has several implications for media markets and public policy. First, a fierce
debate exists over intellectual property and copyrights for content posted online. Search en-
gines and aggregators accumulate information from primary sources, and controversy exists
over whether news aggregators violate existing copyrights and whether content can be pro-
vided freely by a third party. Secondly, policymakers have long stressed the importance of
the diversity of news consumption and how consumption of local news in particular encour-
ages civic engagement. Our results suggest that news aggregators actually provoke readers
to seek further news. Given that local news sites comprise a substantial fraction of online
news, aggregators may promote more diversity in consumption patterns.
4
2 Data and Institutional Setting
2.1 Contractual Dispute between Google and Associated Press
The Associated Press, founded in 1846, is one of the most powerful news agencies in the
world. Since the demise of United Press International, it is the only national news service
in the US, and its major competitors are now the United Kingdom-based Reuters and the
France-based Agence-France Presse. It is a cooperative that is owned by various newspapers
and radio and television stations in the United States. These stakeholders both contribute
stories to the Associated Press and use material which are written by its staff journalists.
During the past decade, The Associated Press has been at the forefront of efforts by copyright
holders to circumscribe fair use for digital content and protect copyholders rights. For
example, in June 2008, Associated Press has invoked the Digital Millennium Copyright Act
and insisted that various bloggers remove Associated Press content (Ardia, 2008).
Google News is ranked as the fifth most visited news website by Hitwise. Receiving 2.90%
of all news site visits, it is the second most popular news aggregator service after Yahoo!
News, which received 7.09% of all news site visits. Founded on April 2002, Google News
electronically aggregates different news sources based upon a proprietary algorithm. As of
December 2009, Google News claimed that it received news content from 25,000 publishers
across the world and that it sent 1 billion clicks to these publishers every month (Cohen,
2009). Google News has been supported by advertising revenues in the US since February
2009. Figure 1 provides a screenshot of Google News. Google News has two noticeable
features that distinguishes it from traditional news sites. First, a variety of sources are
listed for each story. Second, the order of news is electronically determined based on users
preferences, the recency of the story, and the interest it has received from other users.
Since both The Associated Press and Google News are key players in the distribution
of news, it is not surprising that they have forged a partnership. Table 1 summarizes the
5
Figure 1: Screen shot of Google News screen
Note: On June, 30 2010, the formatting of Google News changed somewhat and reduced the ability of users tocustomize the placement of the columns containing news. Therefore the screenshot above, which was producedafter this formatting change, may be slightly different from what users viewed during the period that we study.
6
major events of their relationship. We study a discontinuity in this relationship, which was
engendered by negotiations surrounding the contract renewal at the end of January 2010. As
part of their existing contract, Google and The Associated Press agreed that Associated Press
content could be hosted by Google for a period of 30 days. Therefore, if the contract ended
in January 2010 and was not renewed, Google would have to stop posting new Associated
Press content 30 days prior to the end of the contract. Presumably, to make this clean
break a credible outside option, Google did indeed stop posting content for seven weeks
during these contract negotiations. We should emphasize that this is necessarily based on
the observations of industry outsiders, since both Google and the Associated Press signed
binding non-disclosure agreements, which prevented them from ever commenting on the
course or outcome of negotiations (Sullivan, 2010).
This removal of Associated Press content represents a useful natural experiment for
empirical researchers. Since the removal of content was provoked by the intricacies of contract
negotiation, its timing can be thought of reasonably exogenous, as it was determined by the
expiration of the contract rather than any considerations of the popularity (or lack thereof)
for Associated Press content at that time. As detailed in Table 1, the dispute with the
Associated Press led Google to remove content by the Associated Press from December 23,
2009 to February 9, 2010. Fortunately for our purposes, Yahoo! News continued to host
Associated Press content without interruption during this time, which enables us to use its
web users behavior as a control in our regressions. We compare which websites consumers
navigated to after visiting a news aggregator before and after the removal of content on
Google for both visitors to Google News and Yahoo! News.
Critics and supporters alike of news aggregators have proposed numerous arguments for
whether the removal of Associated Press content may either benefit or hurt news websites.
On the one hand, if consumers are no longer able to obtain Associated Press news content,
they may be more likely to seek the news directly from the Associated Press member orga-
7
Table 1: Timeline of negotations between Google and Associated PressDate Event
March, 2005 Google is sued by Agence France Presse for copyright in-fringement after AFP content appeared on Google News.
August, 2006 Google and Associated Press first sign contract to enableAssociated Press content to appear on Google News for30 day window.
December 24, 2009 Associated Press content no longer appears on Google.Industry press speculates that this is in preparation forthe expiration of contract between Associated Press andGoogle in a months time.
End January 2009 Associated Press and Google contract set to expire
February 2010 Associated Press Content returns to Google News.
nizations and newspapers. On the other hand, consumers may simply be less likely to seek
further information about news. In essence, this distinction can be boiled down to whether
consumers view news aggregators as a complement or substitute to original news sources. Do
they use news aggregators to identify news stories that they then pursue in greater depth,
or do they simply stop after reading the first news item? For instance, the Associated Press
ran a news story about economic depression in Michigan in August 2010. The screenshot
of how the story appeared on Google News is depicted in Figure 2. The links relating to
the Associated Press story that appear at the bottom of a typical story are also depicted
in Figure 2. After reading the Associated Press summary of the story, readers are free to
explore the issue further in local newspapers such as the Detroit News and Lansing State
Journal. The question we ask is whether the presence of the Associated Press content on
Google News makes it more or less likely that a news consumer would then trouble to visit
Detroit News or the Lansing State Journal, both of which are members of the Associated
Press Network.
Our analysis is focused on the period immediately prior to and during the removal of
8
Figure 2: Example screenshot of Associated Press article hosted on Google NewsNote: Google News, August 1st 2010. Text of article has been slightly edited to fit on page.
9
Associated Press articles from Google News for two reasons. First, it is not immediately clear
at which point in February that Google News and Associated Press resumed their relationship
and reached a new agreement. Second, it is not apparent whether the reinstatement during
this time consisted of the older, missing content or new content or whether Google changed
the presentation of AP articles afterwards. For example, it would be problematic if Google
decided to highlight Associated Press content after the contract negotiations were concluded,
perhaps as a sweetener to the deal. For these reasons, we focus on visits to news sites
during the months of December 2009 and January 2010.
2.2 Data Description
Our data derive from Experian Hitwise. Hitwise develops proprietary software that Internet
Service Providers (ISPs) use to analyze website logs created on their network. Once the ISP
aggregates the anonymous data, the data are provided to Hitwise. According to their website,
Hitwise collects these usage data from a geographically diverse range of ISP networks and
opt-in panels, representing all types of Internet usage, including home, work, education
and public access. Currently, Hitwise has usage data from a sample of 25 million people
worldwide.
We collected information on the sites that users visit immediately after navigating to
Google News or Yahoo! News. We use weekly data from the week ending December 5, 2009
to the week ending February 27, 2010 for the top 1500 sites navigated after Google News or
Yahoo! News. Hitwise reports the fraction of total traffic that arrives at each downstream
site immediately after a visit to Google News and Yahoo! News. We constructed a 2-
month panel where the unit of observation is the percentage of weekly clicks a downstream
website received from either Google News or Yahoo! News. Twenty-six percent of websites
received incoming traffic from both Google and Yahoo! News. The remainder of websites
were only visited after navigating to one particular aggregator. This may reflect internal
10
complementarities for these companies. For instance, someone using Google News is unlikely
to navigate to Yahoo! Mail, and similarly someone using Yahoo! News is unlikely to navigate
to Gmail.
We categorized the websites into two main classes: non-news (e.g., Yahoo! Mail, mys-
pace.com) and traditional news (e.g., newyorktimes.com, bostonherald.com). We applied
Hitwises own categorization of news websites to identify traditional news media, but we
excluded weather sites and news aggregators from the 5 major search engines (such as Ya-
hoo! News, Google News, Huffington Post) from the category.1 We identified a site as an
aggregator based upon whether or not they produced their own original content.
We also constructed a separate category for international news (e.g., bbc.com/news,
hindustantimes.com), which we use in our robustness checks. We would expect the removal
of Associated Press content to affect traditional news media sites, but the removal should
not affect visits to international sites that tend to either generate their own content or rely
on non-American news agencies for their content.
Table 2 reports the summary statistics for our data. It is striking that 20 percent of
the time, consumers navigate to a traditional news media website from the news aggregator.
Traditional news sites captured most traffic. International news received less traffic (5.5
percent of sites visited) than traditional news sites.
1Hitwise reports the top 10,000 ranked news and media sites in November 2009.
11
Table 2: Summary statistics for downstream websites from Google News and Yahoo! News
Mean Std Dev Min Max Observations% clicks 0.016 0.19 0 18.3 100503Google News 0.50 0.50 0 1 100503PeriodDispute 0.67 0.47 0 1 100503Traditional News Site 0.20 0.40 0 1 100503News Aggregator Site 0.0011 0.033 0 1 100503International News Site 0.055 0.23 0 1 100503Observations 100503
Notes: This table reports statistics for websites visited immediately after Google News and Yahoo! Newsduring December 2009 and January 2010. The period during which the dispute occurred between AssociatedPress and Google News was after December 23, 2009. Traditional news sites refer to news and media sitesas defined by Hitwise, excluding weather sites, international news sites, and news aggregators from the top5 search engines.
12
Table 3 displays the top 50 (traditional) news websites in our dataset and the average
percentage of downstream clicks they receive. Table 4 displays the top 50 non-news websites
in our dataset and the average percentage of downstream clicks they receive. As shown in
Table 4, the top non-news websites reflect the top website brands on the Internet. This is
suggestive evidence that users of news aggregator sites have both mainstream Internet tastes
and regard the sites as part of their normal Internet consumption.
13
Table 3: Top 50 news websites visitedafter Google News and Yahoo! News
Avg Visit Pctabcnews.com 2.11associatedcontent.com 0.11bleacherreport.com 0.17bloomberg.com 0.51boston.com 0.24bostonherald.com 0.19businessweek.com 0.15cbsnews.com 0.19chron.com 0.13cnn.com 1.85csmonitor.com 0.15dallasnews.com 0.11drudgereport.com 0.64edition.cnn.com 0.20examiner.com 0.65foxnews.com 1.13foxnews.com/entertainment 0.082foxnews.com/politics 0.20freep.com 0.13gather.com 0.34latimes.com 0.48mcclatchydc.com 0.095mercurynews.com 0.44miamiherald.com 0.15msnbc.com 0.83news.com 0.12nj.com 0.11npr.org 0.16nydailynews.com 1.59nypost.com 0.26nytimes.com 2.88people.com 0.39philly.com 0.15politico.com 0.53radaronline.com 0.060reuters.com 0.69seattlep-i.nwsource.com 0.11seattletimes.nwsource.com 0.11sfgate.com 0.17sportsillustrated.cnn.com 0.10startribune.com 0.084thedailybeast.com 0.17theweek.com 0.14time.com 1.16upi.com 0.093usatoday.com 0.72usmagazine.com 0.23usnews.com 0.082voanews.com 0.13washingtonpost.com 1.74wsj.com 0.86
Table 4: Top 50 Non-news websitesvisited after Google News and Yahoo!News
Avg Visit Pctaddress.yahoo.com 0.12amazon.com 0.59aol.com 0.46aralifestyle.com 0.14ask.com 0.19bankofamerica.com 0.18bing.com 0.62blogsearch.google.com 0.77buzz.yahoo.com 0.21chase.com 0.14cosmos.bcst.yahoo.com 0.95ebay.com 1.00education.yahoo.net 0.34espn.com 0.56facebook.com 6.23fastflip.googlelabs.com 3.60finance.google.com 0.36finance.yahoo.com 0.60games.yahoo.com 0.099gmail.com 1.55google.com 11.6howlifeworks.com 1.04huffingtonpost.com 0.96images.google.com 0.50latimesblogs.latimes.com 0.16livescience.com 0.38mail.live.com 1.28mail.yahoo.com 9.94maps.google.com 0.23members.yahoo.com 0.29movies.yahoo.com 0.13msn.com 1.03my.yahoo.com 0.67myspace.com 1.54news.google.com 0.24omg.yahoo.com 0.32rivals.com 0.10search.yahoo.com 2.20shine.yahoo.com 0.13space.com 0.15sports.yahoo.com 0.26sports.yahoo.com/nfl 0.13tmz.aol.com 0.20tv.yahoo.com 0.12video.google.com 0.27weather.com 0.67weather.yahoo.com 0.39wikipedia.org 0.50yahoo.com 7.20youtube.com 2.47
14
Table 5: Demographic description of usersMeasure Yahoo! News Google News New York Times
Male 59.95 63.8 61.21Age 18-24 12.12 13.89 6.17Age 25-34 18.05 14.72 13.93Age 35-44 19.03 17.08 12.98Age 45-54 21.41 22.24 19.45Age 55+ 29.38 32.06 47.47Income 150k 9.29 9.6 10.77
Source: Hitwise
Notes: This table reports the fraction of users of a particular website within each demographic category.Statistics are reported for users of Yahoo! News, Google News, and the New York Times website.
To verify that Yahoo! News could be considered an appropriate control group for Google
News, we checked that the users shared similar observable demographics. Table 5 reports the
fraction of users within each demographic category for a particular site. The users of Yahoo!
News and Google News do indeed look reasonably similar; they are skewed towards being
older, predominantly male, and wealthier than the general U.S. population. For comparison,
we also report demographics for users of the New York Times website. The users of the
New York Times site are similar, though significantly older, than the average users of a news
aggregator. Table 5 also provides suggestive evidence of why the debate over ad revenues from
news content is so contentious. Users such as these are a remarkably attractive demographic
group from an advertisers perspective.
15
3 Analysis
Figure 3 summarizes our main analysis. Figure 3 illustrates the mean percentage of down-
stream traffic for users that visited Google News and Yahoo! News during our period. As
seen in the graph, little change occurs in downstream site navigation for Yahoo!. However,
news sites experience a decline in visits from Google News after the removal of Associated
Press relative to the change in traffic from Yahoo! News.
Figure 4 extends this analysis to show how visit behavior varies for international news
sites as well. Once again, little change exists in user behavior for these additional types of
websites on either Yahoo! News or Google News, suggesting that these sites were not affected
by the removal of Associated Press content. As expected, these international websites are
unlikely to be affected by the removal of AP content due to the nature of their content. As
seen in Figure 5, no such change in clicks occurred in the prior year during the same calendar
months of December 2008 and January 2009.
Figure 3: Downstream sites visited after Google News and Yahoo! News
Notes: This figure shows the average percentage of clicks for news and non-news sites navigated to aftervisiting from Google News and Yahoo! News before and after the removal of The Associated Press fromGoogle News.
16
Figure 4: Downstream sites visited after Google News and Yahoo! News
Notes: This figure shows the average percentage of clicks for a variety of website types navigated to aftervisiting from Google News and Yahoo! News before and after the removal of The Associated Press fromGoogle News.
To formalize the insights provided by Figures 3 and 4, we run a difference-in-differences
regression for the policy change and estimate the following regression for the percentage of
clicks to website i after visiting news aggregator j in month t:
%clicksijt = 0 + 1Newsi Googlej PeriodDisputet + 2Newsi PeriodDisputet+ 3Newsi Googlej + 4Googlej+ i + weekt + ijt
where News is an indicator variable equal to 1 if the website is a traditional news source,
Google is an indicator variable equal to 1 if the traffic originated after viewing Google News,
and PeriodDispute is an indicator variable equal to 1 for the weeks after the removal of
Associated Press from Google News. The controls are downstream-website fixed effects.
The vector weekt contains weekly fixed effects to capture national variation in the volume
17
Figure 5: Downstream sites visited after Google News and Yahoo! News in prior year(December 2008 and January 2009)
Notes: This figure shows the average percentage of clicks for news and non-news sites navigated to aftervisiting from Google News and Yahoo! News in December 2008 and January 2009 for the year prior to theremoval of The Associated Press from Google News.
and interest generated by news stories in that week. The coefficient on the interaction
term News Google PeriodDispute captures the effect of the Associated Press removalon visits to traditional news sites compared to non-news sites from Google News with the
corresponding change in traditional news and non-traditional news sites on Yahoo! as a
control. We estimate this specification using ordinary least squares and cluster our standard
errors at the website level to avoid the downward bias reported by Bertrand et al. (2004).
Table 6 reports the results for various regression specifications, incrementally building
up to our full specification described by equation (1). Very little variation exists in the size
or precision of our coefficient of interest in each of the columns. The negative coefficient on
NewsGooglePeriodDispute implies that during the dispute with Associated Press, GoogleNews users were less likely to visit traditional news websites after visiting Google News. This
suggests that the presence of Associated Press articles in Google News prompted users to
seek further information at traditional news sites and thereby encouraged more diversity in
18
Table 6: Downstream traffic Google and Yahoo! News before and after the policy change
(1) (2) (3)% clicks % clicks % clicks
PeriodDispute X Google X News -0.00583 -0.00583 -0.00583
(0.00271) (0.00284) (0.00284)PeriodDispute X Google 0.00152 0.00152 0.00152
(0.00219) (0.00229) (0.00229)PeriodDispute -0.000514 -0.000514 -0.000560
(0.000655) (0.000686) (0.00105)Google -0.00372 -0.0115 -0.0115
(0.00367) (0.00617) (0.00617)News -0.000393
(0.00356)PeriodDispute X News 0.00143 0.00143 0.00143
(0.000963) (0.00101) (0.00101)News X Google 0.0184 0.0324 0.0324
(0.00600) (0.00753) (0.00753)Website Fixed Effects Yes Yes YesWeek Fixed Effects No No YesObservations 100503 100503 100503R-Squared 0.000543 0.581 0.581
Robust standard errors clustered at website level. *p < 0.1, **p < 0.05, ***p < 0.01. Thedependent variable is the fraction of traffic to websites after visiting Google News or Yahoo!News. The policy change is the removal of hosted articles by The Associated Press fromGoogle News.
news consumption.
News sites on Google experience a 6 percentage point decrease in clicks after the removal
of Associated Press articles. Compared to the mean percentage share of 2.9 percent before
the policy change, this drop represents an approximately 20 percent decrease in traffic to
news sites after the removal of Associated Press articles from Google. If the claim in Cohen
(2009) is true that Google sends a billion clicks each month to its partner news providers, then
this percentage translates into a very large change in the number of clicks that traditional
news websites receive. While we do not know precisely the international breakdown, our
data from Hitwise suggest that 40 percent of all clicks before the policy change went to
19
traditional news media websites hosted in the US. Therefore, this 20 percent decrease could
imply a 80 million decrease in visits each month from Google News users each month to
traditional news media websites hosted in the US.
Our results suggest that news aggregators complement the news sources that they fea-
ture by directing traffic to these news sites. The provision of content on news aggregators
encourages readers to seek further information from other news sources.
20
4 Robustness Checks
We conducted various robustness checks as reported in Table 7. Columns (1) and (2) check
the robustness of our results to alternative specifications. We apply a Tobit regression to
account for sites that receive zero clicks in a given week and also a semi-log regression.2 Both
regressions have similar signs for the coefficients of interest; news sites receive less traffic from
Google after the policy change.
Columns (3)-(5) check robustness of the results to alternative definitions of the con-
trol group. As described previously, users navigated to a variety of non-traditional news
sites after visiting a news aggregator. These sites included both non-traditional and non-
Associated Press sources of news. In columns (3) and (4), our robustness checks omit the
top news aggregators and international websites as part of the control group. These alter-
native definitions of the control group could be warranted if the removal of Associated Press
content also affected navigation to these sites directly (e.g., if Associated Press content had
previously encouraged people to visit international websites, or if the removal of Associated
Press content on Google altered peoples perceptions of news aggregators.) In column (5),
we check robustness to removing both news aggregators and international sites from our
data. Generally, the results are robust in sign.
2For the semi-log regression, we use log(%clicks+0.01) as the dependent variable.
21
Tab
le7:
Rob
ust
nes
sch
ecks:
Dow
nst
ream
traffi
cto
loca
lnew
ssi
tes
from
Goog
leN
ews
and
Yah
oo!
New
sb
efor
ean
daf
ter
the
pol
icy
chan
ge
(1)
(2)
(3)
(4)
(5)
Tob
itS
emi-
log
No
Agg
rega
tors
No
Inte
rnat
ion
alN
ews
vs
Non
-New
s
Per
iod
Dis
pu
teX
Goog
leX
All
New
s-0
.022
9
-0.0
216
-0.0
0583
-0
.006
20
-0.0
0632
(0.0
0924
)(0
.012
7)(0
.002
84)
(0.0
0303
)(0
.00306)
Per
iod
Dis
pu
teX
Goog
le0.
0039
4-0
.007
830.
0015
20.
0017
80.0
0179
(0.0
0557
)(0
.005
72)
(0.0
0230
)(0
.002
49)
(0.0
0251)
Per
iod
Dis
pu
te-0
.005
56-0
.006
85-0
.000
464
-0.0
0076
2-0
.000763
(0.0
0446
)(0
.005
27)
(0.0
0108
)(0
.001
11)
(0.0
0112)
Goog
le0.
0248
-0.0
207
-0.0
114
-0.0
153
-0
.0155
(0.0
0874
)(0
.012
3)(0
.006
19)
(0.0
0674
)(0
.00680)
All
New
s0.
112
(0.0
208)
Per
iod
Dis
pu
teX
All
New
s0.
0152
0.01
55
0.00
142
0.00
148
0.0
0154
(0.0
0564
)(0
.008
76)
(0.0
0101
)(0
.001
04)
(0.0
0105)
All
New
sX
Goog
le-0
.012
60.
0785
0.03
23
0.03
64
0.0
362
(0.0
142)
(0.0
232)
(0.0
0754
)(0
.008
04)
(0.0
0812)
Web
site
Fix
edE
ffec
tsN
oY
esY
esY
esY
esW
eek
Fix
edE
ffec
tsY
esY
esY
esY
esY
es
Ob
serv
atio
ns
1005
0310
0503
1003
9594
959
94203
R-S
qu
ared
0.68
40.
580
0.58
10.5
81
Rob
ust
stan
dar
der
rors
clust
ered
atw
ebsi
tele
vel.
*p