Does reputation hinder entry?
Study of statistical discrimination on a platform ∗Work in progress, please do not share without permission.
Xavier Lambin†& Emil Palikot‡
March 28, 2018
Abstract
Using a dataset collected on a popular ride-sharing platform we, �rstly, docu-
ment that minorities achieve lower economic outcomes in terms of revenue, sold
seats and popularity of listings. Secondly, we show that minority drivers who have
persisted through several �rst interactions despite the discrimination, and built
reputation, achieve results that are similar to those of non-minorities. Finally,
we use the model of career concern's developed by Holmström (1999) to study
dynamics of e�ort.
1 Introduction
In many digital marketplaces, incomplete information plays a crucial role. Interactions
between users are multifaceted, and contracts cannot be exhaustive. In the words of
Joe Gebbia co-founder of AirBnB1 a crucial element of the success of his business is de-
signing trust. What he means by this is that online platforms, typically, match people
who otherwise do not know each other, to engage in one-time activity with uncer-
tain outcomes. In order to address this incomplete information problem, many online
∗We are grateful to Yassine Lefouili, Steven Tadelis, Daniel Garrett, Nicolas Pistolesi, Marc Ivaldiand Bruno Jullien for their valuable comments at various stages of the paper.†Ph.D candidate at Toulouse School of Economics , [email protected]‡Ph.D candidate at Toulouse School of Economics, [email protected]. Emil Palikot greatfully
acknowledges the �nancial support from the ERC grant no. 3409031https://www.youtube.com/watch?v=16cM-RFid9U , last accessed on Dec 5, 2017
1
marketplaces allow users to create pro�les where aside of product/service speci�c infor-
mation users present personal information, i.e., name, age, photo, links to social media
websites, etc. This is meant to facilitate transactions by establishing trust. However,
it can also lead to discrimination.
Economic research has documented discrimination of minorities in house-rentals (Edel-
man and Luca (2014)), ride-sharing (Farajallah et al. (2016)) or taxis (Ge et al. (2016))
across several developed countries. An important common feature of online platforms
is the existence of a reputation system. After each interaction parties are encouraged
to leave a rating and possibly a written review. In principle, this information should
allow updating beliefs on the types of reviewed users, and potentially enable them to
escape discrimination.
In this paper, we ask the question whether the potential of overcoming discrimina-
tion through reputation system is realized. Firstly, we empirically investigate whether
minority drivers face discrimination on a popular ride-sharing platform; secondly, we
demonstrate that building reputation, indeed, allows users to �ght back discrimination.
The reputation e�ect is shown in the context of cross-sectional data, panel and using
coarsened matching. Finally, based on Holmström (1999) we develop a model of the
behavior of drivers and argue that it well predicts taken e�ort and resulting grade.2
Relation to literature: �rstly, this paper relates to the large literature on statisti-
cal discrimination. While the concept has been described since the 1970s (see Phelps
(1972), Arrow (1973) and later by Altonji and Pierret (2001)) with numerous interesting
applications to the labor market (see e.g. Coate and Loury (1993),Farber and Gibbons
(1996), Charles and Guryan (2011), Lang and Lehmann (2012) ), online markets and
their relative anonymity lend themselves well to such discrimination. As a result, they
have also constituted new observation terrains for researchers : Edelman and Luca
(2014), Edelman et al. (2017) and Laouenan and Rathelot (2017) study discrimination
on the short-term house rental platform Airbnb. Castillo et al. (2013), Goddard et al.
(2015) and Ge et al. (2016) show evidence of discrimination in transportation systems.
Close to our empirical analysis, Farajallah et al. (2016) shows minorities have a lower
success rate on a carpooling platform. Our paper also identi�es discrimination. How-
ever, it goes into more details by showing that this discrimination is at least in part
2Results presented in this version of the project should be seen as preliminary. We are currentlyworking on improvements to both theoretical and empirical aspects of this paper.
2
statistical, rather than taste-based (Schuessler and Becker (1958))3, and in this, it is
closest to Laouenan and Rathelot (2017). Recent economic (and computer science) liter-
ature has studied e�ectiveness and design of reputation systems, some notable projects
being: Nosko and Tadelis (2015), Cabral et al. (2010), Bar-Isaac and Tadelis (2008),
Liu and Skrzypacz (2014), Livingston (2005), Jolivet et al. (2016), Bolton et al. (2004),
Mayzlin et al. (2014), Jullien and Park (2014) and Zervas et al. (2015). However, these
papers focus on understanding how consumers may react to the information provided.
They aim at improving the accuracy of the reputation system, by for instance reducing
fraud or providing adequate information. In contrast, our paper shows an excess of
precision in reputation may prevent entry of new users. Spagnolo (2012) and Butler et
al.(2017) show in lab experiments that a reputation system, if not designed wisely may
hinder the entry of new participants. We provide a formal analysis and identify this
e�ect on an existing marketplace. Kovbasyuk and Spagnolo (2017) also show a repeated
game with limited records maximizes the number of trades. Randomly changing types
provide the mechanism behind the identi�ed e�ect. In our model, this e�ect is due to a
trade-o� between revealing information, and allowing the entry of new agents. Finally,
we consider a model of moral hazard, which is related to seminal works in the �eld
Baron and Myerson (1982),Holmström (1999), and La�ont and Tirole (1986), but also
more recent works of Roger and Vasconcelos (2014) and Garrett and Pavan (2012).
The rest of the paper is organized as follows: Section 2 describes the functioning of a
car-pooling platform we are focusing on, as well as our data gathering process. Section
3 documents discrimination, as well as shows the positive e�ect of reputation building
using a cross-section and a matching method. Sections 4 and 5 provide some insights
into the mechanics of reputation building by adapting seminal models of moral hazard
to our setting. Section 6 concludes.
2 Empirical context and data collection
Blablacar is an online marketplace for ride-sharing. It was established in 2006 in France,
and today operates in 22 countries, mostly in Europe, but also Mexico, India, and
Brazil. There are eight million active drivers and over 50 million passenger4, which
makes Blablacar the biggest ride-sharing platform in Europe. The basic idea behind
3Darity and Mason (1998) provide a more detailed review of economic theories of discrimination4https://techcrunch.com/2017/04/10/how-blablacar-faced-growing-pains-and-had-to-change-its-
focus/
3
Blablacar is to enable drivers to o�er seats in their cars that would otherwise end up
empty. Hence, in principle Blablacar should not be used by "professional" drivers, but
by people that just happen to drive on a particular route. This is re�ected in a pricing
mechanism; drivers receive price recommendations based solely on distance, 0.062 EUR
per km, from which they can deviate; however, the maximum price is set 0.082EUR
per km. The idea behind is that Blablacar should allow drivers to cover their costs, but
not run a pro�table business.
After a trip, both passengers and the driver are encouraged to leave a review that
consists of written comment and a number of stars from 1 to 5, which are later on
available publicly. The review system has a simultaneous reveal feature, which means
that a user will not be able to access received review unless he writes one herself.
After typing a search for a ride between a pair of cities, a potential passenger sees a
list of available drivers, with their photo, name, average grade, and price. To see more
details about the driver, including the history of reviews and identity of reviewers a
future passenger has to click on the listing. Examples of pro�le and listing pages are in
the Appendix.A
Data collection We have collected our dataset using a web-crawler on blablacar.fr
website from 1.07.2017 to 08.03.2018. Crawling through the site, the program gathered
all the information available to prospective riders. It includes information on the ride
itself such as the price posted by the driver, time and date, destination, origin, type of
car, whether pets or large luggage are allowed, etc. It also collected detailed data on
the driver itself. Her name, age, picture, a short biography, number of Facebook friends
and other variables are observed. Most importantly, we collect the rating of the driver
and the history of ratings for each driver. The data also covers information related to
the listing itself such as the number of views of a given listing, how many seats have
already been sold etc.
In addition to this, we have matched this data with several other datasets: origins
of names and gender are established through matching with French government index
complemented with other publicly available sources5. In order to increase precision,
we have also used a machine learning software6 to con�rm matching procedure based
5https://www.data.gouv.fr/fr/datasets/liste-de-prenoms/ , http://www.signi�cation-prenom.net/ ,http://madame.le�garo.fr/prenoms/origine/. The complete list of names and origins is available uponrequest.
6www.kairos.com
4
on facial recognition. Passengers might have a preference for cars of higher quality;
we use the value of a car as a proxy. The market value of cars is approximated by
the average price of the same type of cars posted on eBay in Germany. As cost may
be a signi�cant driver of prices, the fuel e�ciency of cars is calculated by matching
car names with a dataset of fuel consumption of cars in long range distances (French
environment and energy management agency � ADEME). Distances and expected time
by car or public transportation are calculated using google.maps. We also include the
suggested and maximum prices set by Blablacar � which follows a cost-based formula
re�ecting the spirit of a not-for-pro�t activity. Drivers may deviate from the suggested
price, but may not exceed the maximum price. Finally, information speci�c to the city
of destination/departure is included, such as population, median income, index of crime
(French government statistics INSEE). We have selected routes that either start or end
in Paris (or close vicinity). The other end of the trip is one of the of remaining 140
largest French cities located further than 20 km from Paris7.
In these preliminary empirical results, we de�ne minority agents as these whose name
has an Arabic or African origin or connotation; In this, we follow most of literature.
However, this constitutes an extension of the de�nition of minorities compared to prior
investigations of discrimination on this platform which restrict minorities to drivers
which an arabic-sounding name (see Farajallah et al (2016)). Descriptive statistics of
selected variables are shown in table 1.
The average price of a ride is 28 euros, and traveling distances are of 370 km in a
car worth 6000 euros. The average driver is 38 years old and has posted (successfully
or not) almost 60 rides in the past. Most of the drivers are men (70%) and around
23% of drivers are from a minority. In our dataset we have about 240.000 observations;
drivers have a unique ID, thus in our dataset, we can identify 48.000 drivers with av-
erage four listings. Unfortunately, we have a number of missing observations for some
variables; therefore typically we will have much fewer observations in estimated regres-
sions. Drivers' strategic decisions are, foremost, setting a price and the number of seats
to o�er. BlablaCar claims that its mission is to help drivers recoup some of the costs
related to driving the car, rather than run a professional transportation endeavor. This,
apart from being an attractive marketing slogan results also in BlablaCar suggesting
prices to drivers, and setting a maximum price. Furthermore, passengers are informed
7The exclusion of small rides is motivated by the fact the core business of Blablacar lies in long-rangetransportation. In shorter ranges around Paris, public transport makes ride-sharing an unattractiveoption and pricing strategies may strongly di�er from that observed in the core market of the platform
5
Table 1: Descriptive statistics
Statistic N Mean St. Dev. Min Max
ride price 234,769 27.72 13.54 6 68price delta 236,529 3.68 3.57 −6.54 18.42driver's age 238,690 37.22 12.60 18 68number of reviews 238,992 41.34 64.22 0 423talkative 241,410 2.21 0.48 1 3published o�ers (#) 238,986 59.29 95.61 2 661number of views of a listing 238,982 18.50 23.58 0 145empty seats (#) 241,410 2.41 0.88 0 5taken seats (#) 241,410 0.29 0.60 0 4average grade 222,453 0.92 0.05 0.74 1.00revenue 238,841 6.56 14.80 0 76seniority (# months) 238,819 43.10 26.71 1 113hours untill ride 204,688 131.96 154.54 1.19 763.42o�er posted since 238,989 5.06 7.45 0.00 52.92post's per month 241,410 1.85 3.21 0.01 29.84picture 241,410 0.87 0.34 0 1bio (#words) 238,878 14.90 16.70 0 79price of car 200,975 6.07 5.10 0.60 27.80consumption 209,660 4.99 0.74 3.65 7.48competition 241,410 28.74 32.82 1 651median revenue (city) 224,634 18,935.67 2,096.17 13,060.00 30,904.50duration public transport (# minutes) 235,022 3.69 2.22 0.14 15.24km 236,062 368.78 178.71 55.14 858.49time of travel (# minutes) 241,410 13.27 4.59 0 23notice (# minutes) 228,771 10.49 9.97 0.00 49.85gender (male) 238,079 0.73 0.44 0 1minority 127,144 0.23 0.42 0 1automatic acceptance 241,410 0.53 0.50 0 1
6
whether a price o�ered is a deviation (up or down) from the suggested price with a
color code (green is a rather low price, red is a rather high price). Thus, the decision
of drivers is really whether, and how much to deviate from the suggested price, we call
this variable �price delta�. Figure 1 shows its distribution in our data.
Figure 1: Distribution of Price Delta
Price delta is close to normally distributed with some skewness to the right, but
the vast majority of drivers set prices within a 5 EUR range around the mean (3.7
EUR). The suggested price is based on expected fuel consumption for an average car
and the distance to be covered. Therefore, deviations can be to some extent explained
by drivers having cars with higher/lower average fuel consumption. Other than cost
elements, factors such as competition from other drivers on the platform as well as other
transportation options can in�uence pricing. Finally, and most importantly from our
perspective is the impact of minority status, and reputation. A characteristic feature
of many online reputation systems is that most users leave very high reviews, and as a
result, there is little variation in data. This is the case also with BlaBlaCar. Drivers are
evaluated on a scale from 1 to 5, and there is a possibility of leaving a written comment.
Figure 2 shows the distribution of average reputation of drivers. As is observed in many
online platforms low ratings are extremely rare (Nosko and Tadelis (2015), Dellarocas
and Wood (2008) ) with the vast majority of rates being within the 4-5 range (i.e.,
between �very good" and �perfect").
7
Figure 2: Distribution of average rating per driver
We collect several measures of economic outcomes. Firstly, we have a proxy for
revenue, which in our case is the product of pric and the number of seats taken as long
as not all of them are sold. Secondly, we have a number of seats reserved. Finally,
a potential passenger has to click on a listing before making a reservation, thus in
our opinion, a number of views that an o�er has received is a good measure of its
attractiveness. Our dataset may miss some very successful rides that are no longer
displayed when data is collected (see discussion in appendix B). We assume this may
happen especially with rides posted by non-minority drivers. As a consequence, the
estimates exhibited in the present paper should be understood as lower bound estimates.
3 Can reputation solve the discrimination problem?
We are interested in investigating whether minority drivers, who may be discriminated
when they enter the platform, can escape this discrimination by building a reputation.
As mentioned before discrimination of minorities is a well-documented phenomenon,
and in raw data, we also see this; listings of minorities receive fewer views 18.9 vs. 16.9,
they have fewer seats taken 0.291 vs. 0.285, and make lower revenue 6.7 vs. 5.9 EUR.
Table 2 introduces various controls:
We can notice several patterns that are consistent across all our measure, older,
8
Table 2: Economic outputs, regressed over driver and ride characteristics
Dependent variable:
number of views revenue taken seats
(1) (2) (3)
driver's age −0.058∗∗∗ (0.007) −0.011∗∗ (0.005) −0.001∗∗∗ (0.0002)number of reviews 0.020∗∗∗ (0.002) 0.016∗∗∗ (0.001) 0.001∗∗∗ (0.00005)talkative 0.723∗∗∗ (0.170) 0.183 (0.121) 0.008 (0.005)minority −0.878∗∗∗ (0.214) −0.631∗∗∗ (0.153) −0.021∗∗∗ (0.006)seniority (# months) −0.024∗∗∗ (0.003) −0.004∗ (0.002) −0.0002∗∗ (0.0001)hours untill ride 0.0005 (0.003) 0.002 (0.002) 0.0002∗∗ (0.0001)posted since 2.592∗∗∗ (0.075) 0.808∗∗∗ (0.053) 0.036∗∗∗ (0.002)posts per month −0.761∗∗∗ (0.039) −0.175∗∗∗ (0.028) −0.009∗∗∗ (0.001)picture 1.796∗∗∗ (0.537) 0.633∗ (0.385) 0.021 (0.015)bio (# words) 0.029∗∗∗ (0.005) 0.008∗∗ (0.004) 0.0003∗∗ (0.0001)male −1.393∗∗∗ (0.182) −0.063 (0.130) 0.003 (0.005)car price −0.042∗∗∗ (0.016) −0.023∗∗ (0.012) −0.001∗∗ (0.0005)consumption 0.645∗∗∗ (0.112) 0.200∗∗ (0.081) 0.012∗∗∗ (0.003)competition 0.020∗∗∗ (0.003) 0.013∗∗∗ (0.002) 0.001∗∗∗ (0.0001)revenue median (city) 0.001∗∗∗ (0.0004) 0.001∗∗ (0.0003) 0.00002∗∗ (0.00001)duration public transport −1.332 (1.272) −5.725∗∗∗ (0.911) −0.112∗∗∗ (0.036)km 0.001 (0.007) 0.025∗∗∗ (0.005) 0.0002 (0.0002)hour 0.062∗∗∗ (0.019) 0.012 (0.013) 0.001∗∗ (0.001)night 0.955∗∗∗ (0.185) 0.468∗∗∗ (0.133) 0.018∗∗∗ (0.005)day 0.732∗∗ (0.298) −1.327∗∗∗ (0.213) −0.069∗∗∗ (0.009)notice −0.948∗∗∗ (0.073) −0.457∗∗∗ (0.052) −0.022∗∗∗ (0.002)automatic acceptance −2.397∗∗∗ (0.165) 1.824∗∗∗ (0.118) 0.087∗∗∗ (0.005)Constant −7.250 (5.629) 0.546 (4.040) −0.055 (0.162)Observations 63,070 62,945 63,599R2 0.250 0.087 0.080Adjusted R2 0.248 0.084 0.076Residual Std. Error 19.906 (df = 62827) 14.244 (df = 62702) 0.573 (df = 63356)F Statistic 86.746∗∗∗ (df = 242; 62827) 24.813∗∗∗ (df = 242; 62702) 22.611∗∗∗ (df = 242; 63356)
Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
9
female and non-minority drivers receive better economic outcomes. Here we measure
reputation building as a number of reviews; it is highly statistically signi�cant in all
regressions. After we control for a number of reviews, seniority on the platform has
a negative coe�cient. Drivers with pro�les with more extended descriptions and with
picture receive higher outcomes. All in all, these results corroborate earlier �ndings
that minority users are discriminated in online marketplaces.
3.1 Reputation e�ect: OLS on subsamples by seniority
Now we investigate whether these patterns of discrimination di�er depending on the
level of reputation. We divide our sample into three subsamples; no reputation with
less than three reviews, with some reputation (between 4 and 49 reviews), and with
established reputation (more than 50 reviews)8. Initially, minorities are discriminated.
However, building reputation allows them to overcome it. We see this pattern for all our
measures, discrimination disappears entirely in regression with the number of views and
taken seats, and is diminished for revenue. Table 3 presents the results for the number
of views as a dependent variable; other regressions are in Appendix C. Being minority
has a signi�cant impact at the beginning of the career minority drivers receive 2.6 views
fewer (mean is 18.5). Crucially, this e�ect decreases relatively fast and can be entirely
overcome once reputation is built. Figure 3 illustrates this. It is worth noting that most
of the other driver-speci�c characteristics remains unchanged; the sign and economic
magnitude of driver's age are similar, having a picture and lengthy description of a
pro�le does not seem to matter too much. Finally, discrimination could also be gender
speci�c, and indeed we have shown that male drivers on average receive lower economic
outcomes. In this case, it seems that reputation does not seem to play a crucial role,
the e�ect is slightly diminished for experienced drivers. However, we cannot draw any
clear-cut conclusions.
8Choice of these thresholds is ad-hoc. However, the reputation e�ect remains for local changes
10
Table 3: Reputation e�ect, number of views as dependent
Number of reviews:
1:3 4:49 50+
driver's age −0.099∗∗∗ (0.018) −0.070∗∗∗ (0.008) −0.001 (0.013)talkative 1.305∗∗∗ (0.454) 0.479∗∗ (0.210) 1.063∗∗∗ (0.304)minority −2.585∗∗∗ (0.565) −1.190∗∗∗ (0.262) 0.032 (0.389)seniority (# months) −0.007 (0.010) −0.010∗∗ (0.004) −0.046∗∗∗ (0.007)hours untill ride −0.006 (0.007) 0.0002 (0.004) 0.0002 (0.006)posted since 2.244∗∗∗ (0.183) 2.776∗∗∗ (0.094) 2.463∗∗∗ (0.154)posts per month −0.269∗∗ (0.136) −0.511∗∗∗ (0.055) −0.755∗∗∗ (0.054)picture 2.065 (1.467) 2.616∗∗∗ (0.770) 1.393 (1.021)bio (#words) 0.010 (0.015) 0.023∗∗∗ (0.006) 0.025∗∗∗ (0.009)male −1.265∗∗ (0.495) −1.235∗∗∗ (0.221) −0.532 (0.390)car price −0.061 (0.042) −0.009 (0.020) −0.056∗∗ (0.028)consumption 0.416 (0.298) 0.714∗∗∗ (0.140) 0.882∗∗∗ (0.207)competition 0.014∗ (0.009) 0.023∗∗∗ (0.004) 0.022∗∗∗ (0.006)median revenue (city) 0.0003 (0.001) 0.002∗∗∗ (0.0005) 0.001∗∗ (0.001)duration public transport −2.376 (2.953) −0.617 (1.563) 3.335 (2.815)km 0.007 (0.016) 0.002 (0.009) −0.011 (0.013)hour 0.157∗∗∗ (0.049) −0.003 (0.024) 0.087∗∗ (0.034)night 0.141 (0.505) 0.865∗∗∗ (0.228) 1.265∗∗∗ (0.334)day 1.548∗∗ (0.785) 1.125∗∗∗ (0.381) −1.378∗∗∗ (0.528)notice −0.656∗∗∗ (0.179) −0.968∗∗∗ (0.091) −0.985∗∗∗ (0.152)automatic acceptance −1.525∗∗∗ (0.442) −2.211∗∗∗ (0.199) −4.186∗∗∗ (0.319)Constant 10.639 (12.653) −14.894∗∗ (7.033) −15.627 (11.805)Observations 8,684 42,314 17,871R2 0.259 0.265 0.266Adjusted R2 0.238 0.261 0.256Residual Std. Error 19.487 (df = 8444) 19.875 (df = 42073) 19.463 (df = 17629)F Statistic 12.337∗∗∗ (df = 239; 8444) 63.170∗∗∗ (df = 240; 42073) 26.521∗∗∗ (df = 241; 17629)
Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
11
Figure 3: Coe�cient of minority status from regressions in Table 3
3.2 Panel data To be added
3.3 Coarsened Exact Matching
This project, likewise most in the literature, uses non-experimental data for evaluating
the impact of minority status. Hence, producing estimates of the impact of being a
minority su�ers from the selection on the non-observables bias. There is a growing,
mostly theoretical, literature on the use of matching techniques to address this issue.
Rosenbaum and Rubin (1983) and Heckman et al. (1997) demonstrate that this bias
can be greatly reduced by use of various matching techniques. Some of their properties
are discussed by Abadie and Imbens (2006) and Abadie and Imbens (2016). A similar
methodology has been applied in Sarsons (2017) 9.
The objective of matching exercise is to test the robustness of results from the standard
OLS of Section ??. We will �rstly estimate propensity scores for each of the observa-
tions and discard these with extreme values. Secondly, we will perform matching of
the minority and non-minority subsamples on driver-speci�c variables. We will execute
both exact matching and coarsened matching. Finally, we will regress model using the
matched sample, controlling for listing-speci�c characteristics.
9We use matching software developed by ?
12
The propensity score is a logistic regression with minority status being dependent vari-
ables and following controls: the price of a car, driver's age, number of posts per month,
picture dummy, length of biography, gender, fuel consumption of the car and whether
the driver is talkative. Minority drivers are more likely to be a young male and to enjoy
conversations. They have on average more expensive cars that consume more fuel; their
pro�les are also shorter. More detailed results are in the Appendix D. We delete 5%
smallest and 5% largest propensity scores, in this way we delete observations for which
we are unlikely to �nd a counterpart.
Exact matching is performed on all driver's characteristics for which we have esti-
mated logistic regression. In our sample, it means that we have 8636 minority drivers
matched with 19039 non-minority drivers. As entrants, we will label minority drivers
with less than ten reviews and as incumbent's (experienced users) these with more than
30. From table 4 we can see that even after the matching procedure, minority entrant
Table 4: Economic outcomes of entrants, exact matching
Dependent variable
Number of views Revenue Taken seats
minority -1.4211∗∗ (0.002 ) -0.8294 (0.007)∗∗ -3.00E-02∗∗ (0.005)hours untill ride -0.0234 ∗∗ (0.009) -0.0025 (0.686) -9.20E-05 (0.672)posted since 2.2056 ∗∗∗ (<2e-16) 0.6555 (9.00E-06)∗∗∗ 2.40E-02 ∗∗∗ (5.00E-06)competition 0.0461 ∗∗∗( 8e-10) 0.0029 (0.569) 1.10E-04 (0.559)day 0.7465 ( 0.124 ) 0.9003 (0.006)∗∗ 2.80E-02∗ (0.018)night 1.9899 ∗∗ ( 0.006 ) -1.1733 (0.018)∗ -4.90E-02∗∗ (0.006)notice -0.3299 (0.122 ) -0.3203 (0.027)∗ -1.20E-02∗ (0.023)
Matched Observations 9,555
Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
drivers are facing discrimination. We repeat the same process for drivers with repu-
tation. Table 5 reports the results. Again, we observe the reputation e�ect; minority
status is insigni�cant for the number of views and seats taken. In case of revenue, there
is a positive relationship, which could suggest a selection e�ect.
Coarsened Matching: is a method to increase the number of matched observations.
We introduce bins in which we will match non-binary covariates: age of the driver, the
price of a car, number of posts per month, length of bio and fuel consumption of the car.
Choice of cuto�s in�uences the precision of matching procedure as well as the number
13
Table 5: Economic outcomes of incumbent drivers, exact matching
Dependent variable
Number of views Revenue Taken seats
minority -0.0200 (0.971 ) 0.9449 (0.035)∗ 0.02679 ( 0.142 )hours untill ride 0.0091 (0.408 ) 0.0180 (0.044)∗ 0.00110∗∗ (0.002)posted since 3.2570∗∗∗ (<2e-16) 1.5431 (2e-12 )∗∗∗ 0.06835 ∗∗∗ ( 3e-14)competition 0.0285 ∗∗( 0.002) 0.0237 ( 0.001)∗∗ 0.00061∗(0.041)day 1.2000∗ ( 0.041 ) 0.7038 ( 0.140 ) 0.04531 ∗ (0.020)night 0.8739 ( 0.357 ) -1.8608 (0.016)∗ -0.10491∗∗∗ (8e-04 )notice -1.1991 ∗∗∗ ( 5e-06 ) -0.9369 (1e-05)∗∗∗ -0.04648 ∗∗∗ ( 9e-08 )
Matched Observations 5,316
Note: Trip �xed e�ects not reported ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
of matched observations; we match within a quantile for each of the variables. In this
way, we match 14146 minority drivers with 45959 nonminority ones, which is almost
a twofold increase. We present only coe�cient on minority (table 5). In the coarsely
Table 6: Economic outcomes entrants and incumbents, coarsened matching
Dependent variable
Number of views Revenue Taken seats
minority (entrant) -0.9869 ∗∗ (0.005) -0.52608∗ (0.029) -0.02070∗ ( 0.016 )minority (incumbent) 0.2507 ( 0.459 ) 0.2327 (0.390) 0.01684 ( 0.134 )
Matched Observations (both models) 60,105
∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
matched sample, we also see a clear reputation e�ect. Minority entrants have lower
economic outcomes, however after they build reputation the e�ect goes entirely away.
These results depend on cut-o�s for labeling as entrants/ incumbents, as well as on the
selection of bins for coarsened matching; they are however robust to local changes.
We have provided evidence of reputation e�ect on all available economic outcomes
measures: number of views, taken seats and revenue. Minority drivers face discrimina-
tion at the beginning of their career, this e�ect, however, disappears as soon as they
develop a reputation. We have implemented a matching technique that made minority
drivers more comparable with non-minorities. Still, the reputation e�ect was visible.
14
4 Driver's Incentive Problems - very preliminary
We have seen that reviews are useful for a passenger in learning about the quality
driver. Moreover, we have provided evidence of updating about the expected quality.
Good reviews are valuable for drivers and informative for passengers. In this section,
we expand on this observation by adapting a moral hazard model. In a seminal paper
Holmström (1999) studies incentive problem of a manager in a moral hazard setting,
where the type of manager is known neither to the market nor to the manager herself.
A manager exerts e�ort that is costly, and it impacts the outcome. We believe that
the two-sided incomplete information set-up is a good representation of interactions on
Blablacar.
Let η be a measure of driver's quality; this can be seen as representing her driving skills
or as for how "likable" she is; η is incompletely known to both driver and potential
passenger; however, its minority status is observed. Passengers and drivers know the
underlying distribution of η; speci�cally, we will assume that η is distributed normally
with mean mm and precision (inverse of the variance) hm for minority drivers and mnm,
hnm for non-minorities. With each interaction, a passenger leaves a review to the driver
that is a measure of the quality of the service provided by the driver. In period t, this
quality is given by:
yt = η + at + εt, t = 1, 2...
, where at ∈ [0,∞] is driver's e�ort and εt is a stochastic noise term, that we assume
to be normally distributed with zero mean and precision hε. ε represents the error
on rating, either due to approximative perception by the riders, or imprecision of the
reputation system itself. Drivers are assumed to be risk neutral with preference given
by an atemporal separable utility function:
U(c, a) =∞∑t=1
βt−1 [ct − g(at)] (1)
Cost of exerting e�ort g(.), in general, is assumed to be increasing and convex; utility
function is publicly known. Consumption ct is assumed to be an increasing function of
15
expected quality. Solution to 1 is given by10:
γt ≡∞∑s=t
βs−tαs = g′(α∗t )
,where αt = hεht, ht = hm/nm + thε, and a
∗t (yt−1) represents the equilibrium decision rule.
As, γt is a declining sequence, and since the sum converges to zero, the e�ort will be a
declining sequence that asymptotically converges to zero. We can interpret it by noting
that in this set-up ability and e�ort are substitutes. As long as the ability is unknown
there are returns to supplying it, output today is increasing the perception of ability
and increase the expected wage in future. What is important, is that returns to the
e�ort are bigger the more there is uncertainty about ability. An entrant driver's reviews
are crucial in learning about her ability. Eventually, η is revealed almost completely,
when the driver becomes incumbent, and new observations will have little impact on
beliefs. For very experienced drivers, there are no returns to trying to in�uence output
and labour supply goes to zero.
We expect grades received by drivers to be relatively high at the beginning, when the
returns from providing e�ort are the highest, and to gradually decline, before stabilizing
at the level corresponding to η. Figure 4 documents this using our dataset.
Figure 4: Distributions of grades and di�erent levels of experience
We document changes in the distribution of grades. We restrict the sample to only
10see Holmström (1999) for details
16
these drivers for whom we have at least 25 reviews. Therefore these are reviews of the
same drivers, at di�erent stages of their career. First, we present distribution of grades
from the �rst to the �fth, then from 6th to the 10th and so forth. The vertical line is
the mean of the last distribution. Clearly, we observe the higher, on average, grades at
the beginning of the career and stabilization at the later stage. This is consistent with
intuition provided in Holmström (1999).
This allows us to calculate moments of distributions of types and errors11. Thus, ηm ∼N(4.47, 0.15) and ηnm ∼ N(4.58, 0.12), hε = 0.37. We assume that g(at) = µ
2(a2t ). Now
we can nonlinearly estimate the cost of e�ort and discount factor:
a∗t =1
µ
[∞∑s=t
βs−tαs
](2)
This estimation allows us to calculate the expected output.12
5 Reputation system: a barrier to or a mode of entry?
In this section, we propose an alternative model of moral hazard, in order to shed some
further light on the problem of e�ort taking by an entrant driver. Is is a static version
of a model where a principal (passenger in the context of a ride-sharing platform)
interacts with two agents (drivers): an incumbent who had used the service n times
before and developed some reputation and an entrant with no reputation yet. We
examine the impact of a reputation system on a principal's choice to interact with a
given agent instead of the other one. The reputation, built over past interactions helps
the principal �gure out whether she wants to interact with an experienced agent or hire
a new one with no reputation yet.13
Each of the past n interactions corresponds to a ride for which a principal needed an
agent. The agent's type has two dimensions a private one β and a public one ε. In the
context of ride-sharing, we will think of ε as belonging to a minority or not, and of β as
the quality/experience, the driver is actually providing. The principal does not observe
the exact type of the agent but has a prior on the distribution of types. For a given
11This results should be seen as an illustration, and they will change while we �netune the model12Linking the optimal e�ort with expected output and using it as a proxy for quality of the driver
is the next stage, and it is currently under development13Extension to a dynamic version of the game, in which agents internalize the impact of their actions
on reputations is our next step, and we are currently working on it
17
agent, the distribution depends on an observed attribute ε. The cumulative distribution
function of types in a given population ε is denoted Fε(.). Hence, when a principal is
matched with an agent, she observes ε, the �population" from which the agent is selected;
which itself is drawn randomly from CDF G(ε). The principal decides whom to hire;
she can thus choose to keep the incumbent (whose population we denote εold), or �re
her and hire the new agent (εnew). She will make a decision based on her beliefs about
current and new drivers' type and propensity to exert e�orts � and her own ability to
elicit e�orts based on the information at his disposal. The model is an extension of
La�ont and Tirole (1986), by addition of the incumbent. Alternatively, this game can
be seen as repeated interaction in which users are myopic. Despite them being short-
sighted, the platform has a degree of memory about previous interactions conveyed
by a �potentially imperfect� reputation system. Hence, any information revealed and
recorded by the principal, will be fully exploited at later stages of the game. The
principal designs pricing and e�ort scheme by which she aims at fully revealing direct
mechanisms. At each period j, the agent i is asked to report her type β̂i. The agent
is then paid a transfer si and provided with a recommendation on e�orts ei. Both the
transfer and e�orts will depend on the reported type β̂i, the population εi of the agent
and the beliefs of the principal, to be described later on. The principal enjoys a gross
bene�t of πi = βi + ei, meaning a high-type agent provides high bene�ts. The principal
also wants to encourage e�ort from an agent, which induces a private cost to the agent.
The principal observes πi, but not βi and ei, giving the agent the opportunity to pretend
she is a low type, to save on e�orts. The principal aims at maximizing:
βi + ei − si
The agent endures a cost of producing e�orts ψ(ei). Her outside opportunity is 0. Her
payo� if she accepts the o�er of the principal is therefore:
ui = si − ψ(ei)
A key addition to standard models is that the principal only has imperfect recall of
previous disclosures. Assuming agent i revealed herself to be of type βi over the last n
periods, the principal believes the type of the agent is :
β̃i,n+1 = (1− αn)βi + αnvε (3)
18
, where vε is randomly drawn from distribution Fε(.), which corresponds to the prior
distribution of types in population ε, absent further information. From the perspective
of principle and agents α ∈ [0, 1] is exogenous14, and represents how well the principal
remembers the information extracted in previous periods. α = 1 means she remembers
nothing and starts from the same prior each period. α = 0 means there is perfect
recall: once the type of the agent is disclosed, the information is fully exploited leaving
the agent with no rent. At the intermediary levels of α the principal has an idea of
the type previously disclosed, but still relies on her prior to some extent. Another
interpretation for this belief formation is that the principal distrusts the information
previously collected and still clings to her prior. We could describe this behavior as a
form of persistence in prejudice. In practice, α represents the strength of the reputation
system, which allows transmitting private information revealed in previous stages to
subsequent stages. The smaller α, the more precise the reputation system. Each period,
the principal forms her belief on agent types and sets a menu of payments. We focus
on revealing mechanisms, that ensure full participation. In more details, the timing
within each is:
• Step 1: principal establishes her prior on current driver (equation 3)
• Step 2: principal is matched with a new driver of population drawn from G(ε)
• Step 3: principal chooses whether to retain her current agent, or �re her and hire
the new one.
• Step 4: agent accepts/rejects participation, and reports her type β̂i
• Step 5: mechanism prescribes a menu of payments. Agent exerts e�orts e accord-
ingly
• Step 6: agent/principal agree on bad outcome if observed bene�ts do not equal
π(β̂i) = β̂i + e(β̂i))
5.1 Hiring and �ring decisions
Expected bene�t of a new driver In this subsection we review some features of
standard incentive compatible contracts, and study how these contracts will change
when agents come from di�erent populations. Whenever no confusion is possible, we
drop subscript i for the sake of concision.
The principal is matched with a driver of population ε. Pricing scheme ensures full
participation. We focus on direct mechanisms. Assuming the agent reports type β̂, she
14Introducing the platform that sets α is an extension, currently under work
19
then has to exert e�orts that will replicate the total bene�ts π(β̂) the principal expects
to observe (otherwise the agent incurs a large penalty). Therefore, she has to choose
e(β, β̂) = π(β̂)− β
Her payo� if innate type is β, and reports β̂ is:
u(β, β̂) = si(β̂)− ψ(π(β̂)− β) (4)
∂U
∂β(β, β̂) = ψ′(π(β̂)− β)
Let U(β) be the agent's payo� with truthful reporting. The envelope theorem yields
that:
U(β) = U(βε) +
∫ β
βε
ψ′(e(s))ds (5)
To minimize rents left to the agent, the principal will choose a menu such that U(βε) = 0
. Integration by parts we obtain :
E(U(β)) =
∫ β̄ε
βε
1− Fε(s)fε(s)
ψ′(e(s))fε(s)ds (6)
The expected payo� of the principal is :
Enew(Π) =
∫ β̄ε
βε
(π(s)− si(s))fε(s)ds (7)
=
∫ β̄ε
βε
(β + e− 1− Fε(s)
fε(s)ψ′(e(s))− ψ(e(s)
)fε(s)ds (8)
Maximizing over the e�ort recommendation yields:
ψ′(e(β)) = 1− 1− Fε(β)
fε(β)ψ′′(e(β)) (9)
This means high types are required to exert high e�orts, while low type e�orts are
distorted downwards. This is a classical result of the theory of incentives.
Assumption 1. For ease of exposition, we make the following assumptions:
20
A1 The di�erence in distribution of types between populations, modeled by changes in
ε, corresponds to a shift of densities from left to right, meaning Fε(β) = F0(β− ε)
A2 ψ(e) = e2
2
A3 Fε(.) follows a uniform distribution over [ε, 1 + ε]
Assumption A1 means that the higher ε, the higher the expected type of an agent.
This has an impact on e�ort recommendation:
ψ′(eε(β)) = 1− 1− F0(β − ε)f0(β − ε)
ψ′′(e(β) (10)
Lemma 1. Under assumptions 1, take an agent of a given type β, with no reputation.
As the agent's prior signal decreases (ε small):
1. the required e�ort increases
2. the surplus of the agent increases
3. the surplus of the principal derived from the agent increases
Proof. Impact of ε on e�orts: De�ne :
h(e, ε) =− ψ′(eε(β)) + 1− 1− Fε(β)
fε(β)ψ′′(e(β))
We use the implicit function theorem on h(e, ε):
∂e∗(ε)
∂ε= −
1 + 1−Fε(β)f2(ε,β))
∂f(ε,β))∂β
ψ′′(e) + 1−Fε(β)f(ε,β))
ψ′′′(e)
= −1 < 0
Impact of ε on agent surplus: Surplus of type β from population ε:
u(β, ε) =
∫ β
βε
ψ′(e(s, ε))f(s, ε)ds =
∫ β
ε
(s− ε)ds =(β − ε)2
2
⇒ ∂u(β, ε)
∂ε< 0
21
Impact of ε on principal surplus:
Π(β, ε) = β + e(β, ε)− 1− Fε(β)
fε(β)ψ′(e(β, ε)− ψ(e(β, ε))
= β + β − ε− (1− β + ε)(β − ε)− (β − ε)2
2
⇒∂Π(β, ε)
∂ε= ε− β < 0
Albeit intuitive, lemma 1 has signi�cant implications for the principal's choice of
an agent. It is worth observing that absent other information than the prior Fε(.),
A principal will choose an agent from the highest ε population whenever she has the
choice. Agents from a relatively low ε population may thus be stuck in a �reputation
trap", whereby they get excluded from interaction � despite them having a potentially
higher type than the expected type of members of a higher-signal population. This is
also in spite of the fact they will provide more e�ort if they are selected than their
counterpart from a higher population �see item (1) in lemma 1� and despite the fact,
the agent provides more surplus to the principal �see item (3) in lemma 1. This means
a reputation system is necessary to convey information from one period to the other,
to avoid the reputation trap phenomena. As the next section will show, the quality of
the reputation system will be vital to eliminating this issue and restoring e�ciency.
Expected bene�t of retaining the incumbent The bene�t of trading with the
incumbent lies in the reduced uncertainty due to their previous messages conveyed by
the reputation system. Assume the agent i is of true type βi. The population of the
incumbent is indexed by εold. The principal forms her beliefs according to equation (3).
This means the principal may doubt the report of the agent is accurate, and thinks her
type will be drawn from a distribution F̃α,n,ε(.) such that:
F̃α,n,ε(x) = F0
(x− (1− αn)βi
αn− ε)
(11)
We solve an optimal incentive compatible contract for an agent whose type is dis-
tributed according to 11. To do so we follow analogous steps as the case with a new
driver analyzed here-above.
22
We �nd that the expected surplus of the principal is:
Eold(Π) = En,βi[(1− αn)βi + αnvε + en,βi(s)−
1− Fα,n,ε(s)fα,n,ε(s)
ψ′(en,βi(s))− ψ(en,βi(s))
](12)
Maximizing with respect to en,βi(s) we �nd that en,βi(s) is de�ned by :
ψ′(en,βi(s)) = 1− 1− Fα,n,ε(s)fα,n,ε(s)
ψ′′(en,βi(s)) (13)
There are two important di�erences between retaining the incumbent (12) and hiring
an entrant (8). First, the principal has a more precise prior on the agents' expected
type (�rst two terms in 12). Second, the e�ort schedule is modi�ed due again to a
better appreciation of the experienced agents' type. This is re�ected both in the e�ort
recommendation, and the rent left to agents.
Principal choices with a reputation system When choosing to retain or �re a
driver the principal will compare her expected surplus from interaction with either of
them, exposed in lemma 2. For ease of exposition, this lemma uses assumptions 1.
Lemma 2. The expected bene�t from hiring a new agent is :
Enew(Π) =2
3+ εnew (14)
, expected bene�t of trading with the incumbent:
Eold(Π) = (1− αn)βi + αnεold +1
2+α2n
6(15)
Proof. See Appendix E.1
Generally the principal's hiring decisions can be illustrated with two examples.
Firstly, let us compare agents that have traded exactly the same number of times
n. In that case we �nd from equation 15 that the principal will hire agent 1 instead of
2, when
β1 ≥ β2 +αn
1− αn(ε2 − ε1) (16)
23
The relative importance of revealed (true) type βi compared to the public signal εi
decreases in α: from the point of view of comparing users that have been interacting
before, the principal will prefer having a very precise reputation system, so as to rely
on real type instead of public signals and reach the optimal decision rule.
Secondly, let us compare the cases when the principal needs to choose between
an experienced (�old") agent with n > 0 recorded interactions and a new one with
no reputation. The principal will retain the incumbent if and only if the expected
bene�t from continued interaction with her Eold(Π) is greater than the expected bene�t
from hiring a new agent Enew(Π). In this case, both the information revealed by the
incumbent and comparison of the expected types from the two populations are used
by the principal. For ease of exposition, we use assumptions 1 again. With these
assumptions, we obtain the following proposition 1:
Proposition 1. The old agent is retained if and only if
Eold(Π) > Enew(Π)
⇔ βold > β̄ ≡ εnew − αnεold1− αn
+1
6+αn
6(17)
Proof. Derives from a comparison of 14 and 15
There are several conclusions stemming from proposition 2; the �rst conclusion
is that if αn is close to 0 (precise reputation system), the retention policy becomes
e�cient, as the principal retains the manager if and only if the surplus she will obtain
from continuation exceeds the expected surplus stemming from interaction with a new
manager. If αn is close to 1 (i.e., the principal has a poor memory, or the reputation
system is weak), then condition 17 is met if and only if εnew > εold: the principal chooses
whoever has the highest public signal. In that case, agents with a low public signals are
trapped in a �reputation trap". This situation is more likely to occur at early stages
(small n) of the relationship between the principal and the agent (since ∂β̄∂αn
> 0). To
study welfare implications of such a reputation system, we introduce a principal with a
perfect recall that internalizes agent surplus (in other words, if she were a benevolent
social planner). Such a principal would be able to elicit optimal e�ort of all agents she
is matched with. In that case, the retention policy is simpler:
β̄SP =1
2+ εnew (18)
24
The social planner keeps the agent if and only if her observed type is greater than the
expected type of the new population. We can then write the following lemma.
Lemma 3. Assume the principal is matched with a new agent of a higher population
than the incumbent agent. There exist αlim such that if α < αlim, there is excess
retention of the agent compared to the policy of a social planner. If α > αlim, there is
excess exclusion of the agent.
Proof. There is excess retention if and only if
β̄ < β̄SP ⇔ w(αn) ≡ εnew − αnεold1− αn
+1
6+αn
6− 1
2− εnew < 0
w(0) < 0 , w(x) −−→x→1
+∞ and ∂w∂αn
(αn) > 0. The lemma results from the intermediate
value theorem.
Figure 5 summarizes the �ndings of this section. It shows the retention policy per
type β , as a function of reputation system memory αn, when the incumbent is εold = 0,
and the principal is matched with a new agent from population εnew > 0. We see
that if the principal distrusts past observations or has a bad memory (αn large), there
will be an excess exclusion of the incumbent agent. It may go as far as excluding all
incumbent agents, notwithstanding their type. On the contrary, if the memory is too
good, the principal may exert excess retention: even though the incumbent may be
of a relatively low type, the principal will keep her. This is because the information
revealed through the reputation system allows the principal to reduce necessary rent.
Closed form solutions for threshold values for α can be found in appendix E.2.
From a social welfare perspective, it may, therefore, be desirable that the reputation
system has an imperfect recall, to limit entrenchment of the incumbent (thanks to the
asymmetry of information being revealed during previous interactions). There should
be some recall though, to avoid that populations be con�ned in a reputation trap.
Taking only the static game into account, the principal always �nds it optimal to
have as good a memory as possible. A principal should, therefore, strive to have α as
low as possible. This result is unlikely to hold if she takes into account the following
two e�ects: (1) the e�ect of reputation on incentives to do long-lasting investment
in quality. (2) The e�ect of reputation on the entry of new agents � which bene�ts
will be reaped in future periods. The analysis of future market expansion e�ects on a
platforms' inclination to keep a precise history of previous transactions is currently in
progress.
25
Figure 5: Retention policy per type β , as a function of reputation system memory αn
6 Conclusion
While it has been documented on a number of online marketplaces that minority users
face discrimination, the role of reputation system in overcoming it has been less studied.
Our empirical analysis uses unique data on listings on a popular on-line carpooling
platform. We show that indeed, minority users face discrimination. Their listings
are less popular, they sell fewer seats and have lower revenue. However, this e�ect
is concentrated during �rst interactions on the platform. Minority drivers overcome
discrimination by building a reputation. A minority driver with several reviews receives
similar economic outcomes to a non-minority one. We show this result using cross-
sectional data, as well as with coarsened matching.
Passengers are willing to change their minds about minority drivers as soon as they
see reviews. This observation highlights the importance of a well-designed reputation
system in inducing entry and market expansion. Early stages of reputation building are
26
particularly important. We are currently working on applying seminal moral hazard
models of Holmström (1999) as well as of La�ont and Tirole (1988) to gain further
insights into the role played by online reviews.
As signaled before, this is an on-going research project and results presented in the
paper will be changed in future.
References
Abadie, A. and Imbens, G. W. (2006). Large sample properties of matching estimators
for average treatment e�ects. econometrica, 74(1):235�267.
Abadie, A. and Imbens, G. W. (2016). Matching on the estimated propensity score.
Econometrica, 84(2):781�807.
Altonji, J. G. and Pierret, C. R. (2001). Employer Learning and Statistical Discrimina-
tion EMPLOYER LEARNING AND STATISTICAL DISCRIMINATION*. Source:
The Quarterly Journal of Economics, 116(1):313�350.
Arrow, K. (1973). The Theory of Discrimination.
Bar-Isaac, H. and Tadelis, S. (2008). Seller Reputation. Foundations and Trends R© in
Microeconomics, 4(4):273�351.
Baron, D. P. and Myerson, R. B. (1982). Regulating a monopolist with unknown costs.
Econometrica: Journal of the Econometric Society, 50(4):911�930.
Bolton, G. E., Katok, E., and Ockenfels, A. (2004). How E�ective Are Electronic
Reputation Mechanisms? An Experimental Investigation. Management Science,
50(11):1587�1602.
Cabral, L. L., Hortacsu, A., and Hortaçsu, A. (2010). The dynamics of seller reputa-
tion:Evidence from eBay. The Journal of Industrial Economics, LVIII(1):54�78.
Castillo, M., Petrie, R., Torero, M., and Vesterlund, L. (2013). Gender di�erences
in bargaining outcomes: A �eld experiment on discrimination. Journal of Public
Economics, 99:35�48.
Charles, K. K. and Guryan, J. (2011). Studying Discrimination: Fundamental Chal-
lenges and Recent Progress. Annual Review of Economics, 3(1):479�511.
27
Coate, S. and Loury, G. C. (1993). Will A�rmative-Action Policies Eliminate Negative
Stereotypes ? The American Economic Review, 83(5):1220�1240.
Darity, W. A. and Mason, P. L. (1998). Evidence on Discrimination in Employment:
Codes of Color, Codes of Gender. Journal of Economic Perspectives, 12(2):63�90.
Dellarocas, C. and Wood, C. A. (2008). The Sound of Silence in Online Feedback:
Estimating Trading Risks in the Presence of Reporting Bias. Management Science,
54(3):460�476.
Edelman, B., Luca, M., and Svirsky, D. (2017). Racial Discrimination in the Sharing
Economy: Evidence from a Field Experiment. American Economic Journal: Applied
Economics, 9(2):1�22.
Edelman, B. G. and Luca, M. (2014). Digital Discrimination: The Case of Airbnb.com.
SSRN Electronic Journal.
Farajallah, M., Hammond, R. G., and PPnard, T. (2016). What Drives Pricing Behavior
in Peer-to-Peer Markets? Evidence from the Carsharing Platform BlaBlaCar. SSRN
Electronic Journal.
Farber, H. S. and Gibbons, R. (1996). LEARNING AND WAGE DYNAMICS. Quar-
terly Journal of Economics, 111(4):1007�1047.
Garrett, D. F. and Pavan, A. (2012). Managerial Turnover in a Changing World.
Journal of Political Economy, 120(5):879�925.
Ge, Y., Knittel, C., MacKenzie, D., and Zoepf, S. (2016). Racial and Gender Dis-
crimination in Transportation Network Companies. NBER Working Paper Series,
(22776):1�38.
Goddard, T., Kahn, K. B., and Adkins, A. (2015). Racial bias in driver yielding behavior
at crosswalks. Transportation Research Part F: Tra�c Psychology and Behaviour,
33:1�6.
Heckman, J. J., Ichimura, H., and Todd, P. E. (1997). Matching as an econometric
evaluation estimator: Evidence from evaluating a job training programme. The review
of economic studies, 64(4):605�654.
28
Holmström, B. (1999). Managerial incentive problems: A dynamic perspective. The
review of Economic studies, 66(1):169�182.
Iacus, S., King, G., Porro, G., et al. (2009). Cem: software for coarsened exact matching.
Journal of Statistical Software, 30(13):1�27.
Jolivet, G., Jullien, B., and Postel-Vinay, F. (2016). Reputation and prices on the e-
market: Evidence from a major French platform. International Journal of Industrial
Organization, 45:59�75.
Jullien, B. and Park, I. U. (2014). New, like new, or very good? Reputation and
credibility. Review of Economic Studies, 81(4):1543�1574.
La�ont, J.-J. and Tirole, J. (1986). Using Cost Observations to Regulate Firms. Journal
of Political Economy, 94(3):614�641.
La�ont, J.-J. and Tirole, J. (1988). The Dynamics of Incentive Contracts. Econometrica,
56(5):1153�1175.
Lang, K. and Lehmann, J.-Y. K. (2012). Racial Discrimination in the Labor Market:
Theory and Empirics. Journal of Economic Literature, 50(4):959�1006.
Laouenan, M. and Rathelot, R. (2017). Ethnic Discrimination on an Online Marketplace
of Vacation Rentals. working paper.
Liu, Q. and Skrzypacz, A. (2014). Limited records and reputation bubbles. Journal of
Economic Theory, 151(1):2�29.
Livingston, J. A. (2005). How Valuable Is a Good Reputation? A Sample Selection
Model of Internet Auctions. Review of Economics and Statistics, 87(September):453�
465.
Mayzlin, D., Dover, Y., and Chevalier, J. (2014). Promotional reviews: An empirical
investigation of online review manipulation.
Nosko, C. and Tadelis, S. (2015). The Limits of Reputation in Platform Markets: An
Empirical Analysis and Field Experiment. NBER Working Paper Series, page 20830.
Phelps, E. S. (1972). The Statistical theory of Racism and Sexism. American Economic
Review, 62(4):659�661.
29
Roger, G. and Vasconcelos, L. (2014). Platform Pricing Structure and Moral Hazard.
Journal of Economics and Management Strategy, 23(3):527�547.
Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in
observational studies for causal e�ects. Biometrika, 70(1):41�55.
Sarsons, H. (2017). Interpreting signals in the labor market: Evidence from medical
referrals. Job Market Paper.
Schuessler, K. and Becker, G. S. (1958). The Economics of Discrimination. American
Sociological Review, 23(1):108.
Spagnolo, G. (2012). Reputation, competition, and entry in procurement. International
Journal of Industrial Organization, 30(3):291�296.
Zervas, G., Proserpio, D., and Byers, J. (2015). A First Look at Online Reputation on
Airbnb, Where Every Stay is Above Average. Where Every Stay is Above . . . , pages
1�22.
30
A Navigation on Blablacar.fr
First, users type in the origine, destination and date of the ride they are seeking. They
then see a list of rides meeting their request (�gure 6 ). They may then click on speci�c
postings to have more details about the ride (�gure 7). Finally they may either see the
pro�le of the driver (�gure 8) or proceed directly to payment. Blablacar service fees
are a function of the price posted by the driver. These fees are shown on �gure 9.
Figure 6: Listing o�ered on a given route
31
B Oversampling of minorities for short-notice rides
Due to our scraping method, it cannot be excluded that our sample provides a slightly
biased representation of listings. Indeed, the program takes snapshots of listings dis-
played on the website at a given point time. However, rides that are already full are
no longer displayed on the platform. This means our data collection may undersample
the particularly interesting rides that would sell out very fast, or those corresponding
to times when demand is much higher than supply. This wouldn't be an issue if both
minorities and non-minorities were a�ected the same way by this sampling bias. How-
ever, as we show in this paper the minority status does impact the attractiveness of
a given listing. Therefore, minorities who may be perceived as posting less interesting
rides remain longer on display and may therefore be over-represented in our sample.
Therefore, our minority gap estimates should be understood as lower bounds. Indeed,
minorities are compared to a pool constituted of non-minorities that are not so good
as to have sold out their seats extremely fast. Table 10 shows that minority drivers
represent a specially high share of rides that are posted on a short notice, a possible
sign that non-minority drivers have sold their seats faster. For trips posted with more
notice, we believe our sample is indeed representative of the actual participants on
blablacar. Indeed, most of the rides �either from minorities or not � still have more
than one empty seat, which means that most listings and indeed collected.
Figure 10: Share of minorities in sample as a function of number of days betweenposting and departure
34
This is true despite the fact minorities tend to allow for automatic con�rmation
more frequently than non-minorities (12% of drivers with automatic con�rmation are
minorities, while they represent only 8% of the drivers with manual con�rmation).
C Reputation e�ect
Table 7: Taken seats, regressed over driver and ride characteristics
Number of reviews:
(1:3) (4:49) (50+)
driver's age −0.001∗∗ (0.0004) −0.001∗∗∗ (0.0002) −0.0001 (0.0004)talkative 0.018 (0.011) 0.007 (0.006) 0.023∗∗ (0.010)minority −0.037∗∗∗ (0.014) −0.030∗∗∗ (0.007) −0.012 (0.013)seniority (# months) −0.0004 (0.0003) −0.00001 (0.0001) −0.001∗∗∗ (0.0002)hours untill ride −0.0003 (0.0002) 0.0002∗ (0.0001) 0.0002 (0.0002)posted since 0.017∗∗∗ (0.004) 0.037∗∗∗ (0.003) 0.046∗∗∗ (0.005)posts per month 0.002 (0.003) −0.002 (0.002) −0.011∗∗∗ (0.002)picture 0.031 (0.036) 0.020 (0.022) 0.003 (0.034)bio (# words) 0.0002 (0.0004) 0.00004 (0.0002) 0.0001 (0.0003)male −0.003 (0.012) 0.007 (0.006) 0.022∗ (0.013)car price 0.001 (0.001) −0.002∗∗∗ (0.001) −0.0001 (0.001)consumption 0.012∗ (0.007) 0.018∗∗∗ (0.004) 0.011∗ (0.007)competition 0.0004∗∗ (0.0002) 0.001∗∗∗ (0.0001) 0.001∗∗∗ (0.0002)median revenue (city) 0.00003 (0.00002) 0.00002∗ (0.00001) 0.00003 (0.00002)duration public transport −0.043 (0.072) −0.095∗∗ (0.044) 0.085 (0.094)km −0.0004 (0.0004) 0.0003 (0.0002) −0.0002 (0.0004)hour 0.001 (0.001) 0.001 (0.001) 0.002∗ (0.001)night 0.011 (0.012) 0.014∗∗ (0.006) 0.026∗∗ (0.011)day −0.061∗∗∗ (0.019) −0.061∗∗∗ (0.011) −0.111∗∗∗ (0.017)notice −0.006 (0.004) −0.022∗∗∗ (0.003) −0.028∗∗∗ (0.005)automatic acceptance 0.073∗∗∗ (0.011) 0.084∗∗∗ (0.006) 0.090∗∗∗ (0.011)Constant −0.031 (0.309) −0.135 (0.197) −0.456 (0.393)Observations 8,762 42,654 18,013R2 0.074 0.073 0.097Adjusted R2 0.048 0.067 0.085Residual Std. Error 0.477 (df = 8522) 0.559 (df = 42413) 0.648 (df = 17771)F Statistic 2.867∗∗∗ (df = 239; 8522) 13.818∗∗∗ (df = 240; 42413) 7.936∗∗∗ (df = 241; 17771)
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
35
Table 8: Revenue, regressed over driver and ride characteristics
Number of reviews:
(1:3) (4:49) (50+)
driver's age −0.025∗∗ (0.012) −0.012∗∗ (0.006) −0.006 (0.010)talkative 0.561∗ (0.301) 0.130 (0.151) 0.408∗ (0.230)minority −0.943∗∗ (0.375) −0.745∗∗∗ (0.189) −0.633∗∗ (0.294)seniority (# months) −0.011 (0.007) −0.004 (0.003) −0.017∗∗∗ (0.005)hours untill ride −0.007 (0.005) 0.003 (0.003) 0.003 (0.005)posted since 0.428∗∗∗ (0.121) 0.887∗∗∗ (0.067) 0.948∗∗∗ (0.116)posts per month 0.066 (0.090) −0.063 (0.040) −0.198∗∗∗ (0.041)picture 0.208 (0.971) 0.587 (0.554) −0.059 (0.776)bio (# words) 0.005 (0.010) −0.001 (0.004) 0.003 (0.007)male −0.173 (0.328) −0.002 (0.159) 0.189 (0.295)car price 0.010 (0.028) −0.044∗∗∗ (0.014) 0.002 (0.022)consumption 0.285 (0.197) 0.368∗∗∗ (0.101) −0.004 (0.156)competition 0.012∗∗ (0.006) 0.014∗∗∗ (0.003) 0.014∗∗∗ (0.004)median revenu (city) 0.001 (0.001) 0.001∗ (0.0003) −0.00001 (0.001)duration public transport −3.229∗ (1.955) −4.759∗∗∗ (1.126) −8.297∗∗∗ (2.291)km 0.007 (0.010) 0.024∗∗∗ (0.006) 0.031∗∗∗ (0.010)hour 0.008 (0.032) −0.002 (0.017) 0.011 (0.026)night 0.043 (0.334) 0.466∗∗∗ (0.164) 0.571∗∗ (0.253)day −1.455∗∗∗ (0.517) −1.329∗∗∗ (0.274) −1.549∗∗∗ (0.398)notice −0.146 (0.118) −0.509∗∗∗ (0.065) −0.555∗∗∗ (0.114)automatic acceptance 1.597∗∗∗ (0.293) 1.753∗∗∗ (0.144) 1.737∗∗∗ (0.241)Constant 1.144 (8.377) −3.628 (5.074) 17.921∗ (9.684)
Observations 8,688 42,235 17,787R2 0.080 0.086 0.120Adjusted R2 0.054 0.081 0.108Residual Std. Error 12.903 (df = 8448) 14.324 (df = 41994) 14.714 (df = 17545)F Statistic 3.062∗∗∗ (df = 239; 8448) 16.533∗∗∗ (df = 240; 41994) 9.917∗∗∗ (df = 241; 17545)
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
36
Table 9: Propensity score
Dependent variable:
minority
car price 0.020∗∗∗ (0.002)driver's age −0.036∗∗∗ (0.001)posts per month 0.066∗∗∗ (0.003)picture −17.857 (55.481)bio (# words) −0.014∗∗∗ (0.001)gender 1.065∗∗∗ (0.026)consumption 0.134∗∗∗ (0.013)talkative 0.347∗∗∗ (0.019)Constant 15.415 (55.481)
Observations 79,440Log Likelihood −36,268.060Akaike Inf. Crit. 72,554.120
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
D Matching
E Retention and exclusion decisions
E.1 Expected revenues from an agent
Expected revenues of a new agent: Equation 14 is derived by plugging assump-
tion 1 in equation 12:
Enew(Π) = En,Bi[(1− αn)βi + αnvε + en,βi(s)−
1− Fα,n,ε(s)fα,n,ε(s)
ψ′(en,βi(s))− ψ(en,βi(s))
]
Since the agent is new, we take n = 0 (no history):
= E0,βi
[vε + e0,βi(s)−
1− Fα,0,ε(s)fα,0,ε(s)
e0,βi(s)−(e0,βi(s))
2
2
]
37
E0,βi [vε] = 12
+ ε. Pro�t maximization yields 10, or e0,βi(s) = s− ε
=1
2+ ε+
∫ 1+ε
ε
(s− ε)[1− 1− (s− ε)
1− s− ε
2
]ds
=1
2+ ε+
1
6=
2
3+ ε
Expected revenues of incumbent agent: Equation 15 is derived by plugging
assumption 1 in equation 12:
Eold(Π) = En,Bi[(1− αn)βi + αnvεold + en,βi(s)−
1− Fα,n,εold(s)fα,n,εold(s)
ψ′(en,βi(s))− ψ(en,βi(s))
]= (1− αn)βi + αn
(1
2+ εold
)+ En,βi
[en,βi(s)−
1− Fα,n,εold(s)fα,n,ε(s)
en,βi(s)−(en,βi(s))
2
2
]
Note that Fα,n,εold(s) = x−(1−αn)βiαn
− εold and is de�ned over [(1 − αn)βi + αnεold, (1 −αn)βi + αn(1 + εold)]. fα,n,εold(s) = 1
αnover the same interval, 0 otherwise. Pro�t
maximization yields 10, or en,βi(s) = 1 − 1−Fα,n,εold (s)
fα,n,εold (s)= (1 − αn)(1 − βi) + s − εoldαn .
It follows:
Eold(Π) = (1− αn)βi + αn(
1
2+ εold
)+
∫ (1−αn)βi+αn(1+εold)
(1−αn)βi+αnεold
en,βi(s)
(1− (1− en,βi(s))−
en,βi(s)
2
)1
αnds
= (1− αn)βi + αn(
1
2+ εold
)+
1
2αn
∫ (1−αn)βi+αn(1+εold)
(1−αn)βi+αnεold
((1− αn)(1− βi) + s− εoldαn)2 ds
= (1− αn)βi + αn(
1
2+ εold
)+
1
6αn(1− (1− αn)3
)= (1− αn)βi + αn
(1
2+ εold
)+
1
6
(3− 3αn + α2n
)= (1− αn)βi + αnεold +
1
2+α2n
6
38
E.2 Threshold values for α
Optimal retention : The decision to retain/exclude current driver is optimal when
cuto�s values of β found in 17 and 18 are equal:
β̄ = β̄SP ⇔εnew − αoptεold
1− αopt+
1
6+αopt
6=
1
2+ εnew
⇔ αopt =3 + 6(εnew − εold)−
√(3 + 6(εnew − εold))2 − 8
2
From this it follows that if εnew > εold, there always exist αopt ∈ [0, 1] such that type
retention by the principal corresponds to the one of a benevolent social planner.
Full exclusion : The condition for full exclusion of a population is that β̄ = 1. De�ne
αexclusion the solution to this equation:
αexclusion = 3−√
4 + 6(εnew − εold)
Note that as soon as εnew > εold, there exist αexclusion ∈ [0, 1] such that if α > αexclusion,
all types from incumbent population will be dismissed.
39