Peer Evaluation and Team Performance:An Experiment on Complex Problem Solving
preliminary - please do not cite
John Morgana, Susanne Neckermannb and Dana Sisakc
aUniversity of California, Berkeley
bUniversity of Chicago & ZEW
cErasmus University Rotterdam& Tinbergen Institute
Universität Innsbruck, March 28th, 2019
Morgan, Neckermann and Sisak, Guesstimations 1/31
Motivation
Organizations succeed when they are capable of solvingcomplex, non-routine problems.Often, these tasks are done by teams of individuals,usually after the individuals alone have had a chance tothink through the issues and possibilities.The interplay of incentives and performance on complexchoices is not well understood, neither theoretically norempirically.In particular, an objective measure of performance is oftennot available, and thus less-studied incentives relying onsubjective evaluation are needed.We study incentives for individual and group performancein a novel complex and non-routine task: guesstimations.
Morgan, Neckermann and Sisak, Guesstimations 2/31
Preview of Results
Sequential design: First subjects work individually, thendecide on final answer in group.
Treatments: add group and individual incentivesGroup piece rate by closeness to truthPayoff relevant peer evaluation
Each individual votes for most valuable group member.The vote is made on the basis of perceivedperformance (no performance feedback).The winner received biggest share of group surplus.Pro: May help mitigate the free rider problem byintroducing individual incentives.Con: May encourage showing off, sabotage and otherperformance reducing behaviors.
Morgan, Neckermann and Sisak, Guesstimations 3/31
Preview of Results
Results:Treatments did not affect performance but did affectprocess.With (individual) incentives
groups spend more time on the question,more creative/different approaches,less individuals completing the answer sheet inindividual phase.
Creative/different approaches are related to a higherchance to be voted MVP in the peer evaluation.
Morgan, Neckermann and Sisak, Guesstimations 4/31
Literature
We consider a complex and non-routine taskGroups vs. individuals (Blinder and Morgan 2005, 2008;Laughlin et al. 2006; Charness et al. 2015; Sniezek 1989,Thompson and Wilson 2015)Optimal group composition (Hoogendoorn and van Praag2012; Barrick et al. 1998; Williams and O’Reilly 1998; Bell2007; Hamilton et al. 2003)Group incentive schemes (Charness and Grieco, 2014;Ramm et al., 2013; Englmaier et al., 2017)
Our contributions: (1) novel, complex task (2) study of peerevaluation (3) understanding a complex group productionprocess
Morgan, Neckermann and Sisak, Guesstimations 5/31
Experimental Design
The experiment wasconducted at the Erasmus University Rotterdamin May and June 2014with a total of 231 studentsfor three treatments (93, 78 and 60)
Morgan, Neckermann and Sisak, Guesstimations 6/31
Experimental Set-Up
Total duration: approx. 1.5 - 2 h1. Briefing in plenum & individual Guesstimation2. Group Guesstimation times three (groups of three,
separate rooms, 5 + 10 minutes)3. Elicitation of social preferences, personality and
demographics as well as questionnaire in plenum4. Payment
Morgan, Neckermann and Sisak, Guesstimations 7/31
Guesstimations
Used in assessment centers and resemble tasks for example inconsulting jobs.
Individual “Ability” How many dogs are there in theUnited States of America? (A: 73.4 million)Group 1 “Toothpaste” How many liters of toothpaste areused in the United Kingdom every year? (A: 46.3 millionliters)Group 2 “Weddings” How many weddings were there inGermany in June 2006? (A: 49 500)Group 3 “Cycling” What is the total distance cycled inAmsterdam per day? (A: 2 million km)
Advantage: guesstimations have definite, known, answers.Possible to grade performance in an objective fashion.
Morgan, Neckermann and Sisak, Guesstimations 8/31
Example of Guesstimation answer sheet
Morgan, Neckermann and Sisak, Guesstimations 9/31
Grading and Group Reward Scheme
The maximum reward is 10 Euro for the individualguesstimation and 35 Euro per group for the groupguesstimations.We implement a piece rate by closeness to right answer.The piece rate group reward is then split amongst groupmembers according to treatment rules.
Group score Construction0 Guesstimation is more than +/− 80% of the true answer0.2 Guesstimation is within +/− 80% of the true answer0.4 Guesstimation is within +/− 60% of the true answer0.6 Guesstimation is within +/− 40% of the true answer0.8 Guesstimation is within +/− 20% of the true answer1 Guesstimation is within +/− 10% of the true answer
Morgan, Neckermann and Sisak, Guesstimations 10/31
Individual Incentives and Treatments
FLAT: No incentive, just a flat rate per question.EQUAL: Group piece rate by closeness to “truth”.Exogenous group sharing rule. Total payment is randomlyallocated in shares of 50%, 30% and 20%.MVP: Endogenous sharing through peer evaluation.Subject voted best by both team members receives 50%,subject who receives one vote 30% and subject with novote 20%. Ties are broken randomly. Subjects are notinformed about their performance at the time of theevaluation.
Morgan, Neckermann and Sisak, Guesstimations 11/31
Overview Analysis
PreliminariesPart I: Incentives
Groups PerformanceGroup ProcessMVP vote
Part II: ExtensionsGroup vs. IndividualsStraddle vs. Non-Straddle
Morgan, Neckermann and Sisak, Guesstimations 12/31
Summary Statistics Guesses by Question
Cycling Toothpaste Weddings(in 10,000) (in 1,000,000 ) (in 1,000)
# Observations 274 267 267
Mean 652.82 1,147.74 886.12Maximum 39,647.06 242,027.00 52,000.00Minimum 3.75 0.00 0.05Standard deviation 2,671.40 14,868.54 4,184.871st Percentile 6.00 0.00 1.105th Percentile 37.64 0.33 4.0010th Percentile 54.00 6.21 12.5025th Percentile 109.55 30.00 33.4450th Percentile 190.56 63.74 70.0075th Percentile 421.20 127.49 295.6490th Percentile 1,008.00 340.15 1,181.2095th Percentile 1,900.00 570.02 2,551.4099th Percentile 9,175.52 5,344.09 20,230.00
True Answer 200.00 46.30 49.50
Note: Includes all group and individual guesses (77 group guesses and the rest individual guesses). Since notall individuals always made an individual guess, the number of observations are lower than 308 per question.
Morgan, Neckermann and Sisak, Guesstimations 13/31
Summary Statistics and Balance TableFLAT EQUAL MVP
Observations 60 93 78Female 0.367 0.337 0.436
(0.482) (0.473) (0.496)
Age 21.333 21.391 21.064(2.370) (2.643) (2.457)
Dutch 0.700 0.685 0.782(0.458) (0.465) (0.413)
Economics Student 0.767 0.685 0.731(0.423) (0.465) (0.444)
Econometrics Student 0.117 0.065 0.064(0.321) (0.247) (0.245)
Bachelor 1 0.283 0.239 0.295(0.451) (0.427) (0.456)
Bachelor 2 0.150 0.228 0.231(0.357) (0.420) (0.421)
Bachelor 3 0.333 0.304 0.218(0.471) (0.460) (0.413)
Master 0.233 0.228 0.256(0.423) (0.420) (0.437)
Previous Experience Task 0.117 0.097 0.128(0.321) (0.296) (0.334)
Number of Quantitative Classes 5.167 3.651∗ 3.944(6.282) (2.800) (2.761)
Average Grade 7.142 7.182 7.096(0.725) (0.762) (0.780)
Morgan, Neckermann and Sisak, Guesstimations 14/31
Summary Statistics and Balance Table II
FLAT EQUAL MVP
Social Value OrientationIndividual/Competitive 0.458 0.391 0.434
(0.498) (0.488) (0.496)
Big 5 InventoryExtraversion 6.983 6.793 7.192
(1.396) (1.757) (1.721)
Agreeableness 7.200 7.293 7.333(1.527) (1.441) (1.345)
Conscientiousness 7.169 7.478 7.295(1.544) (1.593) (1.691)
Neuroticism 4.700 5.118 4.872(2.011) (2.141) (2.134)
Openness to Experience 6.417 6.882∗ 6.923∗
(1.544) (1.621) (1.673)
Ability (Dog Question)Ability 0.347 0.324 0.318
(0.329) (0.299) (0.323)
Morgan, Neckermann and Sisak, Guesstimations 15/31
Overview Analysis
PreliminariesPart I: Incentives
Group PerformanceGroup ProcessMVP Vote
Part II: ExtensionsGroup vs. IndividualsStraddle vs. Non-Straddle
Morgan, Neckermann and Sisak, Guesstimations 16/31
Performance Measures
(Hypothetical) group payoffs (0-35 Euro)Percentage error:
P.E. =∣Guess − Truth∣
Truth
Note: Smaller numbers mean better performance.
Morgan, Neckermann and Sisak, Guesstimations 17/31
Effects of Treatments on Group Performance
010
2030
400
1020
3040
0 .2 .4 .6 .8 1
0 .2 .4 .6 .8 1
FLAT EQUAL
MVPPer
cent
Fraction of prize amountGraphs by treatment
Morgan, Neckermann and Sisak, Guesstimations 18/31
Effects of Treatments on Group Performance
Percentage Error Payoff(1) (2) (3) (4)
Group Incentives 0.513 0.496 0.813 0.795(0.564) (0.496) (2.397) (2.634)
Group Incentives x Peer Evaluation -0.082 -0.419 0.246 0.248(0.784) (0.595) (1.402) (1.947)
Additional Covariates Yes Yes
β1 + β2 0.431 0.077 1.059 1.043Observations 231 231 231 231Clusters 15 15 15 15
Morgan, Neckermann and Sisak, Guesstimations 19/31
Effects of Treatments on Group Process
Three RAs coded answer sheets by the“creativity/uniqueness” of the steps used.Correlations are relatively low, but positive, on the order of.3 − .4.Define “Different” as answer sheet flagged as different byat least two RA’s.Example: Answer took into account that there was a WorldCup in June 2006 in Germany.
Morgan, Neckermann and Sisak, Guesstimations 20/31
Effects of Treatments on Group Process
0.0
2.0
4.0
6.0
8.1
Per
cent
age
Diff
eren
t
FLAT EQUAL MVP
Morgan, Neckermann and Sisak, Guesstimations 21/31
Effects of Treatments on Group Process
Probit (ME)Different
Group Incentives -0.009 0.020(0.045) (0.020)
Group Incentives x Peer Evaluation 0.044 0.099**(0.046) (0.047)
Additional Covariates Yes
β1 + β2 0.035 0.119**Observations 231 231Clusters 15 15
Alternative measure
Morgan, Neckermann and Sisak, Guesstimations 22/31
Effects of Treatments on Group Process - IndividualPhase
FLAT EQUAL MVPFrequency missing guesses 0.08 0.17∗∗∗ 0.24∗∗∗
Steps individual (all) 4.97 4.83 4.81Steps individual (complete) 5.02 4.97 4.85Steps individual (missing) 4.29 4.11 4.68Payoff (all) 7.74 8.54 6.55Payoff (complete) 8.4 10.26∗ 8.58Different 0.09 0.10 0.11Different in group 0.23 0.25 0.26
⇒ Consistent with individuals spending more time preparing in individual phase underMVP.
Morgan, Neckermann and Sisak, Guesstimations 23/31
Effects of Treatments on Group Process
How does this feed into the group phase?
FLAT EQUAL MVPSteps group 5.05 4.51∗∗ 4.64Count method 1.1 1.2 1.08Count guess 1.85 1.34∗∗ 1.49Worktime 7.97 8.37 8.83∗∗∗Unrelated 68.75% 49.43%∗∗ 49.33%∗∗
Speaking turns 9.71 10.45 11.89∗∗∗
All agree 87.5% 86.21% 82.67%No dominant indiv(s) 37.5% 33.33% 37.33%
⇒ Groups spend more time on guesstimation in group phase under MVP.
Morgan, Neckermann and Sisak, Guesstimations 24/31
Effects of Treatments on Group Atmosphere
to voice their ideas in a fair wayI felt that everyone had an opportunity
to voice their ideas in a fair wayI felt that others dominated the discussion
most to solve as many problems as possibleAll members of my group including me gave their
the group reach a better performanceDo you feel that competitiveness helped
on my group membersI wanted to make a good impression
was helpfulThe atmosphere in the group
was competitiveThe atmosphere in the group
Voice
Performance
Atmosphere
Fully disagree
Mostly disagreeNeither
Mostly agreeFully agree
FLAT EQUAL MVP
Morgan, Neckermann and Sisak, Guesstimations 25/31
Most Valuable Person
Why did incentives not affect performance but did affectprocess?MVP treatment: look at individual voting behavior.If good individual performance is not rewarded, incentivesfor good performance are de facto absent.“Best” individual was only voted winner in 21.5% of cases.In 15.4% of cases, a tie was the group outcome.
Morgan, Neckermann and Sisak, Guesstimations 26/31
Most Valuable Person
Probit (ME) winner MVPFull Sample Non-Strategic Full Sample Non-Strategic
Best Guess -0.122 -0.012 -0.160*** -0.067(0.081) (0.089) (0.058) (0.070)
Missing Guess 0.036 0.019 -0.007 0.019(0.057) (0.033) (0.032) (0.018)
Ind. Steps 0.061*** 0.070*** 0.050** 0.061**(0.022) (0.023) (0.022) (0.029)
Different 0.126*** 0.164** 0.130*** 0.166**(0.016) (0.066) (0.019) (0.077)
Leader 0.010 0.029 -0.044 0.031(0.048) (0.072) (0.050) (0.075)
Turns Share 0.654* 0.719** 0.450** 0.608*(0.343) (0.307) (0.216) (0.350)
Presented Guess 0.058 0.106 0.036 0.095(0.067) (0.116) (0.057) (0.102)
Presented Method 0.017 0.039 -0.004 0.051(0.056) (0.078) (0.044) (0.071)
Additional covariates Yes Yes
Observations 219 131 219 131Clusters 5 5 5 5
Morgan, Neckermann and Sisak, Guesstimations 27/31
Most Valuable Person
“He thought about the situation in a different way and had areasonable answer.”
“Both were really good, but he had a few good ideas such asthe 06-06-06 bonus.”
“More simple logic, good innovative ideas, structured thinking.”
Morgan, Neckermann and Sisak, Guesstimations 28/31
Summary Incentives
Incentives didn’t improve performance, but in MVP groupshad more creative/unique steps.Under MVP less individuals finish their individual answersheet. This seems not due to less effort, though strict timelimit makes it hard to say for certain.Under MVP groups work longer.Having a different approach is related to a higher chanceof being voted MVP, while having the best guess is not.
Morgan, Neckermann and Sisak, Guesstimations 29/31
Thank you!
Morgan, Neckermann and Sisak, Guesstimations 30/31
Effects of Treatments on Group Process
Probit (ME) Probit (ME)Different (1RA) Different (2RA)(3) (4) (5) (6)
Group Incentives 0.083 0.134* -0.009 0.020(0.094) (0.072) (0.045) (0.020)
Group Incent. x Peer Eval. -0.001 0.045 0.044 0.099**(0.084) (0.074) (0.046) (0.047)
Additional Covariates Yes Yes
β1 + β2 0.082 0.179** 0.035 0.119**Observations 231 231 231 231Clusters 15 15 15 15
Back
Morgan, Neckermann and Sisak, Guesstimations 31/31