COMMENTS ON PEER REVIEW AND
RATING OF NRAO OBSERVING PROPOSALS
Frederic R. Schwab, Dana S. Balser, and Gareth C. Hunt
March 10, 2015
1. Introduction
Recently we were asked to comment on the NRAO proposal peer review process, and tofocus primarily on the scoring procedure reviewers are instructed to follow and the algorithmused for aggregation of the reviewers’ scores. In particular, we were asked to comment on therobustness of the score aggregation process—for example, sensitivity to imbalance betweenreviewers’ score distributions—and to compare our proposal scoring system with those usedby other institutions and other observatories (e.g., ESO, Arecibo, HST, etc.) in order toprovide assurance that our practice is not out of the mainstream.
We begin by summarizing, in Section 2, the mechanics of the current review process forstandard GBT, VLA, and VLBA proposals. There we describe the instructions provided toreviewers for scoring, the score normalization and averaging method that is used, and thepeculiarities (particular the imbalance) of the distributions of reviewers’ observed raw scoredistributions for recent proposal review cycles. We continue with a discussion of possibledeficiencies in our proposal scoring system and suggest a few simple remedies. In Section 3we compare the NRAO process with what is in use at other observatories and at two federalagencies, NSF, and NIH.
In Section 4 we discuss ranking, rating, and score aggregation methods that are based onpairwise score comparisons. In Section 5, we discuss two other methods that have appearedin recent literature, and in Section 6 we call attention to a recent paper advocating adistributed approach to telescope proposal peer review.
Conclusions are given in Section 7, along with a few more comments and suggestions.
2. Description of the Review Process
The NRAO Web pages give a useful overview of the review process, one intended bothfor reviewers and proposers.1
Also, a comprehensive description of the gritty, technical details of the process is givenin a memorandum by Bryan Butler [1], titled “Requirements for the PST for the NewNRAO Proposal Evaluation and Time Allocation Process”, dated October 13, 2010. Weinclude that memorandum here as Appendix A. Most of the details there are still current.This process pertains to proposals for use of North American NRAO facilities—the GBT,VLA, and VLBA.
1See https://science.nrao.edu/observing/proposal-types .
1
TTA Report 1
2.1. Overall Summary. Each observing proposal pertains to one of five instrumentalcategories:
• GBT — Green Bank Telescope;• VLA — Very Large Array;• VLBA — Very Long Baseline Array;• HSA — High-Sensitivity Array (utilizing the VLBA plus one or more among the
following: GBT, Effelsberg 100-m, Arecibo, the phased VLA); or• GMVA — Global 3 mm VLBI Array (utilizing the VLBA plus the GBT, Effelsberg,
Pico Veleta, Plateau de Bure, Onsala, Yebes, and Metsaehovi radio telescopes).
And each proposal is assigned to one of the eight science categories which are denoted bythe acronyms defined below:
• AGN — Active Galactic Nuclei;• EGS — Extragalactic Structure;• ETP — Energetic Transients and Pulsars;• HIZ — High Redshift and Source Surveys;• ISM — Interstellar Medium;• NGA — Normal Galaxies;• SFM — Star Formation; and• SSP — Sun, Stars, Planets, and Planetary Systems.
Associated with each science category is a six-member Science Review Panel (SRP). Eachpanel member is an expert in the subject discipline. One member is designated as the panelchairperson. Each panel member—except the chair—submits a score for each proposal,unless that member declares a conflict of interest. If one or more of those members areineligible, then the panel chair does submit a score—except in cases where the chair alsohas a conflict of interest. (The procedure for identifying conflicts of interest is specified in§3.12 of Appendix A.)
The HSA and GMVA proposals are—like the GBT, VLA, and standalone VLBAproposals—reviewed by the SRPs. The GMVA proposals are, however, also reviewed by aEuropean review panel and their time allocation is determined by a different process thanthat of the standard NRAO TAC. The time allocation process used for HSA proposals alsois different from the standard one.
There are two calls for proposals per year, for two six-month semesters, denoted Aand B. The nominal start dates for observing within a given semester is February 1 forSemester A and August 1 for Semester B. The proposal deadline for Semester A is thefirst of August of the preceding calendar year, and the deadline for Semester B is the firstof February (the precise date depends on whether the first day of the month falls on aweekend). Proposal cycle semesters are designated by three-character alphanumeric stringsof the form 12B, 13A, . . . to denote “2012 Semester B”, “2013 Semester A”, etc.
Review criteria which panel members are asked to consider are described in detailon the NRAO Science Review Panel Web page.2 Specific review criteria include scientificmerit, justification for any extra resources that are requested (as for GMVA and HSA pro-posals), qualifications of the project team, their publication record from any past proposalsubmissions, the possibility of acquiring more appropriate data than requested (say, from anexisting data archive), amount of resources requested vis-a-vis telescope time pressure, andstudent status, when relevant (e.g., whether a sidelight or a main focus of thesis research).
After all panel members have submitted their independently derived proposal scores,the entire SRP meets by teleconference for discussion, debate, and reconsideration of the
2See https://science.nrao.edu/observing/proposal-types/sciencereviewpanels .
2
scoring. During this stage of the process individual reviewers do not modify their scores,but the aggregate scores (and hence the aggregate rank order of preference from the panel)are subject to modification. The panel chair always has final say. Further details can befound in [1]. The final aggregate scores and rank order of proposals then are submitted tothe Time Allocation Committee (TAC). The TAC then merges the rank orderings from alleight panels.3
2.2. Scoring, and Score Aggregation. As noted above, proposal scoring within theScience Review Panel is done independently by panel members—without consultation—prior to a group meeting of the SRP. The permissible range of scores, from best to worst,is 0.1 to 9.9, in steps of 0.1.4
After scores have been submitted, reviewers’ score distributions each are shifted tohave a common mean (equal to five) and linearly scaled to have a standard deviation oftwo. The scores, so normalized (or “standardized”), then are arithmetically averaged toobtain an aggregate score for each proposal.
Note that some of the standardized scores may fall outside of the range [0.1, 9.9] (e.g.,if the raw distribution has a smaller standard deviation than 2.0 and a significant positiveor negative skew); this happens rather infrequently.5 It may also happen that the aggregatescore falls outside of the range [0.1, 9.9]. (However this has not occurred in the case of anyof the averaged scores (1546 total) for proposal cycles 12B, 13A, 13B, and 14A.)
2.3. Observed Distributions of Reviewers’ Raw Scores. It is interesting and instruc-tive to note the diversity among reviewers’ raw score distributions. As one example, Table 1shows the raw scores submitted by the Cycle 13B, AGN panel reviewers, versus proposalID number.6 Reviewer 170’s scores all are integer values. Reviewer 136’s are integer andhalf-integer values. The other reviewers discriminate at the finest permissible granularityof 0.1. Reviewer 170 has ties in every row of the table—two-way ties at three scores, andmulti-way ties in each other row. Reviewer 149 is the most discriminating, with only five(two-way) ties. Reviewer 151 is very lenient, compared to the others—with 31 out of 51scores ≤ 1.0.
As to the degree of consensus among panel members in this example, let us considerproposal 8166: this proposal is rated last by two (out of five) reviewers (136 and 170), andsecond- or third-to-last by the three others (149, 150, and 151). Here we see a high degreeof consensus. There is much less of a consensus regarding proposal 8225: Reviewer 125rates it first, Reviewer 151 rates it in a two-way tie for last place, two others place it intheir third quartiles, and one in their second quartile.
Histograms of reviewers’ scores for this example are shown in Figure 1, along withsome descriptive statistics. Among the six reviewers, the mean score varies between 1.57and 4.57, and the median between 1.0 and 4.5. Reviewer 151, with a mean assigned score of1.57 (and median of 1.0), is very lenient (as noted before). The scales of these score distri-butions, as measured by their standard deviations, vary between 1.60 and 2.75. (The scaledmedian absolute deviation about the median value (MAD)—perhaps a better estimate of
3More details are given at https://safe.nrao.edu/wiki/bin/view/Software/PSTReviewCookbook .4It seems peculiar that the range is not [0,10] rather than [0.1,9.9]. I believe originally the scale 1 to 9.
was in use. Then someone questioned, “Why not 0 to 10?” Someone replied that the code would alreadywork for 0 to 10—except that 0 was not possible because it was used as a flag to indicate a conflict of
interest; so 0.1 and 9.9 (for the sake of symmetry) were adopted as limits. A better choice would have beento indicate a conflict of interest via a an alphabetic character, a NaN flag, or somesuch. — F.R.S.
5In the four most recent proposal cycles (12B, 13A, 13B, and 14A) this happens for 47 out of 1546raw scores (3.04% of the total).
6These proposal ID numbers have been randomly assigned, for reasons of confidentiality.
3
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 125
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 136
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 149
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 150
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 151
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 170
AGN Panel Raw Score Distributions, Cycle 13B H50 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
125 34 4.57 2.75 0.38 2.15 4.25 4.50 2.74 2.98 3.00136 25 3.20 1.77 0.85 3.42 3.00 3.00 1.48 1.86 2.10149 38 3.67 1.92 -0.01 2.34 3.85 3.70 1.85 2.03 2.02150 42 3.25 2.41 0.95 3.01 2.80 3.00 2.45 2.15 2.04151 48 1.57 1.62 1.97 6.68 1.00 1.25 0.82 0.83 1.03170 47 3.77 1.60 0.29 2.18 4.00 4.00 1.48 1.22 2.16
Figure 1. Histograms of Cycle 13B AGN proposal raw scores, by reviewer. The ordinate in each caserepresents the probability density.
the “typical” scale of a distribution such as Reviewer 151’s—varies between 0.82 and 2.74.)Three of the distributions (those of Reviewers 136, 150, and 151) show significant positiveskew.
Figures showing the raw score distributions from each of the eight review panels andthe four most recent proposal cycles (12B, 13A, 13B, and 14A) are given in Appendix B(Figures B-1 through B-8). Like Figure 1, these show for each panel/cycle pair (in columns 1through 6): the number of scores submitted by the given reviewer, the mean raw score,standard deviation, skewness, kurtosis, and median score. Columns 7 through 10 give a fewancillary statistics; these are defined in §2.4.
The reviewers’ raw score distributions shown in Appendix B are typically clusteredtoward the low end of the score range (i.e., higher ratings), and they often are positivelyskewed, with a few poor scores in the right-tail region. The average mean is 4.06 ± 0.85,the mean standard deviation is 1.85± 0.50, the mean skewness is 0.39± 0.48, and the meankurtosis is 2.63 ± 0.98. (For reference, the kurtosis of Gaussian, or normal distributionis equal to three; and for a normal distribution truncated at ±2.5σ about the mean thekurtosis is 2.6242.) Histograms of these statistics are shown in Figure 2. Table 2 liststhe number of proposals, by panel, for the four most recent proposal cycles. The typicalproposal is rated by five reviewers; out of the 1529 non-GMVA proposals, 17 were rated byonly two reviewers, 91 by three reviewers, 446 by four reviewers, 971 by five reviewers, andfour by six reviewers (the corresponding percentages are 1.1, 6.0, 29.2, 63.5, and 0.3).
Sometimes a reviewer’s score distribution is representative of a total ordering, or com-plete ranking of proposals, with no ties present. And sometimes reviewers appear to gradeon a curve. On the AGN Panel, Cycle 14A (see Figure B-1), Reviewer 149 has a scoredistribution which is exactly symmetric and contains no ties—in ascending order the 51
4
Cycle 13B, AGN Panel Raw Scores
— Reviewer 125 —
{8225} 0.1{8136} 0.5
{7943, 7965} 1.
{8193} 1.5{7939} 2.
{8051} 2.1{7905, 7924, 8055, 8164} 2.5
{7789} 3.
{7709, 7880} 3.5{7833, 8161, 8219} 4.
{7874, 7993, 8178} 4.5{7867} 4.8
{7827, 8176, 8231} 5.
{8249} 5.5{8061} 6.2{8025} 7.
{8247} 7.2{8172} 7.3
{8112, 8138} 8.
{7768, 7903, 8096} 8.5{7859, 8163} 9.9
— Reviewer 136 —
{7956, 8010, 8025} 1.
{7731, 7789, 8015} 1.5{7903, 7972, 8133, 8247} 2.
{7992, 8172, 8178} 2.5{7756, 8248, 8249} 3.
{8108, 8170} 3.5{7770, 7859, 7867, 7943, 7965, 8123} 4.
{8040} 5.
{7768, 8176} 6.
{8166} 8.
— Reviewer 149 —
{7874, 7993} 0.5{8161, 8219} 0.8{8112, 8193} 1.
{7709} 1.2{8108} 1.4
{7770, 7897} 2.
{7679} 2.5{8136} 2.8{8133} 3.
{7924} 3.2{7789} 3.3
{8015} 3.4{7992} 3.5{8055} 3.7{8170} 3.8{7965} 3.9{7880} 4.
{7903} 4.2{8040} 4.4{8225} 4.5{7833} 4.6{7905} 4.7{8138} 4.8
{8051, 8123} 4.9{8061} 5.
{7859} 5.2{8164} 5.3{7939} 5.5{8163} 5.7{7827} 6.
{8096} 6.5{8166} 7.
{8231} 8.
— Reviewer 150 —
{7993, 8010} 0.1{7874} 0.5
{7903, 7939, 7943, 8161, 8178, 8193, 8219, 8248} 1.
{8040} 1.2{8096} 1.5{7731} 1.8
{7897, 8136, 8164, 8247, 8249} 2.
{8138, 8170} 2.2{7827, 8025} 2.5
{7770, 8015, 8172} 2.8{7709, 8051} 3.
{7679} 3.1{8225} 3.2{8231} 3.5
{7924, 7992} 3.8{7833} 4.2{8055} 4.5
{7768, 8061, 8112} 5.
{7880} 5.2
{8176} 6.8{8123, 8133, 8166} 8.
{8163} 8.5{7905} 9.
— Reviewer 151 —
{8133, 8136} 0.1{7956, 8015} 0.2{7939, 7992} 0.3{7709, 8193} 0.4
{7768, 7770, 7827, 8096, 8108, 8112, 8164, 8170} 0.5{8055} 0.7
{7965, 8176} 0.8{7859, 8025} 0.9
{7731, 7789, 7874, 7880, 7972, 7993, 8123, 8138, 8247, 8248} 1.
{7756} 1.2{7833, 8178} 1.5
{8249} 1.6{7867, 7897, 7924, 8010, 8040, 8172} 2.
{8161, 8163, 8219} 2.5{7679, 7943} 3.
{7905, 8231} 4.
{8166} 5.5{8061, 8225} 7.
— Reviewer 170 —
{7874, 7993} 1.
{7709, 7789, 7833, 7992, 8010, 8055, 8133, 8161, 8178, 8219, 8231, 8247, 8248} 2.
{7897, 7903, 7924, 7956, 8108, 8136, 8225} 3.
{7731, 7756, 7768, 7770, 7859, 7867, 7905, 7939, 7943, 7965, 8015, 8025, 8123, 8164, 8193, 8249} 4.
{7827, 8170} 5.
{7679, 7880, 8061, 8096, 8112, 8138, 8163, 8172} 6.
{8040, 8166} 7.
Table
1.
Cycle
13B
,A
GN
pro
posa
lra
wsco
res,by
review
er.P
roposa
lID
num
bers
are
giv
enth
eleft-h
and
colu
mns,
and
the
corresp
ondin
gra
wsco
resin
the
right-h
and
colu
mns.
Multi-elem
ent
pro
posa
lID
listsco
rrespond
toa
multi-w
ay
tiefo
rth
egiv
ensco
re.N
ote
the
div
ersityam
ong
the
review
ers’perso
nalsco
redistrib
utio
ns.
The
quartile
valu
esfo
rth
esix
review
ersare:
Rev
iewer
125,
(2.5
,4.2
5,
7);
Rev
iewer
136,
(1.8
75,
3,
4);
Rev
iewer
149,
(2,
3.8
5,
4.9
);R
eview
er150,
(1.2
,2.8
,4.5
);R
eview
er151,
(0.5
,1,
2);
and
Rev
iewer
170,(2
,4,4.7
5).
5
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Mean, All Raw Score Distributions
0 1 2 3 4 50.0
0.2
0.4
0.6
0.8
1.0
Std Dev, All Raw Score Distributions
-3 -2 -1 0 1 2 30.0
0.2
0.4
0.6
0.8
1.0
1.2
Skewness, All Raw Score Distributions
2 4 6 8 100.0
0.2
0.4
0.6
0.8
Kurtosis, All Raw Score Distributions
Figure 2. This Figure shows histograms of the mean, standard deviation, skewness, and kurtosis statisticsof all the reviewers’ raw score distributions combined for proposal semesters 12B, 13A, 13B, and 14A. Theaverage mean is 4.06 ± 0.85, the mean standard deviation is 1.85 ± 0.50, the mean skewness is 0.39 ± 0.48,and the mean kurtosis is 2.63± 0.98. Note that the distribution of the means is skewed to the left and thatof the skewnesses is skewed to the right.
scores are: a few scores separated in steps of 0.5, then a few separated by 0.3, then by0.2, then a bunch separated by 0.1, then the reverse of all that. Another example is ETPPanel Cycle 14A, Reviewer 114 (see Figure B-3). This score distribution, consisting of 21scores, is an exact fit (by Cramer–von Mises and Anderson–Darling tests) to a triangulardistribution over the range [0.5, 9.5]
2.4. Methodological Tools. In this section a few special methodological tools, suchas ancillary statistical measures, kernel density estimates, and “smooth” histograms aredescribed.
2.4.1. Hodges–Lehmann estimator. In the case of a skewed or fat-tailed distribution, thesample median might be viewed as better representative of the “most typical” value of anunderlying (or parent) distribution than the sample mean. A similar estimator, also basedon order statistics, is the so-called Hodges–Lehmann estimator of location [2]. This has asimple definition: it is the median of all the pairwise averages of x1, . . . , xn, i.e.,
med1≤i,j≤n
xi + xj
2. (1)
For independent samples from a continuous symmetric distribution, the H–L estimator is,in fact, an estimator of the population median. For fat-tailed distributions especially, ithas the advantage that it generally converges faster to the population median than doesthe sample median (i.e., it is often a more efficient estimator of the population medianthan the sample median itself). Also, the H–L estimator has a smooth influence function—i.e., if a sample value is varied continuously, this estimate of distribution centrality varies
6
AGN Panel
12B 13A 13B 14A Total
GBT 4 8 4 4 20VLA 33 16 24 31 104
VLBA�HSA 18 35 22 25 100GMVA 1 5 3 3 12Total 56 64 53 63 236
EGS Panel
12B 13A 13B 14A Total
GBT 13 15 11 22 61VLA 17 33 21 35 106
VLBA�HSA 3 0 1 2 6GMVA 0 0 0 0 0Total 33 48 33 59 173
ETP Panel
12B 13A 13B 14A Total
GBT 21 13 18 20 72VLA 26 23 21 30 100
VLBA�HSA 10 11 12 11 44GMVA 0 0 0 1 1Total 57 47 51 62 217
HIZ Panel
12B 13A 13B 14A Total
GBT 9 10 7 8 34VLA 35 34 42 51 162
VLBA�HSA 0 7 5 6 18GMVA 0 2 0 0 2Total 44 53 54 65 216
ISM Panel
12B 13A 13B 14A Total
GBT 19 25 22 26 92VLA 7 35 19 35 96
VLBA�HSA 2 0 1 0 3GMVA 0 0 0 0 0Total 28 60 42 61 191
NGA Panel
12B 13A 13B 14A Total
GBT 6 2 2 9 19VLA 22 39 30 35 126
VLBA�HSA 0 2 1 2 5GMVA 0 0 0 0 0Total 28 43 33 46 150
SFM Panel
12B 13A 13B 14A Total
GBT 13 12 11 2 38VLA 37 48 32 56 173
VLBA�HSA 4 4 2 3 13GMVA 0 0 0 0 0Total 54 64 45 61 224
SSP Panel
12B 13A 13B 14A Total
GBT 1 6 9 5 21VLA 19 25 25 24 93
VLBA�HSA 2 9 9 3 23GMVA 0 1 0 1 2Total 22 41 43 33 139
Table 2. Number of proposals, by panel, for recent review cycles 12B, 13A, 13B, and 14A. A total of1546 proposals underwent review: 357 for GBT, 960 for VLA, 212 for VLBA (including HSA), and 17 forGMVA time. The GMVA proposals are also reviewed by a European review panel; the final ranking for thiscategory of proposals, and the telescope time allocation, are done outside of the normal SRP and TAC.
continuously, unlike the sample median. Thus it can be viewed as a smooth version of themedian (see Rousseeuw and Croux (1993) [3]). Figure 3 (left) shows the distribution ofthe ratio of the mean score to H–L location estimate, including all reviewers’ raw scoredistributions for the four most recent proposal cycles.
2.4.2. Alternative measures of scale (or dispersion). Also for many non-Gaussian (e.g.,fat-tailed, skewed, or outlier-contaminated distributions) the standard deviation is an over-estimate of the “typical” dispersion of the distribution. The median absolute deviationabout the median value (the so-called MAD), scaled by a constant c, is commonly usedas an alternative measure of scale.7 However, the scaled MAD is only 37% efficient in thecase of the Gaussian distribution. Rousseeuw and Croux (1993) [3] define two alternativeauxiliary estimates of scale, which they term Sn and Qn, that achieve higher efficiency (58%and 81%, respectively) at the Gaussian distribution. Both of these are defined in terms of
7The value c = 1.4826 is chosen to make the scaled MAD asymptotically agree with the standarddeviation in the case of a Gaussian distribution.
7
0.90 0.95 1.00 1.05 1.10 1.15 1.20 1.250
5
10
15
20
Mean�H-L Location
0.6 0.8 1.0 1.2 1.4 1.6 1.80.0
0.5
1.0
1.5
2.0
2.5
3.0
Std. Dev.�Qn
Figure 3. At left is shown a histogram of the ratio of mean score to Hodges–Lehmann location estimate,including all reviewers’ raw score distributions for the four most recent proposal semesters. At right isshown the histogram of ratios of standard deviation to the Qn estimator of scale. The mean values of thesedistributions are 1.024 ± 0.043 and 1.035 ± 0.202, respectively.
order statistics on the set of pairwise absolute differences, {|xi − xj | ; i < j}.8,9 Like theMAD, the influence function for Sn has discontinuities—whereas the influence function forQn is smooth. Figure 3 (right) shows the distribution of the ratio of the standard deviationto Qn estimate of scale, including all reviewers’ raw score distributions for the four mostrecent proposal cycles.
2.4.3. Kernel density estimates and “smooth histograms”. We show in Figure 1 and Appen-dices B and C, in addition to the traditional histogram, a smooth curve representing someputative, underlying probability density function (PDF), for each of the given raw scoredistributions. These smooth curves are based on the so-called kernel density estimator, asdescribed by Silverman (1986) [4]. The method is easy to describe: Given data samplesx1, x2, . . . , xn, place a unit mass at the abscissa of each data sample, convolve the resultingsum of δ-distributions with a smooth, non-negative kernel (e.g., a Gaussian) of appropri-ately chosen width, and normalize so that the resulting smooth function integrates to unity.Expressed mathematically, the kernel density estimate is given by
f(x) =1
nh
n∑
i=1
K
(x − xi
h
), (2)
where K is the kernel function and h is the bandwidth.The plot of a kernel density estimate is sometimes referred to as a “smooth histogram”.
This is the terminology used in Mathematica. For the curves shown in this memorandum weuse the Mathematica implementation of kernel density estimates, with a Gaussian kernel.The kernel bandwidth is chosen adaptively. For bandwidth selection we choose, variously, amethod due to Scott—or another by Silverman—both are implemented within Mathematica.And for the traditional histogram, we select a bin width commensurate with the bandwidthof the adaptive kernel density estimate.
The smooth histogram resulting from a kernel density estimator has a major advantageover its traditional counterpart, in that the traditional histogram can be overly sensitive to
8Sn is defined via Sn = c lomedi{himedj |xi − xj |}, where lomed is the ⌊(n + 1)/2⌋th order statistic,himed is the (⌊n/2⌋ + 1)st order statistic, ⌊·⌋ denotes the greatest integer (or floor) function, and c =1.1926. Additionally, a bias correction is applied which, in the case of Gaussian-distributed data, makesthis estimator unbiased for small sample sizes.
9Qn is defined via the kth order statistic of pairwise differences: Qn = d {|xi − xj | ; i < j}(k), where
k =`
h
2
´
, h = ⌊n/2⌋ + 1, and d = 1.0483. A bias correction is applied, as in the case of Sn.
8
choice of origin—or choice of bin locations, in general.10 The smooth histogram is sensitiveto neither. The most powerful modern methods for tests of multi-modality are based onkernel density estimates [4]. One disadvantage of smooth histograms is that long-taileddistributional detail may not show up as well as with a traditional histogram (though adata-adaptive variable bandwidth choice, dependent on local density, may overcome thisdisadvantage).
2.4.4. Quantitative comparison of score distributions. Since the end product of panel reviewis a rank-ordered list of proposals based on final scores, quantitative rank-order comparisonsof score distributions (e.g., one reviewer vs. another, or one scoring method vs. another)are very apt. For lists of equal length, the Kendall τ and Spearman ρ measures of rankcorrelation are commonly used. Definitions for these vary slightly; we use the tie-correctedversions that are implemented in the standard Mathematica library. The ρ and τ valuescan vary over the range [−1, 1] (+1 if the rank orderings are identical, −1 if they are thereverse of each other, 0 for no association).
Langville and Meyer [5] introduce a modified Kendall τ which can be used for compar-ison of partial lists (e.g., top-k lists). The so-called Spearman weighted footrule is definedanalogously to Spearman’s ρ, but assigns higher weight to discrepancies at higher rankorders.
2.5. Comments and Recommendations. In this section we offer a few suggestions forminor modifications of the proposal scoring system.
1. Adjustment of scale of reviewers’ score distributions. In Appendix B we saw that thereviewers’ raw score distributions often are skewed. For such distributions, the standarddeviation tends to be an over-estimate of the typical spread between scores. Perhaps use ofthe scaled MAD, Sn, or Qn should be used for normalization of the score distributions, inlieu of the standard deviation. We would suggest the use of Qn, which is based on pairwisedifferences of scores, has high efficiency and a smooth influence function, and—unlike thestandard deviation or the MAD—is a scale estimate which does not depend on a priorestimate of location (e.g., mean or median).
If this change were made, however, there would be more occurrences of scores outsidethe range [0, 10] because the influence of scores in the tail of a narrow, skewed, raw scoredistribution would increase. One might prefer to truncate the distribution of normalizedscores to the [0, 10] range.
2. Shift of the reviewers’ score distributions. Likewise, we might consider use of the medianor the H–L location estimate of each reviewer’s score distribution—rather than the samplemean—for alignment of reviewers’ score distributions in the score normalization process.But we believe the effect would be relatively minor compared with that of Suggestion #1,given the nature of the ratio distributions shown in Figure 3.
3. Use of data from all cycles for score normalization. Each reviewer ordinarily serves fortwo years (four consecutive semesters). In Appendix B we see—for a given reviewer—afair degree of consistency between score distributions from consecutive semesters. Hencewe might consider normalizing over previous semesters, as well as the immediate one, forcalibrating reviewers’ scores. However, if reviewers were at some point to be given revisedscoring guidelines, this idea would certainly be ruled out (except for following semesters).
4. Normalization of panel chair’s score distribution. One should note that, because thepanel chair votes only when another member of the panel is conflicted, there often will be
10This can be especially the case if the data are quantized. (As they are, in our case.)
9
a paucity of scores from the panel chair, and therefore a relatively poorer calibration thanfor other reviewers. Incorporation of Suggestion #3 could help somewhat.
5. Should panel chairs always vote (except when conflicted)? If the panel chairs were tovote on all their panel’s proposals, then—obviously—there would be better calibration fortheir scores, and, typically, five or six reviewers per proposal, rather then four or five. Therationale for the current scheme is probably (1) that the panel chair should not have theadditional burden, beyond his or her organizational duties, of scoring every proposal, and(2) possibly, that the chair should not be given additional power. Whatever the rationale,we feel it ought to be elucidated somewhere.
6. Reviewer effort expended in resolving minor score differences. Some reviewers, in spiteof having a large number of proposals to review (50 or 60+ in some cases), produce rawscore distributions which are either completely free of tie scores—equivalent to a totalrank ordering of the proposals—or very nearly free of ties. It almost surely takes iterationto achieve this level of discrimination, and we do not believe it is worth the effort todiscriminate this finely between similarly ranked proposals. Reviewers perhaps should beinstructed not to worry about minor score differences—that differences with respect to theother panelists’ scores will likely overweigh the fine tuning of one’s own scores.
7. Point scale. Along this same vein of reasoning: if the point scale had coarsergranularity—e.g., permitting only integer, or integer and half-integer scores—then reviewerswould not be able to discriminate so finely as now between similarly ranked proposals,11
and this restriction could possibly reduce the reviewers’ exertion of effort, lessening theirburden.
8. Rating one’s own competence to review a given proposal. In some review systems,particularly for journal or conference paper refereeing, reviewers are asked to rate their ownlevel of expertise in scoring the assigned paper or proposal (perhaps on a coarse scale ofone to three). See Haenni (2008), [6]. These self-confidence levels can then be factored in,via appropriate weighting, when computing an aggregate score for each proposal. At firstglance, since we have eight expert SRP panels rather than one; our expectation might bethat every panel member is fully qualified to review every proposal. But perhaps this ideais worth consideration.
This could alleviate some of the burden on reviewers by relieving them of the need toagonize over score decisions in sub-specialty areas that they are not totally familiar with.It could lessen their embarrassment in the face-to-face panel review if they have been toosqueezed for time to fully research every proposal. Also it could alleviate, somewhat, theneed for score modifications in the SRP group meeting—and thus make the process moreeffective and objective, overall.
9. Grading on a curve. The current normalization scheme does not modify the shape of areviewer’s score distribution (recall that it only shifts and linearly stretches or shrinks it).A simple alternative would be to grade on a curve: i.e., to modify each reviewer’s scoredistribution to match the percentiles of some target distribution, say, a truncated normaldistribution with a mean of five and standard deviation equal to two.
Another possibility would to choose as the target distribution the mean raw scoredistribution, averaging over all reviewers in the panel. That way, the scores of a “typical”reviewer (say, with a mean near 4.0, a standard deviation near 1.85, the typical positiveskew, and few outliers) would be minimally altered. (See Section 5, below.)
11For example, if the scores were restricted to half-integer scores in the range [.5, 9.5], then there wouldbe only 19 distinct possible raw scores, as opposed to 99.
10
2.6. Discussion. At a recent NRAO scientific staff meeting, concerns were raised aboutthe appropriateness of proposal review combining all three instrumental categories (GBT,VLA, VLBA), as opposed to having separate review panels for each of these categories.The consolidated review is likely more economical than the alternative, and it is more inaccord with the prevailing “One Observatory” philosophy. But one might argue that eachinstrument ‘deserves’ its own dedicated review structure.12 A specific concern that wasraised was that, given the premier capabilities of the GBT for pulsar studies, meritoriousproposals for pulsar studies using unique capabilities of the VLA might be unfairly out-competed by GBT proposals in the consolidated review (within the ETP panel). In light ofthis we have taken a detailed look at the distribution of normalized scores, by instrument, forthe four most recent review cycles. Histograms of these distributions, along with summarystatistics, are shown in Appendix C. The cumulative distributions are shown in Appendix D,together with distributional two-sample test statistics (Kolmogorov–Smirnov, Cramer–vonMises, and Anderson–Darling P -values) showing pairwise comparisons: GBT vs. VLA, GBTvs. VLBA, and VLA vs. VLBA.
In Appendix E we compare, for each of the SRPs, the initial rank-order aggregatepreferences vs. the rank order after SRP meeting score adjustments. (Since only the ag-gregated scores are adjusted, as opposed to individual reviewers’ scores, c.d.f. comparisonslike those of Appendix D are not possible.) We were surprised by the large number of scoreadjustments and rather extreme rank-order excursions which are seen in some cases (e.g.,NGA panel, semester 13A).
3. Comparison with Procedures Used Elsewhere
[Note: We apologize that this section is incomplete. We will try to make an updatedversion available.]
European Southern Observatory. An ESO working group undertook a review of their pro-posal selection process in 2012. The group report, and an accompanying study of thegrowth of observing programs at ESO, were published in the December 2012 issue of theESO Messenger [15, 16]. Their review system includes thirteen panels, with six memberseach, to cover four science categories: Cosmology (three panels); Galaxies and Galactic Nu-clei (two panels); ISM, Star Formation, and Planetary Systems (four panels); and StellarEvolution (four panels). Proposals for all the ESO telescopes, at Paranal and LaSilla, aswell as APEX, are reviewed by the same panels. They review typically 1000 proposals persemester, with an average of approximately 70–80 proposals per panel. Their review processis structured similarly to ours. The rating scale is 1 to 5 (low is best), with a granularityof 0.1. The bottom 30% of proposals are “triaged” (i.e., not considered in the post-ratingpanel discussion).13
Additional details are given in [17]. One notable differences from the NRAO reviewprocess is that, in committee, revised scores for proposals are submitted by each reviewer,and this is done by formal written ballot. These scores then are averaged to arrive atthe group consensus (in contrast to the procedure in the NRAO SRPs, where there is noprescribed procedure for score adjustment). Also, apparently, separate votes are taken pertelescope.
ALMA. For ALMA proposals there are five science categories (Cosmology and the high-zuniverse; Galaxies and galactic nuclei; ISM, star formation and astrochemistry; Circum-
12Similar concerns have been raised concerning ESO proposal review. See [15].13However, according to [17] there is a mechanism by which a panel may request that a triaged
proposal be “resurrected”.
11
stellar disks, exoplanets, and solar system; and Stellar evolution and the Sun) [18]. Thereare eleven review panels (two for category 1; three each for categories 2 and 3; two forcategory 4; and one for category 5), with seven members per panel. Initially each proposalis rated by four reviewers. The range of scores is 1 to 10 (low score is best). Each reviewer’sraw score distribution is normalized to a common mean and variance. The last 30% aretriaged, and not considered further by the review committee. Otherwise, the initial scoresare taken as recommendations only. A final rank-ordering of proposals is arrived at byconsensus of the panel members.
Arecibo Observatory. Arecibo has a similar panel structure and similar review procedure.They use a scale of 1 to 9; high numerical score is best.
Hubble Space Telescope.
NOAO.
National Science Foundation and National Institutes of Health. Proposal review proce-dures in use at NSF are not uniform across the various divisions of the agency, accordingto Robert L. Dickman and William E. Howard III (private communication). Based oninformation from the Web, it appears that relatively more uniform procedures have beenadopted by NIH than by NSF.
According to Hal R. Arkes (2003) [20], in 1994 the GAO issued a report with anevaluation of the review procedures at NSF and NIH. With regard to NSF, the authorsof this report “were concerned that the stated criteria by which NSF proposals should beevaluated were not, in fact, the only criteria by which such proposals were evaluated. . . .[and] the GAO ‘stated that we found that unwritten or informal criteria were used by panelsat all three agencies’.” In the 1994–1995 time period, according to Arkes, both NSF andNIH were revamping their review processes, and he was involved with both efforts.
Arkes chose to examine one of the grant review processes, in which each reviewer (outof four, total) was asked to submit numerical scores on each of four specific criteria, and alsoto submit an overall score. He found, by regression analysis (for 70 proposals), that, whilethree out of four reviewers’ overall scores agreed with their scores on the individual criteria(R2 values between 0.80 and 0.95, accounting for a large proportion of the variance in theoverall ratings), one reviewer was not consistently using these four criteria in generating anoverall rating (R2 = 0.28).
Apparently, in the mid-90s NSF and NIH review panels were not generally using scorenormalization. Arkes suggested using z-scores, i.e., standardizing to normal distributionwith a mean of zero and a standard deviation equal to one. This is entirely equivalent toour normalization procedure.
His second suggestion was that proposals be “triaged” before panel discussion (similarlyto the ESO procedure), i.e., to generate “cut-off” scores, so that panelists would not needto discuss proposals that had no chance of funding.
His final suggestion was to use what he termed “disaggregated” reviewers’ ratings—i.e., have the reviewers score only on the the individual criteria, and not submit an overallscore—the argument being that this would make it more likely that solely official criteriawould be used.
The NSF rejected all three of his recommendations (including score normalization).NIH around this time period utilized a 150-point scoring system. Arkes cites psycho-
logical research studies which he said show that “if points on rating scales extend beyondapproximate seven”, rater reliability either drops or fails to increase. Other consultants, hesays, were also recommending that the scale be trimmed.
The only recommendation that NIH adopted was to explicitly request the reviewers12
to rate by each criterion. And they considered score normalization to be too difficult toimplement.
From current Web documentation we find that NIH now asks reviews to score on ascale of 1 to 9. They must provide both an overall impact score and scores on, typically,five specified criteria. The guidelines state specifically that the impact score is not intendedto be an average of criterion scores. Reviewers may modify their initial scores during thepanel review meeting.
4. Rating Aggregation Methods Based on Pairwise Score Comparisons
There is a great deal of current interest in rating and ranking methods, and a large,rapidly growing literature. A comprehensive survey on the subject is given by Amy N. Lang-ville and Carl D. Meyer in a book titled Who’s #1? The Science of Rating and Ranking,published by Princeton University Press in 2012 [5]. There the concentration is on algebraicand graph-theoretic methods (rather than classical or Bayesian statistical methods) whichare widely used today in fields such as sports team ranking, e-commerce (e.g., Amazon,Netflix, most major retailers), and information search and retrieval (e.g., Google).14 Algo-rithmic techniques based on the same theory as these can be used for aggregating the scoresof journal referees or proposal reviewers. On pp. 179–181 of Langville and Meyer there isan aside titled “Ranking NSF Proposals”. Below we will show the result of applying theirsuggested method to the AGN Panel, Cycle 13B reviewers’ scores.
The method they propose is based on the Perron–Frobenius theorem (which dates backto 1907–1912) on the eigenvalues of real, square, non-negative, irreducible15 matrices. Thefoundations for methods of this type were established around the early 1950s (see [7]): byJohn R. Seeley (1949); by T.-H. Wei (1952), a student of the famous statistician MauriceKendall at Cambridge University, in a Ph. D. dissertation titled The Algebraic Foundationsof Ranking Theory; and by Kendall (1955). The idea is that given n entities to compare andan n×n matrix M expressing by how much entity i is favored over entity j (or vice-versa),for all pairs i and j, the normalized eigenvector corresponding to the dominant eigenvalueof M provides the correct ranking. So, for each of m reviewers one can construct a squarematrix Mk of pairwise score differences; i.e., in row i column j of the matrix one has thescore differential |s(i) − s(j)|, if proposal i is rated above proposal j, and 0 otherwise.One normalizes each Mk by dividing by the sum of all entries. (It does not matter if thereviewer has not scored all proposals.) Then one forms the average of these matrices andfind the dominant eigenvector. Assuming the matrix is irreducible (which it will be unlessthe review assignments are inadequate) the Perron–Frobenius theorem guarantees that theelements of the dominant eigenvector will be non-negative. These elements then representthe aggregated scores of the reviewers, according to the theory of Wei et al.
Results using the Langville–Meyer method. Figure 4 shows a comparison between our usualmean standardized scores and the scores obtained using the Langville–Meyer formulationof the method above. The comparison is by means of bipartite score plots. Proposal IDnumbers are shown along the vertical axes. In general, the quartile memberships agreefairly well, but we were surprised by the number of relatively large jumps in rank (which weobserved also for other cycle/panel pairs). In the case of Proposal 8051, we see a jump fromthe third quartile to the first, and for Proposal 7756 we see a similarly large jump—bothof these proposals were rated by only three reviewers. In the first quartile, Proposal 7972
14Google’s CEO Larry Page (a computer scientist) has a net worth of around $31 billion, largely dueto the success of his PageRank algorithm which is at the heart of the Google search engine.
15The definition is somewhat technical.
13
jumps eight places higher in ranking—this proposal was rated by only two reviewers. ForCycle 13B, AGN, the distribution number of reviewers per proposal is as follows: just oneproposal had only two reviewers, three proposals had just three reviewers; seven proposalshad four reviewers, and thirty-nine proposals had five reviewers. The percentages are:2 reviewers, 2%; 3 reviewers, 6%; 4 reviewers, 14%; and 5 reviewers, 78%.
Results using a method due to Gleich and Lim. Another approach to the aggregation ofratings or scores is described by David Gleich and Lek-Heng Lim in [8] and Jiang et al. [9].We summarize the Gleich and Lim process as follows: They begin by noting a connectionto skew-symmetric matrices.16 Given a column vector of n scores s = (s1, . . . , sn)T , thematrix Y of pairwise score differences Yij = si − sj is skew-symmetric. Assuming Y 6= 0,Y is of rank two, since it can be written in the form
Y = s eT − e sT , (3)
where e is a column vector of n ones. Suppose one were given a measured version Y of thatmatrix, contaminated by noise and perhaps missing some elements. In that case, one could
solve for a low-rank, skew-symmetric approximation to Y which is, in some well-defined
sense,17 closest to Y. (We denote the kth individual reviewer’s pairwise score differencematrix by Y(k).)
This is an example of the so-called matrix completion problem, which is a bit of ahot topic these days, as it arises in contexts such as compressive sensing. Algorithms formatrix completion are discussed by Gleich and Lim in [8, Section 3]. They treat the specialproblem of matrix completion restricted to the class of skew-symmetric matrices. Supposewithin a panel we have m reviewers of n proposals, whose scores—or ratings—are given by
an m × n matrix R. We then can form an n × n matrix Y whose elements represent thearithmetic means of reviewers’ pairwise score differences, i.e.,
Yij =
∑m
k=1(Rki − Rkj)
# {k |both Rki and Rkj exist}, (4)
where #{·} denotes the cardinality of the given set,18 and in the case of a denominator
equal to 0, we set Yij = 0. In our case, Y is not an exact pairwise difference matrix becausenot all m reviewers review all proposals.
In Section 3.1 of their paper, Gleich and Lim use the singular value projection algorithm
of Jain et al. [10], to find a rank-2 skew-symmetric approximation Y which is nearest to
16An n × n real matrix M is skew-symmetric if M = −MT.17As measured, say, by a matrix norm.18Gleich and Lim include a few other possibilities: Rather than the arithmetic mean of score differ-
ences, one might choose the (log) geometric mean of score ratios; i.e.,
bYij =
Pmk=1(log Rki − log Rkj)
#˘
k |both Rki and Rkj exist¯ ;
or binary comparison, in which case, Y(k)i,j = sign(Rkj − Rki) and
bYi,j = Probk
(Rki > Rkj) − Probk
(Rki < Rkj) ;
or the logarithmic odds ratio, with bYij = logProbk (Rki ≥ Rkj)
Probk (Rki ≤ Rkj).
14
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
77097731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
79938010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
82258231
9.34758
2.97524
9.9
0.1
MStd Langville-Meyer
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
77097731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
79938010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
82258231
9.34758
2.97524
9.9
0.1
MStd Langville-Meyer
Figure 4. Comparison between mean standardized scores and scores obtained using the Langville–Meyeralgorithm [5, pp. 179 ff.] for the Cycle 13B, AGN Panel. In the plot at left, the vertical scale is linear, fromminimum score (top) to maximum (bottom). (The Langville–Meyer dominant eigenvector scores have beenscaled to the range [0.1, 9.9].) At right, the score distances are ignored; i.e., the comparison is solely byrank order. The first quartile scores are shown in green, the second in blue, etc. Dashed lines correspond
to inter-quartile jumps.
15
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
79727992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
9.9
0.1
MStd Gleich-Lim
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
9.9
0.1
MStd Gleich-Lim
Figure 5. Comparison between mean standardized scores and scores obtained using the Gleich–Lim algo-rithm [8] for the Cycle 13B, AGN Panel. Here we see essentially identical memberships in the first quartile
(the only exception being proposals 7972 and 8178, which stay very close to the quartile boundary), andwe see identical memberships in the fourth quartile. There are two pairs of exchanges between the secondand third quartiles.
16
Y in the sense of the so-called nuclear norm. The nuclear norm is the matrix equivalentof the discrete ℓ1 vector norm. It is also known as the trace norm, or in physics a Ky-Fannorm. For a general real matrix A, the nuclear norm is simply the sum of the singularvalues of A; the singular values are equal to the square roots of the non-zero eigenvalues of
AT A. Given the minimum norm solution Y, the aggregate scores from the algorithm are
s = (1/n)Ye (which we rescale to cover the range [0,10]). A Matlab implementation of thisalgorithm can be found among the research codes available at David Gleich’s home page.19
We used our own Mathematica implementation.
Figure 5 shows a comparison between the mean standardized scores and scores obtainedusing the Gleich–Lim algorithm for the Cycle 13B, AGN Panel. Here we see rather lessextreme differences than in the comparison with Langville–Meyer scores (Fig. 4).
5. A Quadratic Programming Method and a Probabilistic
Score Normalization Algorithm for Score Aggregation
In this section we briefly describe two approaches from the recent literature on cali-brating and aggregating reviewers’ scores. These algorithms assume that high scores arebest, so we would need to additively invert the input raw scores (s 7→ 10 − s), and invertthe output aggregate scores as well.
5.1. The Method of Roos et al. Two interesting papers on calibrating the scores ofbiased reviewers were published in 2011 and 2012 by Roos et al., [11, 12]. These authorsfirst describe a standard linear modeling approach, referred to in the statistical literature,as two-way cross-classification in the analysis of variance (ANOVA), that can be solved bylinear least squares if all reviewers review all proposals. The score model
yij = µ + αi + βij + ǫij (5)
is additive, consisting of the overall mean score µ, the mean difference αi between thescores of reviewer i and µ, the mean difference βj between the scores of proposal j andµ, and a random error ǫij . The errors are assumed to be independent and identicallydistributed (i.i.d.). The solution parameters are the αi, which can be thought of as reviewerlenience parameters, and the βj , which are estimates of intrinsic proposal quality. Forranking n proposals it suffices to have estimates of the quality difference with respect toone chosen proposal, say βi−β1. For the imbalanced case, in which reviewers score differingsubsets of all proposals, a constrained least-squares solver is required, with the constraints∑
i αi = 0 and∑
j βj = 0. This algorithm is inadequate for our purposes, because it doesnot include a multiplicative scale factor.
The authors next describe a nonlinear model
yij = µ + γi(αi + βij + ǫi,j) (6)
which does include scale factors, the γi. With the substitution γi = 1/γi, the least-squaresobjective function becomes
∑
i,j
(yij γi − µγi − αi − βj)2 . (7)
19URL: https://www.cs.purdue.edu/homes/dgleich/codes/
17
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
79247939
7943
7956
7965
7972
7992
79938010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
8.97151
2.48848
MStd Roos
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
79247939
7943
7956
7965
7972
7992
79938010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
8.97151
2.48848
MStd Roos
Figure 6. Comparison between mean standardized scores and scores obtained using the Roos quadraticprogramming algorithm [11] for the Cycle 13B, AGN Panel.
18
Defining a vector x = (β1, . . . , βn, γ1, . . . , γm, α1, . . . , αm), one ends up with the quadraticprogramming problem:
minimize1
2xT Qx
subject to Ax ≥ b ,(8)
where the n × n matrix Q and the constraint matrix A both derive from Equation 7 andthe condition that 1
m
∑γi = 1. The solution can be obtained using existing solvers for
bound-constrained quadratic programming. Roos et al. used the Matlab MINQ; we usedthe Mathematica FindMinimum (which is actually a general-purpose solver). The solutionis the maximum likelihood estimate if the ǫij are i.i.d. Gaussian. A sample result is shownin Figure 6.
.
5.2. Grading on a Curve: The Method of Fernandez, Vallet, andCastells. Fernandez et al. in 2006 published a paper [13] on the topic of probabilisticscore normalization for rank aggregation. Their method consists in precisely matching thepercentage points of each reviewer’s raw score distribution to those of some common targetdistribution, then arithmetically averaging the scores so obtained. Thus it can be thoughtof as an extreme form of “grading on a curve”.
In their own application, in information retrieval, these authors appear to have specificgrounds to favor one target distribution over another, which are not relevant to our ownapplication. However, it occurred to us that it might be interesting in our application toapply this method, choosing as target distribution the mean distribution of raw scores,averaging over all reviewers within the given SRP for the given semester. Our rationale isthat, in this case, the “typical” reviewer would see relatively less difference with respectto his or her initial scores.20 Another thought would be to choose as target distributiona parametrized, bounded distribution (covering the range [0,10]) with similar first fourmoments (mean, standard deviation, skewness, and kurtosis) to the mean score distributionshown in our Figure 2.
The mathematics of this method can be described succinctly: Let F denote the cu-mulative distribution (c.d.f.) of the target distribution, let F (−1) denote the inverse c.d.f.,and let Fr denote the empirical c.d.f. of the reviewer’s raw score distribution. Then thetransformation from raw score to normalized score is
snormalized = F (−1)(Fr(sraw)
). (9)
A sample result is shown in Figure 7.
6. A Proposal by Merrifield and Saari for Distributed Peer Review
We would like to call attention to a 2009 paper by Michael Merrifield and DonaldSaari [14] who advocate an alternative, distributed approach to peer review of telescopeproposals. (Merrifield has served as a member of the ESO proposal review committee.)Their approach would spread the task of proposal review across the user community byrequiring each PI to review a certain number, m, of other proposals. That number mightbe m = 10, for example. (For each additional proposal with the same PI, he or she wouldbe given another m review assignments.) Conflicts of interest would be declared, as usual,in which case alternate assignments would be made. The major advantages are:
(1) that no one would be burdened by the task of reviewing a very large number ofproposals;
20And be less likely to grumble.
19
(2) that the model is scalable: if the number of proposals increases, the number ofreview assignments, per reviewer, does not;
(3) that each proposal would be reviewed by the same number of reviewers—conflictsof interest would not reduce that number; and
(4) that instrument-by-instrument proposal review (GBT, VLA, and VLBA/HSA, eachseparately) would be more affordable than under the traditional review panel model.
Each PI would be required to perform his or her full list of assignments; failure to do sowould result in disqualification of the PI’s own proposal(s). Thus, the workload wouldbe distributed evenly, and, as the authors point out “there is a disincentive to taking thelottery-ticket approach to telescope applications.” And the views of the entire communitywould be taken into account. Consensus rank-order preferences would be assigned using arank-aggregation of the type discussed in Section 4 of this report. Various safeguards wouldbe built-in, in order to award good refereeing.
Brinks et al. [15] report that the ESO working group on ran a test on the method.NSF is sponsoring a pilot study within their Sensors and Sensing Systems program. ThePI on that study, George Hazelrigg, an NSF official, reports that the response to the initialcall for proposals was well-received (private communication).
7. Discussion
With regard to proposal scoring and score aggregation we conclude that:
(1) We are not out of the mainstream, with respect to the procedures used by otherobservatories and at the federal science agencies; however,
(2) Our score aggregation procedure is behind the current state of the art, as exempli-fied by the modern rating and ranking theories—developed by mathematicians andcomputer scientists—that are widely used in the corporate world.
In Section 2.5 we offered suggestions for minor modifications of the current score normal-ization and aggregation procedure. We believe these suggestions should be given carefulconsideration. However, we do not strongly advocate that the alternative score-aggregationprocedures discussed in Section 4 should be adopted. This is because the practical differ-ence from these might well be “in the noise,” in comparison to the score adjustments thatare made in the SRP Panel meetings. (We do note, however, that any of the alternativemethods would be an easy plug-in replacement for the current score-aggregation module.)
On the other hand, if the External Review Committee were to recommend that a higherdegree of reliance be placed on the initial, independently derived SRP scores, then thesemore “sophisticated” score-aggregation methods should be considered.
20
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
7.79674
1.14767
MStd MPSN
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.34758
2.97524
7.79674
1.14767
MStd MPSN
Figure 7. Comparison between mean standardized scores and scores obtained using the algorithm of
Section 5.2 (Fernandez et al. [13], “grading on a curve”) for the Semester 13B, AGN Panel proposals. Thetarget distribution is the distribution of mean scores including all panel members. There are no inter-quartilejumps.
21
References
[1] Bryan Butler, “Requirements for the PST for the new NRAO proposal evaluation and time allocationprocess”, Version 2.10, NRAO, October 13, 2010; included here as Appendix A.
[2] J. L. Hodges, Jr. and E. L. Lehmann, “Estimates of location based on ranks tests”, Ann. Math. Stat.,Vol. 34, No. 2, 1963, pp. 598–611.
[3] Peter J. Rousseeuw and Christophe Croux, “Alternatives to the median absolute deviation”, J. Amer.Stat. Assoc., Vol. 88, No. 424, 1273–1283.
[4] Bernard W. Silverman, Density Estimation for Statistics and Data Analysis, Monographs on Statisticsand Applied Probability 26, Chapman & Hall/CRC, 1986.
[5] Amy N. Langville and Carl D. Meyer, Who’s #1? The Science of Rating and Ranking, Princeton
University Press, 2012.[6] Rolf Haenni, “Aggregating referee scores: an algebraic approach”, preprint, 2008; available on the Web
at: www.iam.unibe.ch/ run/papers/haenni08e.pdf .[7] Sebastiano Vigna, “Spectral Ranking”, preprint Nov. 8, 2013; see arXiv:0912.0238v13 .[8] David F. Gleich and Lek-Heng Lim, “Rank aggregation via nuclear norm minimization”, preprint,
Feb. 23, 2011; see arXiv:1102.4821v1 .[9] Xiaoye Jiang, Lek-Heng Lim, Yuan Yao, and Yinyu Ye, “Part 1: Rank aggregation via Hodge Theory”,
NIPS Workshop on Advances in Ranking, Neural Information Processing Systems Foundation, Waikiki,
HI, Dec. 2009; also, arXiv:0811.1067v2 .[10] Raghu Meka, Prateek Jain, and Inderjit S. Dhilon, “Guaranteed rank minimization via singular value
projection”, preprint, 2009, arXiv:0909.5457 .[11] Magnus Roos, Jorg Rothe, and Bjorn Scheuermann, “How to calibrate the scores of biased review-
ers by quadratic programming”, in Proceedings of the Twenty-Fifth AAAI Conference on ArtificialIntelligence, 2011, pp. 255–260, Association for the Advancement of Artificial Intelligence.
[12] Magnus Roos, Jorg Rothe, Joachim Rudolph, Bjorn Scheuermann, and Dietrich Stoyan, “A statisticalapproach to calibrating the scores of biased reviewers: the linear vs. the nonlinear model”, in Sixth
Multidisciplinary Workshop on Advances in Preference Handling, 2012.[13] Miriam Fernandez, David Vallet, and Pablo Castells, “Probabilistic score normalization for rank aggre-
gation”, in Advances in Information Retrieval: 28th European Conference on IR Research, ECIR2006,Eds. M. Laimas et al., Lecture Notes in Computer Science, Vol. 2936, 2006, Springer Berlin/Heidelberg,pp. 553–556.
[14] Michael Merrifield and Donald Saari, “Telescope time without tears: a distributed approach to peerreview”, Astronomy and Geophysics, Vol. 50, Issue 4, Aug. 2009, pp. 4.16–4.20.
[15] Elias Brinks, Bruno Leibundgut, and Gautier Mathys, “Report of the ESO OPC Working Group”,
European Southern Observatory, The Messenger, Vol. 150, Dec. 2012, pp. 21–25.[16] Ferdinando Patat and Gaitee Hussain, “Growth of observing programs at ESO”, European Southern
Observatory, The Messenger, Vol. 150, Dec. 2012, pp. 17–20.[17] European Southern Observatory, ESO Period 90: A step-by-step guide for OPC & Panel members;
available on the Web at www.vt-2004.org/public/about-eso/.../opc/docs/P90 step-by-step.pdf .[18] Francoise Combes, “ALMA Proposal Review”, Observatoire de Paris, slide presentation dated 12 No-
vember 2013; see www.asa2013.sciencesconf.org/file/56031 .[19] National Science Foundation, “Dear Colleague Letter: Information to Principal Investigators (PIs)
Planning to Submit Proposals to the Sensors and Sensing Systems (SSS) Program October 1, 2013,
Deadline,” www.nsf.gov/publications/pub summ.jsp?ods key=nsf13096 .[20] Hal R. Arkes, “The non-use of psychological research at two federal agencies”, Psychological Sci.,
Vol. 14, 2003, pp. 1–6.[21] Neill Reid, “Behind the TAC Process”, Space Science Telescope Institute, Feb. 13, 2014.
22
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 2
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 7
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 8
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 26
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 125
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 133
AGN Panel Raw Score Distributions, Cycle 12B H55 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
2 52 3.52 1.40 0.14 2.79 3.50 3.50 1.04 1.19 1.457 48 3.06 1.18 0.63 2.87 3.00 3.00 1.48 1.19 1.038 53 3.08 1.75 0.94 3.29 2.50 3.00 1.48 1.82 2.1626 17 3.62 1.68 0.38 2.12 3.50 3.50 2.22 1.89 2.05125 48 5.45 2.73 -0.02 2.36 5.00 5.40 2.22 2.98 2.88133 49 3.35 2.06 0.81 2.86 3.00 3.00 1.48 2.43 2.16
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 2
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 8
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 125
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 136
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 139
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 140
AGN Panel Raw Score Distributions, Cycle 13A H59 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
2 57 3.97 1.50 0.03 2.13 4.00 4.00 1.48 1.58 1.738 56 3.26 2.18 0.85 3.09 2.95 3.05 2.00 2.03 1.87125 46 4.46 2.41 -0.05 2.33 4.10 4.50 2.82 2.39 2.46136 17 5.12 2.50 0.06 1.74 4.00 5.00 2.97 2.52 2.05139 49 4.17 3.09 0.33 1.72 3.60 4.20 3.56 3.77 3.02140 32 5.33 2.50 -0.09 1.85 5.00 5.25 3.34 2.98 1.99
Figure B-1. This figure shows AGN panel reviewers’ raw score distributions for proposal cycles 12B, 13A,13B, and 14B. The mean, standard deviation, skewness, and kurtosis are given in tabular form, as are
the sample median and the Hodges–Lehmann robust/resistant estimates of distribution centrality. Besidesthe standard deviation, three other estimates of distribution scale are shown: the (scaled) median absolutedeviation (MAD) about the median, and the Sn and Qn scale estimators of Rousseeuw and Croux. GMVAproposal scores are excluded from these distributions. The ordinate in each case represents the probabilitydensity. (Continued on next page.)
30
Appendix B
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 125
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 136
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 149
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 150
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 151
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 170
AGN Panel Raw Score Distributions, Cycle 13B H50 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
125 34 4.57 2.75 0.38 2.15 4.25 4.50 2.74 2.98 3.00136 25 3.20 1.77 0.85 3.42 3.00 3.00 1.48 1.86 2.10149 38 3.67 1.92 -0.01 2.34 3.85 3.70 1.85 2.03 2.02150 42 3.25 2.41 0.95 3.01 2.80 3.00 2.45 2.15 2.04151 48 1.57 1.62 1.97 6.68 1.00 1.25 0.82 0.83 1.03170 47 3.77 1.60 0.29 2.18 4.00 4.00 1.48 1.22 2.16
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 136
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 149
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 150
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 151
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 170
AGN Panel Raw Score Distributions, Cycle 14A H60 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
136 21 2.95 1.42 0.87 2.80 2.50 2.75 1.48 1.25 1.04149 51 5.00 1.97 0.00 2.87 5.00 5.00 1.93 2.06 1.95150 53 4.62 2.02 0.53 3.29 4.50 4.50 1.48 1.82 2.16151 58 2.11 1.86 1.33 4.58 1.50 1.90 1.48 1.55 1.46170 57 4.54 2.16 0.28 1.85 4.00 4.50 1.48 2.42 2.17
Figure B-1 (Continued from previous page).
31
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 1
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 5
0 2 4 6 8 100.000.020.040.060.080.100.120.14
Reviewer 6
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 115
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 116
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 126
EGS Panel Raw Score Distributions, Cycle 12B H33 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
1 33 4.49 1.28 -0.08 3.02 4.50 4.50 1.04 1.23 1.075 33 4.95 1.63 -0.20 1.95 5.00 4.95 1.93 1.84 1.926 32 5.06 2.40 0.05 1.93 5.00 5.00 2.97 2.39 1.99115 27 2.89 1.78 0.67 2.87 3.00 2.50 1.48 2.47 2.11116 29 4.53 2.32 0.09 2.54 4.50 4.50 2.22 2.46 2.12126 10 2.35 0.94 0.35 2.13 2.25 2.25 1.11 1.19 0.81
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 1
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 5
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 115
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 116
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 126
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 138
EGS Panel Raw Score Distributions, Cycle 13A H48 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
1 41 4.42 0.99 0.49 4.00 4.20 4.35 0.59 0.61 0.865 41 4.56 2.31 -0.08 1.90 4.50 4.55 2.97 2.56 2.58115 46 4.28 1.44 0.13 2.33 4.00 4.25 1.48 1.79 2.05116 39 4.23 1.95 0.43 2.15 4.00 4.25 2.22 2.44 2.14126 16 3.41 1.11 -0.36 2.54 3.25 3.50 1.11 1.19 0.90138 39 3.96 1.71 0.25 2.42 4.00 4.00 1.48 1.83 2.14
Figure B-2. Like Figure B-1 except showing the EGS panel reviewers’ raw score distributions (Continuedon next page.)
32
Appendix B
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 115
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 116
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 126
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 138
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 145
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 146
EGS Panel Raw Score Distributions, Cycle 13B H33 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
115 33 4.05 2.01 0.11 1.52 4.00 4.00 2.97 2.45 2.13116 30 4.57 2.08 0.15 2.42 5.00 4.50 1.48 2.39 1.97126 8 3.69 1.85 0.86 2.43 3.00 3.25 0.74 1.20 1.49138 30 5.33 2.49 -0.13 1.76 5.00 5.25 3.71 2.98 2.96145 28 4.27 1.91 0.58 2.96 4.10 4.25 2.08 1.79 1.96146 30 4.80 2.17 0.43 2.19 4.75 4.75 2.22 2.39 1.97
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 115
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 116
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 126
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 138
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 145
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 146
EGS Panel Raw Score Distributions, Cycle 14A H59 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
115 58 4.06 1.88 0.27 2.27 3.60 4.00 2.08 1.79 2.09116 51 4.81 2.08 0.14 2.32 5.00 4.75 2.97 2.43 2.16126 11 4.77 2.03 0.05 2.08 5.00 4.75 1.48 2.60 1.97138 52 4.67 1.89 0.17 2.48 5.00 4.50 1.48 2.39 2.07145 56 3.36 2.23 0.96 3.68 3.00 3.25 2.22 2.39 2.08146 59 4.97 2.24 0.22 1.81 4.50 5.00 2.97 2.42 2.17
Figure B-2 (Continued from previous page).
33
Appendix B
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 31
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 51
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 53
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 54
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 114
ETP Panel Raw Score Distributions, Cycle 12B H57 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
31 28 5.84 1.32 0.06 1.80 6.00 5.75 1.48 1.67 0.9851 54 4.95 1.41 -0.13 2.32 5.05 4.95 1.63 1.67 1.4553 55 4.95 2.06 0.04 2.17 5.00 5.00 2.22 2.42 2.1754 54 3.56 2.08 1.02 3.53 3.00 3.50 1.48 1.79 2.08114 32 3.12 1.78 0.61 3.07 2.90 3.05 1.85 1.79 1.79
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 31
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 51
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 53
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 54
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 114
ETP Panel Raw Score Distributions, Cycle 13A H47 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
31 26 4.23 2.07 0.24 2.48 4.50 4.25 2.22 2.39 1.9451 46 3.60 0.91 0.06 2.44 3.65 3.60 1.11 1.07 0.8253 44 4.99 1.91 -0.31 2.25 5.50 5.00 1.85 1.79 2.0554 40 4.09 2.31 0.64 2.43 3.75 4.00 2.59 2.39 2.03114 32 4.93 2.44 0.04 2.38 5.00 5.00 2.89 2.39 2.18
Figure B-3. Like Figure B-1 except showing the ETP panel reviewers’ raw score distributions (Continuedon next page.)
34
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 31
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 53
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 114
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 162
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 163
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 164
ETP Panel Raw Score Distributions, Cycle 13B H51 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
31 25 2.61 1.62 1.12 3.73 2.00 2.50 1.48 1.24 1.0553 45 4.77 1.91 0.25 2.39 4.50 4.75 2.22 1.83 2.15114 33 4.50 2.16 0.42 2.61 4.20 4.40 2.08 2.08 2.13162 33 4.09 1.77 0.75 2.63 4.00 4.00 1.48 1.23 2.13163 44 3.29 0.61 -0.27 2.58 3.30 3.30 0.59 0.72 0.61164 41 3.58 1.96 0.46 2.19 3.50 3.50 2.22 2.19 2.15
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 53
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 114
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 162
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 163
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 164
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 186
ETP Panel Raw Score Distributions, Cycle 14A H61 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
53 28 4.25 1.51 0.19 3.13 4.00 4.25 1.48 1.19 1.96114 42 5.00 1.88 0.00 2.51 5.00 5.00 1.78 1.91 2.04162 39 4.15 1.90 0.96 2.72 3.50 4.00 1.48 1.22 1.07163 57 2.57 0.69 0.98 3.93 2.50 2.50 0.59 0.61 0.65164 46 4.31 2.20 -0.19 1.91 4.70 4.40 2.89 2.50 2.26186 58 4.12 1.98 0.11 1.80 3.80 4.10 2.67 2.39 2.09
Figure B-3 (Continued from previous page).
35
Appendix B
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 39
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 40
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Reviewer 89
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 117
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 118
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 129
HIZ Panel Raw Score Distributions, Cycle 12B H44 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
39 37 4.66 2.65 0.05 1.76 4.50 4.75 3.71 3.06 3.2140 38 4.51 1.60 0.36 2.84 4.50 4.50 1.48 1.19 2.0289 8 4.25 1.83 0.67 2.08 3.50 4.00 1.11 1.20 1.49117 40 3.84 1.72 0.10 2.20 3.65 3.78 2.00 1.79 2.03118 40 4.98 2.57 -0.08 1.90 5.00 5.00 2.97 2.98 3.04129 43 5.12 1.16 0.00 2.21 5.00 5.00 1.48 1.22 1.08
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 39
0 2 4 6 8 100.0
0.1
0.2
0.3
Reviewer 40
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 89
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 117
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 118
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 129
HIZ Panel Raw Score Distributions, Cycle 13A H51 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
39 40 4.65 1.77 -0.24 2.15 5.00 4.75 1.85 1.79 2.0340 43 4.62 1.26 0.05 2.29 4.50 4.65 1.48 1.58 1.0889 11 4.91 1.81 -0.39 2.07 5.00 5.00 1.48 2.60 1.97117 50 3.28 1.31 0.01 2.43 3.50 3.25 1.48 1.19 1.03118 50 3.34 2.03 0.64 2.35 3.00 3.25 2.22 2.39 2.06129 51 4.92 1.12 0.00 2.49 5.00 5.00 1.48 1.21 1.08
Figure B-4. Like Figure B-1 except showing the HIZ panel reviewers’ raw score distributions (Continuedon next page.)
36
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 89
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 117
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 118
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Reviewer 129
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 154
0 2 4 6 8 100.00.10.20.30.40.50.60.7
Reviewer 155
HIZ Panel Raw Score Distributions, Cycle 13B H54 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
89 5 6.40 2.07 -0.97 2.48 7.00 7.00 1.48 1.61 1.88117 54 2.80 1.21 0.07 2.22 3.00 2.75 1.48 1.19 1.04118 52 4.13 2.45 0.34 2.09 4.00 4.00 2.97 2.98 2.07129 54 4.91 1.00 0.49 2.16 4.85 4.90 1.26 1.19 1.04154 49 3.55 1.94 0.74 2.91 3.00 3.50 1.48 1.21 2.16155 4 3.83 2.53 0.95 2.14 2.90 3.20 1.11 1.71 1.71
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Reviewer 117
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 118
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 129
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 154
0 2 4 6 8 100.0
0.2
0.4
0.6
Reviewer 155
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 183
HIZ Panel Raw Score Distributions, Cycle 14A H65 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
117 62 2.79 1.03 -0.03 1.87 3.00 2.75 1.48 1.19 1.05118 21 2.95 1.88 0.93 2.80 2.50 2.75 1.48 1.87 2.08129 63 4.97 1.10 -0.25 2.83 5.00 5.00 1.19 1.21 1.09154 65 3.83 2.54 0.56 2.24 3.00 3.75 2.97 2.42 2.18155 56 4.54 1.07 -0.13 3.19 4.50 4.50 0.74 1.19 1.04183 45 4.43 2.40 0.43 2.25 4.00 4.25 2.97 2.43 2.15
Figure B-4 (Continued from previous page).
37
Appendix B
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 28
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 42
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 43
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 45
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 46
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 77
ISM Panel Raw Score Distributions, Cycle 12B H28 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
28 5 4.80 2.17 0.28 2.36 5.00 5.00 1.48 1.61 1.8842 26 4.92 1.90 0.04 1.83 5.00 5.00 2.97 2.39 1.9443 25 3.94 1.32 1.08 4.79 4.00 3.75 1.48 1.24 1.0545 27 4.55 2.31 0.01 2.09 4.50 4.50 2.22 2.47 2.1146 28 3.48 1.18 0.68 2.48 3.00 3.50 0.74 1.19 0.9877 27 4.65 1.75 0.50 2.44 4.50 4.50 2.22 1.85 2.11
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 28
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 42
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 43
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 45
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 46
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 77
ISM Panel Raw Score Distributions, Cycle 13A H60 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
28 16 4.78 1.68 0.27 2.23 4.75 4.75 1.85 2.39 1.8042 52 4.52 1.78 0.62 3.23 4.00 4.50 1.48 2.39 2.0743 57 4.41 1.53 0.57 2.98 4.00 4.25 1.48 1.21 1.0845 53 4.57 1.69 0.30 2.41 4.00 4.50 1.48 2.43 2.1646 60 3.95 1.51 1.01 3.61 3.50 3.75 1.11 1.19 1.0477 49 5.13 1.92 0.30 2.03 5.00 5.00 2.97 2.43 2.16
Figure B-5. Like Figure B-1 except showing the ISM panel reviewers’ raw score distributions (Continuedon next page.)
38
Appendix B
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 156
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 157
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 158
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 159
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 160
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 168
ISM Panel Raw Score Distributions, Cycle 13B H42 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
156 34 4.10 1.80 0.70 2.84 4.00 4.00 1.48 1.79 2.00157 41 2.86 1.89 0.41 2.45 3.00 2.75 1.48 2.44 2.15158 39 3.50 1.87 0.40 1.95 3.00 3.50 2.22 2.44 2.14159 42 4.25 1.83 1.06 3.63 3.80 4.00 1.48 1.43 1.63160 23 5.07 2.45 -0.08 1.59 5.00 5.00 2.97 3.10 2.09168 19 5.01 2.72 -0.19 1.85 5.00 5.00 2.97 3.63 3.93
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 156
0 2 4 6 8 100.000.020.040.060.080.100.120.14
Reviewer 157
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 158
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 159
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 168
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 184
ISM Panel Raw Score Distributions, Cycle 14A H61 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
156 59 4.68 1.69 0.08 2.15 4.50 4.75 2.22 1.82 2.17157 61 4.72 2.87 0.07 1.74 4.50 4.75 3.71 3.03 3.26158 24 3.31 1.67 0.97 3.19 3.00 3.00 1.48 1.19 0.96159 54 5.47 1.93 0.14 2.44 5.50 5.50 1.93 1.91 1.87168 19 5.42 2.76 -0.19 1.80 5.00 5.50 2.97 3.76 4.14184 47 3.80 1.42 0.88 4.11 4.00 3.75 1.48 1.70 1.08
Figure B-5 (Continued from previous page).
39
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 34
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 35
0 2 4 6 8 100.00.10.20.30.40.50.60.7
Reviewer 36
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 119
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 120
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 121
NGA Panel Raw Score Distributions, Cycle 12B H28 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
34 26 3.71 1.33 0.89 3.05 3.50 3.60 1.26 1.19 1.1635 26 4.29 1.96 0.17 2.14 4.25 4.25 1.85 2.39 1.9436 10 3.35 1.52 0.76 3.04 3.00 3.20 1.11 1.19 1.61119 23 4.53 1.46 0.70 3.19 4.00 4.40 1.19 1.24 1.47120 26 4.46 2.09 0.26 2.89 4.65 4.40 2.00 2.39 1.94121 27 4.21 2.14 0.12 2.16 4.00 4.25 2.22 2.47 2.11
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 34
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 35
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 36
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 119
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 120
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 121
NGA Panel Raw Score Distributions, Cycle 13A H43 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
34 27 3.91 1.54 -0.18 2.90 4.00 4.00 1.48 1.85 1.6935 43 4.09 2.15 -0.13 2.22 4.00 4.00 2.97 2.44 2.1536 15 3.32 1.47 0.12 1.94 3.50 3.25 1.48 1.90 2.03119 35 5.33 1.03 0.67 2.73 5.00 5.25 0.74 0.86 1.07120 39 4.57 1.94 -0.22 2.38 5.00 4.60 1.78 1.95 1.93121 41 6.28 2.04 -0.24 2.01 6.50 6.25 2.22 2.44 2.15
Figure B-6. Like Figure B-1 except showing the NGA panel reviewers’ raw score distributions (Continuedon next page.)
40
Appendix B
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 39
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 119
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 120
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 121
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 152
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 153
NGA Panel Raw Score Distributions, Cycle 13B H33 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
39 13 3.88 2.06 0.41 2.30 3.50 3.75 2.22 2.56 2.01119 25 4.31 1.15 0.38 2.33 4.50 4.25 1.48 1.24 1.05120 31 4.39 1.40 -0.10 2.01 4.50 4.40 1.93 1.60 1.49121 31 4.21 2.13 0.45 2.29 4.00 4.00 2.22 2.46 2.13152 29 4.51 1.49 0.10 1.80 4.50 4.45 1.63 1.85 1.70153 31 4.88 2.82 0.26 1.86 4.70 4.85 3.41 3.32 2.98
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 39
0 2 4 6 8 100.00.10.20.30.40.50.6
Reviewer 119
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 120
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 121
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 152
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 153
NGA Panel Raw Score Distributions, Cycle 14A H46 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
39 27 4.26 2.44 -0.06 1.69 4.50 4.25 2.97 3.08 2.11119 30 3.74 1.18 0.87 2.74 3.20 3.60 0.89 0.83 0.99120 41 4.64 1.59 -0.15 2.65 4.70 4.70 1.33 1.59 1.50121 41 3.43 1.48 0.09 2.11 3.50 3.50 1.48 1.83 1.07152 35 4.46 1.37 0.24 1.76 4.30 4.50 1.78 1.59 1.50153 45 4.77 2.74 0.17 1.88 4.00 4.75 2.97 3.04 3.02
Figure B-6 (Continued from previous page).
41
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 78
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 122
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 123
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Reviewer 124
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 134
SFM Panel Raw Score Distributions, Cycle 12B H54 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
78 54 4.37 1.47 0.11 2.21 4.00 4.50 1.48 1.19 2.08122 37 3.68 1.92 0.52 2.35 3.00 3.50 1.48 2.44 2.14123 37 4.22 2.86 0.35 2.07 4.00 4.15 3.56 3.18 3.00124 50 4.37 1.51 0.77 3.46 4.00 4.25 1.48 1.19 1.03134 46 4.37 2.35 0.33 1.96 4.25 4.25 3.04 2.86 2.46
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 30
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 48
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 78
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 122
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 123
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 124
SFM Panel Raw Score Distributions, Cycle 13A H64 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
30 30 4.28 1.83 0.34 2.51 4.00 4.25 1.48 1.79 1.9748 59 3.00 1.41 0.26 2.49 3.00 3.00 1.48 1.21 1.0978 61 4.15 1.32 0.09 2.31 4.20 4.15 1.48 1.33 1.30122 60 3.28 1.73 0.43 2.16 3.00 3.25 1.85 1.79 2.09123 45 4.01 2.15 0.46 2.88 4.00 3.90 2.08 2.19 2.15124 48 4.05 2.13 1.06 3.20 3.50 3.75 1.48 1.79 1.85
Figure B-7. Like Figure B-1 except showing the SFM panel reviewers’ raw score distributions (Continuedon next page.)
42
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 122
0 2 4 6 8 100.000.020.040.060.080.100.120.14
Reviewer 123
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 124
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 161
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 169
0 2 4 6 8 100.0
0.1
0.2
0.3
Reviewer 173
SFM Panel Raw Score Distributions, Cycle 13B H45 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
122 34 3.09 1.31 1.21 4.48 3.00 3.00 1.04 1.19 1.00123 40 4.12 2.76 0.45 2.17 3.70 4.00 2.97 2.98 2.84124 42 3.50 1.62 1.31 5.19 3.40 3.30 1.33 1.19 1.22161 35 3.31 1.69 0.15 1.61 3.00 3.25 2.22 1.84 2.14169 11 2.60 1.92 1.60 4.89 2.50 2.50 1.93 1.69 2.56173 40 2.85 2.07 1.17 3.32 2.00 2.50 1.48 1.19 2.03
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 122
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 123
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 161
0 2 4 6 8 100.00.10.20.30.40.50.60.7
Reviewer 169
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 173
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
0.6
Reviewer 185
SFM Panel Raw Score Distributions, Cycle 14A H61 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
122 41 3.88 1.48 0.88 3.93 4.00 3.75 1.48 1.83 1.07123 54 4.34 2.41 0.18 2.08 4.35 4.30 2.82 2.62 2.49161 31 3.34 1.92 0.33 1.86 3.00 3.50 2.97 2.46 2.13169 15 4.00 1.13 -0.30 2.50 4.00 4.00 1.48 1.27 2.03173 57 2.51 1.57 1.30 4.42 2.00 2.25 1.48 1.21 1.08185 53 2.46 1.39 1.40 4.38 2.00 2.25 0.74 1.21 1.08
Figure B-7 (Continued from previous page).
43
Appendix B
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 11
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 12
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 25
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 74
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
Reviewer 105
0 2 4 6 8 100.000.020.040.060.080.100.120.14
Reviewer 113
SSP Panel Raw Score Distributions, Cycle 12B H22 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
11 22 3.20 2.46 0.80 2.08 2.25 2.75 1.70 1.79 1.8912 18 2.46 1.82 0.74 2.45 2.00 2.25 1.48 1.79 1.8325 12 4.02 1.57 0.08 3.04 4.00 4.00 1.11 1.19 1.6974 16 4.03 2.10 -0.18 1.91 4.25 4.00 2.37 2.39 2.33105 19 3.13 1.79 0.18 1.64 3.00 3.00 2.97 2.50 2.07113 19 5.27 3.22 -0.23 1.65 6.00 5.25 4.45 3.76 3.10
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 11
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 12
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 25
0 2 4 6 8 100.000.050.100.150.200.250.30
Reviewer 105
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 113
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 137
SSP Panel Raw Score Distributions, Cycle 13A H40 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
11 39 1.68 1.10 0.77 2.43 1.50 1.50 1.48 1.22 1.0712 36 2.03 1.64 2.48 10.51 1.50 1.75 0.74 1.19 1.0025 8 3.88 1.38 0.15 1.79 3.75 3.88 1.85 1.80 1.49105 39 3.14 1.98 0.77 2.54 2.50 3.00 2.22 1.83 2.14113 38 3.66 2.63 0.90 2.78 2.60 3.45 1.70 1.91 2.02137 36 3.81 1.62 1.10 4.32 3.50 3.75 1.11 1.19 2.01
Figure B-8. Like Figure B-1 except showing the SSP panel reviewers’ raw score distributions (Continuedon next page.)
44
Appendix B
0 2 4 6 8 100.0
0.1
0.2
0.3
Reviewer 25
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 105
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 113
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
Reviewer 137
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 147
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
Reviewer 148
SSP Panel Raw Score Distributions, Cycle 13B H43 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
25 11 4.33 1.43 0.66 2.43 4.00 4.25 1.48 1.30 1.38105 40 3.06 2.24 1.20 4.19 2.00 3.00 1.48 1.19 2.03113 41 4.13 2.55 0.39 2.53 3.80 3.80 1.78 2.93 2.58137 41 4.39 1.99 0.59 2.51 4.00 4.50 1.48 2.44 2.15147 38 4.87 1.47 0.40 3.08 5.00 4.75 1.48 1.79 1.01148 40 3.91 2.92 0.67 2.29 3.25 3.65 2.59 2.98 2.84
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
Reviewer 105
0 2 4 6 8 100.00
0.05
0.10
0.15
Reviewer 137
0 2 4 6 8 100.000.050.100.150.200.250.300.35
Reviewer 147
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
Reviewer 180
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
Reviewer 181
SSP Panel Raw Score Distributions, Cycle 14A H32 proposalsL
nr Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
105 32 2.56 1.75 1.29 4.21 2.00 2.50 1.48 1.19 1.99137 32 4.56 2.24 0.12 1.89 4.25 4.50 2.59 2.39 1.99147 32 4.91 1.43 -0.23 1.98 5.00 5.00 1.48 1.79 1.99180 32 2.47 0.75 1.48 4.93 2.25 2.35 0.52 0.60 0.60181 32 2.28 1.50 0.76 3.00 2.00 2.18 1.48 1.19 1.39
Figure B-8 (Continued from previous page).
45
Appendix C. Distribution of Normalized Scores, by Instrument,
for Proposal Review Cycles 12B, 13A, 13B, and 14A
46
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
AGN Panel, Cycle 12B
0 2 4 6 8 1001234567
GBT, n = 20
0 2 4 6 8 100
10
20
30
40
VLA, n = 159
0 2 4 6 8 1005
101520253035
VLBA�HSA, n = 88
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.30 2.37 0.54 2.88 3.91 4.24 2.44 2.73 2.53VLA 5.00 1.94 0.58 2.75 4.66 4.87 1.61 1.89 1.93
VLBA�HSA 5.15 1.94 0.46 2.81 4.90 5.07 1.81 1.94 1.98
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
AGN Panel, Cycle 13A
0 2 4 6 8 1002468
101214
GBT, n = 31
0 2 4 6 8 1005
1015202530
VLA, n = 70
0 2 4 6 8 1005
1015202530
VLBA�HSA, n = 156
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.64 1.79 0.62 2.80 4.62 4.51 1.75 1.69 1.95VLA 4.68 1.86 0.46 2.27 4.53 4.60 2.11 1.96 1.92
VLBA�HSA 5.21 2.05 0.06 2.23 5.04 5.20 2.37 2.17 2.14
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
AGN Panel, Cycle 13B
0 2 4 6 8 1001234567
GBT, n = 17
0 2 4 6 8 100
10
20
30
40
50VLA, n = 115
0 2 4 6 8 1005
10152025
VLBA�HSA, n = 102
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.95 2.10 0.66 2.52 4.87 4.78 2.40 2.21 2.38VLA 4.99 2.10 0.85 3.27 4.59 4.79 1.90 1.94 1.90
VLBA�HSA 5.02 1.84 0.68 3.74 4.87 4.91 1.90 1.85 1.86
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
AGN Panel, Cycle 14A
0 2 4 6 8 1001234567
GBT, n = 17
0 2 4 6 8 1005
1015202530
VLA, n = 123
0 2 4 6 8 100
5
10
15
20
VLBA�HSA, n = 100
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.42 2.25 0.01 1.87 3.80 4.47 3.07 2.63 2.37VLA 4.92 1.97 0.67 3.04 4.50 4.79 1.66 1.96 1.89
VLBA�HSA 5.19 1.95 0.68 3.42 4.88 5.08 1.95 1.92 1.87
Figure C-1. This figure shows plots of the distribution of AGN panel normalized scores, by instrumentalcategory (GBT, VLA, and VLBA/HSA), together with summary statistics, for proposal review cycles 12B,13A, 13B, and 14A. GMVA scores have been excluded from these distributions (and excluded from the
score normalization).
47
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
EGS Panel, Cycle 12B
0 2 4 6 8 1005
10152025
GBT, n = 64
0 2 4 6 8 1005
10152025
VLA, n = 85
0 2 4 6 8 100123456
VLBA�HSA, n = 15
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.02 1.83 -0.11 2.78 5.32 5.03 1.83 1.88 1.81VLA 5.17 2.03 0.18 2.09 5.01 5.16 2.38 2.12 2.16
VLBA�HSA 3.93 2.00 0.40 3.28 4.23 3.79 1.41 2.06 1.87
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
EGS Panel, Cycle 13A
0 2 4 6 8 100
5
10
15
20GBT, n = 73
0 2 4 6 8 1005
1015202530
VLA, n = 149
0 2 4 6 8 10-1.0
-0.5
0.0
0.5
1.0
VLBA�HSA, n = 0
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.94 1.79 0.26 2.38 4.76 4.90 2.07 1.89 1.82VLA 5.03 2.07 0.17 2.55 4.76 4.99 2.07 2.18 2.21
VLBA�HSA
0 2 4 6 8 100.00
0.05
0.10
0.15
EGS Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
20
GBT, n = 55
0 2 4 6 8 100
10
20
30
40VLA, n = 100
0 2 4 6 8 100.00.51.01.52.02.53.0
VLBA�HSA, n = 4
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.70 1.83 0.36 2.09 4.45 4.68 1.97 2.12 1.89VLA 5.22 2.00 0.19 2.17 5.18 5.18 2.51 2.20 2.09
VLBA�HSA 3.65 2.42 0.49 1.64 3.11 3.65 2.00 2.90 2.90
0 2 4 6 8 100.00
0.05
0.10
0.15
EGS Panel, Cycle 14A
0 2 4 6 8 100
5
10
15
20
GBT, n = 109
0 2 4 6 8 1005
1015202530
VLA, n = 168
0 2 4 6 8 100
1
2
3
4
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.02 1.83 0.43 2.94 5.02 4.95 1.99 1.96 1.96VLA 5.05 2.06 0.29 2.31 4.82 5.00 2.31 2.28 2.15
VLBA�HSA 3.96 2.07 0.83 2.39 3.28 3.70 1.71 1.78 1.72
Figure C-2. Like Figure C-1 except showing the EGS panel normalized score distributions by instrument.
48
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
ETP Panel, Cycle 12B
0 2 4 6 8 1005
1015202530
GBT, n = 81
0 2 4 6 8 100
5
10
15
20
25VLA, n = 102
0 2 4 6 8 1002468
1012
VLBA�HSA, n = 40
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.63 1.92 0.35 3.40 4.31 4.59 1.91 1.89 2.02VLA 5.10 1.93 0.15 1.91 4.99 5.07 2.23 2.26 1.99
VLBA�HSA 5.50 2.14 0.52 2.45 5.16 5.36 2.22 2.29 2.01
0 2 4 6 8 100.00
0.05
0.10
0.15
ETP Panel, Cycle 13A
0 2 4 6 8 100
5
10
15
GBT, n = 52
0 2 4 6 8 100
5
10
15
20VLA, n = 92
0 2 4 6 8 1002468
101214
VLBA�HSA, n = 44
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.93 2.14 0.47 2.40 4.97 4.80 2.48 2.52 2.12VLA 5.00 1.92 0.00 2.54 5.01 5.01 2.32 2.10 2.00
VLBA�HSA 5.09 1.94 -0.18 2.07 5.45 5.14 2.44 2.05 2.10
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
ETP Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
20GBT, n = 84
0 2 4 6 8 100
5
10
15
20
25VLA, n = 95
0 2 4 6 8 1002468
1012
VLBA�HSA, n = 42
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.76 1.82 0.45 3.03 4.67 4.68 1.53 1.87 1.85VLA 5.24 1.94 0.41 2.03 4.90 5.18 2.15 2.05 1.84
VLBA�HSA 4.95 2.32 0.25 2.65 4.89 4.87 2.32 2.67 2.38
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
ETP Panel, Cycle 14A
0 2 4 6 8 1005
101520253035
GBT, n = 97
0 2 4 6 8 100
5
10
15
20
25VLA, n = 126
0 2 4 6 8 100
5
10
15
VLBA�HSA, n = 47
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.04 2.02 0.34 2.89 4.88 4.99 2.10 2.11 2.09VLA 5.03 2.01 0.33 2.38 4.82 4.96 2.16 2.01 2.12
VLBA�HSA 4.83 1.87 0.45 2.90 4.53 4.75 1.76 1.81 1.86
Figure C-3. Like Figure C-1 except showing the ETP panel normalized score distributions by instrument.
49
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
HIZ Panel, Cycle 12B
0 2 4 6 8 100
5
10
15
20GBT, n = 38
0 2 4 6 8 1005
101520253035
VLA, n = 168
0 2 4 6 8 10-1.0
-0.5
0.0
0.5
1.0
VLBA�HSA, n = 0
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.97 1.77 0.48 2.68 4.80 4.89 1.38 1.85 1.74VLA 5.01 2.02 0.04 2.10 4.98 4.99 2.28 2.10 2.15
VLBA�HSA
0 2 4 6 8 100.00
0.05
0.10
0.15
HIZ Panel, Cycle 13A
0 2 4 6 8 100
5
10
15
GBT, n = 44
0 2 4 6 8 1005
101520253035
VLA, n = 166
0 2 4 6 8 1002468
1012
VLBA�HSA, n = 35
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.84 2.05 0.37 2.51 4.84 4.74 2.04 2.22 2.06VLA 5.09 1.91 0.02 2.19 5.29 5.10 2.39 2.12 1.98
VLBA�HSA 4.76 2.24 0.11 2.56 4.27 4.81 2.33 2.28 2.37
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
HIZ Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
GBT, n = 28
0 2 4 6 8 1005
1015202530
VLA, n = 166
0 2 4 6 8 1002468
10
VLBA�HSA, n = 24
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.38 1.85 0.73 2.56 3.73 4.30 1.90 1.65 1.67VLA 5.09 1.98 0.28 2.28 5.08 5.04 2.15 2.29 2.09
VLBA�HSA 5.10 2.05 0.70 2.71 4.62 4.92 1.60 1.98 1.91
0 2 4 6 8 100.00
0.05
0.10
0.15
HIZ Panel, Cycle 14A
0 2 4 6 8 100
5
10
15
GBT, n = 40
0 2 4 6 8 100
10
20
30
40
50VLA, n = 242
0 2 4 6 8 1002468
101214
VLBA�HSA, n = 30
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.02 2.37 0.27 2.60 4.64 4.93 2.13 2.45 2.29VLA 5.05 1.87 0.16 2.27 4.93 5.03 2.16 2.13 1.98
VLBA�HSA 4.54 2.29 0.18 2.57 4.00 4.47 2.27 2.23 2.24
Figure C-4. Like Figure C-1 except showing the HIZ panel normalized score distributions by instrument.
50
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
ISM Panel, Cycle 12B
0 2 4 6 8 100
10
20
30
40
GBT, n = 95
0 2 4 6 8 1002468
101214
VLA, n = 33
0 2 4 6 8 10012345
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.99 1.98 0.47 2.85 4.96 4.92 2.05 2.06 1.99VLA 5.24 2.00 0.32 2.14 5.08 5.19 2.24 2.14 2.27
VLBA�HSA 4.26 1.58 0.38 2.33 4.22 4.21 1.79 1.61 1.94
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
ISM Panel, Cycle 13A
0 2 4 6 8 100
10
20
30
40
GBT, n = 121
0 2 4 6 8 100
10
20
30
40
50VLA, n = 166
0 2 4 6 8 10-1.0
-0.5
0.0
0.5
1.0
VLBA�HSA, n = 0
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.26 2.16 0.50 2.47 4.86 5.13 2.53 2.13 2.26VLA 4.81 1.83 0.52 3.06 4.46 4.74 1.93 1.81 1.72
VLBA�HSA
0 2 4 6 8 100.00
0.05
0.10
0.15
ISM Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
20
GBT, n = 104
0 2 4 6 8 1005
101520253035
VLA, n = 90
0 2 4 6 8 100.00.51.01.52.02.53.0
VLBA�HSA, n = 4
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.67 1.91 0.64 2.82 4.47 4.54 2.02 2.13 1.87VLA 5.38 1.97 0.30 2.39 5.07 5.35 2.18 2.18 2.03
VLBA�HSA 4.91 2.59 1.07 2.27 3.90 4.00 0.74 1.13 1.13
0 2 4 6 8 100.00
0.05
0.10
0.15
ISM Panel, Cycle 14A
0 2 4 6 8 100
5
10
15
20
25GBT, n = 110
0 2 4 6 8 1005
1015202530
VLA, n = 154
0 2 4 6 8 10-1.0
-0.5
0.0
0.5
1.0
VLBA�HSA, n = 0
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.82 1.82 0.17 2.28 4.71 4.79 1.90 1.95 1.94VLA 5.13 2.08 0.31 2.52 4.93 5.07 2.46 2.22 2.18
VLBA�HSA
Figure C-5. Like Figure C-1 except showing the ISM panel normalized score distributions by instrument.
51
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
NGA Panel, Cycle 12B
0 2 4 6 8 1002468
101214
GBT, n = 29
0 2 4 6 8 1005
10152025
VLA, n = 109
0 2 4 6 8 10-1.0
-0.5
0.0
0.5
1.0
VLBA�HSA, n = 0
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.51 1.59 0.15 2.03 4.27 4.50 2.16 1.95 1.69VLA 5.13 2.04 0.41 2.59 4.81 5.03 1.80 2.09 2.05
VLBA�HSA
0 2 4 6 8 100.00
0.05
0.10
0.15
NGA Panel, Cycle 13A
0 2 4 6 8 100
1
2
3
4
GBT, n = 8
0 2 4 6 8 100
10
20
30
40VLA, n = 182
0 2 4 6 8 1001234567
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.61 2.80 0.19 1.98 4.91 4.46 3.22 4.02 2.89VLA 4.98 1.89 -0.02 2.44 4.92 4.98 2.04 2.00 2.02
VLBA�HSA 5.74 2.70 -0.41 1.67 6.55 5.63 2.79 2.64 2.93
0 2 4 6 8 100.000.050.100.150.200.250.30
NGA Panel, Cycle 13B
0 2 4 6 8 10012345
GBT, n = 10
0 2 4 6 8 1005
10152025
VLA, n = 145
0 2 4 6 8 100
1
2
3
4
VLBA�HSA, n = 5
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.66 1.64 0.32 1.89 4.33 4.66 1.41 1.69 1.85VLA 5.06 2.00 0.17 2.03 4.94 5.04 2.32 2.11 2.12
VLBA�HSA 3.91 1.17 0.49 2.52 3.87 3.87 0.40 0.51 0.60
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
NGA Panel, Cycle 14A
0 2 4 6 8 100
5
10
15
GBT, n = 37
0 2 4 6 8 1005
1015202530
VLA, n = 172
0 2 4 6 8 100
1
2
3
4
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.81 1.72 0.20 2.60 4.77 4.77 1.78 1.88 1.81VLA 5.13 2.01 0.14 2.02 5.08 5.11 2.61 2.33 2.12
VLBA�HSA 3.45 1.68 0.13 2.00 3.53 3.48 1.79 1.79 2.00
Figure C-6. Like Figure C-1 except showing the NGA panel normalized score distributions by instrument.
52
Appendix C
0 2 4 6 8 100.00
0.05
0.10
0.15
SFM Panel, Cycle 12B
0 2 4 6 8 100
5
10
15
20
GBT, n = 47
0 2 4 6 8 1005
101520253035
VLA, n = 160
0 2 4 6 8 1001234567
VLBA�HSA, n = 17
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.00 1.92 0.23 2.17 4.60 4.97 2.00 1.89 2.25VLA 5.02 1.97 0.34 2.20 4.51 4.95 1.98 1.93 2.05
VLBA�HSA 4.84 2.38 1.17 4.00 4.51 4.51 1.98 2.13 2.55
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
SFM Panel, Cycle 13A
0 2 4 6 8 100
5
10
15
20
GBT, n = 56
0 2 4 6 8 100
102030405060
VLA, n = 227
0 2 4 6 8 1001234567
VLBA�HSA, n = 20
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.99 2.06 0.49 2.34 4.99 4.90 2.37 2.35 2.03VLA 4.99 1.95 0.37 2.61 4.77 4.92 1.86 1.97 1.99
VLBA�HSA 5.17 2.18 0.70 2.47 4.73 4.93 2.24 2.02 2.37
0 2 4 6 8 10 120.000.050.100.150.200.25
SFM Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
GBT, n = 49
0 2 4 6 8 100
10
20
30
40VLA, n = 143
0 2 4 6 8 100
1
2
3
4
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.69 2.38 0.75 2.74 5.24 5.51 2.57 2.41 2.18VLA 4.68 1.76 0.90 3.42 4.18 4.51 1.43 1.59 1.62
VLBA�HSA 6.25 1.37 -0.22 2.35 6.24 6.24 1.35 1.49 1.52
0 2 4 6 8 10 120.00
0.05
0.10
0.15
0.20
SFM Panel, Cycle 14A
0 2 4 6 8 10 120123456
GBT, n = 10
0 2 4 6 8 10 120
102030405060
VLA, n = 227
0 2 4 6 8 10 1202468
10
VLBA�HSA, n = 14
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.90 2.20 0.41 1.69 4.35 4.81 2.60 2.26 2.44VLA 5.01 1.91 0.76 3.44 4.65 4.86 1.77 1.74 1.71
VLBA�HSA 4.97 2.94 1.04 2.72 3.72 4.41 1.71 1.50 1.82
Figure C-7. Like Figure C-1 except showing the SFM panel normalized score distributions by instrument.
53
Appendix C
0 2 4 6 8 100.00.10.20.30.40.5
SSP Panel, Cycle 12B
0 2 4 6 8 100
1
2
3
4
GBT, n = 5
0 2 4 6 8 100
5
10
15
20VLA, n = 91
0 2 4 6 8 100123456
VLBA�HSA, n = 10
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 2.43 0.73 -1.34 3.06 2.62 2.62 0.28 0.31 0.35VLA 5.06 1.87 0.32 2.16 4.84 5.01 2.14 2.04 1.98
VLBA�HSA 5.72 2.22 -0.37 1.73 6.33 5.74 2.76 2.67 2.16
0 2 4 6 8 10 120.000.050.100.150.200.25
SSP Panel, Cycle 13A
0 2 4 6 8 10 1202468
10
GBT, n = 30
0 2 4 6 8 10 1205
1015202530
VLA, n = 121
0 2 4 6 8 10 120
5
10
15
20
VLBA�HSA, n = 45
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.37 2.48 1.35 4.99 4.77 5.05 2.60 2.28 1.90VLA 4.76 1.91 1.15 3.83 4.12 4.49 1.66 1.51 1.68
VLBA�HSA 5.39 1.70 0.67 2.43 4.86 5.31 1.50 1.45 1.47
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
SSP Panel, Cycle 13B
0 2 4 6 8 100
5
10
15
20
GBT, n = 43
0 2 4 6 8 1005
1015202530
VLA, n = 124
0 2 4 6 8 100
5
10
15
20VLBA�HSA, n = 44
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 4.81 1.90 0.77 2.76 4.05 4.73 2.05 1.76 1.65VLA 5.04 1.97 0.40 2.38 4.74 4.96 2.31 2.15 1.98
VLBA�HSA 5.08 2.11 1.13 4.02 4.73 4.84 1.74 1.88 1.86
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
SSP Panel, Cycle 14A
0 2 4 6 8 100
2
4
6
8
GBT, n = 25
0 2 4 6 8 1005
1015202530
VLA, n = 120
0 2 4 6 8 1001234567
VLBA�HSA, n = 15
Mean Std Dev Skewness Kurtosis Median H-L Loc Scaled MAD Sn Qn
GBT 5.57 2.19 0.78 3.28 5.35 5.39 1.91 2.52 2.25VLA 4.97 1.92 0.63 3.11 4.58 4.87 1.93 1.84 1.83
VLBA�HSA 4.28 1.91 0.65 2.21 3.79 4.10 1.65 2.09 1.79
Figure C-8. Like Figure C-1 except showing the SSP panel normalized score distributions by instrument.
54
Appendix D. Comparisons of Cumulative Distributions
of Normalized Scores, by Instrument, Including
Distributional Two-Sample Test Statistics
55
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
AGN Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
AGN Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
AGN Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
AGN Panel, Cycle 14A CDFs
Comparison of AGN Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
AGN,12B 20 159 0.1573 0.1102 0.0609AGN,13A 31 70 0.8131 0.8658 0.9478AGN,13B 17 115 0.9971 0.7588 0.9953AGN,14A 17 123 0.6601 0.2640 0.2258
Comparison of AGN Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
AGN,12B 20 88 0.1074 0.0903 0.0434AGN,13A 31 156 0.1818 0.0981 0.1383AGN,13B 17 102 0.7916 0.6432 0.7900AGN,14A 17 100 0.1544 0.1062 0.0696
Comparison of AGN Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
AGN,12B 159 88 0.8631 0.7407 0.8766AGN,13A 70 156 0.1872 0.0564 0.0698AGN,13B 115 102 0.6146 0.5187 0.5271AGN,14A 123 100 0.4843 0.3952 0.4723
Figure D-1. The plots above show the empirical cumulative distribution functions of AGN panel nor-malized scores—categorized by instrument (GBT, blue; VLA, green; and VLBA/HSA, red)—for proposalreview cycles 12B, 13A, 13B, and 14A. The tabulated statistics represent probabilities for the hypothesis
of identical distributions for GBT versus VLA scores, GBT versus VLBA/HSA scores, and VLA versusVLBA/HSA, according to three standard statistical tests. GMVA scores have been excluded.
56
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
EGS Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
EGS Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
EGS Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
EGS Panel, Cycle 14A CDFs
Comparison of EGS Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
EGS,12B 64 85 0.4290 0.4971 0.4262EGS,13A 73 149 0.8991 0.8376 0.7230EGS,13B 55 100 0.4101 0.1792 0.1712EGS,14A 109 168 0.7306 0.6384 0.4143
Comparison of EGS Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
EGS,12B 64 15 0.0263 0.0396 0.0435EGS,13A 73 0 Undefined Undefined UndefinedEGS,13B 55 4 0.2917 Undefined 0.0466EGS,14A 109 10 0.0946 0.0473 0.0298
Comparison of EGS Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
EGS,12B 85 15 0.0752 0.0461 0.0332EGS,13A 149 0 Undefined Undefined UndefinedEGS,13B 100 4 0.2553 Undefined 0.0471EGS,14A 168 10 0.1210 0.0480 0.0745
Figure D-2. Like Figure D-1, but comparing EGS panel score distributions.
57
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ETP Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ETP Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ETP Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ETP Panel, Cycle 14A CDFs
Comparison of ETP Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ETP,12B 81 102 0.2484 0.1830 0.1485ETP,13A 52 92 0.5656 0.5837 0.5916ETP,13B 84 95 0.2175 0.2109 0.1368ETP,14A 97 126 0.9433 0.9454 0.9685
Comparison of ETP Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ETP,12B 81 40 0.2546 0.1221 0.0742ETP,13A 52 44 0.7489 0.6901 0.6982ETP,13B 84 42 0.7748 0.4237 0.4375ETP,14A 97 47 0.8532 0.7182 0.7892
Comparison of ETP Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ETP,12B 102 40 0.6052 0.5143 0.3130ETP,13A 92 44 0.7589 0.7883 0.8951ETP,13B 95 42 0.6177 0.4271 0.2398ETP,14A 126 47 0.6832 0.7386 0.7911
Figure D-3. Like Figure D-1, but comparing ETP panel score distributions.
58
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
HIZ Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
HIZ Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
HIZ Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
HIZ Panel, Cycle 14A CDFs
Comparison of HIZ Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
HIZ,12B 38 168 0.7403 0.5268 0.5643HIZ,13A 44 166 0.3694 0.4244 0.5740HIZ,13B 28 166 0.0325 0.0546 0.0928HIZ,14A 40 242 0.5261 0.3714 0.3977
Comparison of HIZ Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
HIZ,12B 38 0 Undefined Undefined UndefinedHIZ,13A 44 35 0.9262 0.8711 0.9473HIZ,13B 28 24 0.1904 0.2050 0.2221HIZ,14A 40 30 0.8862 0.6955 0.7201
Comparison of HIZ Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
HIZ,12B 168 0 Undefined Undefined UndefinedHIZ,13A 166 35 0.6558 0.4051 0.4678HIZ,13B 166 24 0.9683 0.5376 0.9306HIZ,14A 242 30 0.1351 0.1253 0.1531
Figure D-4. Like Figure D-1, but comparing HIZ panel score distributions.
59
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ISM Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ISM Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ISM Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
ISM Panel, Cycle 14A CDFs
Comparison of ISM Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ISM,12B 95 33 0.7974 0.7170 0.8098ISM,13A 121 166 0.1011 0.1491 0.1021ISM,13B 104 90 0.0247 0.0117 0.0122ISM,14A 110 154 0.5387 0.4232 0.3475
Comparison of ISM Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ISM,12B 95 10 0.5209 0.2875 0.4848ISM,13A 121 0 Undefined Undefined UndefinedISM,13B 104 4 0.8291 Undefined 0.6542ISM,14A 110 0 Undefined Undefined Undefined
Comparison of ISM Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
ISM,12B 33 10 0.4572 0.3248 0.3418ISM,13A 166 0 Undefined Undefined UndefinedISM,13B 90 4 0.2282 Undefined 0.2915ISM,14A 154 0 Undefined Undefined Undefined
Figure D-5. Like Figure D-1, but comparing ISM panel score distributions.
60
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
NGA Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
NGA Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
NGA Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
NGA Panel, Cycle 14A CDFs
Comparison of NGA Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
NGA,12B 29 109 0.4710 0.2928 0.2599NGA,13A 8 182 0.4290 0.2700 0.2479NGA,13B 10 145 0.7742 0.5216 0.6761NGA,14A 37 172 0.3631 0.3474 0.3868
Comparison of NGA Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
NGA,12B 29 0 Undefined Undefined UndefinedNGA,13A 8 10 0.5242 0.4272 0.4437NGA,13B 10 5 0.9191 Undefined 0.7310NGA,14A 37 10 0.1760 0.0493 0.0260
Comparison of NGA Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
NGA,12B 109 0 Undefined Undefined UndefinedNGA,13A 182 10 0.1252 0.0759 0.0531NGA,13B 145 5 0.2007 Undefined 0.2137NGA,14A 172 10 0.0558 0.0184 0.0066
Figure D-6. Like Figure D-1, but comparing NGA panel score distributions.
61
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SFM Panel, Cycle 12B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SFM Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SFM Panel, Cycle 13B CDFs
0 2 4 6 8 10 120.0
0.2
0.4
0.6
0.8
1.0
SFM Panel, Cycle 14A CDFs
Comparison of SFM Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SFM,12B 47 160 0.9947 0.9157 0.9871SFM,13A 56 227 0.6970 0.6930 0.8354SFM,13B 49 143 0.0392 0.0161 0.0075SFM,14A 10 227 0.7517 0.0248 0.6252
Comparison of SFM Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SFM,12B 47 17 0.8763 0.7218 0.7565SFM,13A 56 20 0.8917 0.8749 0.9564SFM,13B 49 10 0.1852 0.1286 0.1341SFM,14A 10 14 0.9479 0.8382 0.8738
Comparison of SFM Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SFM,12B 160 17 0.8693 0.5574 0.8180SFM,13A 227 20 0.9527 0.5790 0.8578SFM,13B 143 10 0.0041 0.0012 0.0020SFM,14A 227 14 0.2615 0.0153 0.0543
Figure D-7. Like Figure D-1, but comparing SFM panel score distributions.
62
Appendix D
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SSP Panel, Cycle 12B CDFs
0 2 4 6 8 10 120.0
0.2
0.4
0.6
0.8
1.0
SSP Panel, Cycle 13A CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SSP Panel, Cycle 13B CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
SSP Panel, Cycle 14A CDFs
Comparison of SSP Panel GBT HblueL and VLA HgreenL Normalized Score Distributions
nGBT nVLA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SSP,12B 5 91 0.0004 Undefined 0.0002SSP,13A 30 121 0.3669 0.2538 0.3615SSP,13B 43 124 0.4491 0.4973 0.6232SSP,14A 25 120 0.4148 0.2947 0.3522
Comparison of SSP Panel GBT HblueL and VLBA�HSA HredL Normalized Score Distributions
nGBT nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SSP,12B 5 10 0.0040 Undefined 0.0044SSP,13A 30 45 0.4471 0.4685 0.3129SSP,13B 43 44 0.5701 0.6004 0.7310SSP,14A 25 15 0.1240 0.0822 0.0600
Comparison of SSP Panel VLA HgreenL and VLBA�HSA HredL Normalized Score Distributions
nVLA nVLBA�HSA
Kolmogorov-Smirnov
P-value
Cramer-von Mises
P-value
Anderson-Darling
P-value
SSP,12B 91 10 0.3515 0.2725 0.3378SSP,13A 121 45 0.0356 0.0108 0.0098SSP,13B 124 44 0.9465 0.8579 0.7471SSP,14A 120 15 0.3027 0.1393 0.1487
Figure D-8. Like Figure D-1, but comparing SSP panel score distributions.
63
Appendix E. Comparisons of Initial Rank-Order Aggregate
Preferences vs. Rank Order after SRP Score Adjustments
64
Appendix E
2470
4714
6389
6406
6447
6465
6470
6494
6507
6510
6512
6520
6524
6543
65536556
6575
6581
6588
6589
6593
65956597
6598
6606
6620
6658
6662
6663
66706687
6690
6693
6696
6726
67286754
67686771
6776
6787
6790
6793
6794
6795
6798
6800
6801
6808
6812
6813
6815
6820
6828
6869
2470
4714
6389
6406
6447
6465
6470
6494
6507
6510
6512
6520
6524
6543
6553
6556
6575
6581
6588
6589
6593
6595
6597
6598
6606
6620
6658
6662
6663
66706687
6690
6693
6696
6726
6728
6754
6768
6771
6776
6787
6790
6793
6794
6795
67986800
6801
6808
6812
6813
6815
6820
6828
6869
7.74982
1.9123
7.75
1.91
MStd TAC
6997
7008
7020
7056
7066
7078
7079
7088
7094
7098
7109
7115
7118
7120
7121
7127
7130
7152
7153
7157
7176
7179
7212
7236
7253
7255
7256
7277
7323
7324
7327
7369
7374
73977398
7400
7411
7421
7425
7440
7452
7466
7470
7476
74797497
7508
75107512
7514
7520
7523
7524
7532
7535
7565
7582
7597
7598
6997
7008
70207056
7066
7078
7079
7088
7094
7098
7109
7115
7118
7120
7121
7127
7130
71527153
7157
7176
7179
7212
7236
7253
7255
7256
7277
7323
7324
7327
7369
7374
73977398
7400
7411
7421
7425
7440
7452
7466
7470
7476
7479
7497
7508
75107512
7514
7520
7523
7524
7532
7535
7565
7582
7597
7598
8.34999
1.7613
8.35
1.5
MStd TAC
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
7679
7709
7731
7756
7768
7770
7789
7827
7833
7859
7867
7874
7880
7897
7903
7905
7924
7939
7943
7956
7965
7972
7992
7993
8010
8015
8025
8040
8051
8055
8061
8096
8108
8112
8123
8133
8136
8138
8161
8163
8164
8166
8170
8172
8176
8178
8193
8219
8225
8231
9.50472
2.98367
9.5
2.
MStd TAC
8357
8382
8429
8433
8440
8441
8446
84548479
8482
8499
8512
8518
8543
8561
8572
8579
8580
8609
8610
8612
8617
8624
8656
8662
8672
8714
8728
8731
8739
8743
8747
8757
8763
8769
8774
8778
8801
88268829
8832
8835
8841
8847
8858
8862
8872
8874
8881
8898
8909
8921
8943
8945
89538970
8973
8988
9010
9021
8357
8382
8429
8433
8440
8441
8446
8454
8479
8482
8499
8512
8518
8543
8561
8572
8579
8580
8609
8610
8612
8617
8624
8656
8662
8672
8714
8728
8731
8739
8743
8747
8757
8763
8769
8774
87788801
8826
8829
8832
8835
8841
8847
8858
8862
8872
8874
8881
8898
8909
8921
8943
8945
89538970
8973
8988
9010
9021
7.99405
1.96587
7.99405
1.96587
MStd TAC
AGN,12B AGN,13A AGN,13B AGN,14A
ð of proposals 55 59 50 60ð of score adjustments 15 20 19 22
percent changed 27.3 33.9 38.0 36.7Kendall Τ 0.867 0.854 0.811 0.700Spearman Ρ 0.954 0.948 0.910 0.816
Figure E-1. Comparison of AGN Panel initial rank-order preference vs. rank-order after SRP meeting
score adjustments, for semesters 12B through 14A (left to right). The first quartile is shown in light green,the second in light blue, etc. Dashed lines indicate inter-quartile jumps. The table shows the total numberof proposals per semester, the number of scores that were adjusted, the percentage that were adjusted, andthe Kendall τ and Spearman ρ measures of rank-correlation between the initial aggregate scores and thefinal, adjusted scores.
65
Appendix E
6342
6417
6421
6431
6439
6441
6458
6489
6515
6518
6522
6548
6555
6560
6568
6571
6580
6601
6647
6650
6667
6668
6700
6707
6783
6799
6805
6816
6837
6853
6855
6865
6877
6342
6417
6421
6431
6439
6441
6458
6489
6515
6518
6522
6548
6555
6560
6568
6571
6580
6601
6647
6650
6667
6668
6700
6707
6783
6799
6805
6816
6837
6853
6855
6865
6877
8.46667
2.89947
8.47
2.9
MStd TAC
6401
6988
6990
6998
7006
7007
7034
7046
7063
7070
7071
7075
7076
7089
7140
7159
7160
7164
7166
7198
7210
7217
7224
7238
7268
7302
7313
7326
7366
7388
7392
7420
7434
7436
7447
7472
7474
7477
7480
7504
7515
7530
7538
7540
7547
7556
7567
7610
6401
6988
6990
6998
7006
7007
7034
7046
7063
7070
7071
7075
7076
7089
7140
7159
7160
7164
7166
7198
7210
7217
7224
7238
7268
7302
7313
7326
7366
7388
7392
7420
7434
7436
7447
7472
7474
7477
7480
7504
7515
7530
7538
7540
7547
7556
7567
7610
7.90915
1.77027
7.91
1.77
MStd TAC
7719
7729
7743
7754
7773
7787
7790
7823
7830
7855
7871
7922
7930
7931
7953
7973
7978
7994
7999
8004
8014
8026
8027
8030
8035
8060
8072
8104
8141
8162
8174
8175
8191
7719
7729
7743
7754
7773
7787
7790
7823
7830
7855
7871
7922
7930
7931
7953
7973
7978
7994
7999
8004
8014
8026
8027
8030
8035
8060
8072
8104
8141
8162
8174
8175
8191
8.32458
2.70189
8.32
2.7
MStd TAC
7950
8358
8392
8405
8423
84538465
8468
8483
8495
8498
8500
8507
8547
8552
8558
8559
8560
8591
8595
8613
8619
8622
8631
8633
8634
8663
8666
8680
8696
8700
8705
8715
8727
8738
8740
8744
8746
8753
87598761
8793
8798
8815
8817
8864
8879
8889
8934
8937
8950
8951
8979
89828989
8992
8994
9002
9003
7950
8358
8392
8405
8423
84538465
8468
8483
8495
8498
8500
8507
8547
8552
8558
8559
8560
8591
8595
8613
8619
8622
8631
8633
8634
8663
8666
8680
8696
8700
8705
8715
8727
8738
8740
8744
8746
8753
87598761
8793
8798
8815
8817
8864
8879
8889
8934
8937
8950
8951
8979
8982
8989
8992
8994
9002
9003
9.20168
2.38068
9.2
2.38068
MStd TAC
EGS,12B EGS,13A EGS,13B EGS,14A
ð of proposals 33 48 33 59ð of score adjustments 5 4 2 15
percent changed 15.2 8.3 6.1 25.4Kendall Τ 0.938 0.980 0.989 0.817Spearman Ρ 0.981 0.998 0.999 0.894
Figure E-2. Like Figure E-1, but comparing EGS panel initial and final rank-order preferences.
66
Appendix E
6411
64196427
6432
6464
6466
6467
6495
6509
6536
6547
6558
6564
6578
6585
6614
6618
6628
6633
6642
6654
6659
6674
66806699
6701
6711
6713
6721
6737
6744
6745
6748
6755
6757
6769
6782
6786
6789
6792
6803
6807
6819
6832
6834
6838
6840
6844
6847
6848
6850
6852
6854
6859
6862
6863
6867
6411
64196427
64326464
6466
6467
64956509
6536
6547
6558
6564
6578
6585
6614
6618
6628
6633
6642
6654
6659
6674
6680
6699
6701
67116713
6721
6737
6744
6745
6748
6755
6757
6769
6782
6786
6789
6792
6803
6807
6819
6832
6834
6838
6840
6844
6847
6848
6850
6852
6854
6859
6862
6863
6867
8.61497
1.4718
8.61
1.47
MStd TAC
7023
7043
7067
7080
7137
7139
7220
7257
7259
7267
7270
7287
7290
7291
7330
7333
7336
7340
7378
7380
7390
7395
7407
7408
7413
7416
7454
7462
7481
7490
7499
7525
7546
7548
7550
7554
7555
7569
7570
7577
7579
7580
7587
7589
7593
7603
7608
7023
7043
7067
7080
7137
7139
7220
7257
7259
7267
7270
7287
7290
7291
7330
7333
7336
7340
7378
7380
7390
7395
7407
7408
7413
7416
7454
7462
7481
7490
7499
7525
7546
7548
7550
7554
7555
7569
7570
7577
7579
7580
7587
7589
7593
7603
7608
8.56216
2.37498
8.56
2.37
MStd TAC
7351
7728
7745
7769
7782
7792
7793
7794
7812
7821
7828
7851
7863
7870
7885
7889
7892
7898
7912
7947
7954
7983
7986
7987
8009
8020
8021
8042
8070
8078
8086
8098
8100
8130
8145
8148
8149
8151
8155
8156
8157
8182
8196
8202
8210
8212
8214
8215
8220
8224
8227
7351
7728
7745
7769
7782
7792
7793
7794
7812
7821
7828
7851
7863
7870
7885
7889
7892
7898
7912
7947
7954
7983
7986
7987
8009
8020
8021
8042
8070
8078
8086
8098
8100
8130
8145
8148
8149
8151
8155
8156
8157
8182
8196
8202
8210
8212
8214
8215
8220
8224
8227
8.02274
2.06766
8.02
2.06766
MStd TAC
8284
8371
8379
8383
8442
8455
8481
8485
8487
8530
8548
8571
8615
8671
8725
8737
8745
8758
8762
8766
8776
8794
8807
8816
8818
8819
8827
8830
8836
8842
8852
8857
8861
8867
8875
8882
8890
8892
8902
8911
8914
8916
8920
8924
8926
8932
8933
8936
8942
8959
8963
8965
8985
8995
8999
9006
9008
9019
9022
90239026
8284
8371
8379
8383
8442
8455
8481
8485
8487
8530
8548
8571
8615
8671
8725
8737
8745
8758
8762
8766
8776
8794
8807
8816
8818
8819
8827
8830
8836
8842
8852
8857
8861
8867
8875
8882
8890
8892
8902
8911
8914
89168920
8924
8926
8932
8933
8936
8942
8959
8963
8965
8985
8995
8999
9006
9008
9019
9022
9023
9026
7.72195
2.65835
7.72195
2.65835
MStd TAC
ETP,12B ETP,13A ETP,13B ETP,14A
ð of proposals 57 47 51 61ð of score adjustments 16 9 10 14
percent changed 28.1 19.1 19.6 23.0Kendall Τ 0.799 0.883 0.842 0.864Spearman Ρ 0.896 0.948 0.914 0.952
Figure E-3. Like Figure E-1, but comparing ETP panel initial and final rank-order preferences.
67
Appendix E
5976
6358
6375
6423
6433
6445
6446
6448
6453
6454
6480
6486
6487
6493
6504
6526
6528
6537
6551
6566
6570
6576
6591
6617
6635
6638
6640
6644
6653
6675
6686
6692
6695
6706
6725
6727
6729
6740
6765
6775
6778
6818
6822
6878
5976
6358
6375
6423
6433
6445
6446
6448
6453
6454
6480
6486
6487
6493
6504
6526
6528
6537
6551
6566
6570
6576
6591
6617
6635
6638
6640
6644
6653
6675
6686
6692
6695
6706
6725
6727
6729
6740
6765
6775
6778
6818
6822
68787.96001
2.8845
8.
2.85
MStd TAC
7036
7104
7111
7117
7141
7145
7154
7163
7168
7196
7204
7223
7229
7230
7231
7237
7239
7244
7251
7266
7286
7296
7300
7303
7315
7318
7325
7332
7343
7345
7353
7363
7370
7379
7382
7389
7394
7430
7431
7446
7448
7483
7492
7511
7526
7537
7542
7545
7557
7574
7611 7036
7104
7111
7117
7141
7145
7154
7163
7168
7196
7204
7223
7229
7230
7231
7237
7239
7244
7251
7266
7286
7296
7300
7303
7315
7318
7325
7332
7343
7345
7353
7363
7370
7379
7382
7389
7394
7430
7431
7446
7448
7483
7492
7511
7526
7537
7542
7545
7557
7574
7611
7.55148
2.1066
7.55
2.11
MStd TAC
7758
77807791
7799
7800
7818
7822
7826
7831
7847
7862
7864
7886
7890
7891
7910
7913
7928
7929
7938
7946
7966
7970
7974
7980
7984
7985
8008
80248032
8046
8047
8048
8050
80548056
8057
8064
8066
8073
8082
8097
8099
8103
8115
8127
8132
8160
81878192
8203
8205
8218
8228
7758
7780
7791
7799
7800
7818
7822
7826
7831
7847
7862
7864
7886
7890
7891
7910
7913
7928
7929
7938
7946
7966
7970
7974
7980
7984
7985
8008
8024
8032
8046
8047
8048
8050
8054
8056
8057
8064
8066
8073
8082
8097
8099
8103
8115
8127
8132
8160
81878192
8203
8205
8218
8228
7.50729
2.20394
7.50729
2.20394
MStd TAC
8346
8389
8400
8416
8424
8431
8444
8459
8490
8494
8496
8513
8527
8529
8533
8542
8556
8557
8562
8563
8578
8584
8620
8630
8640
8643
8655
8661
8673
8677
8690
8691
8699
8707
8708
8717
8718
8720
8724
8726
8756
8768
8788
8811
8820
8821
8828
8845
8849
8850
8866
8873
8884
8894
8910
8918
8935
8939
8940
8971
8972
8974
8997
8998
9011
8346
8389
8400
84168424
8431
8444
8459
8490
84948496
8513
8527
8529
8533
8542
8556
8557
8562
8563
8578
8584
8620
8630
8640
8643
8655
8661
8673
8677
8690
8691
86998707
8708
8717
8718
8720
8724
8726
8756
8768
8788
8811
8820
8821
8828
8845
8849
8850
8866
8873
8884
8894
8910
8918
8935
8939
8940
8971
8972
8974
8997
8998
9011
7.42082
2.48501
7.42082
2.48501
MStd TAC
HIZ,12B HIZ,13A HIZ,13B HIZ,14A
ð of proposals 44 51 54 65ð of score adjustments 8 18 4 14
percent changed 18.2 35.3 7.4 21.5Kendall Τ 0.917 0.729 0.928 0.871Spearman Ρ 0.972 0.856 0.963 0.951
Figure E-4. Like Figure E-1, but comparing HIZ panel initial and final rank-order preferences.
68
Appendix E
6377
6404
6418
6425
6430
6434
6438
6463
6482
6517
6584
6592
6611
6613
6619
6626
6630
6648
6673
6676
6717
6759
6760
6770
6823
6826
6827
6833
6377
6404
6418
6425
6430
6434
6438
6463
6482
6517
6584
6592
6611
6613
6619
6626
6630
6648
6673
6676
6717
6759
6760
6770
6823
6826
6827
6833
7.31157
3.67871
7.31
3.68
MStd TAC
6708
6983
7013
7017
7030
7058
7068
7082
70837110
7116
7123
7155
7167
7170
7171
7186
7225
7233
7234
7248
7271
7272
7281
7288
7295
7298
7299
7301
7322
7328
7337
73417346
7362
7371
7381
7393
7401
7403
74047422
7442
7449
7451
7458
7461
74657468
7484
7485
7491
7493
7495
7507
7519
7522
7528
7543
7578
6708
6983
7013
7017
7030
7058
7068
7082
70837110
71167123
7155
7167
7170
7171
7186
7225
7233
7234
7248
7271
7272
7281
7288
7295
7298
7299
7301
7322
7328
7337
7341
7346
7362
7371
7381
7393
7401
7403
7404
7422
7442
7449
7451
7458
7461
74657468
7484
7485
7491
7493
7495
7507
7519
7522
7528
7543
7578
7.99198
2.45612
7.99
2.46
MStd TAC
6895
7765
7795
7796
7807
7813
7819
7820
7832
7844
7848
7872
7881
7901
7917
7949
7952
7975
7995
8001
8003
8019
8038
8039
8043
8083
8085
8088
8091
8095
8106
8109
8128
8137
8142
8194
8217
8221
8222
8226
8229
8236
6895
7765
7795
7796
7807
7813
7819
7820
7832
7844
7848
7872
7881
7901
7917
7949
7952
7975
7995
8001
8003
8019
8038
8039
8043
8083
8085
8088
8091
8095
8106
8109
8128
8137
8142
8194
8217
8221
8222
8226
8229
8236
8.05029
2.50949
8.05
2.51
MStd TAC
8380
8436
8448
8456
8462
8464
84718486
8504
8505
8581
8583
8596
8600
8602
8606
8611
8623
8627
8637
8647
8648
8651
8659
8664
8678
8685
8701
8702
8703
8710
8734
8741
8742
87558764
8765
87708773
8800
8803
8823
8838
8839
8855
8876
8883
8886
8887
8888
8891
8904
8922
8923
8930
8938
8962
8980
9005
9013
9020
8380
8436
8448
8456
8462
8464
84718486
8504
8505
8581
8583
8596
8600
8602
8606
8611
8623
8627
8637
8647
8648
8651
8659
8664
8678
8685
8701
8702
8703
8710
8734
8741
8742
87558764
8765
87708773
8800
8803
8823
8838
8839
8855
8876
8883
8886
8887
8888
8891
8904
8922
8923
8930
8938
8962
8980
9005
9013
9020
8.22937
2.35615
8.23
2.36
MStd TAC
ISM,12B ISM,13A ISM,13B ISM,14A
ð of proposals 28 60 42 61ð of score adjustments 7 9 6 5
percent changed 25.0 15.0 14.3 8.2Kendall Τ 0.889 0.950 0.887 0.879Spearman Ρ 0.968 0.991 0.946 0.916
Figure E-5. Like Figure E-1, but comparing ISM panel initial and final rank-order preferences.
69
Appendix E
6413
6416
6452
6484
6501
6511
6523
6530
6532
6541
6559
6562
6573
6577
6582
6590
6610
6623
6624
6646
6694
6739
6741
6785
6821
6836
6839
6849
6413
6416
6452
6484
6501
6511
6523
6530
6532
6541
6559
6562
6573
6577
6582
6590
6610
6623
6624
6646
6694
6739
6741
6785
6821
6836
6839
6849
9.17861
2.8066
9.18
2.56
MStd TAC
7033
7045
7060
7095
7106
7114
7119
7126
7129
7178
7183
7199
7205
7209
7213
7214
7228
7240
7245
7276
7279
7292
7293
7314
7334
7338
7347
7377
7387
7418
7424
7426
7427
7432
7433
7489
7516
7518
7563
7566
7573
7581
7601
7033
7045
7060
7095
7106
7114
7119
7126
7129
7178
7183
7199
7205
7209
7213
7214
7228
7240
7245
7276
7279
7292
7293
7314
7334
7338
7347
7377
7387
7418
7424
7426
7427
7432
7433
7489
7516
7518
7563
7566
7573
7581
7601
7.5521
3.15706
7.61
2.04
MStd TAC
7714
7771
7779
7786
7810
7815
7835
7846
7849
7852
7868
7878
7879
7884
7887
7894
7923
7948
7991
7996
8058
8059
8062
8071
8102
8116
8122
8129
8144
8159
8165
8197
8216
7714
7771
7779
7786
7810
7815
7835
7846
7849
7852
7868
7878
7879
7884
7887
7894
7923
7948
7991
7996
8058
8059
8062
8071
8102
8116
8122
8129
8144
8159
8165
8197
8216
7.65997
2.7323
7.66
2.73
MStd TAC
8381
8409
8410
8412
8414
8421
8434
8435
8447
8469
8473
8474
8510
8522
8523
8524
8532
8535
8546
8549
8550
8565
8601
8628
8646
8665
8669
8674
8675
8723
8806
8824
8825
8837
8844
8846
8863
8869
8878
8897
8905
8906
8955
8958
8996
9004
8381
8409
8410
8412
8414
8421
8434
8435
8447
8469
8473
8474
8510
8522
8523
8524
8532
8535
8546
8549
8550
8565
8601
8628
8646
8665
8669
8674
8675
8723
8806
8824
8825
8837
8844
8846
8863
8869
8878
8897
8905
8906
8955
8958
8996
9004
7.7146
2.71469
7.71
2.71
MStd TAC
NGA,12B NGA,13A NGA,13B NGA,14A
ð of proposals 28 43 33 46ð of score adjustments 11 42 6 6
percent changed 39.3 97.7 18.2 13.0Kendall Τ 0.796 0.491 0.871 0.953Spearman Ρ 0.910 0.668 0.948 0.988
Figure E-6. Like Figure E-1, but comparing NGA panel initial and final rank-order preferences.
70
Appendix E
6380
6415
64246457
6468
6472
6497
6498
6499
6502
6505
6519
6521
6529
6533
6542
65446554
6563
6587
6594
6596
6600
6605
6612
6627
6631
6636
664566526679
6681
6682
66856689
6691
6697
6709
6715
6722
6723
6731
6742
6746
6747
6753
6764
6784
6809
6835
6858
6861
6864
6868
6380
6415
64246457
6468
6472
6497
6498
6499
6502
6505
6519
6521
6529
6533
6542
6544
6554
6563
6587
6594
6596
6600
6605
6612
6627
6631
6636
6645
66526679
6681
6682
6685
6689
6691
6697
6709
6715
6722
6723
6731
6742
6746
6747
6753
6764
6784
6809
6835
6858
6861
6864
6868
8.25841
2.74157
8.26
2.74
MStd TAC
6975
7004
7010
7024
7065
7069
7087
7099
7108
7112
7125
7132
7135
7146
7149
7158
7173
7180
7197
7203
7211
7216
7221
7249
7252
7254
7258
7261
7289
7304
7309
7358
7365
7367
7368
7386
7391
7396
7405
7409
7412
7435
7437
7443
7450
7457
7459
7460
7464
7467
7498
7500
7501
7502
7509
7521
7531
7551
7552
7553
7584
7585
7588
7602
6975
7004
7010
7024
7065
7069
7087
7099
7108
7112
7125
7132
71357146
7149
71587173
7180
7197
7203
7211
7216
7221
7249
7252
7254
7258
7261
7289
7304
7309
7358
7365
7367
7368
7386
7391
7396
7405
7409
7412
7435
7437
7443
7450
7457
7459
7460
7464
7467
7498
750075017502
7509
7521
75317551
7552
7553
7584
7585
7588
7602
7.96979
1.84243
7.97
1.84
MStd TAC
7645
7715
7717
7742
7746
7755
7760
7762
7772
7775
7778
7857
7873
7876
7893
7896
7900
7908
7909
7932
7940
7942
7957
7962
7963
7969
7989
8005
8013
8017
8028
8033
8034
8045
8052
8084
8089
8090
8150
8153
8167
8173
8183
8198
8207
7645
7715
7717
7742
7746
7755
7760
7762
7772
7775
7778
7857
7873
7876
7893
7896
7900
7908
7909
7932
7940
7942
7957
7962
7963
7969
7989
8005
8013
8017
8028
8033
8034
8045
8052
8084
8089
8090
8150
8153
8167
8173
8183
8198
8207
8.13459
2.77213
8.13
2.77213
MStd TAC
6579
8356
8376
8403
8437
8458
8477
8478
8492
8501
8519
8541
8569
8582
8585
8588
8590
8592
8597
8603
8604
8607
8618
8621
8626
8635
8638
8645
8653
8657
8676
8688
8693
86978704
8722
8729
8730
8748
8789
8797
8805
8822
8831
8833
8854
8865
8871
8885
8899
8903
8912
8927
8929
8941
8949
8956
8961
8977
8991
9015
6579
8356
8376
8403
8437
8458
8477
8478
8492
8501
8519
8541
8569
8582
85858588
8590
8592
8597
8603
8604
8607
8618
8621
8626
8635
8638
8645
8653
8657
8676
8688
8693
86978704
8722
8729
8730
8748
8789
8797
8805
8822
8831
8833
8854
8865
8871
8885
8899
8903
8912
8927
8929
8941
8949
8956
8961
8977
8991
9015
7.38227
2.73355
7.38227
2.5
MStd TAC
SFM,12B SFM,13A SFM,13B SFM,14A
ð of proposals 54 64 45 61ð of score adjustments 21 20 13 7
percent changed 38.9 31.2 28.9 11.5Kendall Τ 0.804 0.926 0.857 0.910Spearman Ρ 0.903 0.986 0.954 0.964
Figure E-7. Like Figure E-1, but comparing SFM panel initial and final rank-order preferences.
71
Appendix E
6357
6395
6422
6451
6455
6481
6514
6552
6567
6629
6639
6678
6719
6749
6779
6791
6802
6804
6817
6846
6871
6874
6357
6395
6422
6451
6455
6481
6514
6552
6567
6629
6639
6678
6719
6749
6779
6791
6802
6804
6817
6846
6871
6874
7.56198
2.43282
7.56
2.43
MStd TAC
6461
6492
6999
7031
7057
7061
7084
7085
7093
7122
7206
7227
7241
7246
7307
7311
7335
7348
7355
7357
7359
7361
7364
7375
7384
7406
7438
7439
7455
7456
7469
7471
7513
7564
7568
7595
7599
7600
7606
7607
6461
6492
6999
7031
7057
7061
7084
7085
7093
7122
7206
7227
7241
7246
7307
7311
7335
7348
7355
7357
7359
7361
7364
7375
7384
7406
7438
7439
7455
7456
7469
7471
7513
7564
7568
7595
7599
7600
7606
7607
7.73063
2.91602
7.73
2.92
MStd TAC
6918
7747
7749
7763
7767
7784
7808
7839
7865
7882
7895
7918
7941
7988
7998
8000
8006
8011
8012
8036
8053
8065
8074
8077
8079
8081
8087
8093
8110
8118
8131
8134
8139
8181
8184
8185
8200
8201
8209
8223
8230
8232
8233
6918
7747
7749
7763
7767
7784
7808
7839
7865
7882
7895
7918
7941
7988
7998
8000
8006
8011
8012
8036
8053
8065
8074
8077
8079
8081
8087
8093
8110
8118
8131
8134
8139
8181
8184
8185
8200
8201
8209
8223
8230
8232
82339.62126
2.3081
9.62
2.31
MStd TAC
8452
8467
8489
8497
8506
8515
8516
8517
8539
8589
8599
8608
8614
8629
8658
8660
8712
8760
8781
8787
8808
8843
8895
8907
8925
8931
8944
8947
8969
8976
8983
9025
8452
8467
8489
8497
8506
8515
8516
8517
8539
8589
8599
8608
8614
8629
8658
8660
8712
8760
8781
8787
8808
8843
8895
8907
8925
8931
8944
8947
8969
8976
8983
90258.02685
2.69349
8.03
2.69
MStd TAC
SSP,12B SSP,13A SSP,13B SSP,14A
ð of proposals 22 40 43 32ð of score adjustments 5 10 5 0
percent changed 22.7 25.0 11.6 0.0Kendall Τ 0.879 0.871 0.901 1.000Spearman Ρ 0.957 0.940 0.944 1.000
Figure E-8. Like Figure E-1, but comparing SSP panel initial and final rank-order preferences.
72
Appendix F
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
Score, s
Frac
tion
Adj. Linearized Score CDFs, Cycle 12B
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
Score, s
Frac
tion
Adj. Linearized Score CDFs, Cycle 13A
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
Score, s
Frac
tion
Adj. Linearized Score CDFs, Cycle 13B
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
Score, s
Frac
tion
Adj. Linearized Score CDFs, Cycle 14A
Figure F-1. These plots show the cumulative distributions of adjusted, linearized SRP aggregate scores,by instrument. GBT proposal scores are shown in blue, VLA scores in green, and VLBA/HSA scores in red.
73
Appendix F
Adjusted, Linearized Score CDFs, AGN Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, AGN
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, AGN
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, AGN
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, AGN
Adjusted, Linearized Score CDFs, EGS Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, EGS
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, EGS
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, EGS
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, EGS
Figure F-2. The cumulative distributions of adjusted, linearized SRP aggregate scores, by instrument, forthe AGN and EGS panls. (Continued on next page.)
74
Appendix F
Adjusted, Linearized Score CDFs, ETP Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, ETP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, ETP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, ETP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, ETP
Adjusted, Linearized Score CDFs, HIZ Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, HIZ
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, HIZ
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, HIZ
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, HIZ
Figure F-2 (Continued). The cumulative distributions of adjusted, linearized SRP aggregate scores, byinstrument, for the ETP and HIZ panels. (Continued on next page.)
75
Appendix F
Adjusted, Linearized Score CDFs, ISM Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, ISM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, ISM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, ISM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, ISM
Adjusted, Linearized Score CDFs, NGA Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, NGA
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, NGA
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, NGA
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, NGA
Figure F-2 (Continued). The cumulative distributions of adjusted, linearized SRP aggregate scores, byinstrument, for the ISM and NGA panels. (Continued on next page.)
76
Appendix F
Adjusted, Linearized Score CDFs, SFM Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, SFM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, SFM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, SFM
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, SFM
Adjusted, Linearized Score CDFs, SSP Panel
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
12B, SSP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13A, SSP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
13B, SSP
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
14A, SSP
Figure F-2 (Continued). The cumulative distributions of adjusted, linearized SRP aggregate scores, byinstrument, for the SFM and SSP panels. (Continued on next page.)
77
Appendix F
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
12B, ETP Pulsar Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
12B, ETP Pulsar Proposal CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13A, ETP Pulsar Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13A, ETP Pulsar Proposal CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13B, ETP Pulsar Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13B, ETP Pulsar Proposal CDFs
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
14A, ETP Pulsar Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
14A, ETP Pulsar Proposal CDFs
Figure F-3. Comparison of linearized, adjusted score distributions for ETP pulsar proposals, by instru-ment. GBT proposal scores are shown in blue, VLA scores in green, and VLBA/HSA scores in red. Comparewith Figure F-4.
78
Appendix F
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
12B, ETP Triggered Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
12B, ETP Triggered Proposal CDFs
0 2 4 6 8 100.00
0.05
0.10
0.15
0.20
0.25
0.30
13A, ETP Triggered Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13A, ETP Triggered Proposal CDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13B, ETP Triggered Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
13B, ETP Triggered Proposal CDFs
0 2 4 6 8 100.0
0.1
0.2
0.3
0.4
0.5
14A, ETP Triggered Proposal PDFs
0 2 4 6 8 100.0
0.2
0.4
0.6
0.8
1.0
14A, ETP Triggered Proposal CDFs
Figure F-4. Comparison of linearized, adjusted score distributions for ETP triggered proposals, by in-strument. GBT proposal scores are shown in blue, VLA scores in green, and VLBA/HSA scores in red.Compare with Figure F-3.
79
0.3 0.4 0.5 0.6 0.7 0.8
1
2
3
4
5
6
AGN,12B 0.544 0.621 0.598 0.681 0.579 0.555AGN,13A 0.638 0.672 0.700 0.549 0.670 0.670AGN,13B 0.564 0.699 0.691 0.647 0.494 0.665AGN,14A 0.550 0.535 0.672 0.568 0.557EGS,12B 0.493 0.638 0.680 0.613 0.660 0.520EGS,13A 0.594 0.669 0.685 0.633 0.625 0.507EGS,13B 0.684 0.616 0.563 0.671 0.520 0.640EGS,14A 0.576 0.565 0.617 0.629 0.632 0.586ETP,12B 0.574 0.571 0.585 0.594 0.705ETP,13A 0.683 0.597 0.632 0.634 0.740ETP,13B 0.513 0.575 0.702 0.575 0.660 0.599ETP,14A 0.582 0.676 0.600 0.589 0.605 0.608HIZ,12B 0.683 0.666 0.563 0.561 0.560 0.578HIZ,13A 0.561 0.633 0.600 0.526 0.565 0.512HIZ,13B 0.583 0.606 0.682 0.577 0.629 0.250HIZ,14A 0.587 0.691 0.642 0.652 0.579 0.552ISM,12B 0.583 0.497 0.583 0.376 0.556 0.681ISM,13A 0.617 0.609 0.661 0.620 0.636 0.638ISM,13B 0.585 0.545 0.626 0.583 0.617 0.456ISM,14A 0.579 0.570 0.458 0.579 0.622 0.575NGA,12B 0.544 0.577 0.740 0.659 0.686 0.624NGA,13A 0.610 0.682 0.580 0.603 0.553 0.264NGA,13B 0.464 0.631 0.606 0.742 0.721 0.610NGA,14A 0.560 0.600 0.587 0.581 0.552 0.588SFM,12B 0.669 0.706 0.586 0.617 0.540SFM,13A 0.702 0.622 0.674 0.688 0.673 0.602SFM,13B 0.673 0.545 0.639 0.641 0.533 0.688SFM,14A 0.511 0.608 0.594 0.518 0.585 0.650SSP,12B 0.665 0.605 0.639 0.773 0.611 0.489SSP,13A 0.616 0.651 0.813 0.617 0.525 0.634SSP,13B 0.783 0.712 0.488 0.618 0.440 0.539SSP,14A 0.598 0.725 0.625 0.566 0.605
Figure F-5. The table above shows modified Borda counts for each reviewer in each of the eight SRPs,
for semesters 12B, 13A, 13B, and 14A. This is for comparison of individual reviewers’ preliminary scoreswith the SRP average, normalized preliminary scores. These correspond to the Qprelim values in Table 1,Line 1 of Neill Reid’s Memorandum [21]. (We cannot compute the Qfinal scores, as he has done, becauseour panelists do not individually submit revised proposal scores; but rather they agree by acclamation ona consensus score for each proposal.) Also shown is a smooth histogram of the entire list of Qinitial values.The mean value, 0.605 ± 0.075, is very close to that of [21].
80