The future of the British RAE
The REF(Research Excellence Framework)
Jonathan Adams
Research Assessment Exercise - timeline
• 1980s - policy on concentration and selectivity
• 1986 - 1st Research Selectivity Exercise
• 1989 - modified and formalised as the RAE
• 1992 - Polytechnics access research funding, enter a streamlined RAE
• 1996 and 2001 - further cycles, higher quality thresholds for funding
• 2008 – new ‘Roberts’ profiling format
The shift to metrics
• Evolution– RAE = peer review of an evidence portfolio, including
data on outputs, training and grants’ funding– RAE2008 profiling adds emphasis to the data
• Discontinuity– Treasury’s 2007 announcement was disruptive, from
many perspectives
• Compromise– HEFCE consultations shifted emphasis away from the
gross simplification, and restored peer review
Research assessment must support the UK’s enhanced international research status
9
10
11
12
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
Shar
e of
wor
ld c
itatio
ns (%
)
Arrows indicate RAE yearsIs the assessment dividend beginning to plateau? Has the RAE delivered all it can?
0.8
1.0
1.2
1.4
1.6
1981 1986 1991 1996 2001 2006
Out
put g
row
th re
lativ
e to
wor
ld
DENMARK FRANCE GERMANY NETHERLANDS UK
If there is a shift to ‘metrics’, then disproportionate change should be avoided
Research performance - indicators, not metrics
Inputs Research black box
Outputs
Funding Numbers.. Publications
research quality
Time Time
What we want to know
What we have to use
How can we judge possible ‘metrics’?
• Relevant and appropriate – Are metrics correlated with other performance estimates?– Do metrics really distinguish ‘excellence’ as we see it?– Are these the metrics the researchers would use?
• Cost effective– Data accessibility, coverage, cost and validation
• Transparent, equitable and stable– Is it clear what the metrics do?– Are all institutions, staff and subjects treated equitably?– How do people respond, and can they manipulate metrics?– “Once an indicator is made a target for policy, it starts to lose the
information content that initially qualified it to play such a role”
Three proposed data components
• Research funding
• Research training
• Research output– The key quality measure
• All have multiple components
• PLUS Peer Review
HEFCE favours bibliometrics: impact (1996-2000) is related to RAE2001 grade (data for UoA14 Biology)
0
0.5
1
1.5
2
2.5
3
Reb
ased
Impa
ct (1
996-
2000
)
Rebased Impact Average Rebased Impact
4 5 5*1 2 3a3b
World average = 1.0
Impact index is coherent across UK grade levelsdata for core science disciplines, grade at RAE96
0.6
0.8
1
1.2
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Ave
rage
nor
mal
ised
impa
ct (w
orld
ave
rage
= 1
.0)
Grade 4 Grade 3A Grade 3B
16%
12%
17%
HEFCE favours bibliometrics: impact (1996-2000) is related to RAE2001 grade (data for UoA14 Biology)
0
0.5
1
1.5
2
2.5
3
Reb
ased
Impa
ct (1
996-
2000
)
Rebased Impact Average Rebased Impact
4 5 5*1 2 3a3b
World average = 1.0
0
0.5
1
1.5
2
2.5
3
Reb
ased
Impa
ct (1
996-
2000
)
Rebased Impact Average Rebased Impact
4 5 5*1 2 3a3b
World average = 1.0
The residual variance is very great
What is the right impact score?• Correct counts
– 25% of cites are to non-SCI outputs
• Proliferating versions– How do you collate?
• Collaboration vs fractional citations– Fractional citation counts would work against trends and policy
• Self citation – does it matter?– It is part of the sociology of research
• Normalisation strategies
• Clustering into subject groups
TOTAL INSTITUTIONAL OUTPUT
Non-printU
NP
UB
LISH
ED
OR
C
LIEN
T PU
BLIS
HE
D
RE
PO
RTS
etcPUBLICATIONS
INSTITUTIONAL PUBLICATIONS
Books and chapters
Conference proceedings
Journal articles
Will be in WoS within 2-3 months
INSTITUTIONAL PUBLICATIONS
Journals covered by
THOMSON WoS
and/or SCOPUS
Articles in journals not covered by
THOMSON WoS and/or SCOPUS
or journal not covered at time of publication
INSTITUTIONAL PUBLICATIONS
Journals covered by THOMSON WoS and/or SCOPUS
2001
Timeline
2007
CENSUS DATE
CE
NS
US
PE
RIO
D
INSTITUTIONAL PUBLICATIONS
Journals covered by THOMSON WoS and/or SCOPUS
2001
Timeline
2007
CE
NS
US
PE
RIO
D
All papers with an institutional address published by all staff and students employed or in training
during 2001-2007
Papers with an institutional address published by staff who left or retired before the census date
CENSUS DATE
All papers with an institutional address published by all staff and students employed or in training
during 2001-2007
Journals covered by THOMSON WoS and/or SCOPUS
Journals covered by THOMSON WoS and/or SCOPUS
All papers with an institutional address published by all staff and students employed or in training
during 2001-2007
Papers without that institutional address published by staff recruited during 2001-2007
INSTITUTIONAL PUBLICATIONS
CENSUS DATE
2007
2001
CE
NS
US
PE
RIO
D
Papers published during 2001-2007 by
staff present at census date
Papers published during census period by staff while at the institution
PAPERS BY ADDRESS
PAPERS BY AUTHOR
CENSUS DATE
2001
2007
CE
NS
US
PE
RIO
D
Leavers
Recruits
Quality differentiation: do you assess total activity or selected papers? (data for UoA18 Chemistry)
0
0.5
1
1.5
2
0 1 2 3 4 5 6
RAE2001 mapped articles (RBI)
HE
I 5 y
r av'
ge R
BI 1
996-
2000
Grade 3a
Grade 3b
Grade 4
Grade 5
Grade 5*
Spearman r = 0.57, P<0.001Ratio mapped/NSI = 1.93
0
5
10
15
20
25
30
uncited RBI > 0 <0.125
≥ 0.125 <0.25
≥ 0.25 <0.5
≥ 0.5 < 1 ≥ 1 < 2 ≥ 2 < 4 ≥ 4 < 8 ≥ 8
Perc
enta
ge o
f out
put 1
999
- 200
3
The average does not describe the profile
Two units in the same field differ markedly in average normalised citation impact (2.39 vs. 1.86) because of an exceptionally high outlier in one group, but the groups have similar profiles
Average = 2.39
Average = 1.86
Distribution of data values - income
0
5
10
15
20
Income category
Freq
uenc
y
Income per FTE Gross income
RAE2001 - research income for units in UoA14 Biology
£10m per unit
£250k per FTE
MaximumMinimum
0
100
200
300
400
Impact category (normalised to world average)
Freq
uenc
y
UK Physics papers for 1995 = 2323
World average
Maximum0
Distribution of data values - impact
The variables for which we have data are skewed and therefore difficult to picture in a simple way
Simplifying the data picture
• Scale data relative to a benchmark, then categorise– Could do this for any data set
• All journal articles– Uncited articles (take out the zeroes)– Cited articles
• Cited less often than benchmark• Cited more often than benchmark
– Cited more often but less than twice as often– Cited more than twice as often
» Cited less than four times as often» Cited more than four times as often
Categorising the impact data
All papers
Uncited papers Cited papers .
Papers cited less often than benchmark Papers cited more often than benchmark
Papers cited more than
benchmark, but less than four times as often
Papers cited more than four times as
often as benchmark
= 0 >0 >0.125 >0.25 0.5 < 1 1 < 2 2 < 4 4 < 8 > 8
This grouping is the equivalent of a log 2 transformation. There is no place for zero values on a log scale.
UK ten-year profile 680,000 papers
0
5
10
15
20
25
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Perc
enta
ge o
f out
put 1
995-
2004
% of UK output over decade
AVERAGERBI = 1.24
MODE (cited)
MEDIAN
THRESHOLD OF EXCELLENCE?
MODE
Profiles are informative and work well across institutions and subjects
0
5
10
15
20
25
30
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Perc
enta
ge o
f out
put 1
995-
2004
Leading research university Big civic 'Robbins' type university Former Polytechnic
HEIs – 10 year totals smoothed
Absolute volume would add a further element
for comparisons
0
5
10
15
20
25
30
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Perc
enta
ge o
f out
put 1
995-
2004
Leading research university Big civic 'Robbins' type university Former Polytechnic
0
2000
4000
6000
8000
10000
RBI = 0 RBI >0 - 0.125 RBI 0.125 - 0.25 RBI 0.25 - 0.5 RBI 0.5 - 1 RBI 1 - 2 RBI 2 - 4 RBI 4 - 8 RBI > 8
Num
bers
of a
rtic
les
1995
-200
4
HEIs – 10 year totals by volume
Normalisation strategy will affect the outcome(Data for UoA13 Psychology)
0
1
2
3
4
5
6
RAE 2001 rating
Average rebased impact
Average impact relative to journal averageAverage impact relative to Category averageAverage impact relative to UoA average
4 4 45 5* 5 5* 5 5*
0.00 0.16 0.32 0.48 0.64
Clinical Lab Sci...
Accountancy
Hosp. based...Com. based...Other stud. PharmacyBiochemistryBiol. sciencesPre-clin. stud.PhysiologyPharmacologyAnatomyVeterinary sci.Clin. DentistryFood sci...AgricultureEarth sci.Environ. sci.GeographyArcheologyMineral/mining...ChemistryMetallurgy...PhysicsChem. eng.Computer sci.Gen. Eng.Mechanical eng...Electrical eng...Civil eng.Pure maths.Applied maths.Statistical res...NursingSports related...PsychologyEducationPolitics...Social policy...SociologySocial workCommunication...Built environ.Town/country...Economics...Business... stud.LawLibrary and info...AnthropologyAsian stud.Middle east...Theology...American stud.Iberian...European stud.FrenchGerman, Dutch...EnglishHistoryItalianRussian...LinguisticsClassics...PhilosophyHistory of Art...Art and DesignDrama, Dance...MusicCeltic stud.
Subject clustering needs to fit UK research
Engineering
Medical
Physical
Maths
Bio-Med
Environment
Social
Arts & hum’s
This tree diagram illustrates similarity in the frequency with which journals were submitted to RAE1996
How should we map data to disciplines?i.e. what is Chemistry?
Research Council
Chemistry grants committee
University
School of Chemistry
ISI
Chemistry journals
FUNDING
ACTIVITY
OUTPUT
Other departments
Other journals
Other funders
Other researchers
Thomson
How well do metrics respond to variation?
• Subject differences– Can we accept differences in criteria and balance between
clusters?– What about divergence within clusters?– How do metrics support the growth of interdisciplinarity?– How can emerging (marginal?) research groups be recognised?
• Differences in mode– Where is the balance between basic and applied research?
• Differences in people– Career breaks, career development
How well do metrics represent different HEIs?Output coverage by articles on Thomson Reuters’ databases
0
1000
2000
3000
4000
5000
6000
Num
ber o
f art
icle
s su
bmitt
ed to
RA
E200
1
0
0.2
0.4
0.6
0.8
1
Proportion of artciles in ISI journals
Number of articles Proportion in ISI journals
What will it cost?
• Data costs– Core data – how much, from whom?– Data cleaning and validation
• Pilot studies are elucidating this – and the task is big
• Requirements on institutions– Pilot studies will elucidate this
• System development
• System maintenance
• Will it cover institutional quality assurance?
Other issues
• Census period– What about synchrony and sequence?
• Weighting indicators– ERA will weight research training at ‘0’– Need to weight within types as well as between
• Interface between quantitative (indicators) and qualitative (peer review)– Role of panel members– Risk of mis-match
Do outputs hang together with income and training? We can tell you …
“You are the REF”
Check it out now RAE2008.com
How can we judge possible metrics?
• Relevant and appropriate - YES– Technical ‘correctness’ of metrics is not a problem, but there is a
lot of work to do in refining and comparing options
• Cost - MAYBE– Data accessibility is not a problem– But we have yet to scope full system requirements
• So is there a problem?– Are all subjects, HEIs, staff and modes treated equitably?– What will 50,000 intelligent people start to do?– Goodhart’s Law - for how long will the metrics track excellence?
• Researchers must decide, not metricians (RMM, 1997)
– The devil is in the detail: get involved
REF pilot projects
• 20+ institutions (July ’08)
• Collect and collate databases, reconciling authors to staff (Oct ’08)
• Compare Thomson and Scopus coverage
• Collate and normalise citation counts (Dec ’08)
• Run evaluations of alternative methodologies
• Disseminate outcomes and consult (Mar ’09)
New Zealand - bibliometric volume and impact
0.7
0.8
0.9
1
1981-1985 1988-1992 1995-1999 2002-2006
Impa
ct re
lativ
e to
wor
ld a
vera
ge
10000
15000
20000
25000Total articles in 5 year period
Impact
Volume
Over 8,000 people participated in recent PBRF rounds (50,000 in the RAE). Thomson recorded fewer than 5,000 articles per year recently (100,000 for the UK). That is less than one article per NZ researcher per year.
Implications for Aotearoa New Zealand
• Relative data coverage– Balance of regional journals
• ‘International’ = trans-Atlantic– The relevance of citations
• Scale factors and relative load– Fixed costs
• Community size and anonymity
• Compatibility of stakeholder and researcher views on assessment outcomes
The future of the British RAE
The REF(Research Excellence Framework)
Jonathan Adams