Date post: | 18-Dec-2016 |
Category: |
Documents |
Upload: | jose-maria |
View: | 215 times |
Download: | 2 times |
This article was downloaded by: [Florida Atlantic University]On: 24 September 2013, At: 12:24Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK
European Journal of Sport SciencePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tejs20
Expected number of goals depending on intrinsic andextrinsic factors of a football player. An application toprofessional Spanish football leagueAntonio Sáez Castillo a , José Rodríguez Avi a & José María Pérez Sánchez ba Department of Statistics and Operational Research, University of Jaén, Jaén, Spainb Department of Quantitative Methods, University of Granada, Granada, SpainPublished online: 14 Dec 2011.
To cite this article: Antonio Sáez Castillo , José Rodríguez Avi & José María Pérez Sánchez (2013) Expected number of goalsdepending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league,European Journal of Sport Science, 13:2, 127-138, DOI: 10.1080/17461391.2011.589473
To link to this article: http://dx.doi.org/10.1080/17461391.2011.589473
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
ORIGINAL ARTICLE
Expected number of goals depending on intrinsic and extrinsic factorsof a football player. An application to professional Spanish footballleague
ANTONIO SAEZ CASTILLO1, JOSE RODRIGUEZ AVI1, & JOSE MARIA PEREZ SANCHEZ2
1Department of Statistics and Operational Research, University of Jaen, Jaen, Spain, and 2Department of Quantitative
Methods, University of Granada, Granada, Spain
AbstractA Bayesian regression model for the number of goals scored by players in the Spanish football league during nine seasons isfitted. The model handles overdispersion in such a way that individual footballers ability for scoring may be estimatedregardless of the number of minutes played, the position in the field and the team in which they play. Additionally, theposterior predictive distributions of the fitted model allow to obtain an estimation of the performance of any player in eachseason with reference to the number of goals scored, locating that number as a quantile in the expected distribution of thegoals scored by this player in this season. The results show how the model awards the fact that defenders and midfieldersscore goals because it does not expect that they score many goals and evaluates a forward more strictly in relation to theexpected number of goals.
Keywords: Goal scoring ability, goal scoring performance, Bayesian analysis, count data, Poisson regression, negative
binomial regression, overdispersion
1. Introduction
The fact that a player scores a goal may depend on
readily measurable factors, such as the number of
minutes or matches played, the position on the pitch
and the team quality. But it also depends on other
factors which are more difficult to measure, such as
the individual characteristics of the player that make
him different from all the rest.
The main aim of this paper is to evaluate the goal
scoring ability and the goal scoring performance of
Spanish footballers using regression models that
describe the number of goals scored during the
current season, and subsequently update it with the
result obtained in the current season.
We employ hierarchical Bayesian regression mod-
els in which, given the individual characteristics of a
player, the number of goals scored in a season
follows a Poisson or a negative binomial distribution.
The mean of this distribution, which is unobserva-
ble, is modelled taking into account the player’s
environmental conditions (in a deterministic fash-
ion) and his individual conditions (in a probabilistic
way). The purpose that we pursue with this type of
models is double:
. First, we try to quantify an individual compo-
nent which each player incorporates in his
scoring average due to his own characteristics,
regardless of the external conditions of his play,
in terms of its posterior distribution.
. Secondly, we also want to estimate the posterior
predictive distribution for the number of
goals scored by each player in the last season,
based on his background in the Spanish
League competition. In this way, considering
Correspondence: Jose Marıa Perez Sanchez, Department of Quantitative Methods, Faculty of Business and Economics Sciences, University
of Granada, Campus universitario de Cartuja, s/n. 18011 Granada, Spain. E-mail: [email protected]
European Journal of Sport Science, 2013
Vol. 13, No. 2, 127�138, http://dx.doi.org/10.1080/17461391.2011.589473
# 2013 European College of Sport Science
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
the number of goals scored in this last season, the
models can provide an evaluation of the perfor-
mance of the player with respect to goal scoring,
locating the quantile within the posterior
distribution.
We focus on the count variable number of goals
scored by a player during a season. However, the
methodology could be applied to the analysis of
other outcomes that may be quantified by count data
variables, such as the number of yellow or red cards,
goal assists, intercepted passes, tackles, etc.
Previous published works developing predictive
models for the number of goals scored have primarily
focused on the final result of matches and have not
been concerned with player performance. Many of
these models are based on Poisson models or can be
derived from them (e.g. see Crowder, Dixon, Led-
ford, & Robinson, 2002). Other studies have taken
player behaviour into account, but have analyzed the
effectiveness of team strategies rather than individual
performance (Ensum, Pollard, & Taylor, 2004;
Hughes & Franks, 2005; Pollard & Reep, 1997). In
turn, hierarchical Poisson models have been used to
analyse different kinds of data using a range of
approaches to deal with the model overdispersion
that frequently affects such data (Dey, Ghosh, &
Mallick, 2000, pp. 94�96) or, more generally, to
model characteristics of data that cannot be
described by the Poisson model (see, for example,
Ludwig & Osuna-Echavarrıa, 2006). Furthermore,
Negative Binomial Regression has been discussed in
detail in Hilbe (2007). Negative binomial distribu-
tions have been also used for fitting data on goal
scoring (see Baxter & Stevenson, 1988; Lee, 2002);
these studies deal with modelling scores in associa-
tion with football and rugby Leagues, respectively. In
the same way, a model based on an extension of
the negative binomial distribution, known as Gen-
eralized Waring Regression model (GWRM), has
been employed by Rodrıguez-Avi, Conde-Sanchez,
Saez-Castillo, Olmo-Jimenez, and Martınez-Rodrı-
guez (2009) for fitting the number of goals scored in
relation to several covariates in the Spanish football
league. The main drawback of that GWRM in this
context is the frequentist point of view, which makes
impossible to estimate players individual ability.
Finally, Goddard (2005) studied both, goals
and match results, by using regression models for
forecasting purposes.
2. Models
We have employed two different models, taking into
account two different sampling distributions for the
number of goals. The first of them employs a Poisson
sampling which takes into account the individual
characteristics of the player through a Gamma dis-
tribution. In the second one, we start from a negative
binomial sampling, trying to consider the possibility
of overdispersion in the sampling, and we model
individual heterogeneity by a BetaII distribution.
The notation which we consider in both models is:
. yi,j is the number of goals scored by ith player in
the jth season.
. ti,j is the number of minutes that the ith
footballer plays in the jth season, considered as
an offset.
. xi,j is the explaining variables vector of the ith
player in the jth season. Specifically, we have
considered two explaining variables:
1. The position of the ith player in the jth season
(defence, midfielder or forward), according to
the database where we got data. It has been
considered as a factor, with two dummy vari-
ables and defence as reference level.
2. The rank of the ith player’s team in the jth
season.
Thus, x?i,j � (1, midfielderi,j, forwardi,j,
classificationi,j), where midfielderi,j and forwardi,j
are dummy variables and classificationi,j is the rank
of the ith player’s team in the jth season, from 1 to 20.
It is important to take into account that most of
the players don’t play all the seasons, some of them
even play in non-consecutive seasons. In the analysis
of data, we have solved this problem considering in
each season only those players who participate in it.
The dataset has been extracted from the database of
the Spanish sports daily newspaper MARCA. This
newspaper has a daily readership of over 2,749,000,
the highest in Spain for a daily newspaper, and more
than half of sports readership. About the considered
covariates, we think that the most controversial is the
player position in the pitch: it must be taken into
account that it is not easy to determine a specific
player position (defence, midfielder or forward), and
also that this position may not be constant through-
out a season. Nevertheless, we have decided to treat
data as they appear in the original database.
2.1. Poisson model
We consider a Poisson model in which
yi;j ki;j
�� �Poisson ki;j
� �; where ki;j ¼ vili;j ; so
Pr yi;j ¼ y vi; li;j
��� �¼ e�vi ;li;j
vili;j
� �y
y!; y ¼ 0; 1;2; . . . ;
where
li;j ¼ ti;j exp xi;jb� �
:
128 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
In this way, vi represents the individual component
of each player, that is, his own contribution to the
scoring average regardless of the covariates effect,
only taking into account those characteristics that
make him unique. Because of this, we think that vi is
related to the ability of the player in terms of goal
scoring. Thus, high values of vi would indicate that
the scoring average is greater than that determined
by the covariates, and vice versa. On the other hand,
the covariates effect appears in mi,j, and it may
change from one season to another.
To introduce in the model the possibility of
heterogeneity non-explained by the covariates, vi is
considered as a Gamma (d, d) random variable, with
probability density function
pðvi dj Þ ¼ dd
CðdÞvd�1
i e�dvi :
By specifying a Gamma distribution for vi with
shape and scale parameters to be equal, the
Negative Binomial model is derived (see Cameron
& Trivedi, 1998). vi maintains constant its model
throughout the seasons because it represents the
individual ability which is intrinsic to the player.
Note that, due to the presence of b0, vi mean
can be considered the unity, avoiding a
nuisance parameter and controlling only its var-
iance with d.
The hyperparameters bh, (h � 0, 1, 2, 3) follow
prior non-informative normal distributions N (0,
100). We propose a flexible hierarchical prior struc-
ture for d, d � exp(b) and b � exp(0.01), where the
hyperparameter b follows an exponential distribution
with a large variance (Var(b) � 10,000), indicating
the absence of prior information. Similar assump-
tions for the parameters of the hyperpriors distribu-
tions can be found in Saez-Castillo, Olmo-Jimenez,
Perez Sanchez, Negrın Hernandez, Arcos-Navarro,
and Dıaz-Oller (2010).
Posterior distributions are obtained by applying
MCMC methods implemented by the WinBUGS
software (Lunn, Thomas, Best, & Spiegelhalter,
2000). The codes are provided in an Appendix section.
2.2. Negative binomial model. Empirical and full Bayes
versions
To suppose a Poisson sampling may be too restrictive,
so we have proposed a second model in which the
negative binomial distribution may explain overdis-
persion in the sampling. Specifically, we consider
yi;j ki;j �Poisson ki;j
� ��� where ki;j�Gamma ai;j ; 1=vi
� �or
equivalently, yi;j xi;j ; vi �NB ai;j ; vi
� ��� ; with
Pr yi;j ¼ y xi;j ; vi
��� �¼
C ai;j þ yi;j
� �Cðai;jÞyi;j !
1
1 þ vi
!ai;j
vi
1 þ vi
!yi;j
;
in such a way that E yi;j xi;j ; vi
��� �¼ ai;jvi: Additionally,
. vi � BetaII (r, k), with probability density
function
pðviÞ ¼Cðk þ qÞCðkÞCðqÞ
vk�1i ð1 þ viÞ
�ðkþqÞ:
� ai;j ¼ ti;j ex0
i;jbq�1
k ; where ti,j is the offset.
vi represents again the individual ability, the
contribution that the unique conditions of each
player introduce in the scoring average. We have
preferred to maintain the same notation for this
component as in the Poisson model, although the
scale is not comparable in both models. On the other
hand, the covariates effect in the scoring average is
included now in ai,j.
If k, r and b have constant values, the model leads
to a prior predictive distribution for yi;j xi;j
�� given by a
univariate generalized Waring distribution UGWD
(ai,j, k, r) (Irwin, 1968; Xekalaki, 1983) with
probabilities
Pr yi;j ai;j ; k; q��� �
¼Cðai;j þ qÞCðk þ qÞðai;jÞyi;j
ðkÞyi;j
CðqÞCðai;j þ k þ qÞðai;j þ k þ qÞyi;j
1
yi;j !;
yi;j ¼ 0; 1; . . . ;
being ðaÞx ¼CðaþxÞCðaÞ for any a�0 and any integer x.
Taking into account that the BetaII is conjugated in
the negative binomial sampling, the posterior pre-
dictive distribution for the number of goals scored by
the ith player in the (j�1)th season, given the
information about all the previous seasons and the
covariates of the current season is
yi;jþ1 ðyi;l ; xi;lÞj
l¼1
�� ; xi;jþ1�
UGWD ai;jþ1; k þXj
l¼1
yi;l ;Xj
l¼1
ai;l þ q
!:
So, a first version of the model may be fitted by
eliciting k, r and b values. We will name this version
empirical Bayes and is related to the GWRM from a
frequentist point of view.
Nevertheless, the elicitation of k, r and b con-
sidering the same dataset (because there is no
Expected number of goals depending on intrinsic and extrinsic factors of a football player 129
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
previous information) may be controversial. Because
of it, we assign low informative hyperprior distribu-
tions for these parameters. In this sense, we have
proposed a full Bayes model to cope with the
variability of the parameters.
Again, we propose
. r � exp(0.01),
. k � exp(0.01) and
. bi� N (0, 100), for i � 0, 1, 2, 3.
In terms of the goodness of the fits, this model has
been the most accurate. As in the Poisson model,
posterior distributions are obtained by applying
MCMC methods implemented by the WinBUGS
software (Lunn et al., 2000). Again, codes are
included in the Appendix section.
3. Estimation of goal scoring ability and
performance
Once both models have been fitted, we have em-
ployed the posterior distribution of each vi to
establish a ranking about the individual ability of
all the players, sorting the posterior means. As we
have commented previously, vi scales are not com-
parable in the different models, but the rankings may
be compared; in fact, we will see that these classifi-
cations according to both models (Poisson and full
Bayes negative binomial) are very close.
Additionally, we have also employed both models
to evaluate the goal scoring performance of each
player. Specifically, we consider the ith player, whose
career is known until the jth season; in addition, at
the end of the (j � 1)th season we can estimate the
expected distribution for his number of goals,
yi;jþ1 ðyi;l ; xi;lÞj
l¼1
�� ; xi;jþ1: Finally, we can locate the
actual number of goals scored by this ith player in
this last (j � 1)th season in the context of its
expected distribution, as a quantile, in such a way
that this quantile is an evaluation of the goal scoring
performance of this player in this last season, taking
into account all his career and his individual ability.
Thus, if the actual number of goals scored is over the
median, we could say that his performance has been
better than the expected, and vice versa. Note that,
in this way, the predictive distribution is not used
with predictive purposes, but only to value what is
the position of the actual number of goals scored.
It must be taken into account that we only have
an explicit and closed expression of the distribution
of yi;jþ1 ðyi;l ; xi;lÞj
l¼1
�� ; xi;jþ1 in the empirical Bayes
version of the negative binomial model (that is
a UGWD). In the case of the Poisson and the full
Bayes version of the negative binomial model, where
there is no a explicit expression of yi;jþ1 ðyi;l ;��
xi;lÞj
l¼1; xi;jþ1 probabilities, we have taken into account
that these probabilities are the average of the condi-
tional Poisson probabilities, that is
Pr yi;jþ1�y� �
¼Z10
Pr yi;jþ1�y ki;jþ1
��� �f ki;jþ1
� �dki;jþ1
¼ E FðyÞ ki;jþ1
��� �;
where F makes reference to the Poisson distribution
(in both models). So we have simulated large samples
of the goal scoring average of each player, ki;jþ1, and
then we have estimated the probability under the
actual number of goals scored, Pr yi;jþ1� y� �
; as the
sample mean of the Poisson conditional probabilities,
Pr yi;jþ1�y ki;jþ1
�� ;�
given the simulated ki;jþ1 values.
4. Models fit
The statistical methods consisted of two steps: (1)
estimation of regression models and (2) analysis of
goodness of fit using deviance information criterion
(DIC). The final sample size was 1599 players from
Spanish football league from 2000/2001 to 2008/
2009 seasons (nine seasons).
Bayesian estimation of the Poisson and full Bayes
binomial negative models was carried out by apply-
ing MCMC techniques. Four chains of 100,000
samples were recorded after a burn-in sample of
100,000 for both models. Different diagnoses were
obtained to ensure the desired convergence of the
simulations using several tests provided within the
WinBUGS Convergence Diagnostics and Output
Analysis software (CODA). We have considered the
next methodology to establish initial values of the full
Bayes negative binomial model:
� First, we have estimated k, r and b by maximum
likelihood in a GWRM model by means of the
GWRM package (Saez-Castillo, Rodraguez-Avi,
Conde-Sanchez, Olmo-Jimenez, & Martanez-
Rodraguez, 2009) of R Development Core
Team (2009), joining all the seasons in a unique
dataset. We must clarify that the GWRM model
proposes a general UGWD(ai, k, r) distribution
for the number of goals scored, yi, where
ai ¼ exp x0ib
q�1
k
� �in such a way that the mean is
mi�exp(xi?b), but it doesn’t allow to take into
account the season in the same way that we are
considering and, additionally, it only proposes a
general BetaII(p, k) distribution for the indivi-
dual heterogeneity (the same for all the players),
which is absolutely useless for our purposes.
� Secondly, we have calculated the corresponding
values of ai,j, given k, r and b.
So, in this way, we are considering the estimates of
k, r, b and ai,j in the empirical Bayes version of the
130 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
negative binomial model as initial values of the full
Bayes version.
The most accurate model, in terms of the DIC
value, was the full Bayesian version of the negative
binomial model (DIC � 11,260.1), so we are going
to focus only on the results provided by this model.
The DIC values for the Poisson model and the
empirical Bayes version of the negative binomial
model are 11,340.6 and 11,271.2, respectively.
Table I and Table II contain b estimates and the
rest of parameters estimates of the models. We would
like to highlight that b estimates are statistically
relevant and quite similar in the three models.
5. Results
5.1. Estimation of the goal scoring ability. Ranking of the
most able players
To summarise the results that the full Bayes version
of the negative binomial model provide about the
ability of the players by means of, vi we have
highlighted only the best five players in each position
in the field, sorting the mean of the variable vi; this
mean is usually quite similar to the median, due to
the symmetry to the distribution of most of the vi. As
we have commented before, this ranking is very close
to that provided by the Poisson model, as it is shown
in Figure 1, where the estimated posterior means of
vi over both models appear as cartesian coordinates.
Table III shows the results. The best five defenders
are, in this order, Ezequiel Garay, Roberto Carlos,
Campano, Cristian Alvarez and Larrazabal. In fact,
Ezequiel Garay appears as the third best player in
terms of his goal scoring ability in the set of all the
players of the Spanish football league, and it may be
due to several factors: first, he is a player who was
responsible for penalty kicks in his two seasons in
Racing de Santander (his signing for Real Madrid is
not included in the dataset because it was in the
2009/2010 season); secondly, he plays as a defender
in a medium team, so his not very high scoring
average is very well valued by the model. Roberto
Carlos is a different case, because it is well-known
that in his career in Real Madrid he was an effective
defender with a special talent for goal scoring, mainly
due to his strong shot; in the general ranking he is the
fifth best player.
The best midfielders are Rivaldo, Robert, Luis
Cembranos, Mark Gonzalez and Mostovoi. They
include such players as Rivaldo or Mostovoi, who
are considered midfielders but really played in very
advanced positions, in such a way that many
people consider that they are forwards. Robert is
a player that participated not many minutes, but
with a high performance, taking into account also
that he played in a team of the medium-low part of
the classification (R.C.D. Betis) in most of the
seasons.
Finally, the most able forwards, according to this
ranking, are Messi, Ronaldo, Makkay, Villa and
Etoo. It is striking that they appear in not very high
global positions. In fact, apart from Messi, who is in
the sixth position, the rest appears below the 15th
position. That is because the model takes into
account that they are forwards, so it expects that
Table II. Other parameters estimates
Node Mean S.D. MC error 2.5% Median 97.5%
Full Bayesian NB k 3.699 0.4139 0.04058 3.133 3.629 4.853
p 25.22 2.295 0.2226 20.0 25.33 29.26
Poisson b 0.7426 0.5275 0.002998 0.09055 0.6205 2.077
s 2.706 0.2138 0.003874 2.314 2.695 3.152
Empirical Bayes NB k 19.640
p 31.987
Table I. b estimates
Node Mean S.D. MC error 2.5% Median 97.5%
Full Bayesian NB b0 �3.178 0.0522 0.00467 �3.291 �3.172 �3.084
b1 �0.2243 0.0271 0.00121 �0.2796 �0.224 �0.173
b2 1.076 0.0634 0.00552 0.967 1.071 1.21
b3 1.879 0.0672 0.00591 1.752 1.879 2.017
Poisson b0 �3.157 0.0468 0.0019489 �3.248 �3.1578 �3.0628
b1 �0.221 0.0263 3.1884E-4 �0.27303 �0.22087 �0.16949
b2 1.071 0.0564 0.0021388 0.95919 1.0719 1.1806
b3 1.801 0.0605 0.0023942 1.6803 1.8025 1.9183
Empirical Bayes NB b0 �3.1729
b1 �0.3014
b2 1.0023
b3 2.1096
Expected number of goals depending on intrinsic and extrinsic factors of a football player 131
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
they score many goals, without it meaning that they
have a special ability which cannot be explained by
their position in the field; moreover, we have seen
how the model awards in the ranking the fact that
defenders and midfielders score goals, because it is
not their main objective in the field.
5.2. Estimation of the goal scoring performance
We have selected five players of those who appear in
Table III to evaluate their goal scoring performance
in the different seasons. We have also included Raul
in this analysis, because he is an iconic player for
many people in the Spanish football league. As we
have described before, this estimation is made by
locating the actual number of goals scored as a
quantile of the expected distribution of the number
of goals in this season. Remember that we need to
approximate the related probabilities of the expected
distributions as averages of the corresponding Pois-
son probabilities given sample simulated values of
ki;j , for any player i in each involved season j, so the
Figure 1. Estimated posterior means of vi by the full Bayes version of the negative binomial model (x-axis) and the Poisson model (y-axis).
Table III. Ranking, according to the full Bayes negative binomial, of the best players, by the position in the field. The global position in the
general ranking and the number of seasons played are also shown
Defenders
Name Global position Seasons
Ezequiel Garay 3 4
Roberto Carlos 5 7
Campano 7 6
Cristian Alvarez 19 3
Larrazabal 26 4
Midfielders
Name Global position Seasons
Rivaldo 1 2
Robert 2 2
Luis Cembranos 4 3
Mark Gonzalez 8 4
Mostovoi 9 4
Forwards
Name Global position Seasons
Messi 6 5
Ronaldo 16 5
Makaay 18 3
Villa 22 6
Etoo 24 9
132 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
computational effort is great. In this way, we are
quantifying the performance of the player in refer-
ence to his goal scoring, taking into account his
position in the field, the number of minutes played,
the classification of his team and, even, his own
ability. For example, imagine a player scores 12 goals
in a season and, considering the expected probability
for this player and this season, the probability of
score 12 goals or less is 0.95: that involves the actual
number of scored goals (12) is very high (percentile
95) in relation to the expected distribution, which
takes into account his external (minutes played, team
and position) and internal (ability) conditions,
indicating a good performance. Specially, we want
to comment on some cases.
Table IV shows for each season and each player,
the probability that the number of goals is less or
equal to the number of goals scored. The higher this
probability, the better is this performance of the
player in the season. With respect to Etoo evolution,
we would like to emphasise his high performance in
his last seasons (his worst result was in the 2001/
2002 season, when he scored only six goals and his
team, Real Mallorca, finished in the 16th position).
If we focus on the season when he was signed by F.C.
Barcelona, 2004/2005, we can see that previously,
they had played 2864 minutes in Real Mallorca (that
finished in the 11th position), scoring 18 goals, while
in the first season in F.C. Barcelona (which finally
won the league) he played a little more (3036
minutes), scoring 24 goals: nevertheless, the perfor-
mance is very similar in both seasons. That is
because the model must be more strict with a
forward who plays in the winning team than with
another forward in a medium team.
Messi’s evolution is also interesting. He shows a
high goal scoring performance in his initial sea-
sons. But a strong decrease can be detected in the
2007/2008 season. This decrease is due to the
change in his considered position, because until
then he was considered as a midfielder and since
this season, as a forward. Obviously, the model
evaluates a forward more strictly in relation to the
expected number of goals, in such a way that, even
if the performance were similar, it is located in
lower quantiles.
Finally, in the Ronaldo case we think that it must
be taken into account that he was injured most of his
last season (2006�2007), so he only could play 329
minutes in which only scored one goal: that is the
reason for his low performance. In the case of Villa, a
decrease in the 2006/2007 season it is also observed,
but this can only be due to a worse performance,
because he passed from scoring 25 goals in his first
season in Valencia F.C. (2005)�2006) to score only
15 in his second season (2006�2007), playing
approximately the same minutes.
References
Baxter, M., & Stevenson, R. (1988). Discriminating between the
Poisson and negative binomial distributions: An application to
goal scoring in association football. Journal of Applied, Statistics,
15(3), 347�354.
Cameron, A. C., & Trivedi, P. K. (1998). Regression analysis of
count data. Cambridge: Cambridge University Press.
Crowder, M., Dixon, M., Ledford, A., & Robinson, M. (2002).
Dynamic modelling and prediction of English football league
matches for betting. Journal of the Royal Statistical Society: Series
D (The Statistician), 51(2), 157�168.
Dey, D., Ghosh, S. K., & Mallick, B. K. (2000). Generalized linear
models: A Bayesian perspective. New York: CRC Press.
Ensum, J., Pollard, R., & Taylor, S. (2004). Applications of
logistic regression to shots at goal in association football:
Calculation of shot probabilities, quantification of factors and
player/team. Journal of Sports Sciences, 22(6), 500�520.
Goddard, J. (2005). Regression models for forecasting goals and
match results in association football. International Journal of
Forecasting, 21(2), 331�340.
Hilbe, J. (2007). Negative binomial regression. Cambridge: Cam-
bridge University Press.
Hughes, M., & Franks, I. (2005). Analysis of passing sequences,
shots and goals in soccer. Journal of Sports Sciences, 23(5), 509�514.
Irwin, J. O. (1968). The generalized Waring distribution applied
to accident theory. Journal of the Royal Statistical Society A,
131(2), 205�225.
Lee, A. (2002). Modelling rugby league data via bivariate negative
binomial regression. Australian & New Zealand Journal of
Statistics, 41(2), 141�152.
Ludwig, F., & Osuna-Echavarrıa, L. (2006). Structured additive
regression for overdispersed and zero-inflated count data.
Applied Stochastic Models in Business and Industry, 22(4), 351�369.
Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000).
Winbugs: A Bayesian modelling framework: Concepts, struc-
ture and extensibility. Statistics and Computing, 10, 325�337.
Pollard, R., & Reep, C. (1997). Measuring the effectiveness of
playing strategies at soccer. Journal of the Royal Statistical
Society: Series D (The Statistician), 46(4), 541�550.
Table IV. Estimation of the performance of the selected players
00/01 01/02 02/03 03/04 04/05 05/06 06/07 07/08 08/09
Etoo 0.27 0.07 0.45 0.61 0.61 0.71 0.59 0.86 0.87
Messi 0.91 0.85 0.92 0.20 0.62
Roberto Carlos 0.72 0.60 0.77 0.84 0.52 0.72 0.71
Ronaldo 0.77 0.73 0.47 0.58 0.39
Villa 0.46 0.46 0.71 0.23 0.71 0.89
Raul 0.95 0.47 0.76 0.36 0.29 0.23 0.12 0.75 0.81
Expected number of goals depending on intrinsic and extrinsic factors of a football player 133
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
R Development Core Team. (2009). R: A language and environ-
ment for statistical computing. Vienna, Austria: R Foundation for
Statistical Computing. ISBN 3-900051-07-0.
Rodrıguez-Avi, J., Conde-Sanchez, A., Saez-Castillo, A. J., Olmo-
Jimenez, M. J., & Martınez-Rodrıguez, A. M. (2009). A
generalized Waring regression model for count data.
Computational Statistics and Data Analysis, 53(10), 3717�3725.
Saez-Castillo, A. J., Rodrıguez-Avi, J., Conde-Sanchez, A., Olmo-
Jimenez, M. J., & Martınez-Rodrıguez, A. M. (2009). GWRM:
Generalized Waring regression models. R package version 1.1.
Saez-Castillo, A. J., Olmo-Jimenez, M. J., Perez Sanchez, J. M.,
Negrın Hernandez, M. A., Arcos-Navarro, A., & Dıaz-Oller, J.
(2010). Bayesian analysis of nosocomial infection risk and
length of stay in a department of general and digestive surgery.
Value in Health, 13(4), 431�439.
Xekalaki, E. (1983). The Univariate generalized Waring distribu-
tion in relation to accident theory: Proneness, spells or
contagion? Biometrics, 39, 887�895.
Appendix
The WinBUGS codes for the empirical Bayes model are:
1 m o d e l;
{
3 # s e a s o n 1
f o r ( i i n 1: m 1) {
5 g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s. s
e a s o n. 1 [i], 1] � d g a m m a ( a [p l a y e r s. s e a s o n. 1 [i], 1], r a t i o [p l a y e r s. s e a s o n. 1 [i]])
7 a [p l a y e r s. s e a s o n. 1 [i], 1] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], 4 * 1-3] * e x p
( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s
9 [2: 4])) * ( ro-1) / k
}
11 # s e a s o n 2
f o r ( i i n 1: m 2) {
13 g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2])
l a m [p l a y e r s. s e a s o n. 2 [i], 2] � d g a m m a ( a [p l a y e r s. s e a s o n. 2 [i], 2], r a t i o [p l a y e r s.
15 s e a s o n. 2 [i]]) a [p l a y e r s. s e a s o n. 2 [i], 2] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i],
( 4 * 2-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1):
17 ( 4 * 2)], b e t a s [2: 4])) * ( ro-1) / k
}
19 # s e a s o n 3
f o r ( i i n 1: m 3) {
21 g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3])
l a m [p l a y e r s. s e a s o n. 3 [i], 3] � d g a m m a ( a [p l a y e r s. s e a s o n. 3 [i], 3], r a t i o [p l a y e r s.
23 s e a s o n. 3 [i]]) a [p l a y e r s. s e a s o n. 3 [i], 3] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i],
( 4 * 3-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1):
25 ( 4 * 3)], b e t a s [2: 4])) * ( ro-1) / k
}
27 # s e a s o n 4
f o r ( i i n 1: m 4) {
29 g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4])
l a m [p l a y e r s. s e a s o n. 4 [i], 4] � d g a m m a ( a [p l a y e r s. s e a s o n. 4 [i], 4], r a t i o [p l a y e r s.
31 s e a s o n. 4 [i]]) a [p l a y e r s. s e a s o n. 4 [i], 4] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i],
( 4 * 4-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1):
33 ( 4 * 4)], b e t a s [2: 4])) * ( ro-1) / k
}
35 # s e a s o n 5
f o r ( i i n 1: m 5) {
37 g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5])
l a m [p l a y e r s. s e a s o n. 5 [i], 5] � d g a m m a ( a [p l a y e r s. s e a s o n. 5 [i], 5], r a t i o [p l a y e r s.
39 s e a s o n. 5 [i]]) a [p l a y e r s. s e a s o n. 5 [i], 5] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i],
( 4 * 5-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1):
41 ( 4 * 5)], b e t a s [2: 4])) * ( ro-1) / k
}
134 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
43 # s e a s o n 6
f o r ( i i n 1: m 6) {
45 g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6])
l a m [p l a y e r s. s e a s o n. 6 [i], 6] � d g a m m a ( a [p l a y e r s. s e a s o n. 6 [i], 6], r a t i o [p l a y e r s.
47 s e a s o n. 6 [i]]) a [p l a y e r s. s e a s o n. 6 [i], 6] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i],
( 4 * 6-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1):
49 ( 4 * 6)], b e t a s [2: 4])) * ( ro-1) / k
}
51 # s e a s o n 7
f o r ( i i n 1: m 7) {
53 g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s. s
e a s o n. 7 [i], 7] � d g a m m a ( a [p l a y e r s. s e a s o n. 7 [i], 7], r a t i o [p l a y e r s. s e a s o n. 7 [i]])
55 a [p l a y e r s. s e a s o n. 7 [i], 7] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)] * e x p (
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s [
57 2: 4])) * ( ro-1) / k
}
59 # s e a s o n 8
f o r ( i i n 1: m 8) {
61 g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8])
l a m [p l a y e r s. s e a s o n. 8 [i], 8] � d g a m m a ( a [p l a y e r s. s e a s o n. 8 [i], 8], r a t i o [p l a y e r s.
63 s e a s o n. 8 [i]]) a [p l a y e r s. s e a s o n. 8 [i], 8] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i],
( 4 * 8-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1):
65 ( 4 * 8)], b e t a s [2: 4])) * ( ro-1) / k
}
67 # s e a s o n 9
f o r ( i i n 1: m 9) {
69 g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.
s e a s o n. 9 [i], 9] � d g a m m a ( a [p l a y e r s. s e a s o n. 9 [i], 9], r a t i o [p l a y e r s. s e a s o n. 9 [i]])
71 a [p l a y e r s. s e a s o n. 9 [i], 9] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)] * e x p (
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s [
73 2: 4])) * ( ro-1) / k}
f o r ( i i n 1: i n d e x. m a x i m u m) {
75 r a t i o [i] B-1 / v [i]
v [i] B-( 1-p [i]) / p [i]
77 p [i] � d b e t a ( ro, k)
}
79}
The WinBUGS codes for the Poisson model are:
1 m o d e l;
{
3 # s e a s o n 1
f o r ( i i n 1: m 1) {
5 g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s.
s e a s o n. 1 [i], 1] B-v [p l a y e r s. s e a s o n. 1 [i]] * m u [p l a y e r s. s e a s o n. 1 [i], 1] l o g ( m u [
7 p l a y e r s. s e a s o n. 1 [i], 1]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s
9 [2: 4])
}
11 # s e a s o n 2
f o r ( i i n 1: m 2) {
13 g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2]) l a m [p l a y e r s.
s e a s o n. 2 [i], 2] B-v [p l a y e r s. s e a s o n. 2 [i]] * m u [p l a y e r s. s e a s o n. 2 [i], 2] l o g ( m u [
15 p l a y e r s. s e a s o n. 2 [i], 2]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1): ( 4 * 2)], b e t a s
17 [2: 4])
}
19 # s e a s o n 3
Expected number of goals depending on intrinsic and extrinsic factors of a football player 135
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
f o r ( i i n 1: m 3) {
21 g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3]) l a m [p l a y e r s.
s e a s o n. 3 [i], 3] B-v [p l a y e r s. s e a s o n. 3 [i]] * m u [p l a y e r s. s e a s o n. 3 [i], 3] l o g ( m u [
23 p l a y e r s. s e a s o n. 3 [i], 3]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1): ( 4 * 3)], b e t a s
25 [2: 4])
}
27 # s e a s o n 4
f o r ( i i n 1: m 4) {
29 g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4]) l a m [p l a y e r s.
s e a s o n. 4 [i], 4] B-v [p l a y e r s. s e a s o n. 4 [i]] * m u [p l a y e r s. s e a s o n. 4 [i], 4] l o g ( m u [
31 p l a y e r s. s e a s o n. 4 [i], 4]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1): ( 4 * 4)], b e t a s
33 [2: 4])
}
35 # s e a s o n 5
f o r ( i i n 1: m 5) {
37 g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5]) l a m [p l a y e r s.
s e a s o n. 5 [i], 5] B-v [p l a y e r s. s e a s o n. 5 [i]] * m u [p l a y e r s. s e a s o n. 5 [i], 5] l o g ( m u [
39 p l a y e r s. s e a s o n. 5 [i], 5]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1): ( 4 * 5)], b e t a s
41 [2: 4])
}
43 # s e a s o n 6
f o r ( i i n 1: m 6) {
45 g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6]) l a m [p l a y e r s.
s e a s o n. 6 [i], 6] B-v [p l a y e r s. s e a s o n. 6 [i]] * m u [p l a y e r s. s e a s o n. 6 [i], 6] l o g ( m u [
47 p l a y e r s. s e a s o n. 6 [i], 6]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1): ( 4 * 6)], b e t a s
49 [2: 4])
}
51 # s e a s o n 7
f o r ( i i n 1: m 7) {
53 g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s.
s e a s o n. 7 [i], 7] B-v [p l a y e r s. s e a s o n. 7 [i]] * m u [p l a y e r s. s e a s o n. 7 [i], 7] l o g ( m u [
55 p l a y e r s. s e a s o n. 7 [i], 7]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s
57 [2: 4])
}
59 # s e a s o n 8
f o r ( i i n 1: m 8) {
61 g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8]) l a m [p l a y e r s.
s e a s o n. 8 [i], 8] B-v [p l a y e r s. s e a s o n. 8 [i]] * m u [p l a y e r s. s e a s o n. 8 [i], 8] l o g ( m u [
63 p l a y e r s. s e a s o n. 8 [i], 8]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1): ( 4 * 8)], b e t a s
65 [2: 4])
}
67 # s e a s o n 9
f o r ( i i n 1: m 9) {
69 g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.
s e a s o n. 9 [i], 9] B-v [p l a y e r s. s e a s o n. 9 [i]] * m u [p l a y e r s. s e a s o n. 9 [i], 9] l o g ( m u [
71 p l a y e r s. s e a s o n. 9 [i], 9]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)]) �
b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s
73 [2: 4])
}
75 f o r ( i i n 1: i n d e x. m a x i m u m) {
v [i] � d g a m m a ( d e l t a, d e l t a)
136 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
77}
d e l t a � d e x p ( b)
79 b � d e x p ( 0. 0 1)
f o r ( j i n 1: 4) {
81 b e t a s [j] � d n o r m ( 0. 0, t) I (-1 0 0, 1 0 0)
}
83 t B-0. 0 1 $ \ s h a r p $ V a r i a n c e � 1 / t
}
The WinBUGS codes for the full Bayes model are:
m o d e l;
2 {
# s e a s o n 1
4 f o r ( i i n 1: m 1) {
g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s.
6 s e a s o n. 1 [i], 1] � d g a m m a ( a [p l a y e r s. s e a s o n. 1 [i], 1], r a t i o [p l a y e r s. s e a s o n. 1 [i]])
a [p l a y e r s. s e a s o n. 1 [i], 1] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], 4 * 1-3] * e x p (
8 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s
[2: 4])) * ( ro-1) / k
10}
# s e a s o n 2
12 f o r ( i i n 1: m 2) {
g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2]) l a m [p l a y e r s.
14 s e a s o n. 2 [i], 2] � d g a m m a ( a [p l a y e r s. s e a s o n. 2 [i], 2], r a t i o [p l a y e r s. s e a s o n. 2 [i]])
a [p l a y e r s. s e a s o n. 2 [i], 2] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3)] * e x p (
16 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1): ( 4 * 2)], b e t a s
[2: 4])) * ( ro-1) / k
18}
# s e a s o n 3
20 f o r ( i i n 1: m 3) {
g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3]) l a m [p l a y e r s.
22 s e a s o n. 3 [i], 3] � d g a m m a ( a [p l a y e r s. s e a s o n. 3 [i], 3], r a t i o [p l a y e r s. s e a s o n. 3 [i]])
a [p l a y e r s. s e a s o n. 3 [i], 3] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3)] * e x p (
24 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1): ( 4 * 3)], b e t a s
[2: 4])) * ( ro-1) / k
26}
# s e a s o n 4
28 f o r ( i i n 1: m 4) {
g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4]) l a m [p l a y e r s.
30 s e a s o n. 4 [i], 4] � d g a m m a ( a [p l a y e r s. s e a s o n. 4 [i], 4], r a t i o [p l a y e r s. s e a s o n. 4 [i]])
a [p l a y e r s. s e a s o n. 4 [i], 4] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3)] * e x p (
32 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1): ( 4 * 4)], b e t a s
[2: 4])) * ( ro-1) / k
34}
# s e a s o n 5
36 f o r ( i i n 1: m 5) {
g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5])
38 l a m [p l a y e r s. s e a s o n. 5 [i], 5] � d g a m m a ( a [p l a y e r s. s e a s o n. 5 [i], 5], r a t i o [p l a y e r s.
s e a s o n. 5 [i]]) a [p l a y e r s. s e a s o n. 5 [i], 5] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i],
40 ( 4 * 5-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1):
( 4 * 5)], b e t a s [2: 4])) * ( ro-1) / k
42}
# s e a s o n 6
44 f o r ( i i n 1: m 6) {
g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6]) l a m [p l a y e r s.
46 s e a s o n. 6 [i], 6] � d g a m m a ( a [p l a y e r s. s e a s o n. 6 [i], 6], r a t i o [p l a y e r s. s e a s o n. 6 [i]])
a [p l a y e r s. s e a s o n. 6 [i], 6] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3)] * e x p (
48 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1): ( 4 * 6)], b e t a s
Expected number of goals depending on intrinsic and extrinsic factors of a football player 137
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13
[2: 4])) * ( ro-1) / k
50}
# s e a s o n 7
52 f o r ( i i n 1: m 7) {
g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s.
54 s e a s o n. 7 [i], 7] � d g a m m a ( a [p l a y e r s. s e a s o n. 7 [i], 7], r a t i o [p l a y e r s. s e a s o n. 7 [i]])
a [p l a y e r s. s e a s o n. 7 [i], 7] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)] * e x p (
56 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s
[2: 4])) * ( ro-1) / k
58}
# s e a s o n 8
60 f o r ( i i n 1: m 8) {
g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8]) l a m [p l a y e r s.
62 s e a s o n. 8 [i], 8] � d g a m m a ( a [p l a y e r s. s e a s o n. 8 [i], 8], r a t i o [p l a y e r s. s e a s o n. 8 [i]])
a [p l a y e r s. s e a s o n. 8 [i], 8] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3)] * e x p (
64 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1): ( 4 * 8)], b e t a s
[2: 4])) * ( ro-1) / k
66}
# s e a s o n 9
68 f o r ( i i n 1: m 9) {
g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.
70 s e a s o n. 9 [i], 9] � d g a m m a ( a [p l a y e r s. s e a s o n. 9 [i], 9], r a t i o [p l a y e r s. s e a s o n. 9 [i]])
a [p l a y e r s. s e a s o n. 9 [i], 9] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)] * e x p (
72 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s
[2: 4])) * ( ro-1) / k
74}
f o r ( i i n 1: i n d e x. m a x i m u m) {
76 r a t i o [i] B-1 / v [i]
v [i] B-( 1-p [i]) / p [i]
78 p [i] � d b e t a ( ro, k)
}
80 r o � d e x p ( 0. 0 1)
k � d e x p ( 0. 0 1)
82 f o r ( j i n 1: 4) {
b e t a s [j] � d n o r m ( 0. 0, t) I (-1 0, 1 0)
84}
t B-0. 0 1
86}
138 A. S. Castillo et al.
Dow
nloa
ded
by [
Flor
ida
Atla
ntic
Uni
vers
ity]
at 1
2:24
24
Sept
embe
r 20
13