+ All Categories
Home > Documents > Expected number of goals depending on intrinsic and extrinsic factors of a football player. An...

Expected number of goals depending on intrinsic and extrinsic factors of a football player. An...

Date post: 18-Dec-2016
Category:
Upload: jose-maria
View: 215 times
Download: 2 times
Share this document with a friend
13
This article was downloaded by: [Florida Atlantic University] On: 24 September 2013, At: 12:24 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK European Journal of Sport Science Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tejs20 Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league Antonio Sáez Castillo a , José Rodríguez Avi a & José María Pérez Sánchez b a Department of Statistics and Operational Research, University of Jaén, Jaén, Spain b Department of Quantitative Methods, University of Granada, Granada, Spain Published online: 14 Dec 2011. To cite this article: Antonio Sáez Castillo , José Rodríguez Avi & José María Pérez Sánchez (2013) Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league, European Journal of Sport Science, 13:2, 127-138, DOI: 10.1080/17461391.2011.589473 To link to this article: http://dx.doi.org/10.1080/17461391.2011.589473 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions
Transcript
Page 1: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

This article was downloaded by: [Florida Atlantic University]On: 24 September 2013, At: 12:24Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

European Journal of Sport SciencePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/tejs20

Expected number of goals depending on intrinsic andextrinsic factors of a football player. An application toprofessional Spanish football leagueAntonio Sáez Castillo a , José Rodríguez Avi a & José María Pérez Sánchez ba Department of Statistics and Operational Research, University of Jaén, Jaén, Spainb Department of Quantitative Methods, University of Granada, Granada, SpainPublished online: 14 Dec 2011.

To cite this article: Antonio Sáez Castillo , José Rodríguez Avi & José María Pérez Sánchez (2013) Expected number of goalsdepending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league,European Journal of Sport Science, 13:2, 127-138, DOI: 10.1080/17461391.2011.589473

To link to this article: http://dx.doi.org/10.1080/17461391.2011.589473

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

ORIGINAL ARTICLE

Expected number of goals depending on intrinsic and extrinsic factorsof a football player. An application to professional Spanish footballleague

ANTONIO SAEZ CASTILLO1, JOSE RODRIGUEZ AVI1, & JOSE MARIA PEREZ SANCHEZ2

1Department of Statistics and Operational Research, University of Jaen, Jaen, Spain, and 2Department of Quantitative

Methods, University of Granada, Granada, Spain

AbstractA Bayesian regression model for the number of goals scored by players in the Spanish football league during nine seasons isfitted. The model handles overdispersion in such a way that individual footballers ability for scoring may be estimatedregardless of the number of minutes played, the position in the field and the team in which they play. Additionally, theposterior predictive distributions of the fitted model allow to obtain an estimation of the performance of any player in eachseason with reference to the number of goals scored, locating that number as a quantile in the expected distribution of thegoals scored by this player in this season. The results show how the model awards the fact that defenders and midfieldersscore goals because it does not expect that they score many goals and evaluates a forward more strictly in relation to theexpected number of goals.

Keywords: Goal scoring ability, goal scoring performance, Bayesian analysis, count data, Poisson regression, negative

binomial regression, overdispersion

1. Introduction

The fact that a player scores a goal may depend on

readily measurable factors, such as the number of

minutes or matches played, the position on the pitch

and the team quality. But it also depends on other

factors which are more difficult to measure, such as

the individual characteristics of the player that make

him different from all the rest.

The main aim of this paper is to evaluate the goal

scoring ability and the goal scoring performance of

Spanish footballers using regression models that

describe the number of goals scored during the

current season, and subsequently update it with the

result obtained in the current season.

We employ hierarchical Bayesian regression mod-

els in which, given the individual characteristics of a

player, the number of goals scored in a season

follows a Poisson or a negative binomial distribution.

The mean of this distribution, which is unobserva-

ble, is modelled taking into account the player’s

environmental conditions (in a deterministic fash-

ion) and his individual conditions (in a probabilistic

way). The purpose that we pursue with this type of

models is double:

. First, we try to quantify an individual compo-

nent which each player incorporates in his

scoring average due to his own characteristics,

regardless of the external conditions of his play,

in terms of its posterior distribution.

. Secondly, we also want to estimate the posterior

predictive distribution for the number of

goals scored by each player in the last season,

based on his background in the Spanish

League competition. In this way, considering

Correspondence: Jose Marıa Perez Sanchez, Department of Quantitative Methods, Faculty of Business and Economics Sciences, University

of Granada, Campus universitario de Cartuja, s/n. 18011 Granada, Spain. E-mail: [email protected]

European Journal of Sport Science, 2013

Vol. 13, No. 2, 127�138, http://dx.doi.org/10.1080/17461391.2011.589473

# 2013 European College of Sport Science

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 3: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

the number of goals scored in this last season, the

models can provide an evaluation of the perfor-

mance of the player with respect to goal scoring,

locating the quantile within the posterior

distribution.

We focus on the count variable number of goals

scored by a player during a season. However, the

methodology could be applied to the analysis of

other outcomes that may be quantified by count data

variables, such as the number of yellow or red cards,

goal assists, intercepted passes, tackles, etc.

Previous published works developing predictive

models for the number of goals scored have primarily

focused on the final result of matches and have not

been concerned with player performance. Many of

these models are based on Poisson models or can be

derived from them (e.g. see Crowder, Dixon, Led-

ford, & Robinson, 2002). Other studies have taken

player behaviour into account, but have analyzed the

effectiveness of team strategies rather than individual

performance (Ensum, Pollard, & Taylor, 2004;

Hughes & Franks, 2005; Pollard & Reep, 1997). In

turn, hierarchical Poisson models have been used to

analyse different kinds of data using a range of

approaches to deal with the model overdispersion

that frequently affects such data (Dey, Ghosh, &

Mallick, 2000, pp. 94�96) or, more generally, to

model characteristics of data that cannot be

described by the Poisson model (see, for example,

Ludwig & Osuna-Echavarrıa, 2006). Furthermore,

Negative Binomial Regression has been discussed in

detail in Hilbe (2007). Negative binomial distribu-

tions have been also used for fitting data on goal

scoring (see Baxter & Stevenson, 1988; Lee, 2002);

these studies deal with modelling scores in associa-

tion with football and rugby Leagues, respectively. In

the same way, a model based on an extension of

the negative binomial distribution, known as Gen-

eralized Waring Regression model (GWRM), has

been employed by Rodrıguez-Avi, Conde-Sanchez,

Saez-Castillo, Olmo-Jimenez, and Martınez-Rodrı-

guez (2009) for fitting the number of goals scored in

relation to several covariates in the Spanish football

league. The main drawback of that GWRM in this

context is the frequentist point of view, which makes

impossible to estimate players individual ability.

Finally, Goddard (2005) studied both, goals

and match results, by using regression models for

forecasting purposes.

2. Models

We have employed two different models, taking into

account two different sampling distributions for the

number of goals. The first of them employs a Poisson

sampling which takes into account the individual

characteristics of the player through a Gamma dis-

tribution. In the second one, we start from a negative

binomial sampling, trying to consider the possibility

of overdispersion in the sampling, and we model

individual heterogeneity by a BetaII distribution.

The notation which we consider in both models is:

. yi,j is the number of goals scored by ith player in

the jth season.

. ti,j is the number of minutes that the ith

footballer plays in the jth season, considered as

an offset.

. xi,j is the explaining variables vector of the ith

player in the jth season. Specifically, we have

considered two explaining variables:

1. The position of the ith player in the jth season

(defence, midfielder or forward), according to

the database where we got data. It has been

considered as a factor, with two dummy vari-

ables and defence as reference level.

2. The rank of the ith player’s team in the jth

season.

Thus, x?i,j � (1, midfielderi,j, forwardi,j,

classificationi,j), where midfielderi,j and forwardi,j

are dummy variables and classificationi,j is the rank

of the ith player’s team in the jth season, from 1 to 20.

It is important to take into account that most of

the players don’t play all the seasons, some of them

even play in non-consecutive seasons. In the analysis

of data, we have solved this problem considering in

each season only those players who participate in it.

The dataset has been extracted from the database of

the Spanish sports daily newspaper MARCA. This

newspaper has a daily readership of over 2,749,000,

the highest in Spain for a daily newspaper, and more

than half of sports readership. About the considered

covariates, we think that the most controversial is the

player position in the pitch: it must be taken into

account that it is not easy to determine a specific

player position (defence, midfielder or forward), and

also that this position may not be constant through-

out a season. Nevertheless, we have decided to treat

data as they appear in the original database.

2.1. Poisson model

We consider a Poisson model in which

yi;j ki;j

�� �Poisson ki;j

� �; where ki;j ¼ vili;j ; so

Pr yi;j ¼ y vi; li;j

��� �¼ e�vi ;li;j

vili;j

� �y

y!; y ¼ 0; 1;2; . . . ;

where

li;j ¼ ti;j exp xi;jb� �

:

128 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 4: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

In this way, vi represents the individual component

of each player, that is, his own contribution to the

scoring average regardless of the covariates effect,

only taking into account those characteristics that

make him unique. Because of this, we think that vi is

related to the ability of the player in terms of goal

scoring. Thus, high values of vi would indicate that

the scoring average is greater than that determined

by the covariates, and vice versa. On the other hand,

the covariates effect appears in mi,j, and it may

change from one season to another.

To introduce in the model the possibility of

heterogeneity non-explained by the covariates, vi is

considered as a Gamma (d, d) random variable, with

probability density function

pðvi dj Þ ¼ dd

CðdÞvd�1

i e�dvi :

By specifying a Gamma distribution for vi with

shape and scale parameters to be equal, the

Negative Binomial model is derived (see Cameron

& Trivedi, 1998). vi maintains constant its model

throughout the seasons because it represents the

individual ability which is intrinsic to the player.

Note that, due to the presence of b0, vi mean

can be considered the unity, avoiding a

nuisance parameter and controlling only its var-

iance with d.

The hyperparameters bh, (h � 0, 1, 2, 3) follow

prior non-informative normal distributions N (0,

100). We propose a flexible hierarchical prior struc-

ture for d, d � exp(b) and b � exp(0.01), where the

hyperparameter b follows an exponential distribution

with a large variance (Var(b) � 10,000), indicating

the absence of prior information. Similar assump-

tions for the parameters of the hyperpriors distribu-

tions can be found in Saez-Castillo, Olmo-Jimenez,

Perez Sanchez, Negrın Hernandez, Arcos-Navarro,

and Dıaz-Oller (2010).

Posterior distributions are obtained by applying

MCMC methods implemented by the WinBUGS

software (Lunn, Thomas, Best, & Spiegelhalter,

2000). The codes are provided in an Appendix section.

2.2. Negative binomial model. Empirical and full Bayes

versions

To suppose a Poisson sampling may be too restrictive,

so we have proposed a second model in which the

negative binomial distribution may explain overdis-

persion in the sampling. Specifically, we consider

yi;j ki;j �Poisson ki;j

� ��� where ki;j�Gamma ai;j ; 1=vi

� �or

equivalently, yi;j xi;j ; vi �NB ai;j ; vi

� ��� ; with

Pr yi;j ¼ y xi;j ; vi

��� �¼

C ai;j þ yi;j

� �Cðai;jÞyi;j !

1

1 þ vi

!ai;j

vi

1 þ vi

!yi;j

;

in such a way that E yi;j xi;j ; vi

��� �¼ ai;jvi: Additionally,

. vi � BetaII (r, k), with probability density

function

pðviÞ ¼Cðk þ qÞCðkÞCðqÞ

vk�1i ð1 þ viÞ

�ðkþqÞ:

� ai;j ¼ ti;j ex0

i;jbq�1

k ; where ti,j is the offset.

vi represents again the individual ability, the

contribution that the unique conditions of each

player introduce in the scoring average. We have

preferred to maintain the same notation for this

component as in the Poisson model, although the

scale is not comparable in both models. On the other

hand, the covariates effect in the scoring average is

included now in ai,j.

If k, r and b have constant values, the model leads

to a prior predictive distribution for yi;j xi;j

�� given by a

univariate generalized Waring distribution UGWD

(ai,j, k, r) (Irwin, 1968; Xekalaki, 1983) with

probabilities

Pr yi;j ai;j ; k; q��� �

¼Cðai;j þ qÞCðk þ qÞðai;jÞyi;j

ðkÞyi;j

CðqÞCðai;j þ k þ qÞðai;j þ k þ qÞyi;j

1

yi;j !;

yi;j ¼ 0; 1; . . . ;

being ðaÞx ¼CðaþxÞCðaÞ for any a�0 and any integer x.

Taking into account that the BetaII is conjugated in

the negative binomial sampling, the posterior pre-

dictive distribution for the number of goals scored by

the ith player in the (j�1)th season, given the

information about all the previous seasons and the

covariates of the current season is

yi;jþ1 ðyi;l ; xi;lÞj

l¼1

�� ; xi;jþ1�

UGWD ai;jþ1; k þXj

l¼1

yi;l ;Xj

l¼1

ai;l þ q

!:

So, a first version of the model may be fitted by

eliciting k, r and b values. We will name this version

empirical Bayes and is related to the GWRM from a

frequentist point of view.

Nevertheless, the elicitation of k, r and b con-

sidering the same dataset (because there is no

Expected number of goals depending on intrinsic and extrinsic factors of a football player 129

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 5: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

previous information) may be controversial. Because

of it, we assign low informative hyperprior distribu-

tions for these parameters. In this sense, we have

proposed a full Bayes model to cope with the

variability of the parameters.

Again, we propose

. r � exp(0.01),

. k � exp(0.01) and

. bi� N (0, 100), for i � 0, 1, 2, 3.

In terms of the goodness of the fits, this model has

been the most accurate. As in the Poisson model,

posterior distributions are obtained by applying

MCMC methods implemented by the WinBUGS

software (Lunn et al., 2000). Again, codes are

included in the Appendix section.

3. Estimation of goal scoring ability and

performance

Once both models have been fitted, we have em-

ployed the posterior distribution of each vi to

establish a ranking about the individual ability of

all the players, sorting the posterior means. As we

have commented previously, vi scales are not com-

parable in the different models, but the rankings may

be compared; in fact, we will see that these classifi-

cations according to both models (Poisson and full

Bayes negative binomial) are very close.

Additionally, we have also employed both models

to evaluate the goal scoring performance of each

player. Specifically, we consider the ith player, whose

career is known until the jth season; in addition, at

the end of the (j � 1)th season we can estimate the

expected distribution for his number of goals,

yi;jþ1 ðyi;l ; xi;lÞj

l¼1

�� ; xi;jþ1: Finally, we can locate the

actual number of goals scored by this ith player in

this last (j � 1)th season in the context of its

expected distribution, as a quantile, in such a way

that this quantile is an evaluation of the goal scoring

performance of this player in this last season, taking

into account all his career and his individual ability.

Thus, if the actual number of goals scored is over the

median, we could say that his performance has been

better than the expected, and vice versa. Note that,

in this way, the predictive distribution is not used

with predictive purposes, but only to value what is

the position of the actual number of goals scored.

It must be taken into account that we only have

an explicit and closed expression of the distribution

of yi;jþ1 ðyi;l ; xi;lÞj

l¼1

�� ; xi;jþ1 in the empirical Bayes

version of the negative binomial model (that is

a UGWD). In the case of the Poisson and the full

Bayes version of the negative binomial model, where

there is no a explicit expression of yi;jþ1 ðyi;l ;��

xi;lÞj

l¼1; xi;jþ1 probabilities, we have taken into account

that these probabilities are the average of the condi-

tional Poisson probabilities, that is

Pr yi;jþ1�y� �

¼Z10

Pr yi;jþ1�y ki;jþ1

��� �f ki;jþ1

� �dki;jþ1

¼ E FðyÞ ki;jþ1

��� �;

where F makes reference to the Poisson distribution

(in both models). So we have simulated large samples

of the goal scoring average of each player, ki;jþ1, and

then we have estimated the probability under the

actual number of goals scored, Pr yi;jþ1� y� �

; as the

sample mean of the Poisson conditional probabilities,

Pr yi;jþ1�y ki;jþ1

�� ;�

given the simulated ki;jþ1 values.

4. Models fit

The statistical methods consisted of two steps: (1)

estimation of regression models and (2) analysis of

goodness of fit using deviance information criterion

(DIC). The final sample size was 1599 players from

Spanish football league from 2000/2001 to 2008/

2009 seasons (nine seasons).

Bayesian estimation of the Poisson and full Bayes

binomial negative models was carried out by apply-

ing MCMC techniques. Four chains of 100,000

samples were recorded after a burn-in sample of

100,000 for both models. Different diagnoses were

obtained to ensure the desired convergence of the

simulations using several tests provided within the

WinBUGS Convergence Diagnostics and Output

Analysis software (CODA). We have considered the

next methodology to establish initial values of the full

Bayes negative binomial model:

� First, we have estimated k, r and b by maximum

likelihood in a GWRM model by means of the

GWRM package (Saez-Castillo, Rodraguez-Avi,

Conde-Sanchez, Olmo-Jimenez, & Martanez-

Rodraguez, 2009) of R Development Core

Team (2009), joining all the seasons in a unique

dataset. We must clarify that the GWRM model

proposes a general UGWD(ai, k, r) distribution

for the number of goals scored, yi, where

ai ¼ exp x0ib

q�1

k

� �in such a way that the mean is

mi�exp(xi?b), but it doesn’t allow to take into

account the season in the same way that we are

considering and, additionally, it only proposes a

general BetaII(p, k) distribution for the indivi-

dual heterogeneity (the same for all the players),

which is absolutely useless for our purposes.

� Secondly, we have calculated the corresponding

values of ai,j, given k, r and b.

So, in this way, we are considering the estimates of

k, r, b and ai,j in the empirical Bayes version of the

130 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 6: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

negative binomial model as initial values of the full

Bayes version.

The most accurate model, in terms of the DIC

value, was the full Bayesian version of the negative

binomial model (DIC � 11,260.1), so we are going

to focus only on the results provided by this model.

The DIC values for the Poisson model and the

empirical Bayes version of the negative binomial

model are 11,340.6 and 11,271.2, respectively.

Table I and Table II contain b estimates and the

rest of parameters estimates of the models. We would

like to highlight that b estimates are statistically

relevant and quite similar in the three models.

5. Results

5.1. Estimation of the goal scoring ability. Ranking of the

most able players

To summarise the results that the full Bayes version

of the negative binomial model provide about the

ability of the players by means of, vi we have

highlighted only the best five players in each position

in the field, sorting the mean of the variable vi; this

mean is usually quite similar to the median, due to

the symmetry to the distribution of most of the vi. As

we have commented before, this ranking is very close

to that provided by the Poisson model, as it is shown

in Figure 1, where the estimated posterior means of

vi over both models appear as cartesian coordinates.

Table III shows the results. The best five defenders

are, in this order, Ezequiel Garay, Roberto Carlos,

Campano, Cristian Alvarez and Larrazabal. In fact,

Ezequiel Garay appears as the third best player in

terms of his goal scoring ability in the set of all the

players of the Spanish football league, and it may be

due to several factors: first, he is a player who was

responsible for penalty kicks in his two seasons in

Racing de Santander (his signing for Real Madrid is

not included in the dataset because it was in the

2009/2010 season); secondly, he plays as a defender

in a medium team, so his not very high scoring

average is very well valued by the model. Roberto

Carlos is a different case, because it is well-known

that in his career in Real Madrid he was an effective

defender with a special talent for goal scoring, mainly

due to his strong shot; in the general ranking he is the

fifth best player.

The best midfielders are Rivaldo, Robert, Luis

Cembranos, Mark Gonzalez and Mostovoi. They

include such players as Rivaldo or Mostovoi, who

are considered midfielders but really played in very

advanced positions, in such a way that many

people consider that they are forwards. Robert is

a player that participated not many minutes, but

with a high performance, taking into account also

that he played in a team of the medium-low part of

the classification (R.C.D. Betis) in most of the

seasons.

Finally, the most able forwards, according to this

ranking, are Messi, Ronaldo, Makkay, Villa and

Etoo. It is striking that they appear in not very high

global positions. In fact, apart from Messi, who is in

the sixth position, the rest appears below the 15th

position. That is because the model takes into

account that they are forwards, so it expects that

Table II. Other parameters estimates

Node Mean S.D. MC error 2.5% Median 97.5%

Full Bayesian NB k 3.699 0.4139 0.04058 3.133 3.629 4.853

p 25.22 2.295 0.2226 20.0 25.33 29.26

Poisson b 0.7426 0.5275 0.002998 0.09055 0.6205 2.077

s 2.706 0.2138 0.003874 2.314 2.695 3.152

Empirical Bayes NB k 19.640

p 31.987

Table I. b estimates

Node Mean S.D. MC error 2.5% Median 97.5%

Full Bayesian NB b0 �3.178 0.0522 0.00467 �3.291 �3.172 �3.084

b1 �0.2243 0.0271 0.00121 �0.2796 �0.224 �0.173

b2 1.076 0.0634 0.00552 0.967 1.071 1.21

b3 1.879 0.0672 0.00591 1.752 1.879 2.017

Poisson b0 �3.157 0.0468 0.0019489 �3.248 �3.1578 �3.0628

b1 �0.221 0.0263 3.1884E-4 �0.27303 �0.22087 �0.16949

b2 1.071 0.0564 0.0021388 0.95919 1.0719 1.1806

b3 1.801 0.0605 0.0023942 1.6803 1.8025 1.9183

Empirical Bayes NB b0 �3.1729

b1 �0.3014

b2 1.0023

b3 2.1096

Expected number of goals depending on intrinsic and extrinsic factors of a football player 131

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 7: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

they score many goals, without it meaning that they

have a special ability which cannot be explained by

their position in the field; moreover, we have seen

how the model awards in the ranking the fact that

defenders and midfielders score goals, because it is

not their main objective in the field.

5.2. Estimation of the goal scoring performance

We have selected five players of those who appear in

Table III to evaluate their goal scoring performance

in the different seasons. We have also included Raul

in this analysis, because he is an iconic player for

many people in the Spanish football league. As we

have described before, this estimation is made by

locating the actual number of goals scored as a

quantile of the expected distribution of the number

of goals in this season. Remember that we need to

approximate the related probabilities of the expected

distributions as averages of the corresponding Pois-

son probabilities given sample simulated values of

ki;j , for any player i in each involved season j, so the

Figure 1. Estimated posterior means of vi by the full Bayes version of the negative binomial model (x-axis) and the Poisson model (y-axis).

Table III. Ranking, according to the full Bayes negative binomial, of the best players, by the position in the field. The global position in the

general ranking and the number of seasons played are also shown

Defenders

Name Global position Seasons

Ezequiel Garay 3 4

Roberto Carlos 5 7

Campano 7 6

Cristian Alvarez 19 3

Larrazabal 26 4

Midfielders

Name Global position Seasons

Rivaldo 1 2

Robert 2 2

Luis Cembranos 4 3

Mark Gonzalez 8 4

Mostovoi 9 4

Forwards

Name Global position Seasons

Messi 6 5

Ronaldo 16 5

Makaay 18 3

Villa 22 6

Etoo 24 9

132 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 8: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

computational effort is great. In this way, we are

quantifying the performance of the player in refer-

ence to his goal scoring, taking into account his

position in the field, the number of minutes played,

the classification of his team and, even, his own

ability. For example, imagine a player scores 12 goals

in a season and, considering the expected probability

for this player and this season, the probability of

score 12 goals or less is 0.95: that involves the actual

number of scored goals (12) is very high (percentile

95) in relation to the expected distribution, which

takes into account his external (minutes played, team

and position) and internal (ability) conditions,

indicating a good performance. Specially, we want

to comment on some cases.

Table IV shows for each season and each player,

the probability that the number of goals is less or

equal to the number of goals scored. The higher this

probability, the better is this performance of the

player in the season. With respect to Etoo evolution,

we would like to emphasise his high performance in

his last seasons (his worst result was in the 2001/

2002 season, when he scored only six goals and his

team, Real Mallorca, finished in the 16th position).

If we focus on the season when he was signed by F.C.

Barcelona, 2004/2005, we can see that previously,

they had played 2864 minutes in Real Mallorca (that

finished in the 11th position), scoring 18 goals, while

in the first season in F.C. Barcelona (which finally

won the league) he played a little more (3036

minutes), scoring 24 goals: nevertheless, the perfor-

mance is very similar in both seasons. That is

because the model must be more strict with a

forward who plays in the winning team than with

another forward in a medium team.

Messi’s evolution is also interesting. He shows a

high goal scoring performance in his initial sea-

sons. But a strong decrease can be detected in the

2007/2008 season. This decrease is due to the

change in his considered position, because until

then he was considered as a midfielder and since

this season, as a forward. Obviously, the model

evaluates a forward more strictly in relation to the

expected number of goals, in such a way that, even

if the performance were similar, it is located in

lower quantiles.

Finally, in the Ronaldo case we think that it must

be taken into account that he was injured most of his

last season (2006�2007), so he only could play 329

minutes in which only scored one goal: that is the

reason for his low performance. In the case of Villa, a

decrease in the 2006/2007 season it is also observed,

but this can only be due to a worse performance,

because he passed from scoring 25 goals in his first

season in Valencia F.C. (2005)�2006) to score only

15 in his second season (2006�2007), playing

approximately the same minutes.

References

Baxter, M., & Stevenson, R. (1988). Discriminating between the

Poisson and negative binomial distributions: An application to

goal scoring in association football. Journal of Applied, Statistics,

15(3), 347�354.

Cameron, A. C., & Trivedi, P. K. (1998). Regression analysis of

count data. Cambridge: Cambridge University Press.

Crowder, M., Dixon, M., Ledford, A., & Robinson, M. (2002).

Dynamic modelling and prediction of English football league

matches for betting. Journal of the Royal Statistical Society: Series

D (The Statistician), 51(2), 157�168.

Dey, D., Ghosh, S. K., & Mallick, B. K. (2000). Generalized linear

models: A Bayesian perspective. New York: CRC Press.

Ensum, J., Pollard, R., & Taylor, S. (2004). Applications of

logistic regression to shots at goal in association football:

Calculation of shot probabilities, quantification of factors and

player/team. Journal of Sports Sciences, 22(6), 500�520.

Goddard, J. (2005). Regression models for forecasting goals and

match results in association football. International Journal of

Forecasting, 21(2), 331�340.

Hilbe, J. (2007). Negative binomial regression. Cambridge: Cam-

bridge University Press.

Hughes, M., & Franks, I. (2005). Analysis of passing sequences,

shots and goals in soccer. Journal of Sports Sciences, 23(5), 509�514.

Irwin, J. O. (1968). The generalized Waring distribution applied

to accident theory. Journal of the Royal Statistical Society A,

131(2), 205�225.

Lee, A. (2002). Modelling rugby league data via bivariate negative

binomial regression. Australian & New Zealand Journal of

Statistics, 41(2), 141�152.

Ludwig, F., & Osuna-Echavarrıa, L. (2006). Structured additive

regression for overdispersed and zero-inflated count data.

Applied Stochastic Models in Business and Industry, 22(4), 351�369.

Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000).

Winbugs: A Bayesian modelling framework: Concepts, struc-

ture and extensibility. Statistics and Computing, 10, 325�337.

Pollard, R., & Reep, C. (1997). Measuring the effectiveness of

playing strategies at soccer. Journal of the Royal Statistical

Society: Series D (The Statistician), 46(4), 541�550.

Table IV. Estimation of the performance of the selected players

00/01 01/02 02/03 03/04 04/05 05/06 06/07 07/08 08/09

Etoo 0.27 0.07 0.45 0.61 0.61 0.71 0.59 0.86 0.87

Messi 0.91 0.85 0.92 0.20 0.62

Roberto Carlos 0.72 0.60 0.77 0.84 0.52 0.72 0.71

Ronaldo 0.77 0.73 0.47 0.58 0.39

Villa 0.46 0.46 0.71 0.23 0.71 0.89

Raul 0.95 0.47 0.76 0.36 0.29 0.23 0.12 0.75 0.81

Expected number of goals depending on intrinsic and extrinsic factors of a football player 133

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 9: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

R Development Core Team. (2009). R: A language and environ-

ment for statistical computing. Vienna, Austria: R Foundation for

Statistical Computing. ISBN 3-900051-07-0.

Rodrıguez-Avi, J., Conde-Sanchez, A., Saez-Castillo, A. J., Olmo-

Jimenez, M. J., & Martınez-Rodrıguez, A. M. (2009). A

generalized Waring regression model for count data.

Computational Statistics and Data Analysis, 53(10), 3717�3725.

Saez-Castillo, A. J., Rodrıguez-Avi, J., Conde-Sanchez, A., Olmo-

Jimenez, M. J., & Martınez-Rodrıguez, A. M. (2009). GWRM:

Generalized Waring regression models. R package version 1.1.

Saez-Castillo, A. J., Olmo-Jimenez, M. J., Perez Sanchez, J. M.,

Negrın Hernandez, M. A., Arcos-Navarro, A., & Dıaz-Oller, J.

(2010). Bayesian analysis of nosocomial infection risk and

length of stay in a department of general and digestive surgery.

Value in Health, 13(4), 431�439.

Xekalaki, E. (1983). The Univariate generalized Waring distribu-

tion in relation to accident theory: Proneness, spells or

contagion? Biometrics, 39, 887�895.

Appendix

The WinBUGS codes for the empirical Bayes model are:

1 m o d e l;

{

3 # s e a s o n 1

f o r ( i i n 1: m 1) {

5 g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s. s

e a s o n. 1 [i], 1] � d g a m m a ( a [p l a y e r s. s e a s o n. 1 [i], 1], r a t i o [p l a y e r s. s e a s o n. 1 [i]])

7 a [p l a y e r s. s e a s o n. 1 [i], 1] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], 4 * 1-3] * e x p

( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s

9 [2: 4])) * ( ro-1) / k

}

11 # s e a s o n 2

f o r ( i i n 1: m 2) {

13 g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2])

l a m [p l a y e r s. s e a s o n. 2 [i], 2] � d g a m m a ( a [p l a y e r s. s e a s o n. 2 [i], 2], r a t i o [p l a y e r s.

15 s e a s o n. 2 [i]]) a [p l a y e r s. s e a s o n. 2 [i], 2] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i],

( 4 * 2-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1):

17 ( 4 * 2)], b e t a s [2: 4])) * ( ro-1) / k

}

19 # s e a s o n 3

f o r ( i i n 1: m 3) {

21 g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3])

l a m [p l a y e r s. s e a s o n. 3 [i], 3] � d g a m m a ( a [p l a y e r s. s e a s o n. 3 [i], 3], r a t i o [p l a y e r s.

23 s e a s o n. 3 [i]]) a [p l a y e r s. s e a s o n. 3 [i], 3] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i],

( 4 * 3-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1):

25 ( 4 * 3)], b e t a s [2: 4])) * ( ro-1) / k

}

27 # s e a s o n 4

f o r ( i i n 1: m 4) {

29 g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4])

l a m [p l a y e r s. s e a s o n. 4 [i], 4] � d g a m m a ( a [p l a y e r s. s e a s o n. 4 [i], 4], r a t i o [p l a y e r s.

31 s e a s o n. 4 [i]]) a [p l a y e r s. s e a s o n. 4 [i], 4] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i],

( 4 * 4-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1):

33 ( 4 * 4)], b e t a s [2: 4])) * ( ro-1) / k

}

35 # s e a s o n 5

f o r ( i i n 1: m 5) {

37 g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5])

l a m [p l a y e r s. s e a s o n. 5 [i], 5] � d g a m m a ( a [p l a y e r s. s e a s o n. 5 [i], 5], r a t i o [p l a y e r s.

39 s e a s o n. 5 [i]]) a [p l a y e r s. s e a s o n. 5 [i], 5] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i],

( 4 * 5-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1):

41 ( 4 * 5)], b e t a s [2: 4])) * ( ro-1) / k

}

134 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 10: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

43 # s e a s o n 6

f o r ( i i n 1: m 6) {

45 g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6])

l a m [p l a y e r s. s e a s o n. 6 [i], 6] � d g a m m a ( a [p l a y e r s. s e a s o n. 6 [i], 6], r a t i o [p l a y e r s.

47 s e a s o n. 6 [i]]) a [p l a y e r s. s e a s o n. 6 [i], 6] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i],

( 4 * 6-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1):

49 ( 4 * 6)], b e t a s [2: 4])) * ( ro-1) / k

}

51 # s e a s o n 7

f o r ( i i n 1: m 7) {

53 g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s. s

e a s o n. 7 [i], 7] � d g a m m a ( a [p l a y e r s. s e a s o n. 7 [i], 7], r a t i o [p l a y e r s. s e a s o n. 7 [i]])

55 a [p l a y e r s. s e a s o n. 7 [i], 7] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)] * e x p (

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s [

57 2: 4])) * ( ro-1) / k

}

59 # s e a s o n 8

f o r ( i i n 1: m 8) {

61 g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8])

l a m [p l a y e r s. s e a s o n. 8 [i], 8] � d g a m m a ( a [p l a y e r s. s e a s o n. 8 [i], 8], r a t i o [p l a y e r s.

63 s e a s o n. 8 [i]]) a [p l a y e r s. s e a s o n. 8 [i], 8] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i],

( 4 * 8-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1):

65 ( 4 * 8)], b e t a s [2: 4])) * ( ro-1) / k

}

67 # s e a s o n 9

f o r ( i i n 1: m 9) {

69 g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.

s e a s o n. 9 [i], 9] � d g a m m a ( a [p l a y e r s. s e a s o n. 9 [i], 9], r a t i o [p l a y e r s. s e a s o n. 9 [i]])

71 a [p l a y e r s. s e a s o n. 9 [i], 9] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)] * e x p (

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s [

73 2: 4])) * ( ro-1) / k}

f o r ( i i n 1: i n d e x. m a x i m u m) {

75 r a t i o [i] B-1 / v [i]

v [i] B-( 1-p [i]) / p [i]

77 p [i] � d b e t a ( ro, k)

}

79}

The WinBUGS codes for the Poisson model are:

1 m o d e l;

{

3 # s e a s o n 1

f o r ( i i n 1: m 1) {

5 g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s.

s e a s o n. 1 [i], 1] B-v [p l a y e r s. s e a s o n. 1 [i]] * m u [p l a y e r s. s e a s o n. 1 [i], 1] l o g ( m u [

7 p l a y e r s. s e a s o n. 1 [i], 1]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s

9 [2: 4])

}

11 # s e a s o n 2

f o r ( i i n 1: m 2) {

13 g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2]) l a m [p l a y e r s.

s e a s o n. 2 [i], 2] B-v [p l a y e r s. s e a s o n. 2 [i]] * m u [p l a y e r s. s e a s o n. 2 [i], 2] l o g ( m u [

15 p l a y e r s. s e a s o n. 2 [i], 2]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1): ( 4 * 2)], b e t a s

17 [2: 4])

}

19 # s e a s o n 3

Expected number of goals depending on intrinsic and extrinsic factors of a football player 135

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 11: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

f o r ( i i n 1: m 3) {

21 g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3]) l a m [p l a y e r s.

s e a s o n. 3 [i], 3] B-v [p l a y e r s. s e a s o n. 3 [i]] * m u [p l a y e r s. s e a s o n. 3 [i], 3] l o g ( m u [

23 p l a y e r s. s e a s o n. 3 [i], 3]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1): ( 4 * 3)], b e t a s

25 [2: 4])

}

27 # s e a s o n 4

f o r ( i i n 1: m 4) {

29 g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4]) l a m [p l a y e r s.

s e a s o n. 4 [i], 4] B-v [p l a y e r s. s e a s o n. 4 [i]] * m u [p l a y e r s. s e a s o n. 4 [i], 4] l o g ( m u [

31 p l a y e r s. s e a s o n. 4 [i], 4]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1): ( 4 * 4)], b e t a s

33 [2: 4])

}

35 # s e a s o n 5

f o r ( i i n 1: m 5) {

37 g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5]) l a m [p l a y e r s.

s e a s o n. 5 [i], 5] B-v [p l a y e r s. s e a s o n. 5 [i]] * m u [p l a y e r s. s e a s o n. 5 [i], 5] l o g ( m u [

39 p l a y e r s. s e a s o n. 5 [i], 5]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1): ( 4 * 5)], b e t a s

41 [2: 4])

}

43 # s e a s o n 6

f o r ( i i n 1: m 6) {

45 g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6]) l a m [p l a y e r s.

s e a s o n. 6 [i], 6] B-v [p l a y e r s. s e a s o n. 6 [i]] * m u [p l a y e r s. s e a s o n. 6 [i], 6] l o g ( m u [

47 p l a y e r s. s e a s o n. 6 [i], 6]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1): ( 4 * 6)], b e t a s

49 [2: 4])

}

51 # s e a s o n 7

f o r ( i i n 1: m 7) {

53 g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s.

s e a s o n. 7 [i], 7] B-v [p l a y e r s. s e a s o n. 7 [i]] * m u [p l a y e r s. s e a s o n. 7 [i], 7] l o g ( m u [

55 p l a y e r s. s e a s o n. 7 [i], 7]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s

57 [2: 4])

}

59 # s e a s o n 8

f o r ( i i n 1: m 8) {

61 g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8]) l a m [p l a y e r s.

s e a s o n. 8 [i], 8] B-v [p l a y e r s. s e a s o n. 8 [i]] * m u [p l a y e r s. s e a s o n. 8 [i], 8] l o g ( m u [

63 p l a y e r s. s e a s o n. 8 [i], 8]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1): ( 4 * 8)], b e t a s

65 [2: 4])

}

67 # s e a s o n 9

f o r ( i i n 1: m 9) {

69 g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.

s e a s o n. 9 [i], 9] B-v [p l a y e r s. s e a s o n. 9 [i]] * m u [p l a y e r s. s e a s o n. 9 [i], 9] l o g ( m u [

71 p l a y e r s. s e a s o n. 9 [i], 9]) B-l o g ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)]) �

b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s

73 [2: 4])

}

75 f o r ( i i n 1: i n d e x. m a x i m u m) {

v [i] � d g a m m a ( d e l t a, d e l t a)

136 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 12: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

77}

d e l t a � d e x p ( b)

79 b � d e x p ( 0. 0 1)

f o r ( j i n 1: 4) {

81 b e t a s [j] � d n o r m ( 0. 0, t) I (-1 0 0, 1 0 0)

}

83 t B-0. 0 1 $ \ s h a r p $ V a r i a n c e � 1 / t

}

The WinBUGS codes for the full Bayes model are:

m o d e l;

2 {

# s e a s o n 1

4 f o r ( i i n 1: m 1) {

g o a l s [p l a y e r s. s e a s o n. 1 [i], 1] � d p o i s ( l a m [p l a y e r s. s e a s o n. 1 [i], 1]) l a m [p l a y e r s.

6 s e a s o n. 1 [i], 1] � d g a m m a ( a [p l a y e r s. s e a s o n. 1 [i], 1], r a t i o [p l a y e r s. s e a s o n. 1 [i]])

a [p l a y e r s. s e a s o n. 1 [i], 1] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], 4 * 1-3] * e x p (

8 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 1 [i], ( 4 * 1-3 � 1): ( 4 * 1)], b e t a s

[2: 4])) * ( ro-1) / k

10}

# s e a s o n 2

12 f o r ( i i n 1: m 2) {

g o a l s [p l a y e r s. s e a s o n. 2 [i], 2] � d p o i s ( l a m [p l a y e r s. s e a s o n. 2 [i], 2]) l a m [p l a y e r s.

14 s e a s o n. 2 [i], 2] � d g a m m a ( a [p l a y e r s. s e a s o n. 2 [i], 2], r a t i o [p l a y e r s. s e a s o n. 2 [i]])

a [p l a y e r s. s e a s o n. 2 [i], 2] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3)] * e x p (

16 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 2 [i], ( 4 * 2-3 � 1): ( 4 * 2)], b e t a s

[2: 4])) * ( ro-1) / k

18}

# s e a s o n 3

20 f o r ( i i n 1: m 3) {

g o a l s [p l a y e r s. s e a s o n. 3 [i], 3] � d p o i s ( l a m [p l a y e r s. s e a s o n. 3 [i], 3]) l a m [p l a y e r s.

22 s e a s o n. 3 [i], 3] � d g a m m a ( a [p l a y e r s. s e a s o n. 3 [i], 3], r a t i o [p l a y e r s. s e a s o n. 3 [i]])

a [p l a y e r s. s e a s o n. 3 [i], 3] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3)] * e x p (

24 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 3 [i], ( 4 * 3-3 � 1): ( 4 * 3)], b e t a s

[2: 4])) * ( ro-1) / k

26}

# s e a s o n 4

28 f o r ( i i n 1: m 4) {

g o a l s [p l a y e r s. s e a s o n. 4 [i], 4] � d p o i s ( l a m [p l a y e r s. s e a s o n. 4 [i], 4]) l a m [p l a y e r s.

30 s e a s o n. 4 [i], 4] � d g a m m a ( a [p l a y e r s. s e a s o n. 4 [i], 4], r a t i o [p l a y e r s. s e a s o n. 4 [i]])

a [p l a y e r s. s e a s o n. 4 [i], 4] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3)] * e x p (

32 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 4 [i], ( 4 * 4-3 � 1): ( 4 * 4)], b e t a s

[2: 4])) * ( ro-1) / k

34}

# s e a s o n 5

36 f o r ( i i n 1: m 5) {

g o a l s [p l a y e r s. s e a s o n. 5 [i], 5] � d p o i s ( l a m [p l a y e r s. s e a s o n. 5 [i], 5])

38 l a m [p l a y e r s. s e a s o n. 5 [i], 5] � d g a m m a ( a [p l a y e r s. s e a s o n. 5 [i], 5], r a t i o [p l a y e r s.

s e a s o n. 5 [i]]) a [p l a y e r s. s e a s o n. 5 [i], 5] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i],

40 ( 4 * 5-3)] * e x p ( b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 5 [i], ( 4 * 5-3 � 1):

( 4 * 5)], b e t a s [2: 4])) * ( ro-1) / k

42}

# s e a s o n 6

44 f o r ( i i n 1: m 6) {

g o a l s [p l a y e r s. s e a s o n. 6 [i], 6] � d p o i s ( l a m [p l a y e r s. s e a s o n. 6 [i], 6]) l a m [p l a y e r s.

46 s e a s o n. 6 [i], 6] � d g a m m a ( a [p l a y e r s. s e a s o n. 6 [i], 6], r a t i o [p l a y e r s. s e a s o n. 6 [i]])

a [p l a y e r s. s e a s o n. 6 [i], 6] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3)] * e x p (

48 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 6 [i], ( 4 * 6-3 � 1): ( 4 * 6)], b e t a s

Expected number of goals depending on intrinsic and extrinsic factors of a football player 137

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13

Page 13: Expected number of goals depending on intrinsic and extrinsic factors of a football player. An application to professional Spanish football league

[2: 4])) * ( ro-1) / k

50}

# s e a s o n 7

52 f o r ( i i n 1: m 7) {

g o a l s [p l a y e r s. s e a s o n. 7 [i], 7] � d p o i s ( l a m [p l a y e r s. s e a s o n. 7 [i], 7]) l a m [p l a y e r s.

54 s e a s o n. 7 [i], 7] � d g a m m a ( a [p l a y e r s. s e a s o n. 7 [i], 7], r a t i o [p l a y e r s. s e a s o n. 7 [i]])

a [p l a y e r s. s e a s o n. 7 [i], 7] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3)] * e x p (

56 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 7 [i], ( 4 * 7-3 � 1): ( 4 * 7)], b e t a s

[2: 4])) * ( ro-1) / k

58}

# s e a s o n 8

60 f o r ( i i n 1: m 8) {

g o a l s [p l a y e r s. s e a s o n. 8 [i], 8] � d p o i s ( l a m [p l a y e r s. s e a s o n. 8 [i], 8]) l a m [p l a y e r s.

62 s e a s o n. 8 [i], 8] � d g a m m a ( a [p l a y e r s. s e a s o n. 8 [i], 8], r a t i o [p l a y e r s. s e a s o n. 8 [i]])

a [p l a y e r s. s e a s o n. 8 [i], 8] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3)] * e x p (

64 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 8 [i], ( 4 * 8-3 � 1): ( 4 * 8)], b e t a s

[2: 4])) * ( ro-1) / k

66}

# s e a s o n 9

68 f o r ( i i n 1: m 9) {

g o a l s [p l a y e r s. s e a s o n. 9 [i], 9] � d p o i s ( l a m [p l a y e r s. s e a s o n. 9 [i], 9]) l a m [p l a y e r s.

70 s e a s o n. 9 [i], 9] � d g a m m a ( a [p l a y e r s. s e a s o n. 9 [i], 9], r a t i o [p l a y e r s. s e a s o n. 9 [i]])

a [p l a y e r s. s e a s o n. 9 [i], 9] B-c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3)] * e x p (

72 b e t a s [1] � i n p r o d ( c o v a r s. s c a l e d [p l a y e r s. s e a s o n. 9 [i], ( 4 * 9-3 � 1): ( 4 * 9)], b e t a s

[2: 4])) * ( ro-1) / k

74}

f o r ( i i n 1: i n d e x. m a x i m u m) {

76 r a t i o [i] B-1 / v [i]

v [i] B-( 1-p [i]) / p [i]

78 p [i] � d b e t a ( ro, k)

}

80 r o � d e x p ( 0. 0 1)

k � d e x p ( 0. 0 1)

82 f o r ( j i n 1: 4) {

b e t a s [j] � d n o r m ( 0. 0, t) I (-1 0, 1 0)

84}

t B-0. 0 1

86}

138 A. S. Castillo et al.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

2:24

24

Sept

embe

r 20

13


Recommended