The Forecasting Power of Economic Growth Models - DiVA

transcript

UPPSALA UNIVERSITET Nationalekonomiska institutionen Examensarbete magisteruppsats Master's Thesis Vårterminen 2007

The Forecasting Power of Economic Growth Models

Author: Andreas Bryhn Supervisor: Johan Lyhagen

Abstract

High forecasting power is essential for understanding scientific relationships. In economics,

forecasting power may be decisive for the success or failure of a particular policy. The

forecasting power of economic growth models is investigated in this study. Regressions from

one dataset including the gross domestic product (GDP), GDP growth, trade openness, the

quality of public institutions and secondary education generate insufficient forecasting power

with respect to growth. Furthermore, the International Monetary Fund's one-year growth

forecasts are compared to outcome. Forecasts for 1999-2006 were found to be significantly

different from outcome during 7 years out of 8. The forecast error slightly exceeded 1

percentage unit, which is similar to results from earlier studies on forecast error and equal to

the forecast/hindcast error from a simple multivariate model constructed from historical

growth data. Possible reasons behind poor forecast quality are discussed, including the

tradition to build models using assumptions from irrefutable theoretical constructs.

Sammanfattning

Hög prognoskraft är nödvändigt för förståelsen av vetenskapliga samband. Inom

nationalekonomin kan prognoskraften vara avgörande för huruvida en viss ekonomisk politik

kommer att lyckas eller misslyckas. Ekonomiska tillväxtmodellers prognoskraft undersöks i

denna studie. Regressioner från ett dataset som innehåller bruttonationalprodukten, tillväxten

hos densamma, öppenhet för handel, offentliga institutioners kvalitet, samt andel

gymnasieutbildade, genererar otillräcklig prognoskraft med avseende på tillväxt. Vidare

jämförs Internationella Valutafondens ettårsprognoser med utfall. Prognoser för 1999-2006

var signifikant skilda från utfall under 7 år av 8. Prognosfelet uppgick till drygt 1

procentenhet, vilket motsvarar resultat från tidigare studier av prognosfel, liksom prognosfelet

från en enkel, flervariabelsmodell som är baserad på historiska tillväxtdata. Möjliga

anledningar till den låga prognoskvaliteten diskuteras, däribland traditionen att bygga

modeller med hjälp av antaganden från teoretiska konstruktioner som inte kan falsifieras.

TABLE OF CONTENTS

1. INTRODUCTION

2. METHODS AND DATA

2.1. Statistical methods and considerations

2.2. The Gallup dataset

2.3. The IMF dataset

3. RESULTS

3.1. Cross-country growth regressions

3.2. The IMF's growth forecasts

4. DISCUSSION

5. CONCLUSIONS

REFERENCES

1. INTRODUCTION

Forecasting the future has been a highly desirable goal for humans ever since the Stone Age.

With the emergence and growth of science, our forecasting methods have improved

considerably, as forecasting power has become a fundamental part and a distinguishing

feature of scientific knowledge. The ability to make accurate forecasts of important goal

variables is essential for the understanding of scientific relationships, particularly with respect

to causality (Blaug, 1980; DeLurgio, 1998; Fildes and Stekler, 2002). High forecasting power

is necessary but not sufficient for causal analysis in economics, because strongly correlated

variables may have an external, cointegrating causal factor (Clements and Hendry, 1999).

Nevertheless, without high forecasting power, economists will have poor quantitative

knowledge about the effect of their suggested policies on, e. g., economic growth. Insufficient

forecasting power may therefore result in disappointing and expensive policy failure.

The forecasting power of a model can be estimated by calculating the correlation coefficient

(R2) for the relationship between model forecasts and actual outcome. Two crucial issues in

this context are (1) the relative quality of models; i. e., how much better is, e. g., R2 = 0.9,

compared to R2 = 0.8 in a regression between forecast and outcome, and (2) the lower limit of

acceptability, as measured in R2. If there is a large enough amount of data, a correlation with

R2 very close to 0 can still be significant at high confidence levels, but it will be just as

useless for forecasting as a correlation with R2 = 0, because in both cases, correlations will

describe a beeswarm-like scatterplot from which forecasts will have very high uncertainty.

One widely used lower limit of acceptability regarding R2 and forecasting power was first

motivated by Prairie (1996), who presented what has become known in the natural sciences as

Prairie's staircase. Prairie's staircase method is illustrated in Figure 1A. A thick, solid line

starts at the lower boundary line of the 95 % confidence level from the regression and is

drawn upwards until it reaches the upper 95 % confidence limit, and is then drawn to the right

until the lower boundary line is reached again. This procedure is repeated so that the thick,

solid line takes the shape of a staircase (Figure 1A). When Prairie (1996) reiterated this

exercise for a large number of correlations, he found a non-linear relationship between R2 and

the number of staircase risers, which is depicted in Figure 1B. According to this figure, the

number of risers is low and fairly constant for R2 values from 0 to about 0.65, after which the

number rises rather dramatically. Using the number of staircase risers as a representation of

forecasting power, Prairie (1996) argued that R2 = 0.65 should be regarded as the lower limit

of acceptability. Furthermore, Figure 1B can be used to compare forecasting power. For

instance, the difference in forecasting power between R2 = 0 and R2 = 0.9 can be regarded as

more than six times as great as the difference between R2 = 0 and R2 = 0.45.

Figure 1. Prairie´s staircase, suggesting a non-linear relationship between the correlation coefficient (R2) and

forecasting power. From Prairie (1996).

The difficulty in making correct growth forecasts has concerned economists for many decades

(Hutchison, 1938; Kenny and Williams, 2001). Table 1 shows R2 values from some bi-variate

regressions found in the literature. None of the R2 values exceed 0.65, although appreciably

higher R2 values can be found in multi-variate regressions (Gylfason, 2001). This study aims

at examining the forecasting power in economic growth models. The structure of this work is

as follows: first, two datasets used in the study will be described and statistical methods and

considerations will be presented and motivated. Second, correlations and forecasting power

will be analysed. Finally, the results will be discussed based on the intention to increase future

forecasting power in growth models.

Table 1. Correlations between some x-variables and GDP growth or GDP per capita growth as a y-variable. x-variable Correlation sign R2 Reference Fertility - 0.61 Perotti, 1996 Liquid liabilities + 0.55 King and Levine, 1993 Private loans + 0.50 King and Levine, 1993 Investment + 0.35 Levine and Renelt, 1992 Natural resources - 0.28 Gylfason, 2001 Inequality - 0.22 Persson and Tabellini, 1994 Savings + 0.20 Levine and Zervos, 1998 Education + 0.17 Gylfason, 2001 Telecommunications + 0.17 Bougheas et al., 2000 Black market - 0.14 Levine and Renelt, 1992 Revolutions and coups - 0.13 Levine and Renelt, 1992 Bank credit + 0.12 Levine and Zervos, 1998 Paved roads + 0.11 Bougheas et al., 2000 War casualties - 0.10 Easterly et al., 1993 Inflation - 0.08 Carkovic and Levine, 2002 Government size + 0.06 Carkovic and Levine, 2002 Marginal tax + 0.03 Perotti, 1996 Trade + 0.02 Dollar and Kraay, 2001 Welfare spending + 0.01 Perotti, 1996 GDP + ≈ 0 Barro, 1996

2. METHODS AND DATA

2.1. Statistical methods and considerations

Relationships were analysed with single, and forward stepwise multiple, linear regression.

Stepwise multiple regression makes it possible to distinguish the strongest co-varying

parameter, followed by all other parameters that may add additional explanatory power to the

regression. Several potential x-variables can all show strong individual correlations with the

y-variable that they are being used to forecast, although these individual correlations may not

be additive if the x-variables are also correlated with each other. Multi-variate models may

take such co-variation between x-variables into account if they are developed with stepwise

multiple regression techniques (DeLurgio, 1998). Criteria for x-variables to be used to

forecast the various y-variables were (1) they had to correlate significantly with y-variables

individually as well as within the multiple regression and (2) these correlations had to be of

the same sign; i.e., an x-variable that was positively correlated with a y-variable was excluded

if its contribution in the multiple regression was negative, and vice versa. These criteria are

not commonly used in econometrics (DeLurgio, 1998) and their relevance will therefore be

discussed later in this paper. Changes in linear slopes were detected with a trend shift analysis

method from Rodionov and Overland (2005). This method consists of a downloadable

application to Microsoft Excel and makes it possible to detect at which points a trend changes

at a specified significance level, given that the trend changes at all. Statistical significance

was always determined at the 95 % confidence level, since Figure 1 is defined at that level.

2.2. The Gallup dataset

The first dataset (CID, 2007) of two used in this study has been described by Gallup et al.

(1999) and was used in regression 1, Table 3 in the same study. It consists of six variables;

average purchasing power parity (PPP) adjusted annual gross domestic product (GDP) per

capita growth between 1965 and 1990 (hereafter called yG), PPP adjusted initial GDP per

capita in 1965 (hereafter referred to as YG), average years of secondary schooling among the

population in 1965 (EduG), the log value of life expectancy (LifeG), openness to international

trade (OpenG), and finally, the quality of public administration (PublG). This dataset was used

in Section 3.1, first to estimate bi-variate correlations with yG, and subsequently, to

quantitatively assess multi-variate correlations with yG using the criteria stated in 2.1. The

dataset was then divided into groups according to the gradient of one variable which was

insignificantly correlated with yG, to investigate whether the effect of such a variable on other

regressions could be estimated without violating the criteria in 2.1.

2.3. The IMF dataset

The second dataset consisted of actual and forecasted data on PPP adjusted GDP growth in 29

advanced economy in 10 of the International Monetary Fund's (IMF) April or May issues of

World Economic Outlook (IMF, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,

2007). In this work, yt denotes historical IMF data on growth at time t (in years), and it is

worth noting that IMF data are not given as per capita values, as opposed to yG. One-year

growth forecasts (yfort+1) were first compared to the actual outcome (yt+1) in a bi-variate

regression, and the resulting R2 value was compared to R2 values found between historical

data at the time of the forecast and yt+1.

A statistical model based on the presented historical data at the time of the forecast was then

used as a baseline from which to evaluate the forecast quality. This statistical model was

developed from a stepwise multiple regression with yt+1 as a y-variable and all historical data

available at the time of the forecast as potential x-variables. The baseline model was based on

the time period 1998 ≤ t ≤ 2004, i. e., on growth outcome from 7 years and 29 OECD

countries. In order to test the criteria in 2.1 by violating them, 100 normal distributed random

variables were added to the data set, allowing all significant x-variables to enter into the

forward stepwise multiple regression with yt+1 as a y-variable. If any of the random variables

would enter, then that would indicate that the criteria in 2.1 could decrease the risk of adding

nonsensical information to the regression.

The stability of the model constants were subsequently studied by omitting data from one year

at the time. Forecasts/hindcasts from the statistical model were tested against yt+1 by using

those model constants which were valid for all years except for the particular

forecast/hindcast year, in order to perform a test against "independent" data in the sense that

they were not used to develop the statistical model. For example, when hindcasting yt+1 for t =

2001, the model constants used were those that were valid for a multiple regression when t =

2001 had been omitted. Furthermore, the baseline model with constants valid for 1998 ≤ t ≤

2004 was tested to forecast yt+1 for t = 2005, a year from which data had previously not been

used at all in the study. These forecasts were compared to the IMF's forecasts (IMF, 2005)

and to the actual outcome (IMF, 2007). Finally, error terms from all forecasts and hindcasts in

this section were studied and compared.

3. RESULTS

3.1. Cross-country growth regressions

Table 2 displays cross-correlations (R2 values) between the six variables in Gallup et al.

(1999), described in Section 2. All correlations carried a positive sign. According to Table 2,

EduG, LifeG, OpenG and PublG were all mutually correlated and also positively correlated with

yG. YG was insignificantly - although, if anything, positively - correlated with yG.

Table 2. Cross-correlations (n=75) between six variables described in Section 2 and by Gallup et al., (1999). * =

significant at the 95 % confidence level.

YG yG EduG LifeG OpenG yG 0.00 EduG 0.39* 0.14* LifeG 0.51* 0.22* 0.60* OpenG 0.32* 0.38* 0.33* 0.40* PublG 0.61* 0.19* 0.39* 0.48* 0.53*

Thus, all variables except YG could be used in a multi-variate model according to the criteria

set in Section 2. The baseline equation for this model is:

yG = a + b · EduG + c · LifeG + d · OpenG + e · PublG (1)

where a-e are constants. However, only OpenG could enter as a significant x-variable in a

forward stepwise multiple regression with yG as a y-variable. Therefore, constants b, c, and e

were not determined, while a (including its standard error) was 0.787 ± 0.244 (p = 0.002, t =

3.23) and d was 2.62 ± 0.38 (p < 0.001, t = 6.74).

Table 3 shows how the R2 value changed when the data were divided into groups along the

YG gradient. When the data were grouped into 2, the R2 for OpenG vs. yG increased from 0.39

to 0.45 for the poorest countries (cases 1-48), and from 0.39 to 0.40 for the richest countries

(cases 49-96). Growth in the group with the poorest countries could also be forecasted with a

multiple regression that rendered an R2 value of 0.53 and included openness and log values of

life expectancy in 1965 as x-variables.

Table 3. Regression analysis with yG as a y-variable. The data set was sorted ascendingly according to the level

of YG.

Highest R2, single regression

Variable in single regression

R2, multiple regression

Variable(s) in multiple regression

1-96 94 0.39 OpenG 0.39 OpenG 1-48 48 0.45 OpenG 0.53 OpenG, LifeG 49-96 46 0.40 OpenG 0.40 OpenG 1-32 32 0.39 OpenG 0.39 OpenG 33-64 31 0.42 OpenG 0.58 OpenG, LifeG 65-96 31 0.50 OpenG 0.50 OpenG

When data were grouped into 3, R2 increased further for the richest third (cases 65-96) to 0.50

and the middle group (cases 33-64) allowed log life expectancy to enter as an additional x-

variable in the multiple regression, yielding an R2 value of 0.58. However, when the data was

divided into more groups than 3, regressions became insignificant in many of the groups. A

trend shift analysis of the OpenG vs. yG relationship using the method from Rodionov and

Overland (2005) showed that there were no significant slope deviations in the openness-

growth relationship along the GDP per capita gradient. Likewise, a t-test revealed that the

regression slope was slightly but insignificantly steeper among the poorest half of the

countries compared to the richest half.

3.2. The IMF's growth forecasts

The first row in Table 4 gives the

correlation between the IMF's GDP growth

forecasts and the actual outcome one year

after the forecast. The subsequent rows

contain correlations between historical

GDP growth data at the time of the

forecast, and the GDP growth one year

after. All significant correlations were

positive. It is worth noting that two

historical variables in Table 4 yielded

equal or higher R2 values against outcome

as compared to IMF's forecasts.

Table 4. Correlations (n=190) between the IMF's

growth forecasts or historical growth data, and

growth one year after the forecast. Data from IMF

(1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,

2006). All correlations were significant at the 95 %

confidence level. MV denotes the mean value.

Variable R2 Yfor

t+1 0.15 yt-1 0.04 yt-2 0.08 yt-3 0.10 yt-4 0.18 yt-5 0.06 yt-6 0.03 yt-7 0.04 yt-8 0.09 MV(yt-18, yt-17, …, yt-9) 0.15

A multiple regression with yt-1, yt-2, …, yt-8, and the mean value of yt-18, yt-17, …, yt-9 as x-

variables and the future growth outcome yt+1 as an y-variable allowed three historical data

variables to enter at the 5 % significance level, raising the R2 value to 0.26. From this

regression, the following statistical model was constructed:

),...,,( 91718421 −−−−−+ ⋅+⋅+⋅+= tttttt yyyMViyhygfy (2)

where f-i were constants which were determined to f = 0.670, g = 0.162, h = 0.273 and i =

0.246.

When 100 normal distributed random variables were added to the data set, violating the

criteria stated in 2.1, the resulting regression generated an R2 value of 0.35, and included yt-2,

yt-4, yt-7, yt-8, and 5 of the random variables as significant x-variables. None of the random

variables was significantly correlated with yt+1 in a single regression. When the random

variables were removed from the regression, yt-7 remained as a significant x-variable, but with

a negative sign. In an attempt to use yt-7 as an explanatory variable in compliance with the

criteria in 2.1, 8 new variables were generated with yt-7 as a denominator and the 8 other

historical variables from the IMF dataset as nominators. However, none of these ratios were

correlated with yt+1. Likewise, no significant correlation with yt+1 could be generated from

variables that consisted of yt-7 subtracted from any of the remaining historical variables in the

IMF dataset.

The stability of the constants in Equation 2 was tested by omitting yt+1 from one year at the

time and the results are displayed in Table 5. g could not be significantly separated from 0 for

one of the years (t+1 = 2001). The coefficient of variation (CV) for h was only 3.3%,

indicating a comparatively stable contribution to Equation 2 from yt-4 in relation to the other

historical variables.

Table 5. The stability of the constants in Equation 2 when yt+1 from one year at a time was omitted. The first

column describes which one of the forecasted years was excluded from the analysis. MV denotes mean value,

SD is the standard deviation and CV is the coefficient of variation and equals SD/MV. *Based on t+1 = 1999,

2000, 2002-2005. **Not significantly different from zero.

Excluded t+1 f g h i R2 None 0.670 0.162 0.273 0.246 0.26 1999 0.708 0.265 0.270 0.266 0.22 2000 0.806 0.107 0.257 0.234 0.23 2001 0.628 ** 0.283 0.405 0.31 2002 0.632 0.157 0.271 0.258 0.24 2003 0.558 0.217 0.276 0.230 0.28 2004 0.616 0.269 0.281 0.143 0.28 2005 0.630 0.163 0.261 0.263 0.23 MV* 0.658 0.196 0.269 0.232 0.25 SD* 0.087 0.065 0.009 0.046 0.03 CV* 0.132 0.331 0.033 0.199 0.11

The IMF's forecasts were compared to growth forecasts/hindcasts generated by Equation 2 by

regressing forecasts and hindcasts against GDP growth outcome, and these regressions are

shown in Figure 2. As motivated in the previous section, constants f-i were taken from Table

5 in order to test hindcasts from Equation 2 against independent data; i.e., the growth outcome

(yt+1) of, e. g., 2003 was compared to hindcasts from Equation 2 with constant values for f-i

taken from the 2003 line in Table 5. It is worth noting that for 1999 ≤ t+1 ≤ 2004, Equation 2

yielded hindcasts since constants were also generated from yt+1 data from subsequent years,

while the same equation yielded forecasts for 2005 since the information on the 2005 line in

Table 5 strictly emanates from years that preceded 2005. The regression slopes in Figure 2 are

both less than 1, indicating that forecasts and hindcasts may be systematically higher than the

outcome. However, mean values of forecasts, hindcasts and outcome all slightly exceeded 3%

and were not significantly different from one another. Thus, forecasts/hindcasts < 3% showed

a tendency to be overestimated compared to the real outcome yt+1, while forecasts/hindcasts >

3% tended to be underestimated, which explains why the regressions in Figure 2 differed from

the unit line (y=x).

Figure 2. Comparison between one-year forecast/hindcasts and outcome in yearly GDP growth (%) 1999-2005.

A. Forecasts from the IMF's World Economic Outlook. B. Forecasts/hindcasts from Equation 2.

The R2 value in Figure 2B was low, at 0.19, although marginally higher than in Figure 2A

(0.15). According to Figure 2A, none of the IMF's growth forecasts were negative, and none

exceeded 7.2 percent, although negative growth was observed in 9 cases out of 195 and higher

growth than 7.2 percent was observed in 12 cases.

Forecasts from IMF (1995) and Equation 2 compared to outcome (IMF, 2007) regarding GDP

growth in 2006 are displayed in Figure 3. The IMF's forecasts (Figure 3A) yielded a rather

high R2 value (0.55) when regressed against growth outcome, much higher than forecasts

from Equation 2 (0.19, see Figure 3B). However, the two regression equations in Figure 3

show that R2 values rather inaccurately reflected the forecasting power in this case, because

forecasts from both sources were systematically and significantly lower than outcome.

Figure 3. Comparison between one-year forecasts and outcome in GDP growth (%) 2006. A. Forecasts from the

IMF's World Economic Outlook. B. Forecasts from Equation 2.

The differences between the IMF's forecasts, forecasts/hindcasts from Equation 2, and

outcome, are further illustrated in Figure 4. Apparently, errors from both methods followed a

very similar pattern with respect to mean error and standard deviation of the error. During an

average year, the absolute value of the mean error was 1.1 percentage units while the standard

deviation of the error slightly exceeded 1.5 percentage units. Figures 4B and 4C suggest that

forecasts improved over time, given the slightly decreasing mean error deviation from 0 and

the slightly decreasing standard deviation of the error. However, the difference between IMF's

forecasts and growth outcome (including the bars representing the mean value error) in Figure

4A shows that IMF's forecasts were significantly different than outcome for all years except

for 2005. Forecasts/hindcasts from Equation 2 significantly deviated from outcome for all

years except for 1999 and 2005. Thus, 2005 appeared to be an outstandingly predictable year

(mean relative error < 10% according to Figure 4D) while the year 2001 was very

unpredictable (mean relative error > 70%) with respect to GDP growth.

Figure 4. Accuracy in GDP growth forecasting. A. Mean GDP growth forecasts for 29 OECD countries from the

IMF's World Economic Outlook, compared to mean values generated by Equation 2 (hindcasts for 1999-2004

and forecasts for 2005-2006), and to outcome, including the mean value error. Values in percent. B. The mean

forecast/hindcast error in percentage units of GDP growth. B. The standard deviation (SD) of the growth

forecast/hindcast error. C. The relative error (forecast or hindcast error divided by outcome) in percent.

4. DISCUSSION

Many economic growth models are motivated by economic theory and contain common

theoretical assumptions about, e. g., perfect competition (Ventura, 2005). There is reason to

question whether this method is optimal for constructing growth models. Karl Popper's strict,

generic, and unambiguous demarcation line between science and non-science (metaphysics or

pseudo-science) has since its first publication in 1934 widely gained respect in the scientific

community as a fundamental part of the hypothetico-deductive method. A primary criterion

for a scientific theory according to this demarcation method is that the theory must be

refutable, i.e., it must be possible to falsify the theory with evidence of the opposite. A

subsequent criterion is that the theory must pass some kind of empirical test. These criteria

have laid the ground for substantial progress in many academic disciplines. Irrefutable

constructs may very well have a heuristic value or play other important roles in inspiring the

future development of scientific theories. However, only refutable scientific theory has the

potential to forecast and to explain relationships by including probable outcomes and

excluding improbable outcomes. Separating scientific theory from metaphysics and

specifying their separate roles have the potential to improve the forecasting power and thus

the understanding of scientific relationships. Every time a scientific theory is refuted and

improved, the new theory brings us one step closer to the unattainable goal; the "truth"

(Popper, 1972).

The need to use refutability as a scientific criterion in economics has been acknowledged by

many economists (e. g., Hutchison, 1938; Blaug, 1980; Eichner, 1985; Stanley, 1998;

Bernhofen, 2005). Those who oppose this criterion (see, e. g., Hands, 2001 and references

therein) have thus far failed to present an alternative demarcation line which unambiguously

classifies astrology, alchemy and religion as non-science. Many economic theories which

have been found irrefutable assume conditions that cannot be observed, such as steady-state,

perfect competition and ceteris paribus, or contain empirically empty assumptions such as

rational behaviour; although every kind of behaviour may be considered rational by those who

display it (Hutchison, 1938; Blaug, 1980).

An example of an irrefutable theory is the well-known principle of comparative advantage.

This principle can be demonstrated by a logical exercise which shows that trade is always

beneficial for everyone involved. Evidently, no real observations of trade being harmful (e. g.,

arms trade between countries at war with each other, or narcotics trade with a subsequent

increase in drug addiction and social costs) has the potential to falsify, or even affect, this

logical exercise. In other words, the principle has no empirical content and cannot convey any

information about economic relationships in the real world. Bernhofen (2005) objected that

the principle of comparative advantage is indeed refutable, since the opposite of the principle

(trade being harmful) is a possible outcome. However, refutability requires that a theory does

not only have an opposing hypothesis, but that the theory may also be falsified with evidence

of the opposite (Popper, 1972). Opposing evidence is therefore more important than

confirmative evidence for a refutability test. Consequently, the principle of comparative

advantage can certainly be used to illustrate the correlation between openness to trade and

growth, as found in Table 2, but the principle should not be trusted as a pathfinder in the quest

for forecasting power. Likewise, neo-classical growth theory, which forecasts growth

convergence, hinges on conditions that have yet to be observed, such as perfect competition

and market equilibrium. As a result, observed growth divergence cannot possibly refute the

convergence hypothesis (Meeusen, 2003), because convergence can always be suspected to

occur sometime in the near or distant future, even if there are no present signs of it.

This line of reasoning brings us back to the reasons for not allowing YG to enter as an x-

variable with a negative sign in a multiple regression with yG as a y-variable. This choice has

been made in the present study, as opposed to, e. g., Gallup et al. (1999) and references

therein. The first reason why this choice was made here is that such a regression would

forecast that growth is lower in rich countries than in poor countries, which would contradict

Table 2, which instead forecasts that the relationship between YG and yG is insignificant, and,

if anything, positive (also supported by Barro, 1996). Furthermore, the analysis of the OpenG

vs. yG relationship showed that the regression slope was rather constant all along YG gradient,

indicating no significant effect from YG in this respect. Second, the "normal" reason for

ignoring results from individual correlations when constructing multi-variate models consists

of references to economic theory (DeLurgio, 1998). The convergence theory, which would be

needed to justify the use of YG in a negative correlation with yG, is irrefutable, as has been

demonstrated above, and therefore ill-suited for forecasting. Third, since YG is non-stationary

and yG is its derivative, any linear combination between YG and yG will have a non-stationary

residual with an infinite variance (Jones, 1995; Clements and Hendry, 1999). Instead of using

YG as a determinant of yG, a non-linear model can be constructed which varies according to

YG. Table 3 indicates that such a model can increase the forecasting power compared to a

linear model, since the R2 value increased when the dataset was divided into groups according

to YG values.

Future use of the criteria suggested in 2.1 is also motivated by the observation that when these

criteria were violated, several random variables could enter as significant determinants of

GDP growth in a multiple regression including historical growth data. This implies that if a

variable which shows no individual correlation with growth is used in a multiple regression,

the contribution of such a variable may very well be spurious. This may explain why many

multivariate growth regressions which generate high R2 values sometimes contain explanatory

variables with contradicting policy implications (Kenny and Williams 2001). Most of the

multivariate growth models examined by Levine and Renelt (1992) were fragile to small

changes, which should be another reason for questioning the commonly used criteria for

explanatory variables in multiple regression analysis. A final defence of the criteria set in 2.1

is that these criteria were successfully used to develop Equation 2, whose forecasts/hindcasts

were not less certain than the IMF's forecasts (Figure 4).

Similar methodological inconsistencies as those discussed above may have been fed into

many growth regressions, as well as into other types of forecasting models. Conspicuously

enough, the IMF's growth forecasts were found in this study to be of equally poor quality as

simple forecasts and hindcasts constructed from historical statistics (see Equation 2 and

Figures 2-4). These results are consistent with the findings regarding the reliability of the

World Bank's (Verbeek, 1999) and the OECD's (Pons, 1999) growth forecasts in the 1990s. In

an extensive review, Fildes and Stekler (2002) found that typical one-year growth forecasts

deviate slightly more than 1 percentage unit from outcome, which is similar to the findings in

this work (1.1 percentage units). Growth forecasts tend to be more uncertain the longer the

time-horizon (Pons, 1999) and many long-term forecasts have turned out to be grossly

inaccurate (Kenny and Williams, 2001).

Prairie's staircase (Prairie, 1996) provided a useful acceptance limit for forecasting power in

the comparison between growth forecasts/hindcasts and outcome. The IMF's forecasts and

Equation 2 generated different R2 values when regressed against outcome and all values were

below 0.65 (Figures 2 and 3) although both methods yielded rather similar errors (Figure 4).

However, the variation in regression equations from Figures 2 and 3 indicates that the

regression slope must, in addition to the R2 value, be taken into consideration in the

evaluation of forecasting power. None of the other growth regressions examined in this study

exceeded the acceptance limit (see Section 1) of R2 = 0.65. Given that many of the R2 values

in Table 1 may be at least partly additive, it may very well be possible in the future to develop

robust growth models with sufficient forecasting power, although there is reason to question

the prospects of such attempts. Forecasts of growth and other macroeconomic indicators have

not improved over time, despite extensive macroeconomic research (Fildes and Stekler,

2002). Ormerod and Mounfield (2000) argued that the great inherent variability in growth

statistics makes forecast failure inevitable.

The forecasting power of growth models should not be seen as a marginal issue. As stressed

in Section 1, the demonstrated shortcomings in contemporary growth forecasting may have

extensive impacts on policy outcome. The global aggregate GDP per capita growth, as well as

GDP per capita growth in many countries, have been substantially lower during recent

decades, a period often referred to as the "age of globalisation" or the "neoliberal order",

compared to preceding decades - despite the fact that policy has often been designed

according to the mainstream view among economists about the causes of growth (Rodrik,

1999; Maddison, 2001; Milanovic, 2003; Weisbrot et al., 2006). If growth forecasts would

improve in the future, this would imply that the present poor predictive understanding of

economic growth has been part of the reason why the long-term growth during recent decades

has generally not surpassed, or even reached, the high levels achieved during the "Golden

Age" of the 1950s and 1960s.

5. CONCLUSIONS

This study has evaluated the forecasting power of economic growth models and found that all

investigated models yield very uncertain forecasts, and that these findings have strong support

in the literature. The IMF's growth forecasts were significantly different than outcome in 7

years out of 8, and the forecast error was similar to that of a simple statistical model based on

historical data. One reason for the observed poor forecasting power may be the frequent use

of metaphysical constructs in growth models. Another reason may be that many growth

models include explanatory variables which show no individual correlation with growth, or

which are assumed to have an opposite, "concealed" effect, in relation to what is implied from

the variables' individual correlations with growth. Such growth models may be fragile to

small changes and may even produce contradictory policy implications, in addition to yielding

unreliable forecasts which may in turn cause extensive policy failure.

REFERENCES

Barro, R. J., 1996. Determinants of Economic Growth: A Cross-Country Empirical Study.

NBER Working Paper 5698. NBER, Cambridge, Massachusetts, 118 p.

Bernhofen, D. M., 2005. The Empirics of Comparative Advantage: Overcoming the Tyranny

of Nonrefutability. Review of International Economics, 13: 1017-1023.

Blaug, M. 1980. The Methodology of Economics. Cambridge University

Press, Cambridge, 314 p.

Bolaky, B. and C. Freund, 2004. Trade, Regulations, and Growth. Research working paper

WPS 3255, World Bank, Washington, 40 p.

Bougheas, S., Demetriades, P. O., and Mamuneas, T. P., 2000. Infrastructure, Specialization,

and Economic Growth. The Canadian Journal of Economics, 33: 506-522.

Carkovic, M. and Levine, R., 2002. Does Foreign Direct Investment Accelerate Economic

Growth? Working paper, University of Minnesota.

CID, 2007. http://www.cid.harvard.edu/ciddata/ciddata.html

Clements, M. P., and Hendry, D. F., 1999. Forecasting Non-stationary Economic Time Series.

The MIT Press, Cambridge, Massachusetts, 362 p.

DeLurgio, S. A., 1998. Forecasting principles and applications. Irwin/McGraw-Hill, Boston,

802 p.

Dollar, D. and A. Kraay, 2004. Trade, Growth, and Poverty. The Economic Journal, 114: F22-

Easterly, W., Kremer, M., Pritchett, L., and Summers, L. H., 1993. Good Policy or Good

Luck? Country Growth Performance and Temporary Shocks. Journal of Monetary

Economics, 32: 459-483.

Eichner, A. S., 1985. The lack of progress in Economics. Nature, 313: 427-428.

Fildes, R., and Stekler, H., 2002. The state of macroeconomic forecasting. Journal of

Macroeconomics, 24: 435-468.

Gallup, J.L., Sachs, J.D. and Mellinger, A.D., 1999. Geography and economic development.

International Regional Science Review 22, 179–232

Gylfason, T., 2001. Natural resources, education, and economic development. European

Economic Review, 45: 847-859.

Hands, D. W., 2001. Economic methodology is dead - long live economic methodology:

thirteen theses on the new economic methodology. Journal of Economic Methodology, 8: 49-

Hutchison, T. W., 1938. The Significance and Basic Postulates of Economic Theory.

Macmillan and Co., London, 192 p.

IMF, 1998. IMF World Economic Outlook (May issue). IMF, Washington, D. C., 221 p.

IMF, 2002. IMF World Economic Outlook (April issue). IMF, Washington, D. C., 225 p.

Jones, C. I., 1995. Time Series Tests of Endogenous Growth Models. Quarterly Journal of

Economics, 110: 495-525.

Kenny, C. and D. Williams, 2001. What Do We Know About Economic Growth? Or, Why

Don't We Know Very Much? World Development, 29:1-22.

King, R. G., and Levine, R., 1993. Finance and Growth: Schumpeter Might be Right. The

Quarterly Journal of Economics, 108: 717-737.

Levine, R., and Renelt, D., 1992. A Sensitivity Analysis of Cross-Country Growth

Regressions. The American Economic Review, 82: 942-963.

Levine, R., and Zervos, S., 1998. Stock Markets, Banks, and Economic Growth. The

American Economic Review, Vol. 88, No. 3. (Jun., 1998), pp. 537-558.

Maddison, A., 2001. The World Economy: A Millennial Perspective. OECD, Paris, 388 p.

Meeusen, W., 2003. Economic Convergence, the ‘Stylised Facts of Growth’ and

Technological Progress. An introduction from the perspective of the theory of growth. CESIT

Discussion paper No 2003/03. University of Antwerp, Antwerp, 21 p.

Milanovic, B., 2003. The Two Faces of Globalization: Against Globalization as We Know It.

World Development, 31: 667-683.

Ormerod, P., and Mounfield, C., 2000. Random matrix theory and the failure of macro-

economic forecasts. Physica A: Statistical Mechanics and its Applications, 280: 497-504.

Perotti, R., 1996. Growth, Income Distribution, and Democracy: What the Data Say.

Journal of Economic Growth, 1:149-187.

Persson, T. and Tabellini, G., 1994. Is Inequality Harmful for Growth? Theory and Evidence.

American Economic Review, 84: 600-621.

Pons, J., 1999. Evaluating the OECD's Forecasts for Economic Growth. Applied Economics,

31: 893-902.

Popper, K. R., 1972. Conjectures and Refutations; The Growth of Scientific Knowledge, 4th

ed. Routledge and Kegan Paul, London and Henley, 431 p.

Prairie, Y. T., 1996. Evaluating the Predictive Power of Regression Models. Canadian Journal

of Fisheries and Aquatic Sciences 53: 490-492.

Rodionov, S. N., and J. E. Overland, 2005. Application of a sequential regime shift detection

method to the Bering Sea ecosystem. ICES Journal of Marine Science 62: 328-332.

Rodrik, D., 1999. Where Did All the Growth Go? External Shocks, Social Conflict, and

Growth Collapses. Journal of Economic Growth, 4: 385-412.

Stanley, T.D., 1998. Empirical Economics? An Econometric Dilemma with Only a

Methodological Solution, Journal of Economic Issues, 32, 191-218.

Verbeek, J., 1999. The World Bank's Unified Survey Projections: How Accurate Are They?

An Ex-Post Evaluation of US91-US97. Policy Research Working Paper 2071. World Bank,

Washington, D.C., 60 p.

Ventura, J., 2005. A Global View of Economic Growth. In: Aghion, P., and Durlauf, S.

Handbook of Economic Growth, Vol 1B. North Holland, Amsterdam, pp. 1419-1497.

Weisbrot, M., Baker, D., Rosnick, D., 2006. The Scorecard on Development: 25 Years of

Diminished Progress. International Journal of Health Services, 36:211-234.

The Forecasting Power of Economic Growth Models - DiVA

Documents