Quantityeconomicsproject
– Searching for the best fitted consumption
model.Bachelor’s Degree of Finance
10/05/2013
Quantity Economics Project
Introduction:
In order to start up an econometric linear regression
model analysis over the relationship between Consumers’
expenditure, personal disposable income and Interest
rates, the data are collected from NAVIDATA from
aggregated consumers’ expenditure, aggregated disposable
income based on the 1990’s price and Interest rates over
T-bill in UK between 1972 and 1995.
Q1
To gain an overview of the data, the line charts which
Page 1
2025
3035
40Mi
llio
n in
199
0 pr
ices
1970 1975 1980 1985 1990 1995year
Consumers' expenditure Personal disposable incomeSource: Office for National Statistics licensed under the Open Government Licence v.1.0.
Consumption and Income curves
10/05/2013
Figure 1
Quantity Economics Project
included different factors are drawn by Stata. A close
relationship can be found between income and consumption
from the first chart.
510
1520
Trea
sury
Bil
l yi
eld
%
1970 1975 1980 1985 1990 1995yearSource: Office for National Statistics licensed under the Open Government Licence v.1.0.
Interest Rates curves
F
urther analysis about the relationship between those
three factors can be conducted with an understanding of
basic Keynesian Consumption function theory and Surrey’s
theory.
The result of the
Page 2
Table 1
Quantity Economics Project
correlation table also illustrated how Income and
Consumption are closely related with a correlation
coefficient of 0.9934. A negative correlation exists
between interest rates and the other two factors.
So in the following question, various ways to test the
theory and the fitness of the model would be used. These
analyses include t test over the individual coefficient,
partial F test and F test for the overall significance of
the model, in addition, the R-squared analysis is also
used.
Q2
a) After using coefficient of correlation to measure the
Page 3
Table 2
Quantity Economics Project
strength of linear relationship between Income and
consumption in the first question, a regression analysis
can be implemented to determine what equation fits the
data best and measure the fitness of the model.
Using the estimated regression coefficients from Stata
the regression analysis output,
the regression equation can be
present as follow:
Page 4
Table 3
Quantity Economics Project
Y(Predicted) = 0.9988*X2 – 2.8500
St. Error. (0.0264) (0.7802)
T value: (37.85) (-3.60)
P value: (0.002) (0.000)
To test whether a model is good fit, R-squared is an
important measure. It reveals that how many percentage of
the outcome can be explained by the model as well as
whether the model fit actual data. In this case, R-
squared equals 98.83%, which means 98.83% of the change
in consumption can be explained by the change of income,
while the coefficient for X2 is 0.9988, so for every unit
increase in income, 0.9988 units increase of consumption
Page 5
Quantity Economics Project
are predicted. Considering that there is only one
variable t test with the hypothesis that there is no
difference between the coefficient and 0 would also
explain the fit of the model. The t statistic can be
calculated by formula: and it is given in the
stata output. So, for the first model, t is 37.85 which
are obviously larger than the critical t value, 2.1098,
with 17 degree of freedom. The corresponding p-value
approximately equals to 0, being less than 0.05, the
level of significance. So we can conclude that the first
Page 6
Quantity Economics Project
model is a good fit model.
It is clear that in scatter-line fitted diagram the
regression line fit with the actual data. The 95%
confident zone covered most of the data.
Using R-squared and t test analysis to examine the fit of
second model.
Page 7
Figure 2
Table 4
Quantity Economics Project
The estimated regression equation for the second model is
listed as follow:
Y(predicted) = 0.3471*X3 +
22.3333
St. Error. (0.3809)
(4.5552)
T value: (0.91) (4.90)
P value: (0.375) (0.000)
The R-squared value for this model is only 4.66%, which
means less than 5% of the changes in dependent variable,
consumption, can be explained with the model related to
interest rates. In terms of t statistic, in this case
Page 8
Table 5
Quantity Economics Project
0.91, it is located in the acceptance region (-2.1098,
2.1098). So we cannot reject the null hypothesis that the
coefficient of X3 is not significantly different from
zero. Thus we can also conclude the model is not a good
fitted model, as its only variable is not significant.
b) Given that both models have only one independent
variable, we know that the degree of freedom is 17, the
observations minus 2 (one for variable and one for the
Page 9
Figure 3
Figure 4
Quantity Economics Project
constant). The population standard deviation is estimated
by sample standard deviation and the observation is
smaller than 30. So the distribution of the estimated
coefficient a1, b1, a2 and b2 are t distribution with 17
degree of freedom. (The distribution is presented in the
following graph)
The null hypothesis for a1 is that a1 is 0, and the
alternative hypothesis is that a1 is not 0. It can also
Page 10
Quantity Economics Project
be written in this form: H0: a1=0, Ha: a1≠0;
Similar to a1, the t test for b1 has the hypothesis that:
H0: b1=0, Ha: b1≠0;
For a2: H0: a2=0, Ha: a2≠0;
For b2: H0: b2=0, Ha: b2≠0;
And for all of the 4
coefficients, the acceptance
region is (-2.1098, 2.1098). The critical region,
therefore, is .
Compared the t value of each variable with the critical
value, only the coefficient of X3 0.91 is located in the
acceptance region. Therefore we reject the null
Page 11
Table 6
Quantity Economics Project
hypothesis at 5% significance level for a1, b1 and a2. But
we cannot reject the null hypothesis for b2. Using p-value
will give out the same result, as only the p-value of b2
is greater than significance level 0.05.
c) Marginal Propensity to Consume (MPC) represents the
increase in consumption for each unit increasing in
income. People always spend certain amount of money in
maintaining essential daily life. As their income
increase, they will spend part of their income. So that
is marginal propensity to consume, and it should be
greater than 0 and smaller than 1.
Page 12
Quantity Economics Project
However, our model showed that the MPC is 0.9988, which
is almost one. In addition, the constant is -2.85,
smaller than 0. To test whether it agree with the theory
a serials of t test are conducted:
To begin with, two t tests over the constant is conducted
with =0.05.
The first is testing whether the MPC is smaller than 1:
H0: b2 <= 1 V.S. Ha: b2 > 1
With 5% significance level and 17 d.f. the critical value
can be computed. It is 1.7396. Therefore the critical
Page 13
Table 7
Quantity Economics Project
region is (1.7396, +∞). We will reject the null when t
statistic is greater than 1.7396. Using formula, t
statistic is -0.04718055. It is in acceptance region. We
cannot reject the null when =0.05.
H0: b2 >= 0 V.S. Ha: b2 < 0
Because it is testing left side of the tail, the critical
value computed is -1.7396 and the critical region is (-∞,
-1.7396). We will reject the null when t statistic is
less than -1.7396. Calculating t statistic, 37.850918 is
larger than -1.7396 located in acceptance region. We also
cannot reject the null when =0.05.
Page 14
Quantity Economics Project
We cannot reject both hypotheses; MPC is between 0 and 1
according to the sample data. It agrees with the theory.
Q3 Y (predicted) = 1.012296*X2 - 0.0818858*X3 - 2.29054
St. Error. (0.0251396) (0.0402461) (0.7677111)
T value: (40.27) (-2.03) (-2.98)
P value: (0.000) (0.059)
(0.009)
Page 15
Quantity Economics Project
a) H0: b2 = 0.95 V.S. Ha: b2 < 0.95
The significance level is given as 5%. The degree of
freedom for the model is 16. Therefore, the
critical region can be computed by stata. It is (-∞, -
1.7459). Using formula: the t statistic for the
hypothesis can be yielded: t = 2.4779952. Because t
statistic is greater than the critical value and it
located in the acceptance region, we cannot reject the
Page 16
Table 8
Quantity Economics Project
null hypothesis at 5% significance level that b2 is equal
or greater than 0.95.
b) The 95% confidence interval is defined as a region
where there is 95% probability that the true parameter
would fall in this region. The critical t value for 95%
confidence interval is 2.1199. P (-2.1199< <2.1199) =
95%. So X is between , yielding the
confidence interval: (-0.16720351, 0.00343191).
c)
i) F test for the overall model fit also provide a
clear view on whether the model is a good model, the
Page 17
Quantity Economics Project
null hypothesis being all of coefficient are not
significant different from 0. F statistic is
computed by using the Mean squared of Model divided
by the Mean Squared of Residual as . It
is 850.72 and greater than the critical F value at
5% significance level, 3.6337235, at 2 degree of
freedom for numerator and 16 degree of freedom of
denominator (calculated by invFtail function in
stata). Beside, its corresponding p-value
approximately, 0, is less than 0.05. So we can
Page 18
Quantity Economics Project
reject the null hypothesis at 5% significance level
and conclude that the model fit the actual data.
ii) Residual sum of squares is the sum of the difference
between actual outcome and model estimated outcome.
It can also be computed by using the total sum of
square minus the model sum of square as .
The smaller the RSS is comparing with TSS, the
larger the R-squared is and the better the model
fit. We also call it Sum of squared of error of
prediction. The RSS we have in the regression model
is 3.1024553. It indicates that only an amount of
Page 19
Quantity Economics Project
error is not explainable by the model.
The residual plot diagram shows the difference
between actual data and predicated data as follow:
d)
The variance-covariance matrix can be deduced from the
following formula: . In the ith column and
Page 20
Figure 5
Table 9
Quantity Economics Project
ith row the value of the (X’ X) matrix multiplied by is
the variance for ith coefficient. E.g. the variance of x2,
x3 and the constant are 0.000632, 0.00161975 and
0.58938034 respectively. The covariance between two
estimators are listed (i, j) position represented the
covariance between coefficient and .
e) Parameters b2 b3 b1
P value: (0.000) (0.059) (0.009)
Decision: Significant Insignificant
Significant
(Rule: P value smaller than significant level, it means
Page 21
Quantity Economics Project
the sample result is significant enough to reject the
null hypothesis.)
The conclusion is that b1 and b2 are significantly
different from 0 when =0.05. b3 is not significantly
different from 0 according to the sample at 5%
significance level.
f) After generated a time variable t, a new regression can
be conducted as follow:
Page 22
Table 10
Quantity Economics Project
The model:
g) To compare two models with different number of
variables, adjusted R-squared would be used. R-squared
can be computed by using 1 minus residual sum of squared
divided by total sum of square. Considering that each
model might have different number variables, in adjusted
R-squared, we use mean squared of residual and mean
squared of total instead of RSS and TSS. Both models with
time trend or without time trend are good fit according
Page 23
Quantity Economics Project
to the F statistics (both are greater than critical F
value 3.2873821) and R-squared (Both are more 99%
explained).
However, the model with time trend performs better
comparing the adjusted R-squared. Its adjusted R-squared
is 99.34% outperform the former one 98.95%. So the second
model which included time trend is the best model so far.
h) H0 : b3 = b4 = 0 (Both b3 and b4 are equal to 0)
Ha : b3≠0 or b4≠0 (one of the parameter is not equal
to 0)
Using Partial F test, the critical value can be find by
Page 24
Quantity Economics Project
using “invFtail()” function in stata. It is 3.6823203
with 5% significance level when there
are 2 degree of freedom in the numerator and 15 degree of
freedom in the denominator. Therefore the critical region
is (3.68, +∞) and the acceptance region is (0, 3.68). So
if the F value is in critical region, we would reject the
null at 5% significance level. The F value can be
calculated according to the formula: yielding F =
8.6045. Because F is greater than 3.68 and its associated
p-value is less than 0.05, we reject the null at 5%
significance level. Therefore we can conclude that at
Page 25
Quantity Economics Project
least one parameter between b3 and b4 is significantly
different from zero.
Q4. The model can be written in the following form:
Comparing the forecasted data with real data set, it is
obvious that the
predicted values
are relatively
Page 26
Figure 6
Figure 7
Quantity Economics Project
higher than the actual data. An unpredicted drop in
expenditure in after 1991 leads to the higher value in
prediction. In Figure 7, we can see the difference in
prediction and actual data.
Q5.
The log-linear model based on the regression of both
dependent variable and independent variables. For each
independent variable the coefficient represents the
elasticity of dependent variable with each percent change
of independent variable.
The log-linear models with and without time trends are
Page 27
Quantity Economics Project
listed as follow:
(Data from Table 11)
The overall fit of both
models can be examined by
their F statistics and R-squared. Both models have very
small residual sum of squares, with 0.008691539 and
0.008442626 respectively. As a consequence that, both
models have high R-squared. The R-squared for the model
without time trend is 98.88% and 98.91% for the other. F
statistics are also significantly higher that the
critical value, 3.4668 and 3.0984 respectively. So both
Page 28
Table 11
Quantity Economics Project
models are good fit. However the model without time trend
outperforms the other with its adjusted R-squared
slightly higher 98.78% over 98.75%.
Data for both models are listed as follow:
Q6.
There are several variables which are considered in
Surrey’s theory including inflation and the consumption
and the income from the year before (lag structure). I
think those variables can be significant in the
Page 29
Table 12
Quantity Economics Project
consumption model. The inflation is the index which
evaluates the change of price level. It affects
consumption in a negative way. Duesenberry’s theory about
“ratchet effect” explained that the highest income and
the consumption (consume style) will also affect the
present consumption. In addition, I would also take GDP
into consideration, as it is an important index which
related the overall economic condition. It can play a
significant role in the changing of people’s consumption.
The regression output is showed in Table 13
Page 30
Quantity Economics Project
The model that I have built included 6 variables, income,
interest rates, inflation (X4), the income and consumption
from last year. It is built in a log-linear form to
reduce the fluctuation of the model and present the
consumption elasticity to the respect of each variable.
The modeling output also proved it is the best fit model
Page 31
Quantity Economics Project
comparing with the models above. It has 24 observations
as other models. The R-squared is as high as 0.9982.
99.82% of change in income can be explained. The adjusted
R-squared which usually used to compare is outperform all
the other models with 0.9976.
The comparison between the new model and the best model
we used in section (5),
, will prove that this new model is the best fit.
I conduct the partial F test over variables: inflation,
GDP, past income and consumption. The yielding F
statistic 22.01 is greater than the critical value 2.96
Page 32
Quantity Economics Project
with 4 d.f. and 17 d.f. for numerator and denominator
respectively. So we reject the hypothesis that new
variable are not significantly different from 0. So we
cannot take out the variables which are significant in
the model. The new model would be chosen between two.
Conclusion:
In the regression analysis over the relationship between
consumption and other variables including income,
interest rates, inflation and GDP, We discussed about the
theories, such as Kenyesian Consumption theory, MPC
theory and Surrey’s theory, behind the consumption
Page 33
Quantity Economics Project
function. T test, R-squared and F test are used to
testing the significance of the coefficients and overall
model fit. After evaluating different variables, we build
up a new model and can conclude that income, interest
rate, inflation, GDP, and past income and consumption are
significant variables in the changing of consumer’s
expenditure.
Page 34
Quantity Economics Project
/* Quantity Economics Project*/
clear
cd E:\stata /* Here is the directory
of my project*/
capture log close
log using StataProject.log, replace
insheet using StataProject.csv
/***************************** Section One
**************************/
replace y = y / 10000
replace x2 = x2 / 10000
lab var year "year"
lab var y "Consumers' expenditure"
lab var x2 "Personal disposable income"
lab var x3 "Interest rates"
/*The first Graph for income and consumption*/
graph twoway line y x2 year, title(Consumption and Income
curves) subtitle(" ") ytitle("Million in 1990 prices")
name(“gr1”, replace) note("Source: Office for National
Page 35
Quantity Economics Project
Statistics licensed under the Open Government Licence
v.1.0. ")
/*The second Graph for interest rate*/
line x3 year, title("Interest Rates curves") subtitle("
") ytitle("Treasury Bill yield %") name(“gr2”, replace)
note("Source: Office for National Statistics licensed
under the Open Government Licence v.1.0. ")
cor y x2 x3
/***************************** Section Two
**************************/
/*Question a. */
regress y x2 if year <= 1990
estimates store model1
estimates table model1, se
/*The third graph for model1 fit*/
gr tw (sc y x2) (lfitci y x2) if year <= 1990,
title("Regression model for Income and consumption")
subtitle(" ") ytitle("Consumers' expenditure")
name(“gr3”, replace) note("Source: Office for National
Page 36
Quantity Economics Project
Statistics licensed under the Open Government Licence
v.1.0. ")
regress y x3 if year <= 1990
est store model2
est table model2, se
/*The fourth graph of model2 fit*/
gr tw (sc y x3) (lfitci y x3) if year <= 1990,
title("Regression model for Interest rates and
consumption") subtitle(" ") ytitle("Consumers'
expenditure") name(gr4, replace) note("Source: Office for
National Statistics licensed under the Open Government
Licence v.1.0. ")
/*Question b. */
/*The fifth graph for T distribution*/
tw function y=tden(17,x), range(-4 -2.1098) color(gs5)
recast(area) || function y=tden(17, x), range(2.1098 4)
color(gs5) recast(area) || function y=tden(17, x),
range(-4 4) legend(off) ytitle("percentage")
title("Student's T distribution curve") subtitle(" ")
Page 37
Quantity Economics Project
note("t distribution with 17 degree of freedom and 5%
significance level") name("gr5", replace)
/*Table for model parameters*/
est table model1 model1, se t p
/*Question c. */
/*Generate a tabel for t test*/
quietly reg y x2 if year <= 1990
ereturn display
quietly est table model2
mat a = r(coef)
/*T test on MPC*/
display "T test for the null: b2 <= 1 against b>1"
display "t = ( b2(estimated)-1 )/ (se) "
di "t = " (a[1,1]-1)/a[1,2]^0.5
display "T test for the null: b2 >= 0 against b<0"
display "t = ( b2(estimated)-0 )/ (se) "
di "t = " (a[1,1]-0)/a[1,2]^0.5
/***************************** Section Three
**************************/
Page 38
Quantity Economics Project
/*Question a.*/
reg y x2 x3 if year <= 1990
matrix coef = e(b)
display "Model: " "Y =" coef[1,1] " X2 " coef[1,2] " X3
" coef[1,3]
/*Question b.*/
est store model3
quietly est table model3
mat a = r(coef)
display "T test for the null: b2 = 0.95 against b<0.95"
display "t = ( b2(estimated)-0.95 )/ (se) "
di "t = " (a[1,1]-0.95)/a[1,2]^0.5
/*Question c.*/
/*The sixth graph for the residuals*/
rvfplot, yline(0) title("Residual of predicted
consumption") subtitle(" ") xtitle("Predicted
consumption") name("gr6", replace)
/*Question d.*/
estat vce
Page 39
Quantity Economics Project
/*Question e.*/
est table model3, p
/*Question f.*/
gen t = _n
lab var t "time"
reg y x2 x3 t if year <= 1990
est store model4
mat b = e(b)
display "Model: " "Y =" b[1,1] "*X2 " b[1,2] "*X3 "
a[1,3] "*t " b[1,3]
/*Question h.*/
test x3 t
/***************************** Section Four
**************************/
predict yhat
lab var yhat "Predicted Consumption"
/*The seventh graph for prediction value*/
Page 40
Quantity Economics Project
line y year || line yhat year if year >= 1991,
title("Predicted curve from 1991 to 1995") subtitle(" ")
ytitle("Consumption") legend(off) name(“gr7”, replace)
list year y yhat if year >= 1991
gen dfy = y - yhat
lab var dfy "Difference between y and yhat"
sc dfy yhat if year > 1990, mlabel(year) ylabel(-3(0.5)1)
xlabel(35(1)40) yline(0) title("Residuals in the
predicted period") subtitle(" ") ytitle("difference
between y and yhat") xtitle("Predicted consumption")
name(“gr8”, replace)
/************************** Section Five
****************************/
gen logy = log(y)
gen logx2 = log(x2)
gen logx3 = log(x3)
lab var logy "Log form of consumption"
lab var logx2 "Log form of income"
lab var logx3 "Log form of interest rates"
Page 41
Quantity Economics Project
reg logy logx2 logx3
est store loglinear1
reg logy logx2 logx3 t
est store loglinear2
est table loglinear1 loglinear2
/***************************** Section Six
***************************/
clear
insheet using q6.csv
replace y = y / 10000
replace x2 = x2 / 10000
replace g = g / 10000
gen logy = log(y)
gen logx2 = log(x2)
gen logx3 = log(x3)
gen logg = log(g)
gen lagrpi = rpi[_n-1]
gen x4 = rpi/lagrpi-1
Page 42
Quantity Economics Project
gen logx4 = log(x4)
gen lagy = y[_n-1]
gen lagx2 = x2[_n-1]
gen loglagy = log(lagy)
gen loglagx2 = log(lagx2)
drop if _n <= 2
gen t = _n
reg logy logx2 logx3 logx4 loglagy loglagx2 logg
/*************************** Finished
*************************/
Bibliography
1) Annotated output for stata. UCLA: Statistical Consulting Group. from http://www.ats.ucla.edu/stat/AnnotatedOutput/ (accessed May 01, 2013).
2) Douglas A. Lind, Lind, William G. Marchal, Samuel Adam Wathen (2010) Basic Statistics for Business and Economics, Student Edition with Formula Card, 5th edn., : McGraw-Hill Higher Education.
3) Thomas R L, Introductory Econometrics: Theory and Applications, chapter 10 “Consumption Functions”, 2nd edition, Longman, 1993, Ch 10 pp 247-269
4) Michael Barrow (2009) Statistics for economics, accounting and business studies [electronic resource], 5th ed. : Prentice Hall/Financial Times.
Page 43