+ All Categories
Home > Documents > Course Paper On Regression Analysis Of Gold Prices

Course Paper On Regression Analysis Of Gold Prices

Date post: 14-Apr-2015
Category:
Upload: vineet-singh
View: 394 times
Download: 4 times
Share this document with a friend
Description:
Interesting Analysis of What Affects Gold Prices Using Multiple Linear Regression
16
Multiple Linear Regression Model For Predicting GOLD Prices Project report submitted to Prof. Dhiman Bhadra Associate: Kshitij G Trivedi For the requirements of the course Probability and Statistics – 3 On October 25, 2012 By Group E12 Vineet Singh SonamYadav SaurabhPhelixKachhap Arun Kumar K Rohit Raj Nikhil Pandey
Transcript
Page 1: Course Paper On Regression Analysis Of Gold Prices

Multiple Linear Regression Model

For Predicting GOLD Prices

Project report submitted to

Prof. Dhiman Bhadra

Associate: Kshitij G Trivedi

For the requirements of the course

Probability and Statistics – 3

On October 25, 2012

By

Group E12

Vineet Singh

SonamYadav

SaurabhPhelixKachhap

Arun Kumar K

Rohit Raj Nikhil Pandey

Page 2: Course Paper On Regression Analysis Of Gold Prices

2

Contents

1. Introduction ............................................................................................................................. 3

2. Data Description ...................................................................................................................... 3

3. Exploratory Analysis ............................................................................................................... 4

4. Regression Modeling ............................................................................................................... 7

a. Model Building .................................................................................................................... 7

b. Coefficient of Multiple Determination ................................................................................ 9

c. T-tests of Regression Coefficients ..................................................................................... 10

d. Residual Analysis............................................................................................................... 10

e. Multicollinearity Correction .............................................................................................. 12

f. Durbin Watson Test ........................................................................................................... 13

g. Positive Autocorrelation Correction: Lagged Variable ..................................................... 14

h. Model Validation ............................................................................................................... 15

5. Conclusion ............................................................................................................................. 16

6. Further Improvement ............................................................................................................. 16

Page 3: Course Paper On Regression Analysis Of Gold Prices

3

1. Introduction

GOLD has attracted interest from all sorts of investors including several central banks, the IMF,

hedge fund managers, and retail investors especially in India. In the last 5 years, GOLD has

generated cumulative returns of ~130% v/s ~ -9% returns of the S&P 500 Index. Given its

growing importance as an asset class, we intend to predict GOLD prices using the regression

techniques learnt during the course.

2. Data Description

The Gold Prices have been empirically observed to be related to several macroeconomic factors.

For our analysis we have considered the following key variables

1. Value of Dollar (Euro v/s US Dollar): Since gold prices are denominated in US Dollar

and US Dollar is the reserve currency, demand for gold increases with a loss in value of

US dollar

2. Equity Indices (S&P 500 Index and BNY Mellon BRICs ADR Index): Lower

correlation with equity indices will increase demand for gold for diversification in a

portfolio

3. Commodity Prices (Thomson Reuters/Jefferies CRB Index): Rising commodity prices

are a signal that inflationary pressures in the economy are building, the purchasing power

of the dollar is declining and the gold price should be rising as well

4. Monetary Policy (US M1 Money Supply): Expansive monetary policy increases risks

of high inflation thus increasing demand for gold as an inflation hedge

5. Inflation expectations (University of Michigan Expected Inflation): Rising inflation

increases demand for ‘real’ assets like gold as investors seek to hedge themselves against

erosion in real value of money

6. Interest Rates (US Treasury Rate 1 year): Higher real interest rates increase

opportunity cost of gold and decrease its demand

Page 4: Course Paper On Regression Analysis Of Gold Prices

4

Our data sources are mentioned in Table 1

S No FACTORS AFFECTING

DEMAND

INDEPENDENT VARIABLE IN

MULTIPLE REGRESSION

DATA

SOURCE

1 Dollar Value USD v/s EURO Exchange Rate REUTERS

2 Equity Indices S&P 500 Index

BNY Mellon BRICs ADR Index REUTERS

3 Commodity Prices TRB/J CRB Index REUTERS

4 Monetary Policy US Money Supply M1 REUTERS

5 Inflation Expectations University of Michigan Expected Inflation REUTERS

6 Interest Rates US Treasury Rate 1 year REUTERS

Table 1: Variables and Data Sources

3. Exploratory Analysis

We have used scatter plots to inspect whether a linear relationship exists between Gold Prices

and each of the explanatory variables identified

Similarly doing a scatter plot of gold rate with Value of Dollar (USD/EUR Rate), Equity Indices

(S&P 500 Index and BNY Mellon BRICs ADR Index), Commodity Prices (CRB Index),

Monetary Policy (US M1 Money Supply), Inflation Expectations we have obtained the following

plots.

1.21.11.00.90.80.70.6

2000

1500

1000

500

0

USD/EUR Rate

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs USD/EUR Rate

1600150014001300120011001000900800700

2000

1500

1000

500

S&P 500 Index

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs S&P 500 Index

Page 5: Course Paper On Regression Analysis Of Gold Prices

5

There seem to be a positive relationship between Gold Price v/s CRB index, Gold price

v/s BRICs ADR index & gold rate v/s US Money Supply M1 which shows that gold rate

will increase with increase in either of those values.

The scatter plot of Gold price with value of dollar shows a negative relationship that

implies when value of dollar decreases there is an increase in gold price.

For Treasury Rates, based on the scatter plot obtained below we cannot draw any conclusion in

terms of linearity

We decide to plot the gold prices with Ln (Treasury Rate) (Treasury rate taken in natural

logarithmic terms).The following scatter plot was obtained which shows that there exists a linear

relationship between Gold Prices and Ln (Treasury Rate). Since the slope is negative we can

infer that the gold price and treasury rate (taken in natural logarithmic terms) will have a

negative association. As the value of LN treasury rate increases there would be a decrease in the

gold prices.

70006000500040003000200010000

2000

1500

1000

500

BRICs ADR Index

Go

ld P

rice

U$

/O

zScatterplot of Gold Price U$/Oz vs BRICs ADR Index

500400300200100

2000

1500

1000

500

CRB Index

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs CRB Index

220020001800160014001200

2000

1500

1000

500

M1

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs M1

5.55.04.54.03.53.02.52.01.5

2000

1500

1000

500

Expected Inflation

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs Expected Inflation

Page 6: Course Paper On Regression Analysis Of Gold Prices

6

Following is the scatter plot matrix for the gold rate date set.

Owing to the variation in the range of the plots, there is no such perfect positive or

negative relationship but they can be in a sense said to have a somewhat positive relation

as depicted below

Few plots show positive association, few others negative, which is a little misleading. So in order

to get the correct picture, we need to control certain explanatory variables and this is done with

the help of multiple regression model

6543210

2000

1500

1000

500

Treasury Rate

Go

ld P

rice

U$

/O

zScatterplot of Gold Price U$/Oz vs Treasury Rate

210-1-2

2000

1500

1000

500

LN_Treasury Rate

Go

ld P

rice

U$

/O

z

Scatterplot of Gold Price U$/Oz vs LN_Treasury Rate

1.0

0.8

0.6

20-2

16001200800 200016001200 5.03.52.0

1600

1200

800

400

300

200

2000

1600

1200

5000

2500

05.0

3.5

2.0

1.00.80.6

2

0

-2

400300200 500025000

USD/EUR Rate

S&P 500 Index

CRB Index

M1

BRICs ADR Index

Expected Inflation

LN_Treasury Rate

Matrix Plot of Explanatory Variables

Page 7: Course Paper On Regression Analysis Of Gold Prices

7

In addition to the scatter-plot matrix, a correlation matrix was constructed depicting the

correlation coefficients between the response and the predictors and also between the predictors.

Gold

Price

U$/Oz

USD/EUR

Rate

S&P

500

Index

CRB

Index M1

BRICs

ADR

Index

Expected

Inflation

Gold Price

US$/Oz

USD/EUR

Rate

-0.636

0

S&P 500

Index

0.209 -0.445

0.022 0

CRB Index 0.602 -0.833 0.707

0 0 0

M1 0.954 -0.573 0.09 0.476

0 0 0.331 0

BRICs ADR

Index

0.814 -0.783 0.578 0.834 0.659

0 0 0 0 0

Expected

Inflation

0.296 -0.501 0.608 0.794 0.179 0.54

0.001 0 0 0 0.052 0

LN_Treasury

Rate

-0.756 0.245 0.38 -0.03 -0.808 -0.35 0.138

0 0.007 0 0.743 0 0 0.136

Table 2: Correlation Matrix

4. Regression Modeling

a. Model Building

In scatter plots we used single explanatory variable for regression with the gold price. However

in real life applications, the response of gold price will depend on more than one explanatory

variable. Hence, the entire explanatory variable was taken into account at once to estimate the

value of the response. Performing multiple regression in MINITAB, we got the following

regressions equation:

Page 8: Course Paper On Regression Analysis Of Gold Prices

8

MODEL A:

(

) ( )

( ) ( ) ( )

( ) ( )

Effect of value of dollar (USD/EUR): Since the slope of value of dollar is positive, controlling

for rest of the explanatory variables, we can say that gold price is positively correlated to value

of dollar. Specifically the predicted gold price increases by 572 for every one unit increase in

value of dollar vs euro.

Effect of value of S&P Index: Since the slope of S&P index is negative, controlling for rest of

the explanatory variables, we can say that gold price is negatively correlated to S&P Index.

Specifically the predicted gold price decreases by 0.0912 for every one unit increase in S&P

Index.

Effect of value of CRB Index: Since the slope of CRB index is positive, controlling for rest of

the explanatory variables, we can say that gold price is positively correlated to CRB Index.

Specifically the predicted gold price increases by 1.26 for every one unit increase in CRB Index.

Effect of value of Money supply (M1): Since the slope of value of money supply is positive,

controlling for rest of the explanatory variables, we can say that gold price is positively

correlated to value of money supply. Specifically the predicted gold price increases by 0.955 for

every one unit increase in value of money supply.

Effect of value of BRICs ADR index: Since the slope of value of BRICs ADR index is positive,

controlling for rest of the explanatory variables, we can say that gold price is positively

correlated to value of BRIC’s ADR index. Specifically the predicted gold price increases by

0.0830 for every one unit increase in value of BRIC’s ADR index.

Effect of value of inflation: Since the slope of value of inflation is negative, controlling for rest

of the explanatory variables, we can say that gold price is negatively correlated to value of

inflation. Specifically the predicted gold price decreases by 30.6 for every one unit increase in

value of inflation.

Page 9: Course Paper On Regression Analysis Of Gold Prices

9

Effect of value of US Treasury Rate: Since the slope of value of natural log of US treasury is

negative, controlling for rest of the explanatory variables, we can say that gold price is

negatively correlated to value of US Treasury. Specifically the predicted gold price decreases by

64.7 for every one unit increase in natural log value of US Treasury.

b. Coefficient of Multiple Determination

This coefficient measures the proportion of variation in Gold Prices, that is simulatneously

explained by the set of predictors ( USD/EUR, S&P 500,Expected Inflation, BRICs ADR Index,

CRB Index, US Money Supply M1 & US Treasury Rate). R2 is used in the simple regression

setup. Evidently, 0< R2<1 with higher values of R

2 indicating a better fitting model and vice

versa. R2

is given by

However, R2

can only increase when additional predictor variables are added to the model.

Increasing the predictors will also increase the number of parameters and the computational cost.

In order to achieve a tradeoff between these two factors, an adjusted coefficient of multiple

determination is used.

( )

Following results were obtained:

Source DF SS MS F P

Regression 7 17463193 2494742 981.56 0

Residual Error 111 282119 2542

Total 118 1774531

Table 3: ANOVA

So we have

Thus taking USD/EUR, S&P 500,Expected Inflation, BRICs ADR Index, CRB Index, US

Money Supply M1 & US Treasury Rate explains about 98.73% of the total change in gold price.

Page 10: Course Paper On Regression Analysis Of Gold Prices

10

c. T-tests of Regression Coefficients

For testing the significance of each predictor, the null and alternative hypothesis are built

The test statistic is given by

The following results were obtained:

Predictor Coefficient SE

Coefficient

T statistic P Value VIF

Constant -1498.000 112.900 -13.270 0.000

USD/EUR Rate 571.630 91.170 6.270 0.000 5.019

S&P 500 Index -0.091 0.054 -1.700 0.092 4.659

BRICs ADR Index 0.083 0.007 11.530 0.000 7.371

CRB Index 1.261 0.292 4.320 0.000 19.931

M1 0.955 0.063 15.230 0.000 9.475

Expected Inflation -30.550 14.240 -2.140 0.034 4.432

Ln (Treasury Rate) -64.730 14.390 -4.500 0.000 11.337

Table 4: Regression Diagnostics

For 10% significane level, we can observe from the table the p-values are very small. Hence for

each predictor, we reject the null hypothesis and accept the alternate. This means that evey

predictor have a significant effect on the gold prices at 10% significance level.

However at 5% significance level, the S&P 500 Index has high p-value and hence is not

significant

d. Residual Analysis

To test the appropriateness of the multiple regression model we use the following procedure

The residuals were plotted against fitted values to test for linearity of regression models and

consistency of error variances. The residuals fluctuate more or less randomly about 0 with no

Page 11: Course Paper On Regression Analysis Of Gold Prices

11

noticeable trend or variation. Hence we conclude that gold price can be assumed to be

linearly related.

1) To check the validity of normal distributional assumption, the histogram on the normal

probability plot of the residuals were done. The following two plots were obtained:

18001600140012001000800600400200

3

2

1

0

-1

-2

-3

Fitted Value

Sta

nd

ard

ize

d R

esid

ua

l

Versus Fits(response is Gold Price U$/Oz)

3210-1-2-3

99.9

99

95

90

80

7060504030

20

10

5

1

0.1

Standardized Residual

Pe

rce

nt

Normal Probability Plot(response is Gold Price U$/Oz)

Page 12: Course Paper On Regression Analysis Of Gold Prices

12

The above plots indicate the errors can be assumed to have symmetric and bell shaped

distribution. From the normal probability plot, we can deduce that the pattern is pretty much

linear and error distribution can be assumed to be normal.

e. Multicollinearity Correction

To determine whether any of the variables in the model should be removed or not because of

multicollinearity, step wise regression was done as given below.

Stepwise Regression: Forward Selection and Backward Elimination

Alpha-to-Enter: 0.05, Alpha-to-Remove: 0.05

Response is Gold Price U$/Oz on 7 predictors where Number of Observations = 119

The step wise regression was terminated at the end of six steps (Table 5) regression

equation identified using MINITAB was

MODEL B:

(

) ( )

( ) ( ) ( )

( )

3210-1-2

25

20

15

10

5

0

Standardized Residual

Fre

qu

en

cy

Histogram(response is Gold Price U$/Oz)

Page 13: Course Paper On Regression Analysis Of Gold Prices

13

Step 1 2 3 4 5 6

Constant -1650.6 -1338.4 -965.7 -1314.4 -1460.6 -1486.3

M1 1.624 1.257 0.992 1.049 0.963 0.914

T-Value 34.56 35.47 18.25 19.58 17.35 15.67

P-Value 0 0 0 0 0 0

BRICs ADR

Index

0.0725 0.0834 0.0948 0.0808 0.0767

T-Value

15.72 18.78 18.39 13.19 12.29

P-Value

0 0 0 0 0

LN_Treasury

Rate

-54.6 -45.7 -71.1 -80.1

T-Value

-5.97 -5.09 -6.58 -7.11

P-Value

0 0 0 0

USD/EUR

Rate

283 449 542

T-Value

3.81 5.43 6.01

P-Value

0 0 0

CRB Index

0.72 1.26

T-Value

3.8 4.27

P-Value

0 0

Expected

Inflation

-33

T-Value

-2.35

P-Value

0.021

S 116 66 57.9 54.8 51.8 50.8

R-Sq 91.08 97.15 97.82 98.07 98.29 98.37

R-Sq(adj) 91 97.1 97.77 98 98.21 98.28

Mallows Cp 507.8 86 40.9 25.8 12.5 8.9

Table 5: Stepwise Regression Output

f. Durbin Watson Test

To detect the presence of autocorrelation in the residuals, Durbin Watson test was performed.

Durbin Watson test statistic obtained from MINITAB was 0.729. This denotes a high positive

autocorrelation because 0.729 < du.

For k=6 n dL dU

From D-W Tables 100 1.421 1.67

From D-W Tables 150 1.543 1.708

By Interpolation 119 1.499 1.694

Table 6: Durbin Watson Test

Our Durbin-Watson statistic of 0.729223 denotes high positive autocorrelation because

0.729 <dL<dU

Page 14: Course Paper On Regression Analysis Of Gold Prices

14

g. Positive Autocorrelation Correction: Lagged Variable

To correct for autocorrelation we introduce a lagged variable ‘Gold Price (-1)’ or the Gold Price

for the previous month. The whole stepwise regression process was repeated with the lagged

variable ‘Gold Price (-1)’ and this time, the step wise regression was terminated at the end of 7

steps. The following regression was obtained with the new variables

MODEL C:

( ) (

)

( ) ( ) ( )

( ) ( )

Step 1 2 3 4 5 6 7

Constant 1.512 -185.654 -433.719 -366.181 -564.617 -686.812 -731.6

Gold Price (-

1)

1.015 0.914 0.696 0.617 0.581 0.543 0.533

T-Value 84.36 24.21 12.26 10.51 9.68 8.89 8.93

P-Value 0 0 0 0 0 0 0

M1 0.177 0.401 0.377 0.436 0.434 0.402 T-Value 2.82 5.43 5.32 5.83 5.92 5.54

P-Value 0.006 0 0 0 0 0

BRICs ADR

Index

0.0242 0.0346 0.043 0.0394 0.0371

T-Value 4.83 6.12 6.37 5.82 5.57

P-Value 0 0 0 0 0

LN_Treasury

Rate

-24.8 -22.8 -36.9 -45.3

T-Value -3.46 -3.22 -4.02 -4.76

P-Value 0.001 0.002 0 0

USD/EUR

Rate

139 231 325

T-Value 2.19 3.15 4.06

P-Value 0.03 0.002 0

CRB Index 0.36 0.83 T-Value 2.35 3.55

P-Value 0.021 0.001

Expected

Inflation

-29

T-Value -2.61

P-Value 0.01

S 49 47.5 43.5 41.5 40.8 40 39

R-Sq 98.41 98.51 98.77 98.89 98.93 98.98 99.04

R-Sq(adj) 98.4 98.49 98.73 98.85 98.88 98.93 98.98

Mallows Cp 66.7 57 30.3 18.8 15.6 11.8 7

Table 7: Stepwise Regression with Lagged Variable

Page 15: Course Paper On Regression Analysis Of Gold Prices

15

The Durbin – Watson test was performed with the new regression equation in Model C and we

obtain Durbin-Watson statistic = 1.96059 which implies that there is no evidence for

autocorrelation because

dL < dU< 1.96059 < 2

For k=6 n dL dU

From D-W Tables 100 1.421 1.67

From D-W Tables 150 1.543 1.708

By Interpolation 119 1.499 1.694

Table 8: Durbin-Watson Test for Model C

Hence, the final multiple regression model to predict the gold price is as follows:

( ) (

) ( )

( ) ( ) ( )

( )

h. Model Validation

The gold prices were predicted using each of the three models discussed above and was

compared with the original to detect the forecast accuracy.

1500

1550

1600

1650

1700

1750

1800

1850

1900

1950

01/12/11 01/01/12 01/02/12 01/03/12 01/04/12 01/05/12 01/06/12 01/07/12 01/08/12 01/09/12

Model Validation

ACTUAL Gold Price Model C Model B Model A

Page 16: Course Paper On Regression Analysis Of Gold Prices

16

Model C, obtained after correcting for positive autocorrelation is finally used to predict the gold

prices because as observed in the chart above, it has better forecasting accuracy as compared to

Model A and Model B.

5. Conclusion

Forecasting Gold Prices can be useful for several investors and policy makers. We have utilized

multiple linear regression to develop a model A that can predict Gold prices based on Exchange

Rate (USD/EUR), S&P 500 Index, BRICs ADR Index, Commodity Prices (CRB Index), Money

Supply (M1), Inflation and Ln(Treasury Rate). We have performed step wise regression to obtain

model B and applied correction for multicollinearity. However, Durbin Watson test gave us

evidence for positive autocorrelation which we corrected by using a lagged variable Gold Price

for previous period (Gold Price-1). We finally obtained Model C which has better forecast or

predictive power as compared to Model A and Model B.

MODEL C:

( ) (

)

( ) ( ) ( )

( ) ( )

6. Further Improvement

In our study we have not applied correction for heteroskedasticity as we assumed the variance

was almost constant for all observations. Also, an even more sophisticated regression model can

be obtained if we choose an appropriate lagged explanatory variable such that the correlation

coefficient is maximized.


Recommended