Research Article Financial Time Series Forecasting Using Directed … · 2019. 7. 31. · Financial...

Research ArticleFinancial Time Series Forecasting UsingDirected-Weighted Chunking SVMs

Yongming Cai1 Lei Song1 Tingwei Wang1 and Qing Chang2

1 School of Management University of Jinan Jinan 250002 China2 School of Management Inner Mongolia University of Technology Huhhot 010050 China

Correspondence should be addressed to Yongming Cai cym2001099163com

Received 21 February 2014 Accepted 2 April 2014 Published 24 April 2014

Academic Editor Wei Chen

Copyright copy 2014 Yongming Cai et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Support vector machines (SVMs) are a promising alternative to traditional regression estimation approaches But when dealingwith massive-scale data set there exist many problems such as the long training time and excessive demand of memory spaceSo the SVMs algorithm is not suitable to deal with financial time series data In order to solve these problems directed-weightedchunking SVMs algorithm is proposed In this algorithm the whole training data set is split into several chunks and then thesupport vectors are obtained on each subset Furthermore the weighted support vector regressions are calculated to obtain theforecast model on the new working data set Our directed-weighted chunking algorithm provides a newmethod of support vectorsdecomposing and combining according to the importance of chunks which can improve the operation speed without reducingprediction accuracy Finally IBM stock daily close prices data are used to verify the validity of the proposed algorithm

1 Introduction

Financial time series forecasting is an important aspect offinancial decisions Financial practitioners and academicresearchers have proposed a lot of methods and techniquesto improve the accuracy of predictions Because it is inher-ently noisy nonstationary and deterministically chaotic [1]financial time series forecasting is regarded as one of themostchallenging applications of modern time series forecasting

Time series analysis of the early years study is focusedon the regression model such as autoregression modellike AR and ARMA and volatility model like ARCH andGARCH In recent years studies focused on application ofartificial intelligence algorithms (AI) such as artificial neuralnetworks (ANN) [2ndash4] reasoning neural networks (RNN)[5] genetic algorithms (GA) [6] particle swarm optimization(PSO) [7 8] and support vector machines (SVMs) [9 10]

Among these artificial intelligence algorithms SVMs arean elegant tool for solving pattern recognition and regressionproblems According to the research of Vapnik [11] SVMsimplement the structural risk minimization principle which

seeks to minimize an upper bound of the generalizationerror rather than minimize the training error The regressionmodel of SVMs called support vector regression (SVR)has also been receiving increasing attention to solve linearor nonlinear estimation problems For instance Tay andCao [9] studied the five real future contracts in ChicagoMercantile Market Cao and Tay [12] studied the SampP 500daily price index and Kim [13] studied the daily Koreacomposite stock price index (KOSPI) Based on the criteriaof normalized mean square error (NMSE) mean absoluteerror (MAE) directional symmetry (DS) and weighteddirectional symmetry (WDS) the above researches indicatethat the performance of SVMs is better thanARMAGARCH(ARCH) and ANN

According to the statistical learning theory (SLT) supportvector machine regression is a convex quadratic program-ming (QP) optimization with linear constraint Howeverthere is an obvious disadvantage of SVMs that the trainingtime scales somewhere between quadratic and cubic dependon the number of training samples In order to deal withmas-sive data set and improve the training speed many improved

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 170424 7 pageshttpdxdoiorg1011552014170424

2 Mathematical Problems in Engineering

support vector machine methods are proposed One wayis to combine SVMs with some other methods such asactive learning [14 15] multitask learning [16 17] multiviewlearning [18] and semisupervised learning [19 20] Anotherway is to develop some optimization techniques of trainingalgorithms in SVMs such as sequential updating methodslike kernel-Adatron algorithm [21] successive over relaxationalgorithm [22] working set methods like chunking SVMs[23 24] reduced support vector machine (RSVM) [25] andsequential minimal optimization algorithm (SMO) [26]

Chunking SVMs reduced support vector machine(RSVM) and SVMs with sequential minimal optimization(SMO) are outstanding methods in dealing with massivedata set For example Lee and Mangasarian [25] designed areduced support vector machine algorithm (RSVM) whichcan greatly reduce the size of the quadratic program to besolved by reducing the data set volume so the memory usageis much smaller than that of a conventional SVM using theentire data set Osuna et al [27] designed a decompositionalgorithm that is guaranteed to solve the QP problem andthat does not make assumptions on the expected number ofsupport vectors Platt [26] put forward a sequential minimaloptimization algorithm (SMO) which breaks the large QPproblem into a series of smallest possible QP problems whichare analytically solvable in order to speed up the trainingtime Tay and Cao [28] proposed to combine support vectormachines (SVMs) with self-organizing feature map (SOM)for financial time series forecasting where SOM is used asa clustering algorithm to partition the whole input spaceinto several disjoint regions Tay and Cao [29] also putforward C-ascending support vector machines to amend the120576-insensitive errors in the nonstationary financial time series

Most of the improved support vector machine methodsdo well in memory requirement and CPU time but theforecast accuracy has declinedmore or less For financial timeseries prediction the typical massive-scale data we shouldpay more attention to the prediction accuracy while reducingthe computational complexity In this paper we proposeda directed-weighted chunking SVMs algorithm which canimprove the operation speed without reducing predictionaccuracy

This paper consists of five sections Section 1 introducesthe basic algorithm of SVM Section 2 proposes a directed-weighted chunking SVMs algorithm Section 3 designs aseries of experiments and empirical results are summarizedand discussed Section 4 presents the conclusions and limita-tions of this study

2 SVMs Regression Theory

In this section we will briefly introduce the support vectorsregression (SVR) theory Suppose there are a given set of datapoints 119866 = (119909

119894 119889119894)119897

119894(119909119894is the input vector 119889

119894is the desired

value) SVMs approximate the function in the following form

119910 =

119897

sum

119894=1

119908119894120601119894 (119909) + 119887 (1)

where 120601(119909)119897119894=1

are the features of inputs and 119908119894119897

119894=1 119887 are

coefficientsAccording to the structural risk minimization principle

the support vectors regression problem can be expressed as

Minimize 119877 (119908 120585(lowast)) =

1

2

119908119879119908 + 119862

lowast

119897

sum

119894=1

(120585119894+ 120585lowast

119894)

Subject to 119908120601 (119909119894) + 119887 minus 119889

119894le 120576 + 120585

119894

119889119894minus 119908120601 (119909

119894) minus 119887 le 120576 + 120585

lowast

119894

120585119894 120585lowast

119894ge 0

119894 = 1 2 119897

(2)

where 119909119894is mapped to a higher dimensional space by the

function 120601 120585119894and 120585

119894

lowast are slack variables (120585119894is the upper

training error 120585lowast119894is the lower) which are subject to the 120576-

insensitive tube |119889119894minus(119908120601(119909

119894)+119887)| le 120576 The parameters which

control the regression quality are the cost of error119862 thewidthof the tube 120576 and the mapping function 120601

Thus (1) becomes the following explicit form

119891 (119909 119886119894 119886lowast

119894) =

119897

sum

119894=1

(119886119894minus 119886lowast

119894)119870 (119909 119909

119894) + 119887 (3)

Then we can obtain the following form by maximizingthe dual form of function (3)

120601 (119886119894 119886lowast

119894) =

119897

sum

119894=1

119889119894(119886119894minus 119886lowast

119894) minus 120576

119897

sum

119894=1

(119886119894+ 119886lowast

119894)

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

(4)

with the following constraints

119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862 119894 = 1 2 119897

0 le 119886lowast

119894le 119862 119894 = 1 2 119897

(5)

where the 119886119894 119886lowast119894are the Lagrange multipliers introduced

and 119870(119909 119909119894) is named the kernel function The value is

equal to the inner product of two vectors 119909119894and 119909

119895in the

feature spaces 120601(119909119894) and 120601(119909

119895) There are many choices of the

kernel function common examples are the polynomial kernel119870(119909119894 119909119895) = (119909

119894119909119895+ 1)119889 and Gaussian kernel 119870(119909

119894 119909119895) =

exp(minus1205752(119909119894minus 119909119895)2)

Training SVMs is equivalent to optimizing the Lagrangemultipliers 119886

119894 119886lowast119894with constraints based on (4) Good fitting

function can be obtained by choosing appropriate functionalspace

There are many different researches using the straight-forward approaches to construct and implement SVMs

Mathematical Problems in Engineering 3

for financial time series analysis (see eg [9 13 30])other methods like chunking SVMs reduced support vec-tor machine (RSVM) and SVMs with sequential minimaloptimization (SMO) are used to deal with massive-scaledata sets Among these methods SVM chunking providesan alternative method to running a typical SVM on a dataset by breaking up the training data and running the SVMon smaller chunks of data In previous literature manydecomposition and combination methods are proposed

We would like to mention that for financial time seriesprediction problems the data set from different periodswill have different effects on current forecasts and thechange of direction of past stock price will also affect thecurrent forecasts Therefore we proposed directed-weightedchunking SVMs which can improve the operation speedwithout reducing prediction accuracy

3 The Directed-Weighted Chunking SVMs

31 Chunking Model in Support Vector Regression TrainingSVMs is equivalent to solving a linearly constrained QPproblem Training SVMs depends on QP optimization tech-niques Standard QP techniques cannot be directly applied toSVMsproblemswithmassive-scale data set In training stagesof directed-weighted chunking SVMs the whole trainingset is decomposed into several chunks and support vectorsare calculated respectively in their working subset In theprediction stage all these support vectors are combinedinto new working data set to get the model in accordancewith their importance as illustrated in Figure 1 and theprogress of directed-weighted chunking SVMs algorithm canbe described as follows

Step 1 Decompose the whole training set 119866 into 119898 subsets1198661 1198662 119866

119898 and 119866 = 119866

1cup 1198662 cup sdot sdot sdot cup119866

119898 119866119894cap 119866119895= Φ

Step 2 Calculate the support vector regression SVG119894for each

subset 119866119894

Step 3 Calculate the weight and direction of each subset

Step 4 Combine all support vectors 119866SVM119894 for each subsetinto new working data set 119866SVM

Step 5 Calculate the weighted support vector regression onthe new working data set 119866SVM and get the model

32 Directed-Weighted Chunking SVMs In the time seriesforecasting such as stock market the effect of past stockprices on the future stock prices will be different Usuallythe more recent the period of time the greater the weightcoefficient According to a certain time interval the total timeseries of stock priceswill be divided into different chunks andsupport vectors in each chunk are calculated respectivelyFurthermore the weighted support vector regressions arecalculated to obtain the forecast model

The stock market is one of the complex systems [31 32]We can treat chunks as the nodes and relationship betweenchunks as edges Thus these nodes and edges will be acomplex networkThe entire time series of stock price can beregarded as a directed-weighted network with a large numberof nodes and edges The mutual influence between chunkshas its direction For example in chunks G1 and G2 if thereexists a nonzero correlation coefficient from chunk G1 to G2we will draw a directed edge between G1 and G2 Becausethe strength of mutual influence between chunks is differentthe edge weight which is called correlation intensity isalso different So simply chunking SVMs cannot reflect theinfluence of respective chunk on final model Here we willintroduce directed-weighted chunking algorithm into SVMs

In traditional SVM regression optimization problem theparameter 119862lowast is a constant In order to show the differentinfluence on the prediction results we introduce a function119876() and modify the SVMs regression function as follows

119877 (119908 120585(lowast)) =

1

2

119908119879119908 + 119862

lowast

119897

sum

119894=1

119876 (119866119894 Δ119905) (120585

119894+ 120585lowast

119894) (6)

In (6) 119876(119866119894 Δ119905) can be understood as the weight of

each chunk and the values of 119876(119866119894 Δ119905) can be positive or

negative The positive or negative values of 119876(119866119894 Δ119905) will

change parameter 119862lowast in SVMs which is very similar tothat of the positive or negative information impact on thefuture stock price According to the definition of networkcorrelation coefficient introduced by Bonanno et al [33] wecan define the correlation intensity 119876(119866

119894 Δ119905) (influence of

chunk 119866119894by the time interval Δ119905) as follows

119876 (119866119894 Δ119905) =

⟨119877 (119866119894 119905) 119877 (119866

119894 119905 minus Δ119905)⟩ minus ⟨119877 (119866

119894 119905)⟩ ⟨119877 (119866

119894 119905 minus Δ119905)⟩

radic⟨[119877 (119866119894 119905) minus ⟨119877 (119866

119894 119905)⟩]2⟩ ⟨[119877 (119866

119894 119905 minus Δ119905) minus ⟨119877 (119866

119894 119905 minus Δ119905)⟩

2]⟩

(7)

where ⟨⟩ is a temporal average always performed over theinvestigated time period 119905 represents a time Δ119905 represents atime interval119877(119866

119894 119905) is the stock returns in a time intervalΔ119905

and119877(119866119894 119905) = ln[119901(119866

119894 119905)]minusln[119901(119866

119894 119905minusΔ119905)] is the logarithmic

difference of current stock price and the stock price before thetime interval Δ119905

According to the data point and time interval Δ119905 originaldata can be decomposed into several chunks The boundaryof 119876(119866

119894 Δ119905) is minus1 le 119876(119866

119894 Δ119905) le 1 If the 119876(119866

119894 Δ119905) is

positive the influence of chunk 119866119894on the time interval Δ119905 is

positive and vice versa We can calculate all the correlationintensity119876(119866

119894 Δ119905) according to (7) and get a119898times119898matrix of


Input data

SVR Mapping

Output model

CombiningDecomposingG1

G2

Gm

middot middot middot middot middot middot middot middot middot

SVG1

SVG2

SVGm

GSVM1

GSVM2

GSVM119898

GSVM

Figure 1 Model of directed-weighted chunking SVMs

IBM close value [1999-12-312013-12-31]Last 18757

Volume (millions)3619700

Moving average convergence divergence (12 26 9)

MACD 0957

Signal 0250

50

100

150

200

0

10

20

30

40

minus5

0

5

Dec 31 1999 Jun 01 2001 Dec 02 2002 Jun 01 2004 Dec 01 2005 Jun 01 2007 Dec 01 2008 Jun 01 2010 Dec 01 2011 Jun 03 2013

Figure 2 The IBM stock daily close prices (from December 31 1999 up to December 31 2013) Source Yahoo Finance httpfinanceyahoocom

correlation intensity Using these correlation intensity valueswe can get a relationship matrix with direction and weightNow dual form of the original optimization problem can bededuced as

max

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

+

119897

sum

119894=1

119886119894(119889119894minus 120576) minus 119886

lowast

119894(119889119894+ 120576)

Subject to 119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

0 le 119886lowast

119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

(8)

Now the solution of this equation is the overall optimalsolution of the original optimization problem

4 Experiments

In order to make a fair and thorough comparative betweendirected-weighted chunking SVMs and ordinary SVMs theIBM stock daily close prices are selected as shown in Figure 2The data points cover the time period from December 311999 up to December 31 2013 which has 21132 data points

Data points starting from December 31 1999 up toDecember 31 2007 (12075 data points) are used for trainingand data points starting from January 1 2008 up to theDecember 31 2013 (9057 data points) are used for testingNow we decomposed the training data into 1208 chunks bythe time intervals of 10 days and calculated the correlationintensity 119876(119866

119894 Δ119905) according to the function (7) Finally we

obtain a 1208 times 1208matrix of correlation intensity

41 Forecast Accuracy Assessment of Directed-Weighted ChunkSVMs The prediction performance is evaluated by using thefollowing statistical metrics namely the normalized meansquared error (NMSE) the mean absolute error (MAE) andthe directional symmetry (DS) These criteria are calculatedas (9) NMSE and MAE are the measures of the deviationbetween the actual and predicted valuesThe values of NMSE

Mathematical Problems in Engineering 5IB

M cl

ose v

alue

Real valuePredicted values in ordinary SVMPredicted values in D-W chunking SVM

100

150

200

Nov 2008 Nov 2009 Nov 2010 Nov 2011 Nov 2012 Oct 2013

Figure 3The predicted value of IBM stock daily close prices in testdata set (from January 01 2008 up to December 31 2013)

and MAE denote the accuracy of prediction A detaileddescription of performance metrics in financial forecastingcan be referred to in Abecasis [34]

NMSE = 1

1205902119873

119873

sum

119894=1

(119910119894minus 119910119894)2

where 1205902 = 1

119873 minus 1

119873

sum

119894=1

(119910119894minus 119910119894)2 119910119894=

1

119873

119873

sum

119894=1

119910119894

MAE = 1

119873

119873

sum

119894=1

1003816100381610038161003816119910119894minus 119910119894

1003816100381610038161003816

DS = 100

119873

119873

sum

119894=1

119889119894 where 119889

119894=

1 (119910119894minus 119910119894minus1) (119910119894minus 119910119894minus1)

0 otherwise(9)

The program of directed-weighted chunking SVMs algo-rithm is developed using R language In this paper theGaussian function is used as the kernel function of the SVMsThe experiments show that a width value of the Gaussianfunction of 002 is found to produce the best possible results119862 and 120576 are arbitrarily chosen to be 85 and 10minus3 respectively

We calculate the SVMs on the training data sets (fromDecember 31 1999 up to December 31 2007) and then obtainthe trained model Finally we obtain the predicted result byapplying the trained model on the test data set (from January1 2008 up to December 31 2013) In order to comparethe differences of various algorithms real value predictedvalue in ordinary SVMs and the predicted value in directed-weighted chunking SVMs are plotted in Figure 3

In Figure 3 we can clearly see that these two forecastingmethods are very precise but it is hard to tell which one ismore excellent So we calculated the performance criteriarespectively as shown in Table 1 By comparing these datawe find that the NMSE MAE and DS of directed-weightedchunking SVMs are 03760 01325 and 3829 on the trainingset and 10121 02846 and 4378 on the test set It is evidentthat these values are much smaller than those of ordinarySVMs which indicates that there is a smaller deviation

Table 1 Results of accuracy performance criteria

D-W chunking SVMs Ordinary SVMsNMSE MAE DS NMSE MAE DS

Training set 03760 01325 3829 05625 01778 4167Test set 10121 02846 4378 13574 02564 4897

Table 2 Performance comparison between the traditional SVMsand directed-weighted chunk SVMs

SVMmethod Sensitivity Specificity CPU time(ms)

Traditional SVMs 079 081 458231Directed-weightedchunk SVMs 083 085 65367

Dataset IBM stock daily close prices in training data set (12075 data pointsfrom December 31 1999 up to December 31 2007)SVMs parameters kernel is Gaussian function with 120590 = 002 119862 = 85 120576 =0001 and tolerance = 0001Chunking methods now we decomposed the training data into 1208 chunksby the time intervals of 10 daysThe CPU time covered the execution of entire algorithm excluding the fileIO time

between the actual values and predicted values with directed-weighted chunking SVMs

42 Calculation Performance of Directed-Weighted ChunkSVMs As is well known the performance of SVM dependson the parameters But it is difficult to choose suitableparameters for different problems Chunk algorithm reusedthe Hessian matrix elements from one to the next which canimprove the performance sharply

The calculation performance of all algorithms is mea-sured on an unloaded AMD E-350 16GHz processor run-ning Windows 7 and R 301 The same experiment will bedone on the data set of IBM stock daily close prices Theresults of the experiments are shown in Table 2

The primary purpose of these experiments is to examinedifferences of training times between two methods An over-all comparison of the SVMmethods can be found in Table 2Compared to the traditional SVMs directed-weighted chunkSVMs can improve the accuracy and decrease run timessharply Additionally the directed-weighted chunk SVMsmethod allows users to add machines to make the algorithmtraining even faster

43 Analysis of Optimal Number of Chunks in Directed-Weighted Chunk SVMs In the experiment described abovewe decomposed the training data into 1208 chunks by thetime intervals of 10 days arbitrarily and got a satisfactorypredictionOriginal training data set can be decomposed into500 chunks or 5000 chunks also Doing the same experimentson the same training sets by different chunks we will geta series of performance data Plotting the curve based onchunks number and NMSE values (in Figure 4) we canintuitively discover relationships between chunks numberand errors and get the optimal number of chunks


1000 400030002000

02

10

08

06

04

The number of chunks on training set

NM

SE

Figure 4 NMSE value in different chunks (IBM stock daily closeprices data)

According to NMSE criteria we get the minimumNMSEvalue 03173 on the point of chunks 2350That means that thebest number of chunks is 2350 Under this chunking numberwe can get the best performance of prediction

From Figure 4 when the chunks number is increasedNMSE value is declined rapidly But when the decreasereaches a certain value NMSE value will increase converselyHowever this upward trend is not very large which indicatesthat the directed-weighted chunking SVM is not a funda-mental transform of SVM but a limited improvement Butfrom the perspective of processing massive-scale data thisimprovement is very important

5 Conclusions

In this paper we proposed a new chunks algorithm in SVMsregression which combined the support vectors according totheir importance The proposed algorithm can improve thecomputational speed without reducing prediction accuracy

In our directed-weighted chunking SVMsΔ119905 the criteriafor the chunking is a constant but in practice Δ119905 canbe variable or some form of function In addition furtherstudies on the different kernel functions and more suitableparameters 120576 and 119862 can be done in order to improve theperformance of directed-weighted chunking SVMs

Conflict of Interests

The authors declare that they have no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was partly supported by the National NaturalScience Foundation of China (NSFC) under Project Grant(no 71162015) and the Inner Mongolia Autonomous Region

Higher Education Development Plan of Innovation TeamsrsquoProject Grant (no NMGIRT1404)

References

[1] Y S Abu-Mostafa and A F Atiya ldquoIntroduction to financialforecastingrdquoApplied Intelligence vol 6 no 3 pp 205ndash213 1996

[2] W Cheng W Wagner and C H Lin ldquoForecasting the 30-yearUS treasury bond with a system of neural networksrdquo Journalof Computational Intelligence in Finance vol 4 no 1 pp 10ndash161996

[3] K-J Kim and I Han ldquoGenetic algorithms approach to featurediscretization in artificial neural networks for the prediction ofstock price indexrdquo Expert Systems with Applications vol 19 no2 pp 125ndash132 2000

[4] HAhmadi ldquoTestability of the arbitrage pricing theory by neuralnetworkrdquo in Proceedings of the International Joint Conference onNeural Networks (IJCNN rsquo90) pp 385ndash393 June 1990

[5] R Tsaih YHsu andCC Lai ldquoForecasting SampP 500 stock indexfutures with a hybrid AI systemrdquo Decision Support Systems vol23 no 2 pp 161ndash174 1998

[6] G G Szpiro ldquoForecasting chaotic time series with geneticalgorithmsrdquo Physical Review E vol 55 no 3 pp 2557ndash25681997

[7] A Brabazon andM OrsquoNeill Biologically Inspired Algorithms forFinancial Modelling Springer Berlin Germany 2006

[8] X Cai N Zhang G K Venayagamoorthy andD CWunsch IIldquoTime series prediction with recurrent neural networks trainedby a hybrid PSO-EA algorithmrdquo Neurocomputing vol 70 no13ndash15 pp 2342ndash2353 2007

[9] F E H Tay and L Cao ldquoApplication of support vectormachinesin financial time series forecastingrdquo Omega vol 29 no 4 pp309ndash317 2001

[10] G Rubio H Pomares I Rojas and L J Herrera ldquoA heuristicmethod for parameter selection in LS-SVM application to timeseries predictionrdquo International Journal of Forecasting vol 27no 3 pp 725ndash739 2011

[11] V VapnikTheNature of Statical LearningTheory Springer NewYork NY USA 1995

[12] L Cao and F E H Tay ldquoFinancial forecasting using SupportVector Machinesrdquo Neural Computing and Applications vol 10no 2 pp 184ndash192 2001

[13] K-J Kim ldquoFinancial time series forecasting using supportvector machinesrdquo Neurocomputing vol 55 no 1-2 pp 307ndash3192003

[14] S Sun and D R Hardoon ldquoActive learning with extremelysparse labeled examplesrdquoNeurocomputing vol 73 no 16ndash18 pp2980ndash2988 2010

[15] V Ceperic G Gielen and A Baric ldquoRecurrent sparse supportvector regression machines trained by active learning in thetime-domainrdquo Expert Systems with Applications vol 39 no 12pp 10933ndash10942 2012

[16] S Parameswaran and K Q Weinberger ldquoLarge margin multi-task metric learningrdquo in Proceedings of the 24th Annual Con-ference on Neural Information Processing Systems (NIPS rsquo10)Vancouver Canada December 2010

[17] T Jebara ldquoMulti-task feature and kernel selection for SVMsrdquoin Proceedings of the 21st International Conference on MachineLearning pp 433ndash440 ACM July 2004

[18] Y Li S Gong and H Liddell ldquoSupport vector regression andclassification based multi-view face detection and recognitionrdquo


in Proceedings of the 4th IEEE International Conference onAutomatic Face and Gesture Recognition pp 300ndash305 2000

[19] K Bennett and A Demiriz ldquoSemi-supervised support vectormachinesrdquo Advances in Neural Information Processing Systemsvol 11 pp 368ndash374 1999

[20] C Brouard F DrsquoAlche-Buc and M Szafranski ldquoSemi-supervised penalized output kernel regression for linkpredictionrdquo in Proceedings of the 28th International Conferenceon Machine Learning pp 593ndash600 July 2011

[21] K Veropoulos C Campbell and N Cristianini ldquoControllingthe sensitivity of support vector machinesrdquo in Proceedings of theInternational Joint Conference on Artificial Intelligence pp 55ndash60 Citeseer 1999

[22] O L Mangasarian and D R Musicant ldquoSuccessive overrelax-ation for support vectormachinesrdquo IEEETransactions onNeuralNetworks vol 10 no 5 pp 1032ndash1037 1999

[23] V Vapnik Estimations of Dependences Based on Statistical DataSpringer Berlin Germany 1982

[24] L Kaufman ldquoSolving the quadratic programming problemarising in support vector classificationrdquo in Advances in KernelMethods pp 147ndash167 MIT Press 1999

[25] Y J Lee and O L Mangasarian ldquoRSVM reduced supportvector machinesrdquo in Proceedings of the First SIAM InternationalConference on Data Mining pp 5ndash7 Philadelphia Pa USA2001

[26] J C Platt ldquoUsing analytic QP and sparseness to speed trainingof support vector machinesrdquo in Proceedings of the Conference onAdvances in Neural Information Processing Systems pp 557ndash563Cambridge UK 1999

[27] E Osuna R Freund and F Girosi ldquoImproved training algo-rithm for support vector machinesrdquo in Proceedings of the 7thIEEEWorkshop on Neural Networks for Signal Processing (NNSPrsquo97) pp 276ndash285 September 1997

[28] F E H Tay and L J Cao ldquoImproved financial time seriesforecasting by combining support vector machines with self-organizing feature maprdquo Intelligent Data Analysis vol 5 no 4pp 339ndash354 2001

[29] F E H Tay and L J Cao ldquoModified support vector machines infinancial time series forecastingrdquo Neurocomputing vol 48 pp847ndash861 2002

[30] L Cao ldquoSupport vector machines experts for time seriesforecastingrdquo Neurocomputing vol 51 pp 321ndash339 2003

[31] K E Lee J W Lee and B H Hong ldquoComplex networks in astock marketrdquo Computer Physics Communications vol 177 no1-2 p 186 2007

[32] P Caraiani ldquoCharacterizing emerging European stock marketsthrough complex networks from local properties to self-similarcharacteristicsrdquo Physica A vol 391 no 13 pp 3629ndash3637 2012

[33] G Bonanno G Caldarelli F Lillo S Micciche N VandewalleandRNMantegna ldquoNetworks of equities in financialmarketsrdquoEuropean Physical Journal B vol 38 no 2 pp 363ndash371 2004

[34] S Abecasis E Lapenta and C Pedreira ldquoPerformance metricsfor financial time series forecastingrdquo Journal of ComputationalIntelligence in Finance vol 7 no 4 pp 5ndash22 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of


Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of


Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of


CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of


Operations ResearchAdvances in

Journal of


Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences


The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014


Algebra

Discrete Dynamics in Nature and Society



Decision SciencesAdvances in

Discrete MathematicsJournal of


Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of


support vector machine methods are proposed One wayis to combine SVMs with some other methods such asactive learning [14 15] multitask learning [16 17] multiviewlearning [18] and semisupervised learning [19 20] Anotherway is to develop some optimization techniques of trainingalgorithms in SVMs such as sequential updating methodslike kernel-Adatron algorithm [21] successive over relaxationalgorithm [22] working set methods like chunking SVMs[23 24] reduced support vector machine (RSVM) [25] andsequential minimal optimization algorithm (SMO) [26]

Chunking SVMs reduced support vector machine(RSVM) and SVMs with sequential minimal optimization(SMO) are outstanding methods in dealing with massivedata set For example Lee and Mangasarian [25] designed areduced support vector machine algorithm (RSVM) whichcan greatly reduce the size of the quadratic program to besolved by reducing the data set volume so the memory usageis much smaller than that of a conventional SVM using theentire data set Osuna et al [27] designed a decompositionalgorithm that is guaranteed to solve the QP problem andthat does not make assumptions on the expected number ofsupport vectors Platt [26] put forward a sequential minimaloptimization algorithm (SMO) which breaks the large QPproblem into a series of smallest possible QP problems whichare analytically solvable in order to speed up the trainingtime Tay and Cao [28] proposed to combine support vectormachines (SVMs) with self-organizing feature map (SOM)for financial time series forecasting where SOM is used asa clustering algorithm to partition the whole input spaceinto several disjoint regions Tay and Cao [29] also putforward C-ascending support vector machines to amend the120576-insensitive errors in the nonstationary financial time series

Most of the improved support vector machine methodsdo well in memory requirement and CPU time but theforecast accuracy has declinedmore or less For financial timeseries prediction the typical massive-scale data we shouldpay more attention to the prediction accuracy while reducingthe computational complexity In this paper we proposeda directed-weighted chunking SVMs algorithm which canimprove the operation speed without reducing predictionaccuracy

This paper consists of five sections Section 1 introducesthe basic algorithm of SVM Section 2 proposes a directed-weighted chunking SVMs algorithm Section 3 designs aseries of experiments and empirical results are summarizedand discussed Section 4 presents the conclusions and limita-tions of this study

2 SVMs Regression Theory

In this section we will briefly introduce the support vectorsregression (SVR) theory Suppose there are a given set of datapoints 119866 = (119909

119894 119889119894)119897

119894(119909119894is the input vector 119889

119894is the desired

value) SVMs approximate the function in the following form

119910 =

119897

sum

119894=1

119908119894120601119894 (119909) + 119887 (1)

where 120601(119909)119897119894=1

are the features of inputs and 119908119894119897

119894=1 119887 are

coefficientsAccording to the structural risk minimization principle

the support vectors regression problem can be expressed as

Minimize 119877 (119908 120585(lowast)) =

1

2

119908119879119908 + 119862

lowast

119897

sum

119894=1

(120585119894+ 120585lowast

119894)

Subject to 119908120601 (119909119894) + 119887 minus 119889

119894le 120576 + 120585

119894

119889119894minus 119908120601 (119909

119894) minus 119887 le 120576 + 120585

lowast

119894

120585119894 120585lowast

119894ge 0

119894 = 1 2 119897

(2)

where 119909119894is mapped to a higher dimensional space by the

function 120601 120585119894and 120585

119894

lowast are slack variables (120585119894is the upper

training error 120585lowast119894is the lower) which are subject to the 120576-

insensitive tube |119889119894minus(119908120601(119909

119894)+119887)| le 120576 The parameters which

control the regression quality are the cost of error119862 thewidthof the tube 120576 and the mapping function 120601

Thus (1) becomes the following explicit form

119891 (119909 119886119894 119886lowast

119894) =

119897

sum

119894=1

(119886119894minus 119886lowast

119894)119870 (119909 119909

119894) + 119887 (3)

Then we can obtain the following form by maximizingthe dual form of function (3)

120601 (119886119894 119886lowast

119894) =

119897

sum

119894=1

119889119894(119886119894minus 119886lowast

119894) minus 120576

119897

sum

119894=1

(119886119894+ 119886lowast

119894)

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

(4)

with the following constraints

119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862 119894 = 1 2 119897

0 le 119886lowast

119894le 119862 119894 = 1 2 119897

(5)

where the 119886119894 119886lowast119894are the Lagrange multipliers introduced

and 119870(119909 119909119894) is named the kernel function The value is

equal to the inner product of two vectors 119909119894and 119909

119895in the

feature spaces 120601(119909119894) and 120601(119909

119895) There are many choices of the

kernel function common examples are the polynomial kernel119870(119909119894 119909119895) = (119909

119894119909119895+ 1)119889 and Gaussian kernel 119870(119909

119894 119909119895) =

exp(minus1205752(119909119894minus 119909119895)2)

Training SVMs is equivalent to optimizing the Lagrangemultipliers 119886

119894 119886lowast119894with constraints based on (4) Good fitting

function can be obtained by choosing appropriate functionalspace

There are many different researches using the straight-forward approaches to construct and implement SVMs







119898 and 119866 = 119866


119898 119866119894cap 119866119895= Φ


subset 119866119894







119877 (119908 120585(lowast)) =

1

2

119908119879119908 + 119862

lowast

119897

sum

119894=1

119876 (119866119894 Δ119905) (120585

119894+ 120585lowast

119894) (6)







119876 (119866119894 Δ119905) =

⟨119877 (119866119894 119905) 119877 (119866

119894 119905 minus Δ119905)⟩ minus ⟨119877 (119866

119894 119905)⟩ ⟨119877 (119866

119894 119905 minus Δ119905)⟩

radic⟨[119877 (119866119894 119905) minus ⟨119877 (119866

119894 119905)⟩]2⟩ ⟨[119877 (119866

119894 119905 minus Δ119905) minus ⟨119877 (119866

119894 119905 minus Δ119905)⟩

2]⟩

(7)



and119877(119866119894 119905) = ln[119901(119866

119894 119905)]minusln[119901(119866




119894 Δ119905) is minus1 le 119876(119866

119894 Δ119905) le 1 If the 119876(119866

119894 Δ119905) is





Input data

SVR Mapping

Output model


G2

Gm


SVG1

SVG2

SVGm

GSVM1

GSVM2

GSVM119898

GSVM





MACD 0957

Signal 0250

50

100

150

200

0

10

20

30

40

minus5

0

5




max

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

+

119897

sum

119894=1

119886119894(119889119894minus 120576) minus 119886

lowast

119894(119889119894+ 120576)

Subject to 119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

0 le 119886lowast

119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

(8)


4 Experiments







M cl

ose v

alue


100

150

200




NMSE = 1

1205902119873

119873

sum

119894=1

(119910119894minus 119910119894)2

where 1205902 = 1

119873 minus 1

119873

sum

119894=1

(119910119894minus 119910119894)2 119910119894=

1

119873

119873

sum

119894=1

119910119894

MAE = 1

119873

119873

sum

119894=1

1003816100381610038161003816119910119894minus 119910119894

1003816100381610038161003816

DS = 100

119873

119873

sum

119894=1

119889119894 where 119889

119894=


0 otherwise(9)

















1000 400030002000

02

10

08

06

04


NM

SE




5 Conclusions





Acknowledgments



References












































Volume 2014




Journal of











Journal of


Function Spaces






Algebra















119898 and 119866 = 119866


119898 119866119894cap 119866119895= Φ


subset 119866119894







119877 (119908 120585(lowast)) =

1

2

119908119879119908 + 119862

lowast

119897

sum

119894=1

119876 (119866119894 Δ119905) (120585

119894+ 120585lowast

119894) (6)







119876 (119866119894 Δ119905) =

⟨119877 (119866119894 119905) 119877 (119866

119894 119905 minus Δ119905)⟩ minus ⟨119877 (119866

119894 119905)⟩ ⟨119877 (119866

119894 119905 minus Δ119905)⟩

radic⟨[119877 (119866119894 119905) minus ⟨119877 (119866

119894 119905)⟩]2⟩ ⟨[119877 (119866

119894 119905 minus Δ119905) minus ⟨119877 (119866

119894 119905 minus Δ119905)⟩

2]⟩

(7)



and119877(119866119894 119905) = ln[119901(119866

119894 119905)]minusln[119901(119866




119894 Δ119905) is minus1 le 119876(119866

119894 Δ119905) le 1 If the 119876(119866

119894 Δ119905) is





Input data

SVR Mapping

Output model


G2

Gm


SVG1

SVG2

SVGm

GSVM1

GSVM2

GSVM119898

GSVM





MACD 0957

Signal 0250

50

100

150

200

0

10

20

30

40

minus5

0

5




max

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

+

119897

sum

119894=1

119886119894(119889119894minus 120576) minus 119886

lowast

119894(119889119894+ 120576)

Subject to 119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

0 le 119886lowast

119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

(8)


4 Experiments







M cl

ose v

alue


100

150

200




NMSE = 1

1205902119873

119873

sum

119894=1

(119910119894minus 119910119894)2

where 1205902 = 1

119873 minus 1

119873

sum

119894=1

(119910119894minus 119910119894)2 119910119894=

1

119873

119873

sum

119894=1

119910119894

MAE = 1

119873

119873

sum

119894=1

1003816100381610038161003816119910119894minus 119910119894

1003816100381610038161003816

DS = 100

119873

119873

sum

119894=1

119889119894 where 119889

119894=


0 otherwise(9)

















1000 400030002000

02

10

08

06

04


NM

SE




5 Conclusions





Acknowledgments



References












































Volume 2014




Journal of











Journal of


Function Spaces






Algebra










Input data

SVR Mapping

Output model


G2

Gm


SVG1

SVG2

SVGm

GSVM1

GSVM2

GSVM119898

GSVM





MACD 0957

Signal 0250

50

100

150

200

0

10

20

30

40

minus5

0

5




max

minus

1

2

119897

sum

119894=1

119897

sum

119895=1

(119886119894minus 119886lowast

119894) (119886119894minus 119886lowast

119894)119870 (119909

119894 119909119895)

+

119897

sum

119894=1

119886119894(119889119894minus 120576) minus 119886

lowast

119894(119889119894+ 120576)

Subject to 119897

sum

119894=1

(119886119894minus 119886lowast

119894) = 0

0 le 119886119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

0 le 119886lowast

119894le 119862119876 (119866

119894 Δ119905) 119894 = 1 2 119897

(8)


4 Experiments







M cl

ose v

alue


100

150

200




NMSE = 1

1205902119873

119873

sum

119894=1

(119910119894minus 119910119894)2

where 1205902 = 1

119873 minus 1

119873

sum

119894=1

(119910119894minus 119910119894)2 119910119894=

1

119873

119873

sum

119894=1

119910119894

MAE = 1

119873

119873

sum

119894=1

1003816100381610038161003816119910119894minus 119910119894

1003816100381610038161003816

DS = 100

119873

119873

sum

119894=1

119889119894 where 119889

119894=


0 otherwise(9)

















1000 400030002000

02

10

08

06

04


NM

SE




5 Conclusions





Acknowledgments



References












































Volume 2014




Journal of











Journal of


Function Spaces






Algebra










M cl

ose v

alue


100

150

200




NMSE = 1

1205902119873

119873

sum

119894=1

(119910119894minus 119910119894)2

where 1205902 = 1

119873 minus 1

119873

sum

119894=1

(119910119894minus 119910119894)2 119910119894=

1

119873

119873

sum

119894=1

119910119894

MAE = 1

119873

119873

sum

119894=1

1003816100381610038161003816119910119894minus 119910119894

1003816100381610038161003816

DS = 100

119873

119873

sum

119894=1

119889119894 where 119889

119894=


0 otherwise(9)

















1000 400030002000

02

10

08

06

04


NM

SE




5 Conclusions





Acknowledgments



References












































Volume 2014




Journal of











Journal of


Function Spaces






Algebra










1000 400030002000

02

10

08

06

04


NM

SE




5 Conclusions





Acknowledgments



References












































Volume 2014




Journal of











Journal of


Function Spaces






Algebra


































Volume 2014




Journal of











Journal of


Function Spaces






Algebra
















Volume 2014




Journal of











Journal of


Function Spaces






Algebra









Date post:	31-Dec-2020
Category:	Documents
Upload:	others
View:	6 times
Download:	0 times

Research Article Financial Time Series Forecasting Using Directed … · 2019. 7. 31. · Financial...

Documents