Analyzing Stock Quotes using Data Mining TechniquesName of Student: To Yi FunUniversity Number: 2010149103
First Presentation, Final Year Project, 2013
Flow of Presentation
•Aim of the this classification for stock trade
•Theory of Classification•Decision Tree making•Introduction of the application•Structure and techs used in this
application•Preparation•Interface
Flow of Presentation
•Demonstration•Data Analysis•What to do next•Q&A
Aim•Find a model for class attribute as a
function of others to group a class for previously unseen records
•e.g. find out the classifier for historic stock price;
Group companies into different classes for inspection
•classier: decision tree, rule-based classifier
Theory for Decision Tree•A series of test conditions making to sort
the instances into class
•Greedy, split record based on attribute that best suit the criterion
•Attribute (discrete) setting, 2-way split; multiple-way split
Theory for Decision Tree•Best split
-Gini Index, generalization of variance impurity -Entropy, amount of impurity on a set
•Aim: using a training setto provide a classifier for classifying testing set
Application Structure
Raw data Data processing
Information presentation and arithmetic operation
Download
CSV2MYSQLGENERATOR
Processed Data
Filter Query (Splitting)
Preparation• Downloading the stock historic data: for 30 DOM shares
e.g. Pfizer, Bank of America, America Express, Exxon
• Convert to .csv file to be processed by the CSV2MYSQLGENERATOR program, the result is a lengthy sql commands
Data Processing • Categories into different type of stock by its industries
• Dow 30 as training set and 8 more stocks as testing set, mainly large scale company
Data Processing • Downloading the stock historic data: for 30 DOM shares
e.g. Pfizer, Bank of America, America Express, Exxon
• Convert to .csv file to be processed by the CSV2MYSQLGENERATOR program, the result is a lengthy sql commands
Data Processing • Attributes Setting -HL_30DaysAverage: Tendency -HL_ChangeDaily: Change -HL_ChangePerc: Difference -HL_VolChange: Popularity
Class: -B_RiseMore3Perc5Day: Buy Signal
Data Processing • Attributes Setting
User Interface• Make Use of the mysql connector to input the processed
data into the C#
• Three Major Components:
-Input -Result Log -Test
Demonstration• Make Use of the mysql connector to input the processed
data into the C#
• Three Major Components:
-Input -Result Log -Test
Result
Result Analysis
Attributes Setting -HL_30DaysAverage: Tendency -HL_ChangeDaily: Change -HL_ChangePerc: Difference -HL_VolChange: Popularity
What to do Next• Implement a more user friendly UI for presenting the
stock price, visualize the tree and provide query service
• Implement an splitting Algorithm using Gini and compare the difference of the results generated by these Algorithms
Q & A