Date post: | 08-Aug-2015 |
Category: |
Data & Analytics |
Upload: | remziduzagac |
View: | 25 times |
Download: | 0 times |
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Data Mining & Methods
Remzi Duzagac
February 11, 2015
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
What is data mining?
Data mining is the task of discovering interesting patternsfrom large amounts of data
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Why do we need data mining?
Computers have promised us a fountain of wisdom butdelivered flood of data
Data explosion problem
Automated data collection tools and mature databasetechnology lead to tremendous amounts of data stored indatabases, data warehouses and other informationrepositories
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Why do we need data mining?
We are drowning in data, but starving for knowledge
The greatest problem of today is how to teach people toignore the irrelevant, how to refuse to know things, beforethey are suffocated. For too many facts are as bad asnone at all. (W.H. Auden)
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Medical / Pharma
Computer Assisted Diagnosis (expert systems learning)
Characterization/prediction of patient’s response toproduct dosage
Identification of successful medical therapies (successfulprescription patterns).
Study of relations between dosage and potentially relatedadverse events
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Insurance and Health Care
Discovery of medical procedures that are claimed togetherthrough claims analysis
Identification of customers that are potential buyers fornew policies.
Detection of behavior patterns capable of identifying riskycustomers.
Detection of fraudulent behavior.
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Retail / Marketing
Discovery of buying behavior patterns
Detection of associations among customer characteristics.
Prediction of the probability that clients answer to mailing.
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Banking / Finance
Detection of fraudulent credit card usage patterns.
Risk management related to attribution of loans usingscorecards.
Find hidden correlations between different financialindicators.
Identification of stocks trading rules from historical marketdata.
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Computer Science
Image processing
Natural language processing
Information retrivial (Search engines)
Bioinformatics
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Real Estate
...
...
...
...
...
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Steps
Data cleaning: missing values, noisy data, andinconsistent data
Data integration: merging data from multiple data stores
Data selection: select the data relevant to the analysis
Data transformation: aggregation (daily sales to weeklyor monthly sales) or generalisation (street to city; age toyoung, middle age and senior)
Data mining: apply intelligent methods to extractpatterns
Pattern evaluation: interesting patterns shouldcontradict the user’s belief or confirm a hypothesis theuser wished to validate
Knowledge presentation: visualisation andrepresentation techniques to present the mined knowledgeto the use
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Classification
Decision Tree Learning
Bayesian Learning (Naive Bayes, Bayesian Tree)
KNN
Neural Networks
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Prediction
Regression (Linear, Multiple, Non-Linear)
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Clustering
Hierarchical clustering
K-Means
Markov Cluster Algorithm
Data Mining& Methods
RemziDuzagac
Introduction
What is datamining?
Usage Areas
How does itwork?
Methods &Algorithms
Classification &Prediction
Clustering
GeneticAlgorithm
Questions
Genetic Algorithm
Genetic Algorithm
Genetic Programming