Date post: | 23-Dec-2015 |
Category: |
Documents |
Upload: | winifred-sherman |
View: | 216 times |
Download: | 0 times |
28.11.2001 Data mining - Applications, future, and summary
1
Intro/Ass. RulesIntro/Ass. RulesIntro/Ass. RulesIntro/Ass. Rules
EpisodesEpisodesEpisodesEpisodes
Text MiningText MiningText MiningText Mining
Home ExamHome Exam
24./26.10.
30.10.
ClusteringClusteringClusteringClustering
KDD ProcessKDD ProcessKDD ProcessKDD Process
Appl./SummaryAppl./SummaryAppl./SummaryAppl./Summary
14.11.
21.11.
7.11.
28.11.
Course on Data Mining (581550-4)Course on Data Mining (581550-4)
28.11.2001 Data mining - Applications, future, and summary
2
Today 28.11.2001Today 28.11.2001Today 28.11.2001Today 28.11.2001
Course on Data Mining (581550-4)Course on Data Mining (581550-4)
• Today's subjectToday's subject: :
o Data mining applications, Data mining applications, future, and summaryfuture, and summary
• The program at the end of The program at the end of this week:this week:
o Exercise: KDD ProcessExercise: KDD Process
o Seminar: KDD ProcessSeminar: KDD Process
28.11.2001 Data mining - Applications, future, and summary
3
Applications, future and summaryApplications, future and summary
• Data mining applicationsData mining applications• How to choose a data mining How to choose a data mining
system?system?• Data mining system products Data mining system products
and research prototypesand research prototypes• Additional themes on data Additional themes on data
miningmining• Social impact of data miningSocial impact of data mining• Trends in data miningTrends in data mining• SummarySummary
28.11.2001 Data mining - Applications, future, and summary
4
Data mining applicationsData mining applications
• Data mining is a young discipline with wide and diverse Data mining is a young discipline with wide and diverse applicationsapplications
o general principles of data mining versus domain-specific, effective data mining tools for particular applications
• Application domains, e.g.,Application domains, e.g.,
o biomedical and DNA data analysis
o financial data analysis
o retail industry
o telecommunication industry
28.11.2001 Data mining - Applications, future, and summary
5
Biomedical data mining Biomedical data mining and DNA analysisand DNA analysis
• DNA sequencesDNA sequences consist of 4 basic building blocks consist of 4 basic building blocks (nucleotides):(nucleotides): adenine (A), cytosine (C), guanine (G), and thymine (T).
• GeneGene: a sequence of hundreds of individual nucleotides arranged in a particular order
• Semantic integration of heterogeneous, distributed Semantic integration of heterogeneous, distributed genome databasesgenome databases
o data cleaning and data integration methods developed in data mining will help
28.11.2001 Data mining - Applications, future, and summary
6
DNA analysis – Examples (1)DNA analysis – Examples (1)
• Similarity search and comparison among DNA Similarity search and comparison among DNA sequencessequences
o compare the frequently occurring patterns of each class
o identify gene sequence patterns that play roles in various diseases
• Association analysis: Association analysis: identification of co-occurring gene sequences
o most diseases are triggered by a combination of genes acting together
o may help determine the kinds of genes that are likely to co-occur together in target samples
28.11.2001 Data mining - Applications, future, and summary
7
DNA analysis – Examples (2)DNA analysis – Examples (2)
• Path analysis: Path analysis: linking genes to different disease development stages
o different genes may become active at different stages of the disease
o develop pharmaceutical interventions that target the different stages separately
• Visualization tools and genetic data analysisVisualization tools and genetic data analysis
28.11.2001 Data mining - Applications, future, and summary
8
Data mining for financial data Data mining for financial data analysis (1)analysis (1)
• Collected data is often relatively complete, reliable, Collected data is often relatively complete, reliable, and of high qualityand of high quality
• Design and construction of data warehouses for Design and construction of data warehouses for multidimensional data analysis and data miningmultidimensional data analysis and data mining
o view the debt and revenue changes, e.g., by month
o access statistical information, e.g., trend
• Loan payment prediction/consumer credit policy Loan payment prediction/consumer credit policy analysisanalysis
o loan payment performance
o consumer credit rating
28.11.2001 Data mining - Applications, future, and summary
9
Data mining for financial data Data mining for financial data analysis (2)analysis (2)
• Classification and clustering of customers for targeted Classification and clustering of customers for targeted marketingmarketing
o multidimensional segmentation to identify customer groups or associate a new customer to an appropriate customer group
• Detection of money laundering and other financial Detection of money laundering and other financial crimescrimes
o integration of multiple DBs
o tools: data visualization, linkage analysis, classification, clustering tools, outlier analysis, and sequential pattern analysis tools
28.11.2001 Data mining - Applications, future, and summary
10
Data mining for retail industry (1)Data mining for retail industry (1)
• Retail industry:Retail industry: huge amounts of data on sales, customer shopping history, etc.
• Applications of retail data mining:Applications of retail data mining:
o identify customer buying behaviors
o discover customer shopping patterns and trends
o improve the quality of customer service
o achieve better customer retention and satisfaction
o enhance goods consumption ratios
o design more effective goods transportation and distribution policies
28.11.2001 Data mining - Applications, future, and summary
11
Data mining in retail industry (2) Data mining in retail industry (2)
• Design and construction of data warehouses based on Design and construction of data warehouses based on the benefits of data mining the benefits of data mining (multidimensional analysis of sales, customers, products, time, and region)
• Analysis of the effectiveness of sales campaignsAnalysis of the effectiveness of sales campaigns
• Analysis of customer loyaltyAnalysis of customer loyalty
o use customer loyalty card information to register sequences of purchases of particular customers
o use sequential pattern mining to investigate changes in customer consumption or loyalty
o suggest adjustments on the pricing and variety of goods
• Purchase recommendation and cross-reference of itemsPurchase recommendation and cross-reference of items
28.11.2001 Data mining - Applications, future, and summary
12
Data mining for Data mining for telecommunication industry (1)telecommunication industry (1)
• A rapidly expanding and highly competitive industry A rapidly expanding and highly competitive industry and a great demand for data miningand a great demand for data mining
o understand the business involved
o identify telecommunication patterns
o catch fraudulent activities
o make better use of resources
o improve the quality of service
• Multidimensional analysis of telecommunication dataMultidimensional analysis of telecommunication data
o e.g., calling-time, duration of call, location of caller, type of call, etc.
28.11.2001 Data mining - Applications, future, and summary
13
Data mining for Data mining for telecommunication industry (2)telecommunication industry (2)
• Fraudulent pattern analysis and the identification of Fraudulent pattern analysis and the identification of unusual patternsunusual patterns
o identify potentially fraudulent users and their atypical usage patterns
o detect attempts to gain fraudulent entry to customer accounts
o discover unusual patterns which may need special attention
28.11.2001 Data mining - Applications, future, and summary
14
Data mining for Data mining for telecommunication industry (3)telecommunication industry (3)
• Multidimensional association and sequential pattern analysis
o find usage patterns for a set of communication services by customer group, by month, etc.
o promote the sales of specific services
o improve the availability of particular services in a region
• Use of visualization tools in telecommunication data Use of visualization tools in telecommunication data analysisanalysis
28.11.2001 Data mining - Applications, future, and summary
15
How to choose a data mining How to choose a data mining system? (1)system? (1)
• Commercial data mining systems Commercial data mining systems have little in commonhave little in common
o different data mining functionality or methodology
o may even work with completely different kinds of data sets
• For selection of a system we need to For selection of a system we need to have a multiple dimensional view of have a multiple dimensional view of existing systemsexisting systems
28.11.2001 Data mining - Applications, future, and summary
16
How to choose a data mining How to choose a data mining system? (2)system? (2)
• Data types:Data types: relational, transactional, text, time sequence, spatial?
• System issuesSystem issues
o running on only one or on several operating systems?
o a client/server architecture?
o provide Web-based interfaces and allow XML data as input and/or output?
• Data sourcesData sources
o ASCII text files, multiple relational data sources
o support ODBC connections (OLE DB, JDBC)?
28.11.2001 Data mining - Applications, future, and summary
17
How to choose a data mining How to choose a data mining system? (3)system? (3)
• Data mining functions and methodologiesData mining functions and methodologies
o one vs. multiple data mining functions
o one vs. variety of methods per function
• Coupling with DB and/or data warehouse systemsCoupling with DB and/or data warehouse systems
o four forms of coupling:four forms of coupling: no coupling, loose coupling, semitight coupling, and tight coupling
• Visualization tools: Visualization tools: data visualization, mining result visualization, mining process visualization, and visual data mining
28.11.2001 Data mining - Applications, future, and summary
18
How to choose a data mining How to choose a data mining system? (4)system? (4)
• ScalabilityScalability
o row (or database size) scalability
o column (or dimension) scalability
o curse of dimensionality: curse of dimensionality: it is much more challenging to make a system column scalable that row scalable
• Data mining query language and graphical user Data mining query language and graphical user interfaceinterface
o easy-to-use and high-quality graphical user interface
o essential for user-guided, highly interactive data mining
28.11.2001 Data mining - Applications, future, and summary
19
Data mining systems (1)Data mining systems (1)
• IBM Intelligent MinerIBM Intelligent Miner
o a wide range of data mining algorithms
o scalable mining algorithms
o toolkits:toolkits: neural network algorithms, statistical methods, data preparation, and data visualization tools
o tight integration with IBM's DB2 relational database system
• SAS Enterprise MinerSAS Enterprise Miner
o a variety of statistical analysis tools
o data warehouse tools and multiple data mining algorithms
28.11.2001 Data mining - Applications, future, and summary
20
Data mining systems (2)Data mining systems (2)
• SGI MineSetSGI MineSet
o multiple data mining algorithms and advanced statistics
o advanced visualization tools
• Clementine (SPSS)Clementine (SPSS)
o an integrated data mining development environment for end-users and developers
o multiple data mining algorithms and visualization tools
28.11.2001 Data mining - Applications, future, and summary
21
Data mining systems (3)Data mining systems (3)
• DBMiner (DBMiner Technology Inc.)DBMiner (DBMiner Technology Inc.)
o multiple data mining modules: discovery-driven OLAP analysis, association, classification, and clustering
o efficient, association and sequential-pattern mining functions, and visual classification tool
o mining both relational databases and data warehouses
• Microsoft SQLServer 2000Microsoft SQLServer 2000
o integrate DB and OLAP with mining
o support OLEDB for DM standard
28.11.2001 Data mining - Applications, future, and summary
22
Additional themes on data miningAdditional themes on data mining
• Web miningWeb mining
• Visual data miningVisual data mining
• Audio data miningAudio data mining
• Theoretical foundations of data Theoretical foundations of data miningmining
• Data mining and intelligent Data mining and intelligent query answeringquery answering
28.11.2001 Data mining - Applications, future, and summary
23
Web mining (1)Web mining (1)
• The WWW is huge, widely The WWW is huge, widely distributed, global information distributed, global information service center forservice center for
o information services: news, advertisements, consumer information, education, government, e-commerce, etc.
o hyper-link information
o access and usage information
28.11.2001 Data mining - Applications, future, and summary
24
Web mining (2)Web mining (2)
• Web search engines:Web search engines:
o index-based: search the Web, index Web pages, and build and store huge keyword-based indices
o help locate sets of Web pages containing certain keywords
• Deficiencies of the web search engines:Deficiencies of the web search engines:
o a topic of any breadth may easily contain hundreds of thousands of documents
o many documents that are highly relevant to a topic may not contain keywords defining them
28.11.2001 Data mining - Applications, future, and summary
25
Web mining (3)Web mining (3)
• WWW provides rich sources for WWW provides rich sources for data miningdata mining
• Challenges:Challenges:
o too huge for effective data warehousing and data mining
o too complex and heterogeneous: no standards and structure
28.11.2001 Data mining - Applications, future, and summary
26
Web mining (4)Web mining (4)
• Web mining is a more Web mining is a more challenging task than challenging task than constructing and using web constructing and using web search enginessearch engines
• Web mining searches forWeb mining searches for
o web access patterns
o web structures
o regularity and dynamics of web contents
28.11.2001 Data mining - Applications, future, and summary
27
Web mining (5)Web mining (5)
• Web mining taxonomy:Web mining taxonomy:
Web Mining
Web StructureMining
Web ContentMining
Web PageContent Mining
Search ResultMining
Web UsageMining
General AccessPattern Tracking
CustomizedUsage Tracking
28.11.2001 Data mining - Applications, future, and summary
28
Visual data mining (1)Visual data mining (1)
• Visualization:Visualization: use of computer graphics to create visual images which aid in the understanding of complex, often massive representations of data
• Visual data mining:Visual data mining: the process of discovering implicit, but useful knowledge from large data sets using visualization techniques
28.11.2001 Data mining - Applications, future, and summary
29
Visual data mining (2)Visual data mining (2)
• Purpose of visualizationPurpose of visualization
o gain insight into an information space by mapping data onto graphical primitives
o provide qualitative overview of large data sets
o search for patterns, trends, structure, irregularities, relationships among data
o help find interesting regions and suitable parameters for further quantitative analysis
o provide a visual proof of computer representations derived
28.11.2001 Data mining - Applications, future, and summary
30
Visual data mining (3)Visual data mining (3)
• Integration of visualization and Integration of visualization and data miningdata mining
o data visualization
o data mining result visualization
o data mining process visualization
o interactive visual data mining
28.11.2001 Data mining - Applications, future, and summary
31
Data visualizationData visualization
• Data in a database or data Data in a database or data warehouse can be viewedwarehouse can be viewed
o at different levels of granularity or abstraction
o as different combinations of attributes or dimensions
• Data can be presented in various Data can be presented in various visual formsvisual forms
28.11.2001 Data mining - Applications, future, and summary
32
Box-plots in StatsoftBox-plots in Statsoft
28.11.2001 Data mining - Applications, future, and summary
33
Data mining result visualizationData mining result visualization
• Presentation of the results or Presentation of the results or knowledge obtained from data knowledge obtained from data mining in visual formsmining in visual forms
• ExamplesExamples
o scatter plots and box-plots
o association rules
o clusters
o outliers
o generalized rules
28.11.2001 Data mining - Applications, future, and summary
34
Scatter plots in Scatter plots in SAS Enterprise MinerSAS Enterprise Miner
28.11.2001 Data mining - Applications, future, and summary
35
Association rules in MineSet 3.0Association rules in MineSet 3.0
28.11.2001 Data mining - Applications, future, and summary
36
A decision tree in MineSet 3.0A decision tree in MineSet 3.0
28.11.2001 Data mining - Applications, future, and summary
37
Cluster groupings in Cluster groupings in IBM Intelligent MinerIBM Intelligent Miner
28.11.2001 Data mining - Applications, future, and summary
38
Data mining process visualizationData mining process visualization
• Presentation of the various processes of data mining Presentation of the various processes of data mining in visual forms so that users can seein visual forms so that users can see
o how the data are extracted
o from which database or data warehouse they are extracted
o how the selected data are cleaned, integrated, preprocessed, and mined
o which method is selected at data mining
o where the results are stored
o how they may be viewed
28.11.2001 Data mining - Applications, future, and summary
39
Data mining processes in Data mining processes in ClementineClementine
28.11.2001 Data mining - Applications, future, and summary
40
Interactive visual data miningInteractive visual data mining
• Using visualization tools in the data Using visualization tools in the data mining process to help users make mining process to help users make smart data mining decisions smart data mining decisions
• ExampleExample
o display the data distribution in a set of attributes using colored sectors or columns
o use the display to decide which sector should first be selected for classification and where a good split point for this sector may be
28.11.2001 Data mining - Applications, future, and summary
41
Interactive visual mining by Interactive visual mining by perception-based classificationperception-based classification
28.11.2001 Data mining - Applications, future, and summary
42
Audio data miningAudio data mining
• Audio signals Audio signals (sounds, music) are used to indicate the the patterns of data, or the features of data mining resultspatterns of data, or the features of data mining results
• An interesting alternative interesting alternative to visual mining
• An inverse task of mining audio An inverse task of mining audio (such as music) databases databases which is to find patterns from audio data
• Visual data mining Visual data mining may disclose interesting patterns using graphical displays, but requires users to concentrate on watching patterns watching patterns
• In audio data mining, audio data mining, the user listens listens to pitches, rhythms, tune, and melody in order to identify anything in order to identify anything interesting or unusualinteresting or unusual
28.11.2001 Data mining - Applications, future, and summary
43
Theoretical foundations of Theoretical foundations of data mining (1)data mining (1)
• Data reductionData reduction
o the basis of data mining is to reduce the data representation (use, e.g., histograms or clustering)
o trades accuracy for speed
• Data compressionData compression
o the basis of data mining is compress the given data by encoding in terms of bits, association rules, decision trees, clusters, etc.
28.11.2001 Data mining - Applications, future, and summary
44
Theoretical foundations of Theoretical foundations of data mining (2)data mining (2)
• Pattern discoveryPattern discovery
o the basis of data mining is to discover patterns occurring in the database, e.g., associations, classification models and sequential patterns
• Probability theoryProbability theory
o the basis of data mining is to discover joint probability distributions of random variables
28.11.2001 Data mining - Applications, future, and summary
45
Theoretical foundations of Theoretical foundations of data mining (3)data mining (3)
• Microeconomic viewMicroeconomic view
o a view of utility
o the task of data mining is finding patterns that are interesting only to the extent in that they can be used in the decision-making process of some enterprise
28.11.2001 Data mining - Applications, future, and summary
46
Theoretical foundations of Theoretical foundations of data mining (4)data mining (4)
• Inductive databasesInductive databases
o data mining is the problem of performing inductive logic on databases
o the task is to query the data and the theory (i.e., patterns) of the database
o popular among many researchers in database systems
28.11.2001 Data mining - Applications, future, and summary
47
Data mining and Data mining and intelligent query answering (1)intelligent query answering (1)
• Query answeringQuery answering
o direct query answering:direct query answering: returns exactly what is being asked
o intelligent intelligent (or cooperative) query query answering:answering: analyzes the intent of the query and provides generalized, neighborhood or associated information relevant to the query
28.11.2001 Data mining - Applications, future, and summary
48
Data mining and Data mining and intelligent query answering (2)intelligent query answering (2)
• Some users may not not have a clear idea clear idea of exactly what to what to minemine or what is contained in the database
• Intelligent query answering Intelligent query answering analyzes the user's intent and answers queries in an intelligent way
28.11.2001 Data mining - Applications, future, and summary
49
Data mining and Data mining and intelligent query answering (3)intelligent query answering (3)
• A general framework for the A general framework for the integration of data mining and integration of data mining and intelligent query answeringintelligent query answering
o data query: data query: finds concrete data stored in a database
o knowledge query: knowledge query: finds rules, patterns, and other kinds of knowledge in a database
28.11.2001 Data mining - Applications, future, and summary
50
Data mining and Data mining and intelligent query answering (4)intelligent query answering (4)
• For example, three ways to improve For example, three ways to improve on-line shopping serviceon-line shopping service
o informative query answering by providing summary information
o suggestion of additional items based on association analysis
o product promotion by sequential pattern mining
28.11.2001 Data mining - Applications, future, and summary
51
Social impact of data miningSocial impact of data mining
• Is data mining a hype?Is data mining a hype?
• Data mining: merely Data mining: merely managers’ business or managers’ business or everyone’severyone’s
• Privacy and data securityPrivacy and data security
28.11.2001 Data mining - Applications, future, and summary
52
Is data mining a hype, or Is data mining a hype, or will it be persistent?will it be persistent?
• Data mining is a technologyData mining is a technology
• Technological life cycle:Technological life cycle:
o innovators
o early adopters
o chasm
o early majority
o late majority
o laggards
28.11.2001 Data mining - Applications, future, and summary
53
Life Cycle of Technology AdoptionLife Cycle of Technology Adoption
• Data mining is at chasm!?Data mining is at chasm!?
o existing data mining systems are too generic
o need business-specific data mining solutions and smooth integration of business logic with data mining functions
28.11.2001 Data mining - Applications, future, and summary
54
Whose business is it?Whose business is it?
• Data mining will surely be an important tool for an important tool for managers’ decision makingmanagers’ decision making
• The amount of the available datadata is increasingincreasing, and data mining systemssystems will be more affordablemore affordable
• Multiple personal usesMultiple personal uses
o mine your family's medical history to identify genetically-related medical conditions
o mine the records of the companies you deal with
o mine data on stocks and company performance, etc.
• Invisible data mining: Invisible data mining: bbuild data mining functions into many intelligent tools
28.11.2001 Data mining - Applications, future, and summary
55
Threat to privacy Threat to privacy and data security?and data security?
• ““Big Brother” is carefully watching youBig Brother” is carefully watching you
• Profiling information is collected constantlyProfiling information is collected constantly
o you use your credit card, supermarket loyalty card, or frequent flyer card, or apply for any of the above
o you surf the Web, reply to an Internet newsgroup, subscribe to a magazine, rent a video, or fill out a contest entry form
• Collection of personal data may be beneficial for Collection of personal data may be beneficial for companies and consumers, but there is also potential companies and consumers, but there is also potential for misusefor misuse
28.11.2001 Data mining - Applications, future, and summary
56
Protect privacy and data securityProtect privacy and data security
• Fair information practicesFair information practices
o international guidelines for data privacy protection
o cover aspects relating to data collection, purpose, use, quality, openness, individual participation, and accountability
o purpose specification and use limitation
o openness: individuals have the right to know what information is collected about them, who has access to the data, and how the data are being used
• Develop and use data security-enhancing techniques, Develop and use data security-enhancing techniques, e.g., blind signatures, biometric encryption, and anonymous databases
28.11.2001 Data mining - Applications, future, and summary
57
Trends in data mining (1)Trends in data mining (1)
• Application explorationApplication exploration
o development of application-specific data mining system
o invisible data mining (mining as built-in function)
• Scalable data mining methodsScalable data mining methods
o constraint-based mining: use of constraints to guide data mining systems in their search for interesting patterns
28.11.2001 Data mining - Applications, future, and summary
58
Trends in data mining (2)Trends in data mining (2)
• Integration of data mining with Integration of data mining with database systems, data warehouse database systems, data warehouse systems, and web database systems systems, and web database systems
• Standardization of data mining Standardization of data mining languagelanguage
o a standard will facilitate systematic development, improve interoperability, and promote the education and use of data mining systems in industry and society
• Visual data miningVisual data mining
28.11.2001 Data mining - Applications, future, and summary
59
Trends in data mining (3)Trends in data mining (3)
• New methods for mining complex New methods for mining complex types of datatypes of data
o more research is required towards the integration of data mining methods with existing data analysis techniques for the complex types of data
• Web miningWeb mining
• Privacy protection and information Privacy protection and information security in data miningsecurity in data mining
28.11.2001 Data mining - Applications, future, and summary
60
Summary (1)Summary (1)
• Data mining: Data mining: semi-automatic semi-automatic discovery of interesting patterns discovery of interesting patterns from large data setsfrom large data sets
• Knowledge discovery is a Knowledge discovery is a process:process:
o preprocessing
o data mining
o postprocessing
• Application areas:Application areas: retail, telecommunication, Web mining, log analysis, …
28.11.2001 Data mining - Applications, future, and summary
61
Summary (2)Summary (2)
• Knowledge can be mined from Knowledge can be mined from different kinds of databasesdifferent kinds of databases (relational, object-oriented, spatial, WWW, …)
• We can mine different kinds of We can mine different kinds of knowledgeknowledge (characterization, clustering, association, …)
• Data mining uses also techniques Data mining uses also techniques from other areas of computer from other areas of computer sciencescience (machine learning, statistics, visualization, …)
28.11.2001 Data mining - Applications, future, and summary
62
Summary (3)Summary (3)• Some useful data mining Some useful data mining
techniques: techniques:
o association rules
o episodes
o text mining
o classification
o clustering
• There are also many other data There are also many other data mining methods/techniques mining methods/techniques developed, but not covered in developed, but not covered in this coursethis course
28.11.2001 Data mining - Applications, future, and summary
63
Summary (4)Summary (4)
• It is important to It is important to
o study theoretical foundations of data mining
o watch privacy and security issues in data mining
• The future of data mining The future of data mining seems promising, even seems promising, even without hypewithout hype
28.11.2001 Data mining - Applications, future, and summary
64
References - References - Applications etc. Applications etc. (1)(1)
• M. Ankerst, C. Elsen, M. Ester, and H.-P. Kriegel. Visual classification: An interactive approach to decision tree construction. KDD'99, San Diego, CA, Aug. 1999.
• P. Baldi and S. Brunak. Bioinformatics: The Machine Learning Approach. MIT Press, 1998.• S. Benninga and B. Czaczkes. Financial Modeling. MIT Press, 1997.• L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth
International Group, 1984.• M. Berthold and D. J. Hand. Intelligent Data Analysis: An Introduction. Springer-Verlag, 1999.• M. J. A. Berry and G. Linoff. Mastering Data Mining: The Art and Science of Customer
Relationship Management. John Wiley & Sons, 1999.• A. Baxevanis and B. F. F. Ouellette. Bioinformatics: A Practical Guide to the Analysis of Genes and
Proteins. John Wiley & Sons, 1998.• Q. Chen, M. Hsu, and U. Dayal. A data-warehouse/OLAP framework for scalable
telecommunication tandem traffic analysis. ICDE'00, San Diego, CA, Feb. 2000.• W. Cleveland. Visualizing Data. Hobart Press, Summit NJ, 1993.• S. Chakrabarti, S. Sarawagi, and B. Dom. Mining surprising patterns using temporal description
length. VLDB'98, New York, NY, Aug. 1998.• J. L. Devore. Probability and Statistics for Engineering and the Science, 4th ed. Duxbury Press,
1995.
28.11.2001 Data mining - Applications, future, and summary
65
References - References - Applications etc. Applications etc. (2)(2)
• A. J. Dobson. An Introduction to Generalized Linear Models. Chapman and Hall, 1990.• B. Gates. Business @ the Speed of Thought. New York: Warner Books, 1999.• M. Goebel and L. Gruenwald. A survey of data mining and knowledge discovery software tools.
SIGKDD Explorations, 1:20-33, 1999.• D. Gusfield. Algorithms on Strings, Trees and Sequences, Computer Science and Computation
Biology. Cambridge University Press, New York, 1997.• J. Han, Y. Huang, N. Cercone, and Y. Fu. Intelligent query answering by knowledge discovery
techniques. IEEE Trans. Knowledge and Data Engineering, 8:373-390, 1996.• R. C. Higgins. Analysis for Financial Management. Irwin/McGraw-Hill, 1997.• C. H. Huberty. Applied Discriminant Analysis. New York: John Wiley & Sons, 1994.• T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Communications of
ACM, 39:58-64, 1996.• D. A. Keim and H.-P. Kriegel. VisDB: Database exploration using multidimensional visualization.
Computer Graphics and Applications, pages 40-49, Sept. 94.• J. M. Kleinberg, C. Papadimitriou, and P. Raghavan. A microeconomic view of data mining. Data
Mining and Knowledge Discovery, 2:311-324, 1998.• H. Mannila. Methods and problems in data mining. ICDT'99 Delphi, Greece, Jan. 1997.
28.11.2001 Data mining - Applications, future, and summary
66
References - References - Applications etc. Applications etc. (3)(3)
• R. Mattison. Data Warehousing and Data Mining for Telecommunications. Artech House, 1997.
• R. G. Miller. Survival Analysis. New York: Wiley, 1981.
• G. A. Moore. Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers. Harperbusiness, 1999.
• R. H. Shumway. Applied Statistical Time Series Analysis. Prentice Hall, 1988.
• E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, Cheshire, CT, 1983.
• E. R. Tufte. Envisioning Information. Graphics Press, Cheshire, CT, 1990.
• E. R. Tufte. Visual Explanations : Images and Quantities, Evidence and Narrative. Graphics Press, Cheshire, CT, 1997.
• M. S. Waterman. Introduction to Computational Biology: Maps, Sequences, and Genomes (Interdisciplinary Statistics). CRC Press, 1995.
28.11.2001 Data mining - Applications, future, and summary
67
Data mining conferencesData mining conferences
• 1989 IJCAI Workshop1989 IJCAI Workshop
• 1991-1994 KDD Workshops1991-1994 KDD Workshops
• 1995-1998 KDD Conferences1995-1998 KDD Conferences
• 1998 ACM SIGKDD1998 ACM SIGKDD
• 1999-> SIGKDD Conferences1999-> SIGKDD Conferences
• And many smaller/new DM conferences, e.g., And many smaller/new DM conferences, e.g.,
o PAKDD, PKDDPAKDD, PKDD
o SIAM-Data Mining, (IEEE) ICDMSIAM-Data Mining, (IEEE) ICDM
28.11.2001 Data mining - Applications, future, and summary
68
Useful References on Data MiningUseful References on Data Mining
• DM:DM:
o Conferences: Conferences: KDD, PKDD, PAKDD, ...
o Journals:Journals: Data Mining and Knowledge Discovery, CACM
• DM/DB:DM/DB:
o Conferences: Conferences: ACM-SIGMOD/PODS, VLDB, ...
o Journals:Journals: ACM-TODS, J. ACM, IEEE-TKDE, JIIS, ...
• AI/ML:AI/ML:
o Conferences: Conferences: Machine Learning, AAAI, IJCAI, ...
o Journals:Journals: Machine Learning, Artifical Intelligence, ...Machine Learning, Artifical Intelligence, ...
28.11.2001 Data mining - Applications, future, and summary
69
Reminder: Course OrganizationReminder: Course Organization
Course EvaluationCourse EvaluationCourse EvaluationCourse Evaluation
• Passing the course: min 30 pointsPassing the course: min 30 pointso home exam: min 13 points (max 30
points)o exercises/experiments: min 8 points
(max 20 points) at least 3 returned and reported
experimentso group presentation: min 4 points (max
10 points)• Remember also the other requirements:Remember also the other requirements:
o attending the lectures (5/7)o attending the seminars (4/5)o attending the exercises (4/5)
28.11.2001 Data mining - Applications, future, and summary
70
Thanks to Thanks to Jiawei Han from Simon Fraser University Jiawei Han from Simon Fraser University
for his slides for his slides which greatly helped in preparing this lecture! which greatly helped in preparing this lecture!
Data mining applications, Data mining applications, future, and summaryfuture, and summary