RM World 2014: Design and implementation of data mining case studies

Post on 18-Nov-2014

116 views 2 download

description

 

transcript

Dr. Matthew North

Professor of Business & Information Systems

The College of Idaho

RapidMinerWorld 2014

Boston, USA

Design and Implementation of Data

Mining Case Studies in RapidMiner

W&J/College of Idaho

A Focus on Teaching & Learning

Lots of folks do data mining

Lots of those use RapidMiner

Data mining education

Younger than the discipline

Strange collection of options

Science? Business? Math?

A Focus on Teaching & Learning

2005 – Present:

Books!

Tools!

Weka, Alphaminer, Clementine, more…

Education

Master’s, Certificates, Boot Camps

Data Mining for the Masses (2012)

Data Mining Cases in RapidMiner (2013)

A Focus on Teaching & Learning

The Case Method

Cases give context

My first Clementine class

Cases build on prior knowledge

Central Tendency > k-Means Clustering

Cases use Learning Theory

Concept Attainment

The Anatomy of a Data Mining Case

ActivationStimulate prior knowledge/learningRelevant to the data mining task

AdditionIntroduce the new conceptK-Means Clustering

ComparisonGood/poor examples

Conclusions

RapidMiner World/Boston Example

Activation: Welcome to Boston!

There’s a lot to do here

Lots of cool/smart people

After hours connections can be valuable

Can data mining help make an effective

fun/work connection?

Maybe so, if we rate options and then

build option clusters

RapidMiner World/Boston Example

Addition: Options + Data = Choice

List our options, then rate from 0-3

across various types of fun

RapidMiner World/Boston Example

Addition: Modeling the data

RapidMiner World/Boston Example

Comparison: What do you see?

RapidMiner World/Boston Example

Conclusions: So what?

Does this help you make a decision?

How can you fine tune your model?

To what other problems/datasets could

you apply what you’ve learned?

Response to Reviewers

Use of a toy example

Transfer of knowledge to other

scenarios is ideal

Sometimes a little help is good…

Loan Analyst Example

Activation:

You review loans looking for red flags

You know how to spot anomalies

Your work is time-consuming

Addition:

Problem loans don’t look like average ones

K-Means Clustering uses averages

Averages help create different groups

Loan Analyst Example

Comparison:Build a k-Means model with your loan data

You’re the expert, what do you see?Compare your standard method results to the

data mining results

Conclusions:Is the model useful?

Can it speed up your identification of problem loans?

Conclusions

Cases are fun/interesting

Cases are accessible to area experts

Learning data mining is often the hurdle

RapidMiner makes data mining

accessible to non-experts

Now…..

Who’s Ready to Hit the Town?!?