+ All Categories
Home > Documents > Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute,...

Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute,...

Date post: 15-Jan-2016
Category:
Upload: erick-cameron
View: 214 times
Download: 0 times
Share this document with a friend
21
Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb
Transcript
Page 1: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Active subgroup mining

for descriptive induction

tasks

Dragan Gamberger

Rudjer Bošković Instute, Zagreb

Zdenko Sonicki

University of Zagreb

Page 2: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Talk overview:

- descriptive induction- active subgroup mining - subgroup discovery- data mining server- a real medical example

Page 3: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Descriptive induction is aimed at generating (inducing) knowledge that is understandable (interpretable) by humans.

It is different from classification aimed induction where the main goal is high classification quality (but induced classification schemes are typically too complex for human interpretation).

Page 4: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Main properties of descriptive induction:

- simple rules

- reasonable prediction quality (both on available and future cases)

Main problem: overfitting

functional genomics domain has 150 examples with 16000 measured attribute values

Page 5: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

- descriptive induction- active subgroup mining - subgroup discovery- data mining server- a real medical example

Page 6: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Active subgroup mining is a data analysis approach specially developed for medical applications (but applicable also for other domains).

It is based on the observation that expert knowledge (in medical domains it means knowledge

and experience of medical doctors) is very important for the quality of obtained results.

Page 7: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

In active subgroup mining the expert is positioned in the center of the process and machine learning (subgroup discovery) is only a tool that helps him in the data analysis process.

Page 8: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

definition of task(s)

induction of models

presentation

visualization

integration

statistical evaluatio

n

selection of models

expert

subgroup discovery

Page 9: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

- descriptive induction- active subgroup mining - subgroup discovery- data mining server- a real medical example

Page 10: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

+++

+

+

+

+

+

+

+

+

+

+

+

+

classical versus subgroup discovery

induction

Page 11: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

+

+

+

+

+

+

very specific subgroup very sensitive subgroup

generality – the main parameter of the subgroup induction process

Page 12: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Subgroup discovery is a beam search algorithm which generates short rules in the form of conjunctions of conditions.

Conditions are based on the values of available attributes.

example:

CHD <- age > 53 AND T.CH > 6.1 AND BMI < 30

Page 13: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
Page 14: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

- descriptive induction- active subgroup mining - subgroup discovery- data mining server- a real medical example

Page 15: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

dms.irb.hr

Page 16: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
Page 17: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

meningoencephalitis domain

subgroup describing bacteria in contrast to the virus type disease

Page 18: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

- descriptive induction- active subgroup mining - subgroup discovery- data mining server- a real medical example

Page 19: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.
Page 20: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

Conclusions:

-descriptive induction and active subgroup mining are novel concepts potentially very interesting for data analysis and knowledge induction in medical applications

- active and central role of medical experts is essential

Page 21: Active subgroup mining for descriptive induction tasks Dragan Gamberger Rudjer Bošković Instute, Zagreb Zdenko Sonicki University of Zagreb.

- we have extensive and positive experience with these methodology on different medical domains but no experience in constructing medical guidelines. For such applications potentially useful might be:

- detection of decision points for numerical attributes

- detection of apparent but significant contradictions

- explicit noise detection


Recommended