Post on 02-Apr-2015
transcript
1
Evaluation for Web Mining Applications
Bettina BerendtHumboldt University Berlin
Ernestina MenasalvasUniversidad Politécnica de Madrid
Myra SpiliopoulouOtto von Guericke University Magdeburg
www.wiwi.hu-berlin.de/~berendt/Evaluation
2
Agenda
Mining for evaluation: perspectives and measures
A case study
Outlook: Evaluation of mining
Web mining as a project: towards a methodology
Evaluation and experimentation
Evaluation and Web mining
Web mining as a project: towards a methodology
Evaluation and experimentation
Evaluation and Web mining
3
Evaluation of Web mining applications, or: Web mining as a project
Is it worthwhil
e to do the
mining project?
Is the result valuable for
the application?
Are (all) the tasks performed well?
Are the data
appropriate for the
mining project?
Are the technique
s appropriate for the expected resutls?
4
Project definition
Refers to
Set of interdependent activities
Oriented to a specific goal
With a predetermined lenght
Set of tasks
Web Site goal: stakeholder
Cost and time estimation
5
Data Mining as a project
Define the goal:
Corresponds to the business understanding step of Crisp-DM
Business and data mining experts have to define the goal collaboratively
Each goal must be defined with a great degree of detail
Obtain the model
Apply data mining process model
Evaluate results and redirect
Evaluation in the extent definition: the act of ascertaining the value of an object according to specified criteria, operationalised in terms of measures.
Object= patterns or model
Measures and criteria has to do with goals
Deploy
With business goals directing each step, data mining produce results with a business impact
Check the business impact is due to the result of the project
Experiment design
6
Web Mining as a project:the 3 components of a system by Garnert Group
ERP/ERM
Order Manag.
Supply ChainMgmt.
Order Prom.
LegacySystems
SalesAutomation
ServiceAutomation
MarketingAutomation
FieldService
Mobile SalesVertical Apps.
Category Mgmt.
MarketingAutomation
Campaign Mgmt.
CustomerActivity
Customers Products
DataWarehouse
Voice(IVR, ACD)
Conferencing
WebConferencing
ResponseManagement
FaxLetter
DirectInteraction
Operational CRM Analytical CRM
Collaborative CRM
Off
ice
Off
ice
Off
ice
Inte
ract
ion
Clo
sed-
Loop
Pro
cess
ing
(EA
I Too
lkits
, Em
bedd
ed/M
obile
Age
nts
7
Web Mining as a project:the 3 components translated
ERP/ERM
Order Manag.
Supply ChainMgmt.
Order Prom.
LegacySystems
SalesAutomation
ServiceAutomation
MarketingAutomation
FieldService
Mobile Sales
Data Mining.Data Mining
CustomerActivity
Customers Products
DataWarehouse
RecommenderPersonalization E-mail
ResponseManagement
OperationalAnalytical
Decisional System
Off
ice
Off
ice
Off
ice
Inte
ract
ion
Clo
sed-
Loop
Pro
cess
ing
(EA
I Too
lkits
, Em
bedd
ed/M
obile
Age
nts
Web SiteFront??
Web SiteBack??
8
The 3 component of a Web Site
Operational component: The end result of a
Software Development Process
Decisional component: Results of the analitycal component are integrated in the operational system:
Software development project
Analitical component: The end result of a
Data Mining process
Sw Development Methodologies
Sw Development Methodologies
Data Mining Methodologies ?¿??
Business Intelligent Project BI Methodologies
9
Methodology
Process Model
Lifecycle
+
Set of tasks to be perfomed:
Development tasks
Project Management tasks
Sequencing of task
Waterfall
Iterative
Phases of the project
10
BI-Methodologies
BI-Roadmap
CRM-Catalyst
11
CRM Catalyst mayor phases:
The five mayor phases are:
Discovery.
Establishing the business goals for CRM
Orientation.
Defining necessary system and organisational (specific technical solutions) changes to meet the goals. This leads to a definition of top-level system requirements.
Navigation.
The CRM system requirements are defined more precisely, the system is scoped, system and vendor assessment criteria are defined and a system is selected and contracted.
Implementation.
Planning and managing the CRM project. It is during this phase that the system is built and put into use.
Post implementation.
Monitoring performance and continuous improvement since CRM project never ends because CRM must constantly evolve to keep pace with the changing business and its environment.
12
Software Methodologies
Process Model
ISO 12207
Lifecycle
Iterative+ = RUP
13
Web Mining Methodology?
To Be Defined
Can be reused ?
The ones in CRISP-DM
14
Web mining methodology :Process Model: Crisp-DM
Is it worthwhil
e to do the
mining project?
Is the result valuable for
the application?
Are (all) the tasks performed well?
Are the data
appropriate for the mining
project?
Are the technique
s appropriate for the expected resutls?Has the goal be obtained as a
cause effect of the project development?
15
Web Mining Project goals
Top-level goal 1: The Web exists in order to be used
Goals of usage depend on stakeholder and viewpoint.
Is the site a good site? Is it successful?But: What does Success mean?
Starting point: Web life-cycle metrics, micro-conversion rates
Extension for application-oriented success measurement: Multi-Channel Metrics
Has the goal be obtained as a cause effect of the project development?
Join in this slides resutls with the web mining project or other factors
16
Agenda
Mining for evaluation: perspectives and measures
A case study
Outlook: Evaluation of mining
Web mining as a project: towards a methodology
Evaluation and experimentation
Evaluation and Web mining
Web mining as a project: towards a methodology
Evaluation and experimentation
Evaluation and Web mining
17
Experimentation in Web Mining Applications
Ernestina Menasalvas
Javier Segovia
Pilar HerreroUniversidad Politécnica de Madrid
18
Experimentation
Refers to
Matching with facts
Supositions, assumptions
speculation and beliefs
That abound in web mining solutions deployment
Users and stakeholder satisfied
Personalization helps the user to remain loyal
Recommendation increase selling
Evaluation: the act of ascertaining
the value and
the functioning
of an object according to specified criteria, operationalised by measures.
to assess concrete achievements
to give feedback towards improvement
19
Experimentationin web mining: Is the success due to the web mining resutls or to external
factors?
Is this a good Website?
Web Mining -> good website
NOT web Mining -> good website
20
•Humans can generate valid knowledge by means of trial and error
•Trial and error process is longer and chancy than the scientific method
•Experimental design is is used in other fields of science
Zelkowitz (98):
Controlled
Observational
Historical
What is Experimental Design?
5
Kitchemham (96):
Formal Experiments
Case Studies
Surveys
Experimental design to Web Mining empirical validation
Adatation of experimental design terminology to WM
(Juristo& Moreno 02)
Laboratory validation of theories
Validation at the level of real projects
Historical data validation
Empirical validation can be carried out:
21
Experimental Designwww.soacilaresearchmethods.net/Kb/desexper.html
Most rigorous of all research design
The strongest with respect to internal validity
Internal validity: Asses the proposition:
If X, then Y
And
If not X, Then not Y
If the program is given, then the outcome occurs
And
If the program is not given then the outcome does not occur
Isolate the program from all of the other potential causes of the outcome
22
Experimental Designwww.soacilaresearchmethods.net/Kb/desexper.html
Experimental design is intrusive
Difficult to carry out in mos real world contexts
TO some extent, you set up an artificial situation:
Asses the casual relationship with high internal validity.
Limitating the degree to which results can be generalized
Reduce external validity in order to achieve greater internal validity
23
Phases of experimental design process
1. Defining the objectives of the experiment
Mathematical techniques demand experiment to produce quantifiable hypothesis
Hypothesis expressed in terms of:
– a metric of the web mining results obtained using the web mining techniques
– or of the web mining process where the techiques have been applied
2. Designing the experiment:
Experimental unit
Parameters
Response variable
Factors, levels ans interaction
Replication: based on analogy ??
Design
3. Executing the experiment:
Measure response variables at the end of each experiment
4. Analyzing results: Experimental Analysis
Quantify the impact of each factor and each iteration between factors on the variation of the response variable: statistical significance
24
Experimental design classification
What we see can be divided into:
Signal Noise
Related to the variable of interest:
the construct to measurerandom factors in the situation
Signal enhancersNoise Reducers
Signal to noise metaphor: (www.socialresearchmethods.net/kb)
Factorial designs Blocking Designs
25
Experimental design techniques
Categorical
Factors
Quantitative
Experimental
response
Quantitative Factors
and
Response variable
1 Factor
(2 or n levels)
K Factors
(2 or n levels)
All other parameters fixed
Some parameters cannot be fixed
Regression
Models
One factor experiment
Blocking experiment
Some parameters are irrelevant
All factors are relevant
Blocking Factorial design
nk
experiments
Less than nk
experiments
Factorial
design
Fractorial
Factorial design
26
Questions thus far ?