Date post: | 02-Jun-2018 |
Category: |
Documents |
Upload: | pabloapacheco |
View: | 221 times |
Download: | 1 times |
of 176
8/10/2019 SPAD7 Data Miner Guide.pdf
1/176
22 quai gallieni - 92150 Suresnes - France
Tl : +33 1 57 32 60 60- Fax : +33 1 5732 62 [email protected] www.coheris.comSiret : 39946792700105 - APE : 5829CRegister number training: 11-92-1522492
DATA MINERGUIDE
Descriptive Statistics - Factorial Analyses - Clustering
Linear Models Discriminant Analyses
Scoring Decision Trees
8/10/2019 SPAD7 Data Miner Guide.pdf
2/176
8/10/2019 SPAD7 Data Miner Guide.pdf
3/176
3
Table of contents
DESCRIPTIVE STATISTICS WITH SPAD 4
STATS - MARGINAL DISTRIBUTIONS,HISTOGRAMS 5
DEMODAUTOMATIC CHARACTERIZATION OF A QUALITATIVE VARIABLE 16DESCO - AUTOMATIC CHARACTERIZATION OF A CONTINUOUS VARIABLE 21
TABLE - CROSS TABLES 25
BIVAR - BIVARIATE ANALYSIS 28
FACTORIAL ANALYSES WITH SPAD 30
PCA - PRINCIPAL COMPONENT ANALYSIS 32
SCA - SIMPLE CORRESPONDENCE ANALYSIS 45
MCA - MULTIPLE CORRESPONDENCE ANALYSIS 50
CLUSTERING WITH SPAD 62
RECIP/SEMIS - CLUSTERING ON FACTORS SCORES 63
PARTI-DECLA- CUT OF THE TREE AND CLUSTERS DESCRIPTION 69
CLASS-MINER - CLUSTERS DESCRIPTION 78
ESCAL - STORING THE FACTORIAL AXES AND THE PARTITIONS 79
THE LINEAR MODEL AND ITS APPLICATIONS 80
REGRESSION AND ANALYSIS OF VARIABCE, GENERAL LINEAR MODEL 80
OPTIMAL REGRESSIONS RESEARCH 85
LOGISTIC REGRESSION 94
THE DISCRIMINANT AND ITS METHODS 105
FUWILD - OPTIMAL DISCRIMINANT ANALYSIS 105DIS2GD - LINEAR DISCRIMINANT ANALYSIS BASED ON CONTINUOUS VARIABLES 117
DIS2GFP - LINEAR DISCRIMINANT ANALYSIS BASED ON PRINCIPAL FACTORS 126
DISCO - DISCRIMINANT ANALYSIS BASED ON QUALITATIVE VARIABLES 134
SCORE - SCORING FUNCTION 134IDT1 - INTERACTIVE DECISION TREE 1 154IDT2 - INTERACTIVE DECISION TREE 2 154
8/10/2019 SPAD7 Data Miner Guide.pdf
4/176
4
DESCRIPTIVE STATISTICS WITH SPAD
STATS: marginal distributions, histograms, matrix plot, box plot
DEMOD: automatic characterization of a qualitative variable
DESCO: automatic characterization of a continuous variable
TABLE: Crossed tables
BIVAR: Bivariate analysis
8/10/2019 SPAD7 Data Miner Guide.pdf
5/176
Descriptive Statistics with SPAD
5
STATS - MARGINAL DISTRIBUTIONS,HISTOGRAMS
This procedure supplies a rapid and automatic description of your nominal andcontinuous variables.
The Survey.sbabase is an opinion survey file, which will be used for this example. The file is
supplied with the application and installed automatically on your PC.
SET THE PARAMETERS FOR A METHOD
Before it can be executed, a method must have its parameters set.
To access the parameter settings of a method, right click on the method then on the Set themethod command or double-click on the method icon.
The rules for calculation and parameter settings of each of the methods are available on line.
The Cases, Weighting and Parameters tabs are available for almost all SPAD methods.
Cases: the Cases tab lets you select the cases used for the method
Weighting: the weighting tab allows you to adjust the distribution of the cases in the sampleParameters: options and settings of the method
8/10/2019 SPAD7 Data Miner Guide.pdf
6/176
STATS - marginal distributions, Histograms
6
The Cases tab
The Cases tab lets you select the cases with one of the following methods:
All the available cases One or more logical filters (selection criteria combined with AND/OR)
A name list of cases A selection made in one or more intervals Random draw
Apply a logical filter
In case of error, you can delete an expression from the filter by selecting the expression to discard,
and click on Delete.
The cases satisfying the filter are considered as active, while the others are supplementary.
Select the individuals from a list
Click on Logical filterSelect the chosen
variable
Click on the operator
Click on
Validate
Global Definition
of the filter
Click on the
operand
Select the chosen
method by List
Choose your cases in the Availablelistand
use the transfer buttons to select them.
Select the statusof
the cases
8/10/2019 SPAD7 Data Miner Guide.pdf
7/176
Descriptive Statistics with SPAD
7
Select cases by interval
You can save the definition of the selection made, by clicking on the Savebutton. This allows you
to re-use it later.
Do a Random Draw
This selection lets you apply the method to a sample before applying it to the entire SPAD base.
It also lets you, by executing the same method several times, after having taken the precaution tochange the number of preliminary request, to test the stability of the results of the method.
Indicate the number of preliminary
requests for the random draw. On
another execution of the selection, you
do not need to change the value of this
number unless you want to generatedifferent draws
Enter the percentageof thedraw by random, or the
sample size after the draw
Click on OK
Select by interval as the
method of choice
Select the statusof the
cases
Define the interval as a
function of its rank in the
Base SPAD
Click on the arrow button
to move your choice to thecases statuswindow
Click on the Yesradio button
to run a random draw
Click on Define to set the
parameters for the
random draw
8/10/2019 SPAD7 Data Miner Guide.pdf
8/176
STATS - marginal distributions, Histograms
8
The Weighting tab
The weightingtab allows you to adjust the distribution of the cases in the sample:
According to a Weighting variable already in the file. As a function of one or more theoretical percentages (calculation by adjustment).
Enter the theoretical percentage for each category and click on OK.
You can repeat this operation for another variable. In this way you get an adjustment as a function
of several variables with a simple weighting variable. This requires a calculation by successive
approximations, as shown in the window below:
Click on the optionsin thefirst window, to access the
options window for the
weighting system.
In the case of calculation byadjustment, in the available
variables window, choose the
variable serving to correct and
click on the button Define
Select the
weighting
type
For a category, enter
the theoreticalpercentage and hit
Enter
You can use the options
by default, or change theoptions for fitting
8/10/2019 SPAD7 Data Miner Guide.pdf
9/176
Descriptive Statistics with SPAD
9
Attention:The weighting calculation in the weighting tab page for a method is temporary (the
weighting variable is not saved). This approach lets you make quick tests and also to measure the
influence of the weighting on the results of the method. When a satisfactory weighting variable has
been obtained, it is preferable to create a permanent weighting variable with the menu Tools
Weightingof the main menu (Data Management Manual, paragraph 4.3).
Then in the weighting tab of a method, we will select this variable as the weight variable.
The Marginal distributions tab
We select the categorical variables in the list below.
The Parameters button allows you to display or not the categories without anyrespondent and to display or not the missing data as a new category.
The Statistics button displays summary statistics on the selected variables. For example,select the Region where the respondent lives (V1), then click on the statistics button. Awindow opens with statistics on the variable:
This statistics window shows for the categoricalvariables: the count and percentage associated foreach category. For the continuous variables; thestatistic window shows the count, the mean, thestandard deviation, as well as the minimum and
maximum.
8/10/2019 SPAD7 Data Miner Guide.pdf
10/176
STATS - marginal distributions, Histograms
10
The Histograms - Categorization tab
This tab allows you to select continuous variables both for histograms/summary statisticsand for categorization (marginal distributions of the variables values)
The Parameters button allows you to set global or specific parameters for the histogramscharacteristics such as the number of classes, the min and max bounds and the histogrambar width.
You can also select continuous variables for categorization. As a result, each distinct valueis displayed with its frequency.It is a preliminary step before splitting the continuous variable into classes.
It is not allowed to do both histograms and categorization for the same variable.
8/10/2019 SPAD7 Data Miner Guide.pdf
11/176
Descriptive Statistics with SPAD
11
The Marginal distributions by categories tab
This tab is useful for variables that are based on the same categories. The categories oftheses variables must have the same labels and must be ranked in the same order (we cancheck it with the marginal distributions tab).
The Parameters tab
This tab allows you to export the results into excel or not.
8/10/2019 SPAD7 Data Miner Guide.pdf
12/176
STATS - marginal distributions, Histograms
12
Once you have specified your request, then you validate the method by clicking on theOK button.
RESULTS
Results are accessible in the Execution view or by right-clicking on the method andchoosing the Results command. Then, depending on the method, different choices areavailable between the results editor, the Graphics gallery and Excel results.
The results editor
The Result Editoropens up in a new window.
The information list has a tree structure.
By clicking on you open a branch of the tree, and by clicking on you close abranch of the tree. You can use the mouse to navigate through the tree.
By double clicking on the title, you display the relevant results in the new window.
The Layout option of the File menu allows you to customize results display on the screen.The results can be printed or copied into your word processor, but they cannot be changedin this editor.
8/10/2019 SPAD7 Data Miner Guide.pdf
13/176
Descriptive Statistics with SPAD
13
THE RESULTS OF THE STATSMETHOD
SUMMARY STATISTICS OF THE VARIABLES
MARG I NA L D I S TRI BUT I ONS OF CATEGOR I CA L VAR I AB LES- - - - - - - - COUNTS - - - - - - - -ACTUAL %/ TOTAL %/ EXPR. HI STOGRAM OF WEI GHTS
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -1 . Re g i o n w h e r e t h e r e s p o n d e n t l i v e s
Rg1 - Par i s r egi on 56 17. 78 17. 78 *** ** ****Rg2 - Par i s Basi n 51 16. 19 16. 19 ** **** **Rg3 - nort h 24 7. 62 7. 62 ** **Rg4 - east 29 9. 21 9. 21 ** ***Rg5 - west 45 14. 29 14. 29 ** ** ** *Rg6 - south- west 38 12. 06 12. 06 ** ** **Rg7 - cent er east 36 11. 43 11. 43 *** ** *Rg8 - medi t er r anean 36 11. 43 11. 43 ** ** **
OVERALL 315 100. 00 100. 00- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2 . U r b a n a r e a s i ze ( n u m b e r o f i n h a b i t a n t s )
Agg1 - l ess t han 2000 84 26. 67 26. 67 ** **** ** ** ***Agg2 - 2001 t o 5000 18 5. 71 5. 71 ** *Agg3 - 5001 t o 10000 18 5. 71 5. 71 ** *Agg4 - 10001 t o 20000 12 3. 81 3. 81 **Agg5 - 20001 t o 50000 23 7. 30 7. 30 ** **Agg6 - 50001 t o 100000 18 5. 71 5. 71 ** *Agg7 - 100001 t o 200000 28 8. 89 8. 89 ** ** *Agg8 - mor e t han 200000 68 21. 59 21. 59 ** ** *** ***Agg9 - pari s, pari s. aggl o 46 14. 60 14. 60 *** *** *
OVERALL 315 100. 00 100. 00- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
3 . Se x o f r e s p o n d e n t
Sex1 - mal e 138 43. 81 43. 81 ** **** ** ** **** ** ** ***Sex2 - f emal e 177 56. 19 56. 19 *** *** *** *** *** *** *** *** **
OVERALL 315 100. 00 100. 00- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
MARG I NA L D I S TRI BUT I ONS CATEGOR I ZED VAR I AB LES
- - - - - - - - - - - COUNTS - - - - - - - - - - - -ACTUAL %/ TOTAL %/ EXPR. % CUM. HI STOGRAM OF WEI GHTS
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
8/10/2019 SPAD7 Data Miner Guide.pdf
14/176
STATS - marginal distributions, Histograms
14
2 6 . N u m b e r o f p e r s o n s in a h o u s i n g
1. 000 38 12. 06 12. 06 12. 06 ** ** **2. 000 90 28. 57 28. 57 40. 63 **** ** **** ** *3. 000 69 21. 90 21. 90 62. 54 *** ** ** ** *4. 000 71 22. 54 22. 54 85. 08 *** ** ** ** *5. 000 34 10. 79 10. 79 95. 87 ** ** *6. 000 7 2. 22 2. 22 98. 10 *7. 000 4 1. 27 1. 27 99. 37 *8. 000 2 0. 63 0. 63 100. 00 *
OVERALL 315 100. 00 100. 00- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -2 8 . N u m b e r o f c h i ld r e n
0. 000 70 22. 22 22. 22 22. 22 *** ** ** ** *1. 000 67 21. 27 21. 27 43. 49 *** ** ** ** *2. 000 94 29. 84 29. 84 73. 33 **** ** **** ** *3. 000 54 17. 14 17. 14 90. 48 ** ** ** **4. 000 9 2. 86 2. 86 93. 33 **5. 000 11 3. 49 3. 49 96. 83 **6. 000 2 0. 63 0. 63 97. 46 *7. 000 2 0. 63 0. 63 98. 10 *8. 000 2 0. 63 0. 63 98. 73 *9. 000 4 1. 27 1. 27 100. 00 *
OVERALL 315 100. 00 100. 00- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
SUMMARY STAT IST I CS OF CONT I NUOUS VAR I ABLES
TOTAL COUNT : 315TOTAL WEI GHT : 315. 00+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +| NUM . LABEL COUNT WEI GHT | MEAN STD. DEV. | MI NI MUM MAXI MUM | MI N. 2 MAX. 2 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +| 4 . Age of r espondent 315 315. 00 | 43. 756 16. 581 | 18. 000 86. 000 | 19. 000 83. 000 || 41 . Fami l y, chi l dren : i 315 315. 00 | 6. 651 1. 062 | 1. 000 7. 000 | 2. 000 6. 000 || 42 . Work, prof essi on : i 315 315. 00 | 5. 956 1. 544 | 1. 000 7. 000 | 2. 000 6. 000 || 43 . Free t i me, r el ax: i m 315 315. 00 | 5. 295 1. 454 | 0. 000 7. 000 | 1. 000 6. 000 || 44 . Fri ends, acquai ntanc 315 315. 00 | 5. 190 1. 424 | 1. 000 7. 000 | 2. 000 6. 000 || 45 . Rel ati ves, brothers, 315 315. 00 | 5. 629 1. 436 | 1. 000 7. 000 | 2. 000 6. 000 || 46 . Rel i gi on : i mpor t anc 315 315. 00 | 3. 241 2. 022 | 0. 000 7. 000 | 1. 000 6. 000 || 47 . Pol i t i c, pol i t i cal l 315 315. 00 | 3. 111 1. 770 | 0. 000 7. 000 | 1. 000 6. 000 || 50 . Stat e benef i t s : ave 283 283. 00 | 533. 795 926. 899 | 0. 000 5100. 000 | 15. 000 4980. 000 || 51 . Sal ary of t he r espon 267 267. 00 | 4408. 547 4575. 339 | 0. 000 40000. 000 | 300. 000 24000. 000 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +
H I S TOGRAMS OF CONT I NUOUS VAR I AB LES
V AR I A B L E 4 : A g e o f r e s p o n d e n t
LOW. LI MI T| MEAN | WEI GHT| HI STOGRAM ( BETWEEN 16. 00 I NCLUDED AND 88. 00 EXCLUDED,BAR I NTERVAL WI DTH = 2. 00)
- - - - - - - - - - +- - - - - - - - - - +- - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -16. 00 | 20. 93 | 28 | XXXXXXXXXXXXXX24. 00 | 27. 85 | 68 | XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX32. 00 | 35. 31 | 58 | XXXXXXXXXXXXXXXXXXXXXXXXXXXXX40. 00 | 43. 35 | 37 | XXXXXXXXXXXXXXXXXX48. 00 | 52. 08 | 39 | XXXXXXXXXXXXXXXXXXX56. 00 | 59. 06 | 33 | XXXXXXXXXXXXXXXX64. 00 | 67. 09 | 33 | XXXXXXXXXXXXXXXX72. 00 | 74. 71 | 14 | XXXXXXX80. 00 | 82. 20 | 5 | XX
+- - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| | OVERALL | HI STOGRAM || | ( FROM 18. 00 TO 86. 00) | ( FROM 16. 00 TO 88. 00) |+- - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| WEI GHT | 315. 00 | 315. 00 || MEAN | 43. 756 | 43. 756 || STD. DEV. | 16. 581 | 16. 440 |+- - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - +WEI GHTS OF REMAI NI NG CASES : STRI CTLY LESS THAN . . . . . 16. 00 : 0. 00
GREATER THAN OR EQUAL TO 88. 00 : 0. 00
8/10/2019 SPAD7 Data Miner Guide.pdf
15/176
8/10/2019 SPAD7 Data Miner Guide.pdf
16/176
DEMOD Automatic Characterization of a qualitative variable
16
DEMODAUTOMATIC CHARACTERIZATION OF AQUALITATIVE VARIABLE
This extremely powerful procedure provides the automatic characterization of anycategorical variable.This is the IDEAL procedure to find out everything about a variable in one question. Thewell-structured outputs form comprehensive study reports.
One can characterize either each category of a variable, or globally the variable itself. Allthe elements available (active and illustrative) may participate in the characterization: thecategorical variables of the categorical variables, the categorical variables themselves, andthe continuous variables.
The following table summarizes all the capabilities of the DEMOD procedure:
Elements to characterize Characterizing elements
Groups of cases (defined by the categories of thevariable to characterize)
We describe each category with all its significant characterizingelements.
categories
categorical variables
continuous variables
The categorical variable to characterize
We cross the variable with all the characterizing elements anddisplay only the elements that are dependant from the variableto characterize.
categoriescategorical variables
continuous variables
A group of cases is defined by a category of the variable to characterize. We have as muchgroups of cases as the number of categories of the variable to characterize.
Double-click on the demod icon in order to access the settings of the method.
8/10/2019 SPAD7 Data Miner Guide.pdf
17/176
Descriptive Statistics with SPAD
17
THE VARIABLESTAB
The scrolling menu allows you to select the variables to characterize and the characterizingelements.
In this example, the variable to characterize is V8 The family is the only place where youfeel well. All the other variables whether categorical or continuous are selected ascharacterizing elements.
8/10/2019 SPAD7 Data Miner Guide.pdf
18/176
DEMOD Automatic Characterization of a qualitative variable
18
THE PARAMETERSTAB
This tab allows you to modify the default parameters for the DEMOD method.
Once you have set the parameters, then you validate the method by clicking on the OKbutton and run the chain.
8/10/2019 SPAD7 Data Miner Guide.pdf
19/176
Descriptive Statistics with SPAD
19
THE DEMODRESULTS
THE DEMOD-5EXCEL SHEET
% of category in group :Frequency of the category in the group divided by the frequency of the group
% of category in set:Frequency of the category in the population
% of group in category:Frequency of the group in the category divided by the frequency of category
Test-value:When the test-value is greater than zero, it means that the category is over-represented in the group. The category is under-represented if the test-value is
negative. By default, SPAD displays only characterizing elements with a test-valuegreater equal than 1.96 (i.e. a probability equal to 0.025 for an unilateral test).
Probability:The probability evaluates the scale of the difference between the percentage of thecategory in the group and the percentage of the category in the population. Lower isthe probability, more significant is the difference and greater is the test-value relatedto this probability (the test-value is the fractile of the normal law that corresponds tothe same probability).
Weight:Weight of the cases in the category
Characterisation by categories of groups of
The family is the only place where you feel well
Group: Yes (Count: 230 - Percentage: 73.02)
Variable labelCaracteristic
categories
% of
category in
group
% of
category in
set
% of group
in categoryTest-value Probability Weight
Marital status married 78,26 70,79 80,72 4,55 0,000 223
Do you watch TV every day 62,61 55,87 81,82 3,83 0,000 176
Opinion about marriage indissoluble 31,30 25,71 88,89 3,79 0,000 81
Are you worried about the risk of a nuclear plant accident a lot 32,61 28,25 84,27 2,76 0,003 89
Do you have children yes 81,30 77,14 76,95 2,68 0,004 243
Are you worried about the risk of a road accident a lot 40,87 36,51 81,74 2,55 0,005 115
Educational level of the respondent primary school 20,43 17,14 87,04 2,50 0,006 54
Current situation of the respondent retired people 20,43 17,14 87,04 2,50 0,006 54
Are you worried about the risk of a mugging a lot 33,04 29,21 82,61 2,38 0,009 92
Do you think the society needs to change I do not know 11,30 9,21 89,66 2,01 0,022 29
Current situation of the respondent unemployed person 5,22 7,30 52,17 -2,02 0,022 23
Are you worried about the risk of a mugging not at all 23,04 26,35 63,86 -2,02 0,022 83
Current situation of the respondent student 2,17 3,81 41,67 -2,06 0,020 12Educational level of the respondent technical and GCSE 3,48 5,40 47,06 -2,10 0,018 17
Marital status cohabitation 3,04 5,08 43,75 -2,30 0,011 16
Do you have work-personal life problems yes 20,43 24,13 61,84 -2,33 0,010 76
Urban area size (number of inhabitants) more than 200000 17,83 21,59 60,29 -2,46 0,007 68
Your opinion on the life conditions in the future improving a lot 3,91 6,67 42,86 -2,81 0,002 21
Do you watch TV quite often 19,57 24,13 59,21 -2,90 0,002 76
Marital status single 9,57 13,33 52,38 -2,93 0,002 42
Do you have children no 17,39 21,90 57,97 -2,96 0,002 69
Opinion about marriage dissolved if agreem 30,87 36,19 62,28 -3,07 0,001 114
Are you worried about the risk of a road accident a little 15,65 20,32 56,25 -3,13 0,001 64
Educational level of the respondent more high school 9,13 13,65 48,84 -3,49 0,000 43
8/10/2019 SPAD7 Data Miner Guide.pdf
20/176
DEMOD Automatic Characterization of a qualitative variable
20
THE DEMOD-13EXCEL SHEET
Category mean:Weighted mean of the variable in the category
Overall mean:Weighted mean of the category in the overall population
Interpretation:One can see that the Age of respondent is the most characterizing continuousvariable of the group who answered yes to the question The family is the onlyplace where you feel well .This group is significantly older than the average respondent with an average age of46 years old, compared to 43.75 years old for the overall population.
Characterisation by continuous variables of categories of
The family is the only place where you feel wellYes (Weight = 230.00 Count = 230 )
Characteristic variablesCategory
mean
Overall
mean
Category Std.
deviation
Overall Std.
deviationTest-value Probability
Age of respondent 46,100 43,756 16,752 16,581 4,12 0,000
Religion : importance given 3,383 3,241 2,081 2,022 2,04 0,021Relatives, brothers, sisters ... : importance given 5,726 5,629 1,380 1,436 1,98 0,024
Salary of the respondent 4044,990 4408,550 3690,140 4575,340 -2,09 0,018
No (Weight = 83.00 Count = 83 )
Characteristic variablesCategory
mean
Overall
mean
Category Std.
deviation
Overall Std.
deviationTest-value Probability
Salary of the respondent 5377,780 4408,550 6311,000 4575,340 2,10 0,018
Number of children 1,542 1,860 1,772 1,671 -2,02 0,022
Age of respondent 36,855 43,756 13,971 16,581 -4,41 0,000
8/10/2019 SPAD7 Data Miner Guide.pdf
21/176
Descriptive Statistics with SPAD
21
DESCO - AUTOMATIC CHARACTERIZATION OF ACONTINUOUS VARIABLE
This procedure provides the statistical characterization of one or more continuousvariables by:
The other continuous variables, with the support of correlations.The categories of the categorical variables, by comparison of means.The categorical variables themselves, with the help of Fisher's statistic.
THE VARIABLESTAB
A continuous variable can be characterized with the other variables whether categorical orcontinuous, called characterizing variables.
The scrolling menu allows you to select the variables to characterize and the characterizingelements.
8/10/2019 SPAD7 Data Miner Guide.pdf
22/176
DESCO - Automatic Characterization of a continuous variable
22
THE PARAMETERSTAB
The parameter Minimum relative weight of charactering elements is useful if you donot want to display characterizing categories whose the frequency in the population islower than 2% (threshold by default).
Display the categories whose therelated probabilities are lower
equal than 0.025. It correspondsto a test-value of 1.96.
8/10/2019 SPAD7 Data Miner Guide.pdf
23/176
Descriptive Statistics with SPAD
23
THE DESCORESULTS
CHARACTERISATION OF CONTINUOUS VARIABLES
DESCR I PT ION OF : Sa l a r y o f t h e r e s p o n d e n t
DESCRI PT I ON BY CATEGORI ES
OF CONT I NUOUS VAR I ABLE : S a l a r y o f t h e r e s p o n d e n t
ON 267. 0 ACTI VE CASES MEAN = 4408. 547STD. DEV. = 4575. 339
+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - +| TEST PROB. | MEAN STD. DEV. | CATEGORI ES | VARI ABLE LABEL | WEI GHT || VALUE | | | | |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - +| 8.16 0.000 | 7060. 53 4921. 82 | yes, f ul l t i me | At t he moment , do you have a professi onal acti vi t y | 114.00 || 7. 58 0. 000 | 6496. 32 4736. 16 | empl oyed | Curr ent si t uat i on of t he r espondent | 136. 00 || 7. 28 0. 000 | 6617. 07 4883. 30 | no | Have you been unempl oyed dur i ng t he l ast t wel ve months | 123. 00 || 6. 69 0. 000 | 6533. 19 5486. 12 | mal e | Sex of r espondent | 117. 00 || 4. 60 0. 000 | 6452. 63 5414. 05 | no | Do you have work- per sonal l i f e probl ems | 76. 00 || 4. 25 0. 000 | 6698. 25 6784. 83 | qui t e of t en | Do you watch TV | 57.00 || 3. 73 0. 000 | 6331. 15 3880. 83 | yes | Do you have work- per sonal l i f e probl ems | 61. 00 || 3. 47 0. 000 | 6797. 37 6049. 03 | more hi gh school | Educat i onal l evel of t he r espondent | 38. 00 || 3. 35 0. 000 | 4860. 06 4834. 30 | no | Have you r ecentl y been depressed | 217. 00 || 3. 18 0. 001 | 5291. 85 5418. 67 | no | Have you r ecentl y been ner vous | 135. 00 || 3. 10 0. 001 | 6950. 00 5579. 71 | yes | Do you have a pi ano | 28. 00 || 2. 89 0. 002 | 6529. 41 5935. 61 | yes | Do you have a second house | 34.00 || 2. 88 0. 002 | 6330. 00 7536. 22 | yes | Do you have a vi deo-t ape | 40. 00 || 2. 65 0. 004 | 5937. 26 6786. 27 | Par i s r egi on | Regi on wher e t he r espondent l i ves | 51. 00 || 2. 43 0. 008 | 5179. 34 5246. 40 | a l ot | Has t he r espondent been i nter ested by t he survey | 117. 00 || 2.17 0.015 | 6906. 67 4638. 46 | a l ot bet t er | Your opi ni on on t he evol uti on of t he dai l y per sonal l i f e | 15. 00 || 2. 10 0. 018 | 5377. 78 6311. 00 | No | The f ami l y i s t he onl y pl ace wher e you f eel wel l | 72. 00 || - 2.01 0.022 | 3301. 51 2735. 77 | qui t e agree | Pers ons l i ke me oft en f eel al one | 55. 00 || - 2. 09 0. 018 | 4044. 99 3690. 14 | Yes | The f ami l y i s t he onl y pl ace wher e you f eel wel l | 193. 00 || - 2.14 0.016 | 3769. 06 3573. 01 | a l ot | Are you worr i ed about t he ri sk of havi ng a seri ous i l l ness | 125.00 || - 2.23 0.013 | 3196. 12 3440.69 | a l ot worse | Your opi ni on on t he evol uti on of French peopl e l i f e l evel | 56. 00 || - 2.47 0.007 | 3319. 48 2735. 76 | a l ot | Are you worr i ed about t he ri sk of a nucl ear pl ant acci dent | 77. 00 || - 2. 54 0. 006 | 1971. 43 1864. 75 | unempl oyed per son | Curr ent si t uat i on of t he r espondent | 21. 00 || - 2.57 0.005 | 760.00 1356. 61 | st udent | Curr ent si t uat i on of t he respondent | 10. 00 || - 2.66 0.004 | 2606. 41 3255. 77 | a l ot worse | Your opi ni on on t he evol uti on of t he dai l y per sonal l i f e | 39. 00 || - 2. 86 0. 002 | 3726. 34 3277. 03 | every day | Do you watch TV | 155. 00 || - 2. 88 0. 002 | 4069. 97 3721. 48 | no | Do you have a vi deo-t ape | 227. 00 || - 2. 89 0. 002 | 4099. 07 4253. 85 | no | Do you have a second house | 233. 00 || - 3. 10 0. 001 | 4110. 81 4346. 66 | no | Do you have a pi ano | 239. 00 || - 3. 18 0. 001 | 3505. 18 3271. 07 | yes | Have you r ecentl y been ner vous | 132. 00 || - 3. 35 0. 000 | 2449. 00 2373. 53 | yes | Have you r ecentl y been depressed | 50. 00 || - 3.49 0.000 | 2263. 04 2043. 80 | no qual i f i cati ons | Educat i onal l evel of t he respondent | 46. 00 || - 4. 36 0. 000 | 832. 14 1563. 89 | I have never worked | At t he moment , do you have a prof essi onal acti vi t y | 28. 00 || - 4. 85 0. 000 | 2691. 10 3397. 40 | no | At t he moment , do you have a prof essi onal acti vi t y | 103. 00 || - 6.54 0.000 | 488.54 1396. 02 | housewi f e w/ o prof. | Curr ent si t uat i on of t he respondent | 48. 00 || - 6. 69 0. 000 | 2751. 33 2742. 02 | f emal e | Sex of r espondent | 150. 00 || - 7. 28 0. 000 | 2311. 41 3196. 29 | mi ssi ng category | Do you have work- per sonal l i f e probl ems | 130. 00 || - 7. 28 0. 000 | 2311. 41 3196. 29 | mi ssi ng cat egory | Have you been unempl oyed dur i ng t he l ast t wel ve months | 130. 00 |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - +| | 4408. 55 4575. 34 | OVERALL | 267. 00 |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - +
DESCR I PT I ON BY CATEGOR ICAL VAR I ABLES
OF VARI A B L E : S a la r y o f t h e r e s p o n d e n t
+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - +| TEST- VALUE | PROBA. | NUM . VARI ABLE LABEL | DEN. DEG. FREE. | FI SHER|+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - +| 8. 56 | 0. 000 | 5 . Cur rent si t uati on of t he r espondent | 261 | 21. 44|| 8. 48 | 0. 000 | 18 . At t he moment , do you have a prof essi onal acti vi t y | 263 | 31. 95|| 7. 50 | 0. 000 | 20 . Have you been unempl oyed dur i ng t he l ast t wel ve mont hs | 264 | 35. 01|| 7. 28 | 0. 000 | 19 . Do you have work- per sonal l i f e probl ems | 264 | 32. 89|| 6. 98 | 0. 000 | 3 . Sex of r espondent | 265 | 53. 58|| 3. 48 | 0. 000 | 7 . Educat i onal l evel of t he r espondent | 258 | 3. 87|| 3. 47 | 0. 000 | 33 . Do you watch TV | 263 | 6. 57|| 3. 38 | 0. 001 | 24 . Have you r ecentl y been depressed | 265 | 11. 69|| 3. 21 | 0. 001 | 23 . Have you r ecentl y been ner vous | 265 | 10. 50|| 3. 12 | 0. 002 | 16 . Do you have a pi ano | 265 | 9. 94|| 2. 90 | 0. 004 | 17 . Do you have a second house | 265 | 8. 58|| 2. 89 | 0. 004 | 15 . Do you have a vi deo- t ape | 265 | 8. 50|
| 2. 04 | 0. 021 | 52 . Has t he r espondent been i nterest ed by t he survey | 264 | 3. 92|| 1. 92 | 0. 054 | 21 . Have you r ecentl y had headaches | 265 | 3. 74|| 1. 77 | 0. 039 | 30 . Your opi ni on on t he evol ut i on of t he dai l y per sonal l i f e | 261 | 2. 38|| 1. 56 | 0. 059 | 25 . Ar e you sat i sf i ed of your heal t h | 263 | 2. 51|| 1. 33 | 0. 092 | 40 . Ar e you worr i ed about t he r i sk of a nucl ear pl ant acci dent | 263 | 2. 16|| 1. 31 | 0. 189 | 29 . Do you r egul arl y i mpose r est r i cti ons | 265 | 1. 73|| 1. 24 | 0. 107 | 8 . The f ami l y i s t he onl y pl ace where you f eel wel l | 264 | 2. 24|| 1. 12 | 0. 132 | 1 . Regi on wher e t he r espondent l i ves | 259 | 1. 61|| 1. 07 | 0. 143 | 39 . Ar e you worr i ed about t he r i sk of umempl oyment | 263 | 1. 82|| 1. 03 | 0. 151 | 35 . The comput er sci ence di f f usi on i s. . . | 263 | 1. 78|| 1. 02 | 0. 154 | 34 . Do you t hi nk t he soci ety needs t o change | 264 | 1. 86|| 0. 92 | 0. 179 | 49 . Per sons l i ke me oft en f eel al one | 263 | 1. 64|| 0. 89 | 0. 186 | 31 . Your opi ni on on t he evol ut i on of French peopl e l i f e l evel | 260 | 1. 48|| 0. 86 | 0. 194 | 36 . Ar e you wor r i ed about t he ri sk of havi ng a ser i ous i l l ness| 263 | 1. 58|| 0. 79 | 0. 428 | 22 . Have you r ecentl y had backaches | 265 | 0. 63|| 0. 78 | 0. 217 | 11 . Ar e you sat i sf i ed of your housi ng | 263 | 1. 49|| 0. 65 | 0. 257 | 37 . Ar e you worr i ed about t he r i sk of a muggi ng | 263 | 1. 35|| 0. 45 | 0. 327 | 13 . Occupat i on st atus of housi ng | 262 | 1. 16|| 0. 22 | 0. 412 | 27 . Do you have chi l dren | 264 | 0. 88|
| 0. 13 | 0. 446 | 38 . Ar e you worr i ed about t he r i sk of a r oad acci dent | 263 | 0. 89|| 0. 10 | 0. 459 | 6 . Mari t al status | 262 | 0. 91|| 0. 08 | 0. 469 | 9 . Opi ni on about marr i age | 263 | 0. 85|| - 0. 15 | 0. 561 | 32 . Your opi ni on on t he l i f e condi t i ons i n t he f ut ur e | 261 | 0. 79|| - 0. 21 | 0. 585 | 12 . Are you sat i sf i ed of your dai l y l i f e | 263 | 0. 65|
8/10/2019 SPAD7 Data Miner Guide.pdf
24/176
DESCO - Automatic Characterization of a continuous variable
24
| - 0. 23 | 0. 591 | 14 . The housi ng expenses are f or you | 260 | 0. 77|| - 0. 53 | 0. 702 | 10 . Housekeepi ng works, t ake car e of chi l dren. . . | 263 | 0. 47|| - 0. 59 | 0. 724 | 2 . Ur ban area si ze ( number of i nhabi t ant s) | 258 | 0. 66|| - 0. 64 | 0. 740 | 48 . Your opi ni on on t he j ust i ce r unni ng i n 1986 | 261 | 0. 55|+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - +
SUMMARY STAT I S T I CS OF CONT I NUOUS VAR I AB LESTOTAL COUNT 315TOTAL WEI GHT 315. 00+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - +| NUM . I DEN - LABEL COUNT WEI GHT | MEAN STD. DEV. | MI NI MUM MAXI MUM |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - +| 4 . Age - Age of r espondent 267 267. 00 | 43. 61 16. 88 | 18. 00 83. 00 || 26 . Nbpr - Number of persons i n 267 267. 00 | 3. 04 1. 43 | 1. 00 8. 00 || 28 . Nbef - Number of chi l dren 267 267. 00 | 1. 85 1. 69 | 0. 00 9. 00 || 41 . Fami - Fami l y, chi l dr en : i 267 267. 00 | 6. 65 1. 07 | 1. 00 7. 00 || 42 . Trav - Work, pr of essi on : i 267 267. 00 | 5. 90 1. 57 | 1. 00 7. 00 || 43 . Loi s - Free t i me, r el ax: i m 267 267. 00 | 5. 30 1. 43 | 0. 00 7. 00 || 44 . Ami s - Fri ends, acquai nt anc 267 267. 00 | 5. 18 1. 41 | 1. 00 7. 00 || 45 . Par t - Rel at i ves, br ot her s, 267 267. 00 | 5. 63 1. 44 | 1. 00 7. 00 || 46 . Rel i - Rel i gi on : i mport anc 267 267. 00 | 3. 15 1. 96 | 1. 00 7. 00 || 47 . Pol i - Pol i t i c, pol i t i cal l 267 267. 00 | 3. 15 1. 79 | 1. 00 7. 00 || 50 . PrFm - St ate benef i t s : ave 244 244. 00 | 583. 10 966. 04 | 0. 00 5100. 00 |
| 51 . Sal r - Sal ary of t he r espon 267 267. 00 | 4408. 55 4575. 34 | 0. 00 40000. 00 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - +
CORRELAT I ONS W I TH CONT I NUOUS VAR I A B LES
OF V ARI A B L E : S a la r y o f t h e r e s p o n d e n t
+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - +| TEST-VALUE | PROB. | CORRELATI ON | NUM . VARI ABLE LABEL | WEI GHT |+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - +| 99. 90 | 0. 000 | 1. 000 | 51 . Sal ary of t he r espondent | 267. 000 || - 2. 53 | 0. 006 | - 0. 162 | 50 . St ate benef i t s : average mont hl y amount | 244. 000 |+- - - - - - - - - - - - +- - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - +
8/10/2019 SPAD7 Data Miner Guide.pdf
25/176
Descriptive Statistics with SPAD
25
TABLE - CROSS TABLES
With this procedure, you can obtain in one go an unlimited number of tables for members,
means or frequencies.
THE TABLESTAB
This tab allows you to define the cross tables to create.
The tables cells can display weights, % raw, % column, average and standard deviationdepending on the parameters and settings.
The scrolling menu allows you to define the cross tables you want to display with orwithout supplementary information such as mean or frequency related to anothervariable.
If a variable appears in the Meanscolumn, each cell of the cross table will display theweighted average corresponding to the cases of the cell.
If a variable appears in the Frequencies column, each cell of the cross table will displaythe weighted sum of the values of the variable for the cases of the cell.
By clicking on local filter, you can define a specific filter for each command.
8/10/2019 SPAD7 Data Miner Guide.pdf
26/176
TABLE - Cross tables
26
THE PARAMETERSTAB
8/10/2019 SPAD7 Data Miner Guide.pdf
27/176
Descriptive Statistics with SPAD
27
THE TABLERESULTS
CROSS-TABS
L I S T OF COMMANDS
COMMAND 1
TABLE 1 BY ROW : 9 . Opi ni on about mar r i age
BY COLUMN : 3 . Sex of r espondentCOMMAND 2
TABLE 2 BY ROW : 9 . Opi ni on about mar r i ageBY COLUMN : 3 . Sex of r espondentMEANS OF : 4 . Age of r espondent
L I ST OF CROSS -TABS
T A B LE 1 B Y ROW : O p i n i o n a b o u t m a r r i a g e TO TA L W EI GH T: 3 1 5 .
B Y CO LUMN : S e x o f r e s p o n d e n t
WEI GHT | mal e | f emal e | OVERALLCOLUMN PERC. | | |
ROW PERC. | | |- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -
| 41 | 40 | 81i ndi ssol ubl e | 29. 71 | 22. 60 | 25. 71
| 50. 62 | 49. 38 | 100. 00- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -
| 39 | 69 | 108di ssol ved seri ous pb | 28. 26 | 38. 98 | 34. 29| 36. 11 | 63. 89 | 100. 00
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 50 | 64 | 114
di ssol ved i f agr eem | 36. 23 | 36. 16 | 36. 19| 43. 86 | 56. 14 | 100. 00
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 8 | 4 | 12
I do not know | 5. 80 | 2. 26 | 3. 81| 66. 67 | 33. 33 | 100. 00
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 138 | 177 | 315
OVERALL | 100. 00 | 100. 00 | 100. 00| 43. 81 | 56. 19 | 100. 00
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -KHI 2 = 6. 67 / 3 DEGREES OF FREEDOM / 0 EXPECTED FREQUENCI ES LESS THAN 5PROB. ( KHI 2 > 6. 67 ) = 0. 083 / TEST- VALUE = 1. 38- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -T A B LE 2 B Y ROW : O p i n i o n a b o u t m a r r i a g e TO TA L W EI GH T: 3 1 5 .
B Y CO LUMN : S e x o f r e s p o n d e n t
M E ANS O F : A g e o f r e s p o n d e n t
WEI GHT | mal e | f emal e | OVERALLMEAN | | |
STD. DEV. | | |- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -
| 41 | 40 | 81i ndi ssol ubl e | 45. 829 | 48. 325 | 47. 062
| 17. 234 | 17. 084 | 17. 206- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -
| 39 | 69 | 108di ssol ved ser i ous pb | 43. 000 | 46. 362 | 45. 148
| 14. 739 | 18. 260 | 17. 148
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 50 | 64 | 114
di ssol ved i f agr eem | 41. 300 | 38. 484 | 39. 719| 15. 442 | 14. 330 | 14. 893
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 8 | 4 | 12
I do not know | 50. 250 | 41. 250 | 47. 250| 15. 618 | 8. 842 | 14. 377
- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - -| 138 | 177 | 315
OVERALL | 43. 645 | 43. 842 | 43. 756| 16. 007 | 17. 015 | 16. 581
8/10/2019 SPAD7 Data Miner Guide.pdf
28/176
BIVAR - Bivariate Analysis
28
BIVAR - BIVARIATE ANALYSIS
The BIVAR procedure lets you characterize a sample from the viewpoint of two particular
continuous variables (AXES variables or base variables). The sample can be described bycategorical variables and by other continuous variables.
THE VARIABLESTAB
With this tab, the SPAD user selects the two continuous variables for the bivariateanalysis.
It is possible to include in the analysis some supplementary variables (whether continuousor categorical).
The graph editor of the BIVAR method is the same that is used for factorial analyses.The capabilities of the graph editor will be described in the section Factorial analyses.
8/10/2019 SPAD7 Data Miner Guide.pdf
29/176
Descriptive Statistics with SPAD
29
8/10/2019 SPAD7 Data Miner Guide.pdf
30/176
BIVAR - Bivariate Analysis
30
FACTORIAL ANALYSES WITH SPAD
PCA: Principal Component Analysis (PCA)
SCA: Simple Correspondence Analysis (SCA)
MCA: Multiple Correspondence Analysis (MCA)
DEFAC: Factors description
SPAD provides the main techniques in multidimensional exploratory analysis, combinedwith procedures for clustering. One area of application concerns the processing of large-scale surveys in market research and socio-economic research.
The main applications of factorial analyses are: (1) to reduce the number of dimensionsand (2) to detect structure in the relationships between variables. Therefore, factor analysisis applied as a data reduction or structure detection method.
8/10/2019 SPAD7 Data Miner Guide.pdf
31/176
Factorial Analyses with SPAD
31
VOCABULARY
Active Variables Variables used to perform the factorial analysis
Supplementary variables Variables that are not used to perform the original analysisbut used to illustrate the main results of the analysis.
Contribution Criteria that measures the contribution of an element(category, variable, frequency or case) to the inertia (totalinertia, dimensions inertia)
Cosines Criteria that measures the quality of representation of anelement (category, variable, case or frequency) for eachdimension.
Axes, factors, dimensions These terms correspond to the factors computed orextracted by the analysis. Consecutive factors areuncorrelated or orthogonal to each other. Factors areconsecutively extracted by maximizing the remainingvariability in the active data.
8/10/2019 SPAD7 Data Miner Guide.pdf
32/176
PCA - Principal Component Analysis
32
PCA - PRINCIPAL COMPONENT ANALYSIS
This method performs the principal component analysis of a sample of cases describedwith continuous variables. The analysis can be performed on original variables or normedvariables (centered and normalized) whether the active variables are on the same scale ornot.It is possible to introduce supplementary elements such as: cases, other continuousvariables or categorical variables.
Import the Sba dataset Cars.sba.
Drag and drop the PCA method on the Cars dataset as follows.
The two goals of the analysis are:
Capture the main interrelationships between correlated variables in small number
of summary characteristics: dimension reduction
Identify automobile models with similar attributes: Useful step for developingclustering or classification model
The dataset contains measurements on 6 variables for 24 models: cubic capacity, power,speed, weight, length and width.
Due to strong differences in measurement scales, we will perform a PCA on normedvariables.
KIDENCubic
capacity Power Speed Weight Length Width
Honda civic 1396 90 174 850 369 166
Peugeot 205 Rallye 1294 103 189 805 370 157
Seat Ibiza SX I 1461 100 181 925 363 161
Citron AX Sport 1294 95 184 730 350 160
Renault 19 1721 92 180 965 415 169
Fiat Tipo 1580 83 170 970 395 170
Peugeot 405 1769 90 180 1080 440 169
Renault 21 2068 88 180 1135 446 170
Citron BX 1769 90 182 1060 424 168
Opel Omega 1998 122 190 1255 473 177
Peugeot 405 Break 1905 125 194 1120 439 171
Ford Sierra 1993 115 185 1190 451 172
8/10/2019 SPAD7 Data Miner Guide.pdf
33/176
Factorial Analyses with SPAD
33
Renault Espace 1995 120 177 1265 436 177
Nissan Vanette 1952 87 144 1430 436 169
VW Caravelle 2109 112 149 1320 457 184
Audi 90 Quattro 1994 160 214 1220 439 169
BMW 530i 2986 188 226 1510 472 175
Rover 827i 2675 177 222 1365 469 175
Renault 25 2548 182 226 1350 471 180BMW 325iX 2494 171 208 1600 432 164
Ford Scorpio 2933 150 200 1345 466 176
Fiat Uno 1116 58 145 780 364 155
Peugeot 205 1580 80 159 880 370 156
Ford Fiesta 1117 50 135 810 371 162
The matrix plot, performed with the STATS method, is a good overview of the pair wiserelationships between variables.
8/10/2019 SPAD7 Data Miner Guide.pdf
34/176
PCA - Principal Component Analysis
34
The SETTING OPTIONS
THE VARIABLESTAB
This tab allows the SPAD user to define the following elements:
Active continuous variables Supplementary continuous variables Supplementary categorical variables
In our example, we select all the available continuous variables as active. We do not haveany more available variable for supplementary information.
8/10/2019 SPAD7 Data Miner Guide.pdf
35/176
Factorial Analyses with SPAD
35
THE CASESTAB
The Cases tab allows you to define the role of the cases in the analysis.
The cases retained are the ACTIVE cases, those not retained are called ILLUSTRATIVESor SUPPLEMENTARY. By using the selections by list or interval, we can also define theABANDONNED cases (which are neither active nor illustrative).
All the calculations that lead to the factorial planes, to the hierarchical classification treeand to the final partitions are carried out only on the active cases. The illustrative casesmay be projected onto the factorial planes constructed, and re-assigned during thepartition into classes, of which they are the closest or form a missing data class.
The cases abandoned are completely ignored in the calculations and affected automaticallyto a missing data class in the partitions.
If you conduct many analyses on a particular sub-population, it may be preferable tocreate a BASE corresponding this one. To do this, use the Recoding chain in the Toolsmenu.
In the Cars example, weselect all the cases as active.
THE PARAMETERSTAB
NORMED PCAAND NOT NORMED PCA
Cases coordinates are notdisplayed by default.
8/10/2019 SPAD7 Data Miner Guide.pdf
36/176
PCA - Principal Component Analysis
36
Normed PCA means that all the active variables are previously centered and standardizedby SPAD. The consequence is that all the variables are assigned the same contribution tothe overall inertia.When the PCA is not normed (only centered), the distance between the variable and theorigin is equal to the variance of the variable.
Most of the time, it is advised to perform a normed analysis in order to assign the sameimportance to each active variable. It is particularly recommended when themeasurements scales are different.
In our example, we can see that the measurements scales are strongly different. Thus, wewill perform a normed PCA.
RETAINED COORDINATES
The number of retained coordinates is useful for the methods that follow the PCA in thechain. These methods can be DEFAC (factors description) and RECIP/SEMIS (clustering).
8/10/2019 SPAD7 Data Miner Guide.pdf
37/176
Factorial Analyses with SPAD
37
THE PCARESULTS
PRINCIPAL COMPONENTS ANALYSIS
SUMMARY STAT I S T I CS OF CONT I NUOUS VAR I AB LES
TOTAL COUNT : 24 TOTAL WEI GHT : 24. 00+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +
| NUM . I DEN - LABEL COUNT WEI GHT | MEAN STD. DEV. | MI NI MUM MAXI MUM |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +| 1 . CYLI - Cubi c capaci t y 24 24. 00 | 1906. 13 516. 79 | 1116. 00 2986. 00 || 2 . PUI S - Power 24 24. 00 | 113. 67 37. 97 | 50. 00 188. 00 || 3 . VI TE - Speed 24 24. 00 | 183. 08 24. 68 | 135. 00 226. 00 || 4 . POI D - Wei ght 24 24. 00 | 1123. 33 243. 20 | 730. 00 1600. 00 || 5 . LONG - Length 24 24. 00 | 421. 58 40. 47 | 350. 00 473. 00 || 6 . LARG - Wi dth 24 24. 00 | 168. 83 7. 49 | 155. 00 184. 00 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +CORRELAT I ON MATR I X
| CYLI PUI S VI TE POI D LONG LARG- - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -CYLI | 1. 00PUI S | 0. 86 1. 00VI TE | 0. 69 0. 89 1. 00POI D | 0. 90 0. 77 0. 51 1. 00LONG | 0. 86 0. 69 0. 53 0. 86 1. 00LARG | 0. 71 0. 55 0. 36 0. 70 0. 86 1. 00- - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| CYLI PUI S VI TE POI D LONG LARG
The linear correlation coefficient points out the intensity of the relationship between twocontinuous variable. The coefficient correlation ranges from 1 to 1. The closer thecorrelation coefficient is to +1 or -1, the more closely the two variables are related.
TEST - VA LUES M ATR I X
| CYLI PUI S VI TE POI D LONG LARG- - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -CYLI | 99. 99PUI S | 6. 35 99. 99
VI TE | 4. 19 7. 06 99. 99POI D | 7. 14 4. 99 2. 74 99. 99LONG | 6. 42 4. 14 2. 90 6. 40 99. 99LARG | 4. 34 3. 05 1. 86 4. 25 6. 41 99. 99- - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
| CYLI PUI S VI TE POI D LONG LARG
This matrix is related to the previous one. SPAD translates the test of correlation in termsof test-value. In this example, the higher is the test-value, the more closely are the twovariables. We can consider that a test-value lower than 2 means no linear relationshipbetween the two variables.
E I GENVA LUES
COMPUTATI ONS PRECI SI ON SUMMARY : TRACE BEFORE DI AGONALI SATI ON. . 6. 0000SUM OF EI GENVALUES. . . . . . . . . . . . 6. 0000
H I S TOGRAM OF THE F I RST 6 EI GENVALUES
+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| NUMBER | EI GENVALUE | PERCENTAGE | CUMULATED | || | | | PERCENTAGE | |+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| 1 | 4. 6173 | 76. 96 | 76. 96 | ******************************************************************************** || 2 | 0. 8788 | 14. 65 | 91. 60 | **************** || 3 | 0. 3035 | 5. 06 | 96. 66 | ****** || 4 | 0. 1055 | 1. 76 | 98. 42 | ** || 5 | 0. 0732 | 1. 22 | 99. 64 | ** || 6 | 0. 0216 | 0. 36 | 100. 00 | * |+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
In the second column (Eigenvalue) above, we find the variance on the new factors that
were successively extracted. In the third column, these values are expressed as a percent ofthe total variance. As we can see, factor 1 accounts for 77 percent of the variance, factor 2for 15 percent, and so on. As expected, the sum of the eigenvalues is equal to the number
8/10/2019 SPAD7 Data Miner Guide.pdf
38/176
8/10/2019 SPAD7 Data Miner Guide.pdf
39/176
Factorial Analyses with SPAD
39
AND ERSON 'S LAPLACE I NTERVALS
W I T H 0 . 9 5 T HRESHO LD
+- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| NUMBER | LOWER LI MI T EI GENVALUE UPPER LI MI T |+- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| 1 | 1. 9486 4. 6173 7. 2860 || 2 | 0. 3709 0. 8788 1. 3868 || 3 | 0. 1281 0. 3035 0. 4789 || 4 | 0. 0445 0. 1055 0. 1665 || 5 | 0. 0309 0. 0732 0. 1154 |+- - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +LENGTH AND RELAT I VE POS I T I ON OF I N TERVALS1 . . . . . . . . . . . . . . . . . * - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - * .2 . . . * - - - - - - - - +- - - - - - - -*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 . *- - +- - *. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 *+* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 +*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Third and second differences as well as Andersons laplace intervals are other guidelinesto help the SPAD User to choose the number of dimensions to retain for further analyses.
LOAD I NGS OF VAR I AB LES ON AXES 1 TO 5
ACTI VE VAR I AB LES- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
VARI ABLES | LOADI NGS | VARI ABLE- FACTOR CORRELATI ONS | NORMED EI GENVECTORS- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -I DEN - SHORT LABEL | 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -CYLI - Cubi c capaci t y| 0.96 0.01 - 0.15 0. 04 - 0. 23 | 0.96 0. 01 - 0. 15 0.04 - 0. 23 | 0.45 0. 01 - 0. 27 0.11 - 0.84PUI S - Power | 0. 90 0. 38 - 0. 02 - 0. 16 0. 04 | 0. 90 0. 38 - 0. 02 - 0. 16 0. 04 | 0. 42 0. 41 - 0. 03 - 0. 49 0. 15VI TE - Speed | 0. 75 0. 62 0. 20 0. 08 0. 04 | 0.75 0. 62 0. 20 0. 08 0. 04 | 0. 35 0. 66 0. 37 0. 26 0. 13POI D - Wei ght | 0.91 - 0.18 - 0.35 - 0. 06 0. 11 | 0.91 - 0. 18 - 0. 35 - 0.06 0. 11 | 0.42 - 0.19 - 0. 63 - 0.18 0. 42LONG - Length | 0. 92 - 0. 30 0. 05 0. 22 0. 07 | 0. 92 - 0. 30 0. 05 0. 22 0. 07 | 0. 43 - 0. 32 0. 10 0. 69 0. 26LARG - Wi dth | 0. 80 - 0. 48 0. 34 - 0. 14 - 0. 02 | 0. 80 - 0. 48 0. 34 - 0. 14 - 0. 02 | 0. 37 - 0. 51 0. 62 - 0. 42 - 0. 06- - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
For normed PCA, correlations (variable factor) and loadings are equivalent.Apparently, the first factor is generally more highly correlated with the variables than thesecond factor. This is to be expected because, as previously described, these factors areextracted successively and will account for less and less variance overall.
Normed eigen vectors are the coefficients that describe the linear relationship between theactive normed variables and the factors: in this example, we have:
...35.0)(
)(42.0
)(
)(45.01 +
+
=
PUISSTDEV
PUISMeanPUIS
CYLISTDEV
CYLIMeanCYLIFactor
Note:SPAD does not print out neither the contributions nor the cosinus for the active variables.However, it is possible to calculate them this way:
),(),( jLoadingjCos = for a normed PCA
),(),( jnCorrelatiojCos = for both normed and not normed PCA
and),(),( jnVectorNormedEigejonContributi =
8/10/2019 SPAD7 Data Miner Guide.pdf
40/176
PCA - Principal Component Analysis
40
FACTOR SCORES , CONTR I BUT I ONS AN D SQUARED COS I NES OF CASES
AXES 1 TO 5+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - +| CASES | FACTOR SCORES | CONTRI BUTI ONS | SQUARED COSI NES || - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - || I DENTI FI ER REL. WT. DI STO | 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +| Honda ci vi c 4. 17 4. 59 | - 2.01 0.32 0.50 - 0.44 - 0.10 | 3. 6 0. 5 3.4 7.6 0. 6 | 0.88 0.02 0. 05 0. 04 0.00 || Peugeot 205 Ral l ye 4. 17 7. 37 | - 2. 25 1. 49 0. 14 0. 09 0. 19 | 4. 6 10. 6 0. 3 0. 3 2. 1 | 0. 69 0. 30 0. 00 0. 00 0. 00 || Seat I bi za SX I 4. 17 4. 73 | - 1.92 0.94 - 0. 06 - 0.36 0.00 | 3. 3 4. 2 0.1 5.0 0. 0 | 0.78 0.19 0. 00 0. 03 0.00 || Ci t r on AX Spor t 4. 17 8. 78 | - 2.60 1.29 0. 47 - 0.32 - 0.15 | 6. 1 7. 9 3.0 4.0 1. 2 | 0.77 0.19 0. 02 0. 01 0.00 |
| Renaul t 19 4. 17 0. 92 | - 0. 78 - 0. 16 0. 48 0. 20 - 0. 12 | 0. 6 0. 1 3. 1 1.6 0. 8 | 0. 66 0. 03 0. 25 0. 04 0. 01 || Fi at Ti po 4. 17 2. 18 | - 1.30 - 0.43 0. 43 - 0.22 - 0.10 | 1. 5 0. 9 2. 5 2. 0 0. 6 | 0.77 0.09 0. 08 0. 02 0.00 || Peugeot 405 4. 17 0. 71 | - 0. 30 - 0. 46 0. 21 0. 58 0. 16 | 0. 1 1. 0 0. 6 13. 1 1. 4 | 0. 12 0. 30 0. 06 0. 47 0. 04 || Renaul t 21 4. 17 0. 96 | 0. 15 - 0. 64 0. 01 0. 67 - 0. 21 | 0. 0 1. 9 0. 0 17. 8 2. 5 | 0. 02 0. 42 0. 00 0. 47 0. 05 || Ci t r on BX 4. 17 0. 54 | - 0. 52 - 0. 20 0. 17 0. 40 0. 04 | 0. 2 0. 2 0. 4 6. 2 0. 1 | 0. 50 0. 07 0. 06 0. 29 0. 00 || Opel Omega 4. 17 3. 25 | 1. 45 - 0. 79 0. 51 0. 31 0. 42 | 1. 9 3. 0 3.5 3. 7 10. 0 | 0. 64 0. 19 0. 08 0. 03 0. 05 || Peugeot 405 Br eak 4. 17 0. 55 | 0. 57 0. 13 0. 39 0. 15 0. 19 | 0. 3 0. 1 2. 0 0. 9 2. 1 | 0. 58 0. 03 0. 27 0. 04 0. 07 || Ford Si err a 4. 17 0. 82 | 0.70 - 0.43 0. 14 0. 30 0.16 | 0. 4 0. 9 0. 3 3. 5 1. 4 | 0.60 0.23 0. 02 0. 11 0.03 || Renaul t Espace 4. 17 1. 77 | 0. 86 - 0. 87 0. 20 - 0. 44 0. 13 | 0. 7 3. 6 0. 5 7. 7 0. 9 | 0. 42 0. 43 0. 02 0. 11 0. 01 || Ni ssan Vanett e 4. 17 4. 73 | - 0. 11 - 1. 69 - 1. 33 - 0. 05 0. 24 | 0. 0 13. 6 24. 4 0. 1 3. 3 | 0. 00 0. 61 0. 38 0. 00 0. 01 || VW Caravel l e 4. 17 7. 58 | 1.14 - 2.39 0. 21 - 0.69 - 0.06 | 1. 2 27. 1 0. 6 18.7 0. 2 | 0.17 0.75 0. 01 0. 06 0.00 || Audi 90 Quat t r o 4. 17 3. 43 | 1. 39 1. 10 0. 19 - 0. 03 0. 48 | 1. 7 5. 7 0. 5 0. 0 13. 0 | 0. 56 0. 35 0. 01 0. 00 0. 07 || BMW 530i 4. 17 15. 98 | 3. 88 0. 85 - 0. 35 - 0. 04 - 0. 30 | 13. 6 3. 4 1. 7 0. 1 5. 1 | 0. 94 0. 04 0. 01 0. 00 0. 01 || Rover 827i 4. 17 10. 52 | 3. 15 0. 75 0. 13 0. 05 - 0. 13 | 8. 9 2. 7 0. 2 0. 1 0. 9 | 0. 94 0. 05 0. 00 0. 00 0. 00 || Renaul t 25 4. 17 12. 39 | 3. 39 0. 57 0. 71 - 0. 23 0. 07 | 10. 4 1. 5 6.9 2. 1 0. 3 | 0. 93 0. 03 0. 04 0. 00 0. 00 || BMW 325i X 4. 17 8. 92 | 2. 20 1. 17 - 1. 59 - 0. 24 0. 32 | 4. 4 6. 5 34. 6 2. 3 6. 0 | 0. 54 0. 15 0. 28 0. 01 0. 01 || Ford Scorpi o 4. 17 8. 28 | 2.74 - 0.15 - 0. 19 0. 13 - 0.83 | 6. 8 0. 1 0. 5 0. 6 39. 1 | 0.91 0.00 0. 00 0. 00 0.08 || Fi at Uno 4. 17 14. 29 | - 3. 73 0. 03 - 0. 50 0. 19 0. 01 | 12. 6 0. 0 3. 5 1. 4 0. 0 | 0. 97 0. 00 0. 02 0. 00 0. 00 || Peugeot 205 4. 17 7. 70 | - 2. 60 0. 46 - 0. 72 0. 12 - 0. 39 | 6. 1 1. 0 7. 1 0. 6 8. 4 | 0. 88 0. 03 0. 07 0. 00 0. 02 || Ford Fi esta 4. 17 12. 99 | - 3.49 - 0.87 - 0. 13 - 0.11 - 0.03 | 11. 0 3. 6 0. 2 0.5 0. 1 | 0.94 0.06 0. 00 0. 00 0.00 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +
DISTO: the distance between the case and the center of gravity of the overall sample. Thisis helpful to determine the Average cars, (close to the center of gravity) and the morespecific ones that are far from the center of gravity.
8/10/2019 SPAD7 Data Miner Guide.pdf
41/176
Factorial Analyses with SPAD
41
THE FACTORIAL GRAPH EDITOR
To access the factorial graph editor, click on this icon .
To create a new factorial graph, select Graph - New, the following windowappears:
The preselection step allows you to select the different elements to display in the graph:
Active or supplementary cases Active or supplementary variables
If you forget to select an element, you have to create a new graph and redo thepreselection.
THE TOOL BAR OF THE GRAPH EDITOR
Points Total Delete Cancelselection Unselection the labels the ghosts
Factors Framing Write Setselection selection the labels as ghost
8/10/2019 SPAD7 Data Miner Guide.pdf
42/176
PCA - Principal Component Analysis
42
Information Vertical Correlationon points symmetric view circle
Refresh Horizontalsymmetric view
SAVE A GRAPH
Internal saveis dependent on the chain.In the case of a re-execution of the chain, or the deletion by the user of the results of the
chain, these internal saves are deleted.This type of save uses the commands:SaveSave as internal save of the graphics menu.
When you save in internal format, you give a TITLE to the saved graphic.Later you can reload this save with the command Open Internal save graphics menu.
The utility of the Save in Internal Format is that all the functions of the annotations andproperties of the factorial planes remain available.
The save in archive formatis a save, which is independent of the chain.
This type of save is made using the command Save as Save archive on the graphicsmenu.When saving in archive format, you give a NAME to the graphic saved with the obligatoryextension .GFA.
Later, you can recover this save with the command Open -Save archive in the Graphics
menu.This save is independent of the chain. Some formats are no longer possible in this type ofsave, in particular the formatting of cases.
The editor for the factorial planes also lets you save the graphics in .BMPor .PCXformat.These images can then be inserted into a word processor document.TheEMF Metafileformat gives the best image quality.This type of Save is made with the command Save as - Screen Image BMP/PCX.
8/10/2019 SPAD7 Data Miner Guide.pdf
43/176
Factorial Analyses with SPAD
43
GENERAL PRINCIPLES
The construction of a graphic after an analysis requires the following general principles:
Go to the New Graphics Menu, which opens the pre-selections Dialogue Box.For a single analysis, you can open several graphics at once through the Graphics Menuand make different pre-selections. All the graphics you create can be saved in an internalor the archive format.
To modify your graph, apply the following rule:
Select the points with the tool bar or the selection menu Format them with the format menu Deselect to see the effect of the embellishments.
IMPORTANTTo manipulate (move, change etc.) the labels and the texts on a graphic, enlarge the frame.For this you have to be in standard mode, that is: no selection mode button is highlighted,and the status bar is empty.
8/10/2019 SPAD7 Data Miner Guide.pdf
44/176
PCA - Principal Component Analysis
44
8/10/2019 SPAD7 Data Miner Guide.pdf
45/176
Factorial Analyses with SPAD
45
SCA - SIMPLE CORRESPONDENCE ANALYSIS
This procedure performs a simple correspondence analysis (SCA) on a contingency tableor a table with non-negative numbers.
Simple correspondence analysis is a powerful statistical tool for the graphical analysis ofcontingency tables.
The result of a simple correspondence analysis is a two-dimensional graphicalrepresentation of the association between rows and columns of the table.The plot contains a point for each row and each column of the table. Rows with similarpatterns of counts produce points that are close together, and columns with similar
patterns of counts produce points that are close together.
Simple correspondence analysis analyzes a contingency table made up of one or morecolumn variables and one or more row variables.
To illustrate this method, consider the following dataset, a typical two-dimensionalcontingency table. The data deal with the perception of different kinds of alcohol.
Select the SPAD dataset ALCOOL.SBA and import it.
PASTIS WHISKY MARTINI SUZE VODKA GIN MALIBU BEER
Like the taste 49 50 42 18 25 23 25 59
With friends 83 83 76 60 69 68 69 74
To relax oneself 61 61 51 32 38 39 39 72
Become expensive 60 88 42 41 75 70 61 19
Refreshing 78 22 18 19 17 19 14 80
Not elegant 26 11 13 17 13 11 13 29
Friendly product 64 64 56 34 45 42 46 68
Good before meals 88 79 85 64 45 46 37 41
Good during the day 24 21 12 10 13 12 13 85
Good during evening 7 61 12 11 53 50 48 54
For all year long 83 87 85 79 83 82 80 90
Liked by youngs 45 77 36 16 65 69 76 89
Good for guests 88 92 87 60 70 67 67 81
Oldy, not trendy 12 4 13 38 5 6 8 7As wel l for men as for women 50 62 69 43 49 51 61 60
Close to me 38 41 27 11 16 18 17 49
By habits 36 30 24 16 19 19 17 40
Make snobish 3 35 9 8 28 25 21 4
We can mix it 43 87 29 32 82 80 43 40
For night life / bars / nightclubs 12 91 27 16 84 81 72 67
http://www.soc.surrey.ac.uk/sru/SRU7.html#table1http://www.soc.surrey.ac.uk/sru/SRU7.html#table18/10/2019 SPAD7 Data Miner Guide.pdf
46/176
SCA - Simple correspondence analysis
46
The SETTING OPTIONS
THE COLUMNSTAB
Active frequencies: all
THE ROWSTAB
This tab is exactly similar to the Cases tabs available for the descriptive statisticsmethods.
8/10/2019 SPAD7 Data Miner Guide.pdf
47/176
Factorial Analyses with SPAD
47
THE PARAMETERSTAB
In order to display the rowsresults in excel sheets, clickon the Options button
and select Yes
8/10/2019 SPAD7 Data Miner Guide.pdf
48/176
8/10/2019 SPAD7 Data Miner Guide.pdf
49/176
Factorial Analyses with SPAD
49
The following graph has been designed with the SPAD Amado procedure.Using the SCA results, rows and columns are ranked by decreasing first factorcoordinates. It gives a visual structure to the table. The width of a column is proportionalto its frequency.
28
84
53
82 75 6549
8369 70
45 38 4525 19 16 13 13 5
17
25
81
50
80 70 6951
8268 67
42 39 4623 19 18 11 12 6
19
21
7248 43
6176
6180 69 67
46 39 3725 17 17 13 13 8 14
35
91
6187 88 77
6287 83 92
64 6179
5030 41
11 21 422
927
1229
42 36
6985 76 87
56 51
85
4224 27
13 12 13 18
8 16 1132 41
1643
7960 60
34 32
64
18 16 11 17 1038
19
4
6754
4019
89
60
9074 81 68 72
4159
40 49 29
85
7
80
3 12 743
60
4550
83 83 8864 61
88
4936 38 26 24
12
78
VODKA
GIN
MALIBU
WHISKY
MARTINI
SUZE
BEER
PASTIS
Makesnobish
Fornightlife/bars/nightclubs
Goodduringevening
Wecanmixit
Becomeexpensive
Likedbyyoungs
Aswellformenasforwomen
Forallyearlong
Withfriends
Goodforguests
Friendlyproduct
Torelaxoneself
Goodbeforemeals
Likethetaste
Byhabits
Closetome
Notelegant
Goodduringtheday
Oldy.nottrendy
Refreshing
8/10/2019 SPAD7 Data Miner Guide.pdf
50/176
MCA - Multiple Correspondence Analysis
50
MCA - MULTIPLE CORRESPONDENCE ANALYSIS
The multiple correspondence analysis extends the simple correspondence analysisproperties to n-way tables.The procedure requires more than 2 active categorical variables, observed on a set of cases.As well as for the other factorial analyses, it is possible to add some supplementaryelements such as illustrative cases, illustrative continuous or categorical variables.
We will perform the MCA on the ASPI1000.SBA dataset.
VARIABLES DESCRIPTION OF THE ASPI1000.SBADATASET
ACTIVE CATEGORICAL VARIABLES - 7 VARIABLES - 28 CATEGORIES
11 . Gender ( 2 categories )29 . Do you own securities ? ( 2 categories )39 . Urban area size (number of inhabitants) ( 5 categories )49 . Job category ( 5 categories )51 . Diploma in 5 categories ( 5 categories )52 . Occupation status of housing in 4 categories ( 4 categories )53 . Age in 5 categories ( 5 categories )
SUPPLEMENTARY CATEGORICAL VARIABLES - 35 VARIABLES - 152 CATEGORIES
All available categorical variables
SUPPLEMENTARY CONTINUOUS VARIABLES - 8 VARIABLES
All available continuous variables
8/10/2019 SPAD7 Data Miner Guide.pdf
51/176
Factorial Analyses with SPAD
51
The SETTING OPTIONS
THE VARIABLESTAB
8/10/2019 SPAD7 Data Miner Guide.pdf
52/176
MCA - Multiple Correspondence Analysis
52
THE PARAMETERSTAB
Random assignment of active categories inferior to (in %)To assure the robustness of the analysis, it may be useful, on the definition of the
axes of the analysis, to take into account only the categorical variables of a sufficientweight.For each question, the cases concerned by a weak total weight category will beassigned at random to one of the other categories of the variable with a sufficientweight in the question considered. This cleaning operation allows the data table toconserve its completely disjunctive property.
The parameter PCMIN fixes the percentage of the total weight of the active casesbelow which a category is considered to have a weight too weak. If all the caseshave the weight 1, PCMIN is the percentage of the number of active cases below
which a category will be broken down.
If all the categories for a question (or all except one) have too weak weight, thequestion itself will be made illustrative for the calculation of the axes.The default value (2%) is suitable for most analyses. If the parameter is set to 0.0,only the categories with a null weight will be eliminated.
Retained coordinatesThe number of retained coordinates is useful for the methods that follow the MCA
in the chain. These methods can be DEFAC (factors description) and RECIP/SEMIS(clustering).
By default, cases coordinatesare not displayed.
8/10/2019 SPAD7 Data Miner Guide.pdf
53/176
Factorial Analyses with SPAD
53
THE MCARESULTS
MULTIPLE CORRESPONDENCE ANALYSIS
E LI M I N A T I ON O F AC TI V E CA T EGORI ES W I T H SMA L L WE I GH T S
THRESHOLD ( PCMI N) : 2. 00 % WEI GHT: 20. 00BEFORE CLEANI NG : 7 ACTI VE QUESTI ONS 28 ASSOCI ATE CATEGORI ESAFTER CLEANI NG : 7 ACTI VE QUESTI ONS 28 ASSOCI ATE CATEGORI ESTOTAL WEI GHT OF ACTI VE CASES : 1000. 00
MARG I NA L D I S TRI BUT I ONS OF ACTI VE QUEST I ONS- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
CATEGORI ES | BEFORE CLEANI NG | AFTER CLEANI NGI DENT LABEL | COUNT WEI GHT | COUNT WEI GHT HI STOGRAM OF RELATI VE WEI GHTS,- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
11 . Gendermasc - mal e | 469 469. 00 | 469 469. 00 *** **** **** *** **** *** **** ****f mi - gender | 531 531. 00 | 531 531. 00 *** **** **** *** **** *** **** **** ***- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
29 . Do you own some securi t i es ?vmo1 - Yes | 121 121. 00 | 121 121. 00 ** **** **
vmo2 - No | 879 879. 00 | 879 879. 00 *** **** **** *** **** *** **** **** *** **** **** *** **** **- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -39 . Ur ban area si ze ( number of i nhabi t ant s)
agg1 - Lower t han 2. 000 | 83 83. 00 | 83 83. 00 *****agg2 - 2. 000 - 20. 000 | 87 87. 00 | 87 87. 00 ***** *agg3 - 20. 000 - 100. 000 | 175 175. 00 | 175 175. 00 ***** ******agg4 - greater t han 100. 000 | 329 329. 00 | 329 329. 00 ***** ******* ***** ***agg5 - Par i s | 326 326. 00 | 326 326. 00 *** **** **** *** **** **- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
49 . J ob categoryemp1 - Worker | 263 263. 00 | 263 263. 00 ***** ******* ****emp2 - Empl oyee | 335 335. 00 | 335 335. 00 ***** ******* ***** ****emp3 - Manager | 229 229. 00 | 229 229. 00 ** **** ** **** **emp4 - Ot her | 48 48. 00 | 48 48. 00 ==RAND. ASSI GN. == 49_ - mi ssi ng category | 125 125. 00 | 125 125. 00 ***** ***- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
51 . Di pl oma i n 5 categori esdi e1 - No one | 189 189. 00 | 189 189. 00 ***** *******di e2 - CEP | 321 321. 00 | 321 321. 00 *** **** **** *** **** **
di e3 - BEPC- BE-BEPS | 158 158. 00 | 158 158. 00 ***** *****di e4 - Bac - Br evet sup. | 182 182. 00 | 182 182. 00 *** **** ****di e5 - Uni ver si t y | 150 150. 00 | 150 150. 00 *** **** ***- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
52 . Occupat i on st atus of housi ng i n 4 categori essl o1 - homeowner | 120 120. 00 | 120 120. 00 ** **** **sl o2 - owner | 290 290. 00 | 290 290. 00 *** **** **** *** ****sl o3 - t enant | 523 523. 00 | 523 523. 00 ******* ******* ******* ******** ***sl o4 - f r ee housi ng, other | 67 67. 00 | 67 67. 00 *** **- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
53 . Age i n 5 categori esagc1 - Lower t han 25 yo | 150 150. 00 | 150 150. 00 ***** *****agc2 - 25 t o 34 yo | 284 284. 00 | 284 284. 00 ***** ******* ***** *agc3 - 35 t o 49 yo | 209 209. 00 | 209 209. 00 ***** ******* *agc4 - 50 t o 64 yo | 188 188. 00 | 188 188. 00 ***** *******agc5 - 65 yo and more | 169 169. 00 | 169 169. 00 ** **** ** ***- - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
8/10/2019 SPAD7 Data Miner Guide.pdf
54/176
MCA - Multiple Correspondence Analysis
54
E I GENVA LUESCOMPUTATI ONS PRECI SI ON SUMMARY : TRACE BEFORE DI AGONALI SATI ON. . 2. 8571
SUM OF EI GENVALUES. . . . . . . . . . . . 2. 8571
H I S TOGRAM OF THE F I RST 2 0 E I GENVALUES+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| NUMBER | EI GENVALUE | PERCENTAGE | CUMULATED | || | | | PERCENTAGE | |
+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - +| 1 | 0. 2703 | 9. 46 | 9. 46 | ******************************************************************************** | | 2 | 0. 2369 | 8. 29 | 17. 75 | *********************************************************************** | | 3 | 0. 2084 | 7. 29 | 25. 05 | ************************************************************** || 4 | 0. 1922 | 6. 73 | 31. 77 | ********************************************************* || 5 | 0. 1846 | 6. 46 | 38. 23 | ******************************************************* || 6 | 0. 1578 | 5. 52 | 43. 76 | *********************************************** || 7 | 0. 1534 | 5. 37 | 49. 13 | ********************************************** || 8 | 0. 1493 | 5. 23 | 54. 35 | ********************************************* || 9 | 0. 1441 | 5. 04 | 59. 40 | ******************************************* || 10 | 0. 1398 | 4. 89 | 64. 29 | ****************************************** || 11 | 0. 1326 | 4. 64 | 68. 93 | **************************************** || 12 | 0. 1300 | 4. 55 | 73. 48 | *************************************** || 13 | 0. 1284 | 4. 49 | 77. 97 | ************************************** || 14 | 0. 1222 | 4. 28 | 82. 25 | ************************************* || 15 | 0. 1070 | 3. 74 | 86. 00 | ******************************** || 16 | 0. 1015 | 3. 55 | 89. 55 | ******************************* || 17 | 0. 0954 | 3. 34 | 92. 89 | ***************************** || 18 | 0. 0821 | 2. 87 | 95. 76 | ************************* |
| 19 | 0. 0748 | 2. 62 | 98. 38 | *********************** || 20 | 0. 0462 | 1. 62 | 100. 00 | ************** |+- - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - +
RESEARCH OF I RREGULAR I T I ES ( TH I RD D I F FERENCES)+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| I RREGULARI TY | I RREGULARI TY | || BETWEEN | VALUE | |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| 5 - - 6 | - 27. 77 | **************************************************** || 14 - - 15 | - 10. 42 | ******************** || 17 - - 18 | - 6. 67 | ************* || 13 - - 14 | - 5. 44 | *********** || 10 - - 11 | - 3. 77 | ******** || 2 - - 3 | - 3. 66 | ******* || 8 - - 9 | - 1. 53 | *** |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
RESEARCH OF I RREGULAR I T I ES ( SECOND D I FFERENCES )+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| I RREGULARI TY | I RREGULARI TY | || BETWEEN | VALUE | |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +| 5 - - 6 | 22. 31 | **************************************************** || 2 - - 3 | 12. 28 | ***************************** || 14 - - 15 | 9. 83 | *********************** || 3 - - 4 | 8. 62 | ********************* || 1 - - 2 | 4. 94 | ************ || 10 - - 11 | 4. 67 | *********** || 11 - - 12 | 0. 90 | *** || 8 - - 9 | 0. 81 | ** || 6 - - 7 | 0. 40 | * |+- - - - - - - - - - - - - - +- - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
Irregularity 2nddiff between 5 and 6 = [ ( 7 6 ) ( 6 5 ) ] * 1000
The two tables below are the equivalent of the scree test (or Cattel test).This procedure detects the main irregularities in the graph and ranks them by decreasingimportance.
8/10/2019 SPAD7 Data Miner Guide.pdf
55/176
Factorial Analyses with SPAD
55
LOAD I NGS , CONTR I BUT I ONS AND SQUARED COS I N ES OF ACTI VE CATEGOR I ES
AXES 1 TO 5+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +| CATEGORI ES | LOADI NGS | CONTRI BUTI ONS | SQUARED COSI NES || - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - || I DEN - LABEL REL. WT. DI STO | 1 2 3 4 5 | 1 2 3 4 5 | 1 2 3 4 5 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - +| 11 . Gender || masc - mal e 6.70 1.13 | - 0.29 0. 08 0.43 - 0.47 - 0.25 | 2.1 0.2 6.0 7. 6 2. 3 | 0.07 0.01 0.16 0.19 0.06 || f mi - gender 7.59 0.88 | 0.26 - 0.07 - 0.38 0. 41 0.22 | 1.8 0.2 5. 3 6. 7 2. 0 | 0.07 0.01 0.16 0.19 0.06 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - CUMULATED CONTRI BUTI ON = 3. 9 0. 3 11. 2 14. 4 4. 3 +- - - - - - - - - - - - - - - - - - - - - - - - - - +| 29 . Do you own some secur i t i es ? |
| vmo1 - Yes 1.73 7.26 | 0.69 1.46 - 0.25 - 0.23 0.06 | 3.1 15. 5 0.5 0. 5 0.0 | 0.07 0.29 0.01 0.01 0.00 || vmo2 - No 12. 56 0.14 | - 0.10 - 0.20 0.03 0.03 - 0.01 | 0.4 2. 1 0.1 0. 1 0.0 | 0.07 0.29 0.01 0.01 0.00 |+- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +- - - - - - - CUMULATED CONTRI BUTI ON = 3. 5 17. 6 0. 6 0. 6 0. 0 +- - - - - - - - - - - - - - - - - - - - - - - - - - +| 39 . Urban area si ze ( number of i nhabi t ant s) || agg1 - Lower t han 2.000 1.19 11. 05 | -1. 06 0.83 - 1.06 0.75 - 0.06 | 5.0 3.4 6.4 3.5 0.0 | 0. 10 0.06 0.10 0.05 0.00 || agg2 - 2.000 - 20. 000 1.24 10. 49 | - 0.55 0.26 0.28 0.80 - 0.61 | 1.4 0.3 0.5 4.2 2.5 | 0.03 0.01 0.01 0.06 0.04 || ag