+ All Categories
Home > Documents > Cours add-r1-part1

Cours add-r1-part1

Date post: 22-Nov-2014
Category:
Upload: arthur-charpentier
View: 609 times
Download: 4 times
Share this document with a friend
Description:
 
117
Arthur CHARPENTIER - Analyse des donn´ ees Analyse des donn´ ees (1) L’Analyse en Composantes Principales Arthur Charpentier http ://perso.univ-rennes1.fr/arthur.charpentier/ blog.univ-rennes1.fr/arthur.charpentier/ Master 2, Universit´ e Rennes 1 1
Transcript
Page 1: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Analyse des donnees (1)

L’Analyse en Composantes Principales

Arthur Charpentier

http ://perso.univ-rennes1.fr/arthur.charpentier/

blog.univ-rennes1.fr/arthur.charpentier/

Master 2, Universite Rennes 1

1

Page 2: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Introduction a l’analyse des donnees

Dans ce cours, nous verrrons essentiellement deux types de methodes• les methodes factorielles, ou on cherchera a reduire le nombre de variables en

les resumant en un petit nombre de composantes synthetiques◦ en particulier l’ACP, Analyse en Composantes Principales si les variables

sont quantitatives◦ en particulier l’AC, Analyse des Correspondances si les variables sont

qualitatives, ou on cherchera les liens entre les modalites, avec l’ACFAnalyse des Correspondances Factorielles (simples) dans le cas ou on disposede 2 variables, et l’ACM Analyse des Correspondances Multiples dans le casou on dispose de plus de 2 variables

2

Page 3: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Introduction a l’analyse des donnees

• les methodes de classification, ou on cherchera a reduire la taille de l’ensembledes individus en les regroupant en un petit nombre de groupes homogenes

◦ en particulier la CAH, Classification Ascendante Hierarchique ...◦ en particulier l’Analyse Discriminante ...

Remarque Ce cours est davantage un cours d’algebre lineaire qu’un cours deprobabilite ou de statistique. Mais une interpretation sera parfois possible enterme de moyenne ou de variance (voire de covariance).

3

Page 4: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

“Le palmares des departements : ou vit-on en securite ?, dans L’Express (no2589, 15 fevrier 2001)

• infra Nombre d’infractions totale pour 1000 habitants (2000)• vvi Nombre de vols avec violance pour 1000 habitants (2000)• auto Nombre de vols d’automobiles pour 1000 habitants (2000)

> add=read.table("http://perso.univ-rennes1.fr/arthur.charpentier/securite.txt",header=TRUE)

> base=add[,2:ncol(add)]

> rownames(base)=add$dep

> base=base[,c(1,6,9)]

> head(base)

infra vvi auto

D1 44.11 0.27 4.47

D2 45.97 0.55 4.39

D3 38.83 0.41 2.39

D4 49.68 0.21 4.17

D5 47.67 0.33 2.35

D6 109.21 4.10 8.83

4

Page 5: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

●●

●●●

●●

●●●●

●●

●●

●●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

●●●

●●

● ●●

●●

● ●

●●

● ●●

●●

20 40 60 80 100 120 140

02

46

810

infractions

vols

ave

c vi

olen

ce

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●● ●●

●●

●● ●●

●● ● ●●● ●●● ● ● ●●●●

●●

●●

●● ● ●● ●

● ●●

●●● ●● ●●

●●●

●●● ●●●●

●●● ●●

●●

●● ● ●●

● ●●

●●

● ●●●

●●●● ●●

●●

● ●

● ●

5

Page 6: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

20 40 60 80 100 120 140

02

46

810

1214

infractions

vols

aut

omob

ile

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

6

Page 7: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

0 2 4 6 8 10

02

46

810

1214

vols avec violence

vols

aut

omob

ile

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7

Page 8: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

Les variables semblent plutot correlees positivement,

> cor(base)

infra vvi auto

infra 1.0000000 0.8583172 0.7808855

vvi 0.8583172 1.0000000 0.5032206

auto 0.7808855 0.5032206 1.0000000

Supposons que l’on cherche a regrouper les villes “proches”.

=⇒ Comme on a du mal a voir dans R3, on va essayer de projeter le nuage.

• projection sur un axe (droite)• projection sur un plan

8

Page 9: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●

● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●● ●

●●●●●●●●●● ●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●

●●●●●●●●●●●●●●●

●●

●●●

●●●●

●●●●●●●●●●●●●●●●●

●●●●●

●●

●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●

9

Page 10: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●● ●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

10

Page 11: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

=⇒ recherche de la projection “la plus representative”, cf. idee des moindrescarres, qui minimise l’erreur de projection comise

11

Page 12: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

Pourquoi pas projecter sur un plan ?

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

20 40 60 80 100 120 140

0 2

4 6

810

1214

0 2

4 6

810

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

12

Page 13: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple, ville et (in)securite

Peut-etre faut-il normer les axes pour les rendre comparable ?

−2 −1 0 1 2 3 4 5

−2−1

0 1

2 3

−1 0

1 2

3 4

5 6

7

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

−2 −1 0 1 2 3 4 5

−2−1

0 1

2 3

−1 0

1 2

3 4

5 6

7

infractions

vols

ave

c vi

olen

ce

vols

aut

omob

ile

●●●

●●

●●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●

13

Page 14: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Analyse de la “meilleur” projection d = 2

D1 D2

D3

D4

D5

D6

D7 D8 D9

D10

D11

D12

D13

D14 D15

D16

D17

D18 D19 D21

D22 D23

D24 D25

D26

D27

D28 D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46 D47

D48

D49

D50

D51

D52 D53 D54

D55 D56 D57 D58

D59

D60

D61

D62 D63 D64 D65

D66

D67 D68

D69

D70 D71

D72

D73 D74

D75

D76

D77

D78 D79 D80 D81

D82 D83

D84

D85 D86 D87

D88

D89

D90

D91

D92

D93

D94

D95

14

Page 15: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Analyse de la “meilleur” projection d = 2

40] 60] 80] 100] 120]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95

Infractions (total)

d = 2

40] 60] 80] 100] 120]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95●

Infractions (total)

15

Page 16: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Analyse de la “meilleur” projection d = 2

2] 4] 6] 8]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95

Vols avec violence

d = 2

2] 4] 6] 8]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95

Vols avec violence

16

Page 17: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Analyse de la “meilleur” projection d = 2

2] 4] 6] 8] 10] 12]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95

Vols d'automobiles

d = 2

2] 4] 6] 8] 10] 12]

D1D2

D3

D4

D5

D6

D7D8D9

D10

D11

D12

D13

D14D15

D16

D17

D18D19 D21

D22D23

D24D25

D26

D27

D28D29

D30

D31

D32

D33

D34

D35

D36 D37

D38

D39 D40

D41

D42

D43

D44

D45

D46

D47

D48

D49

D50

D51

D52D53 D54

D55D56

D57D58D59

D60

D61

D62D63 D64D65

D66

D67D68

D69

D70D71

D72

D73D74

D75

D76

D77

D78D79 D80

D81

D82 D83

D84

D85D86

D87D88

D89

D90

D91

D92

D93

D94

D95

Vols d'automobiles

17

Page 18: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Un peu de geometrie euclidienne

On observe n individus, et q variables (quantitatives, sur R).

Les nuages de points peuvent se decomposer de deux manieres,– l’espace des individus, i.e. Rq

– l’espace des variables, i.e. Rn

On note xij l’observation de la jeme variable sur le ieme individu.

variables

1 · · · j · · · q

individus 1 x11 · · · x1j · · · x1q

......

......

i xi1 · · · xij · · · xiq

......

......

n xn1 · · · xnj · · · xnq

18

Page 19: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Un peu de geometrie euclidienne

Chaque individu est characterise par Li = (xi1, · · · , xiq)t, appartenant a Rq,exprime dans la base canonique {e1, · · · , eq}.Definition 1. Les points individus dans l’espace vectoriel Rq, munie de{e1, · · · , eq} est appele espace des individus.

=⇒ comment mesurer la distance entre deux individus ?

19

Page 20: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Distance entre individusDefinition 2. Soit D une matrice diagonale q × q, dont les elements diagonauxsont strictement positifs (dii > 0 pour i = 1, · · · , q). Alors la fonctionϕ : Rq × Rq 7→ R definie par

(u,v)→ utDv =q∑

j=1

djjujvj

est un produit scalaire, note < ·, · >D.Definition 3. Soit D une telle matrice diagonale q × q, et < ·, · >D le produitscalaire associe. On note alors ‖ · ‖D la norme associee,

‖u‖D =√< u,u >D =

q∑j=1

djjujuj

et dD(·, ·) la distance associee,

dD(u,v) = ‖u− v‖D.

20

Page 21: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemples de produits scalaires

• D = Id correspond au produit scalaire canonique, < u,v >Id=q∑

j=1

ujvj

• Considerons le produit scalaire associe a D =

3/4 0

0 1/4

Les points a egale distance de l’origine 0 sont les points M = (x, y) ∈ R2 tels que

‖0M‖D = α > 0, i.e.34x2 +

14y2 = α,

c’est a dire une ellipse dans R2.

21

Page 22: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Deformation de l’espace

−2 −1 0 1 2

−2

−1

01

2

Produit scalaire canonique, Id

−2 −1 0 1 2−

2−

10

12

Produit scalaire associé à la matrice D

22

Page 23: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les metriques usuelles

Il y a fondamentalement trois types de metriques a retenir,

• la metrique usuelle i.e. M = I, la matrice identie

Dans ce cas, la distance depend de l’unite de mesure, et de la dispersion desvariables.

• la metrique reduite i.e. M = diag(s−21 , · · · , s−2

q ), la matrice diagonale desinverses des variances empiriques

Rappelons que pour une serie d’observations {x1, · · · , xq}, la moyenne(empirique) est

mx = x =1n

n∑i=1

xi

et que la variance (empirique) est

s2x =1n

n∑i=1

(xi − x)2 =1n

n∑i=1

x2i − x2.

23

Page 24: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Enfin, rappelons que la covariance entre x et y est

sxy =1n

n∑i=1

(xi − x)(yi − y) =1n

n∑i=1

xiyi − xy.

On appele correlation (au sens de Pearson) la grandeur

rxy =sxy

sxsy=

∑ni=1(xi − x)(yi − y)√∑n

i=1(xi − x)2 ·∑n

i=1(yi − y)2.

• la metrique transformee i.e. M = T ′T ,

Cela est equivalent a travailler avec la metrique classique I sur le tableautransformee XT ′.

Notons que pour toute matrice symmetrique positive M , il existe une tellematrice T , appele racine carree de M

24

Page 25: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Deformation de l’espace

Proposition 4. Munir l’espace de la metrique issue de D q × q, diagonale, estequivalent a attribuer des poids {

√d11, · · · ,

√dqq} aux q variables et d’utiliser la

metrique canonique.

Demonstration. Pour tout u,v ∈ Rq,

< u,v >D= utDv =q∑

j=1

djjujvj =q∑

j=1

(√djjuj

)(√djjvj

)soit < u,v >D=< u, v >Id ou u = (u1, · · · , uq), uj =

√djjuj .

25

Page 26: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les variables, cas de la dimension 2

On cherche ici a mesurer une distance, ou une proximite, entre des variables.Intuitivement, cette notion doit etre proche de la notion de correlation.

Soient deux variables X1 et X2 continues.

Remarque La regression propose d’etudier le lien entre deux variables, dansl’optique d’en utiliser une pour prevoir l’autre.

26

Page 27: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

●●

● ●

●●●●

●●

●●

● ●

●●

●●

●●

−2 −1 0 1 2

−2

−1

01

2

●●

● ●

●●●●

●●

●●

● ●

●●

●●

●●

−2 −1 0 1 2

−2

−1

01

2

Ici, on s’interesse davantage a des projections (orthogonales). On parlera alors dedirection principal du nuage.

27

Page 28: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

●●

● ●

●●●●

●●

●●

● ●

●●

●●

●●

−2 −1 0 1 2

−2

−1

01

2

●●

● ●

●●●●

●●

●●

● ●

●●

●●

●●

−2 −1 0 1 2

−2

−1

01

2

On peut montrer que cet axe passe par le centre de gravite du nuage (comme lesdeux autres regressions).

Changeons les coordonnees pour simplifier, Y1 = X1 −X1 et Y2 = X2 −X2. Onnotera O ce barycentre, X les points d’origine et P les projections

28

Page 29: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

orthongonales. On cherche a minimiser

I =n∑

i=1

‖XiPi‖2 =n∑

i=1

‖OiXi‖2 − ‖OiPi‖2 (qu’on appelera inertie),

par des proprietes d’orthogonalite. Les points O et X etant fixer, si u est levecteur directeur de l’axe, u = (a, b), suppose unitaire, minimiser I devient amaximiser

I2 =n∑

i=1

‖OiPi‖2 = (Y u)′Y Y uu′(Y ′Y )uu′(nΣ)u

ou Σ correspond a la matrice de variance-covariance de Y (et donc de X).

Σ est symmetrique, elle possede toujours deux valeurs propres, et deux vecteurspropres, et

Σ = UΛU ′ =

u1,1 u1,2

u2,1 u2,2

λ1 0

0 λ2

u1,1 u1,2

u2,1 u2,2

29

Page 30: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

ou U est une matrice othonormee. Aussi,

I2 = λ1α2 + λ2β

2 ≤ max{λ1, λ2} [α2 + β2]︸ ︷︷ ︸=1

,

ou (α, β) sont les nouvelles coordonees de u.

L’inertie ne peut donc depasser la plus grande valeur propre (on supposera quec’est λ1), et elle atteint cette valeur lorsque u est le premier vecteur propre.

=⇒ l’axe principal d’un nuage de points bivarie est le vecteur propre associe a laplus grande valeur propre de la matrice de variance-covariance des deux variables.

Ce resultat va se generaliser en plus grande dimension.

30

Page 31: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’espace des variables

De la meme maniere, chaque variable est characterise par Cj = (x1j , · · · , xnj)t,appartenant a Rn, exprime dans la base canonique {f1, · · · , fn}.

Generalement, dans l’espace des variables, un poids identique sera donne achaque individu.

31

Page 32: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projeter un nuage de points

32

Page 33: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Sous R, on peut utiliser le code suivant

> library(mnormt);library(rgl)

> mu <- c(0,0,0)

> Sigma <- matrix(c(1,0.5,0.4,0.5,1,-0.5,0.4,-0.5,1), 3, 3)

> Z <- rmnorm(80, mu, Sigma)

> plot3d(Z,type="s",col="blue")

> plot3d(ellipse3d(cor(Z)),col="light green",alpha=0.5,add=TRUE)

=⇒ la recherche d’axes principaux est lie a la recherche des axes de l’ellipse.

33

Page 34: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projeter un nuage de points

Attention des points proches dans Rk ont des projections proches, mais deuxpoints dont les projections sont proches ne sont pas necessairement proches.

34

Page 35: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projeter des points, la notion d’inertie

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.

L’espace individus (de Rq) est muni de la metrique issue D.

Definition 5. On appelle inertie du nuage des points {L1, · · · , Ln} la quantite

I(X, D) =n∑

i=1

di‖Li‖2D =n∑

i=1

q∑j=1

diDjjx2ij

=⇒ on cherche des axes ou des plans de projections telle que l’intertie soitmaximale.

35

Page 36: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●●

●●

● ●

●●

●● ●

●●

●●

●●

●● ●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

36

Page 37: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●● ●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

37

Page 38: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●● ●

●●

● ●

●● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

38

Page 39: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●● ●

● ●

●● ●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●● ●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

39

Page 40: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

●● ●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

40

Page 41: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

41

Page 42: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Projection sur un plan

Plan de projection, en dimension 3

0 1 2 3 4 5 6

01

23

45

6

01

23

45

6

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●● ●

●●

●●

● ●

● ●

●●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●● ●

●●

●●

●●

●● ●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

0 1 2 3 4 5 60

12

34

56

Projection sur le plan

42

Page 43: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’inertie expliquee par un axe

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.

L’espace individus (de Rq) est muni de la metrique issue D.

Definition 6. Soit u ∈ Rq. On appelle inertie du nuage des points {L1, · · · , Ln}expliquee par l’axe u la quantite I(X,u, D) correspondant a l’intertie du nuageprojecte orthogonalement sur u (pour < ·, · >D).

D’apres le theoreme de Pytaghore

inertie totale ≥ inertie expliquee par l’axe u.

Considerons le cas de la projection de R2 sur un axe u.

43

Page 44: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Le (premier) axe principal

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.

L’espace individus (de Rq) est muni de la metrique issue D.Definition 7. L’axe principal, ou premier axe principal, pour un nuaged’individus {L1, · · · , Ln} est un vecteur unitaire u? ∈ Rq qui maximise l’inertieI(X,u, D) (pour < ·, · >D).

On cherche alors

u? = argmax{u′DX ′XDu}, avec ‖u‖D = 1.

Ce probleme est equivalent a chercher v? = D1/2u qui maximize

v? = argmax{v′D1/2X ′XD1/2v}, avec ‖v‖ = 1. (1)

la derniere norme etant la norme euclidienne.Proposition 8. Le vecteur unitaire v? ∈ Rq solution de ?? est le vecteur propreassocie a la plus grande valeur propre de la matrice (XD1/2)′(XD1/2).

44

Page 45: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Demonstration. v? est necessaire un vecteur propre car (utilisation duLagrangien pour determiner l’optimum)

(XD1/2)′(XD1/2)v? − λv? = 0.

Rappelons que (XD1/2)′(XD1/2) est diagonalisable dans une base orthonomee(car symmetrique reelle). Soient λ1 > · · · > λk toutes les valeurs propres, i.e.(XD1/2)′(XD1/2)vk = λkvk. Comme on cherche a maximiserv′D1/2X ′XD1/2v, c’est que v? = v1.

Corollaire 9. Le vecteur D-unitaire u? = u1 ∈ Rq maximisant I(X,u, D) estdefini de mani‘ere unique (au signe pres) par u? = D−1/2v1 ou v1 est le vecteurpropre associe a la plus grande valeur propre de la matrice (XD1/2)′(XD1/2). Etl’inertie expliquee par cet axe vaut alors λ1.

45

Page 46: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Un resultat d’algebre lineaire

Proposition 10. On a equivalence entre les resultats suivants• Si Ek est le sous-espace de dimension k portant l’inertie principale, alors

Ek+1 = Ek ⊕ uk+1

ou uk+1 est l’axe (espace de dimension 1) D-orthogonal a Ek portant l’inertiemaximale.

• Ek est engendre par les k vecteurs propres de (XD1/2)′(XD1/2) associes aux kplus grandes valeurs propres.

Aussi, l’ACP sur k+ 1 variables est obtenue par ajout d’une composante d’inertiemaximale a l’ACP sur k variable. C’est un mechanisme iteratif, il est inutile derefaire tourner des algorithmes.

46

Page 47: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les autres axes principaux

Le 2eme axe principal est• un axe orthogonal a u1 pour < ·, · >D

• maximisant l’inertieEn fait, u2 = D−1/2v2 ou v2 est le vecteur propre associe a la plus secondegrande valeur propre de la matrice (XD1/2)′(XD1/2). Et l’inertie expliquee parcet axe vaut alors λ2.

Rappelons que < u1,u2 >D=< v1,v2 >D= 0.

47

Page 48: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les autres axes principaux

De maniere plus generale, le keme axe principal est• un axe orthogonal a u1, · · · ,uk−1 pour < ·, · >D

• maximisant l’inertieEn fait, uk = D−1/2vk ou vk est le vecteur propre associe a la plus keme grandevaleur propre de la matrice (XD1/2)′(XD1/2). Et l’inertie expliquee par cet axevaut alors λk.

Rappelons que < uj ,uk >D=< vj ,vk >D= 0 pour j = 1, 2, · · · , k − 1.

48

Page 49: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Le (premier) axe principal

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.

L’espace individus (de Rq) est muni de la metrique issue D.

Definition 11. Le plan principal, ou premier plan principal, pour un nuaged’individus {L1, · · · , Ln} est le plan engendre par u1,u2.

49

Page 50: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Rappel de la methodologie

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.L’espace des individus (de Rq) est muni de la metrique issue D.

• on diagonalise (XD1/2)′(XD1/2).• soient λ1 ≥ λ2 ≥ λ3 ≥ · · · ≥ λq les valeurs propres, et vj les vecteurs propres• les axes principaux sont les uj = D−1/2vj .

Considerons le tableau de donnees X = (xij)1≤i≤n,1≤j≤q = {L1, · · · , Ln}.L’espace des variables (de Rn) est muni de la metrique issue ∆.

• on diagonalise (∆1/2X)′(∆1/2X).• soient λ1 ≥ λ2 ≥ λ3 ≥ · · · ≥ λq les valeurs propres, et νi les vecteurs propres• les axes principaux sont les µi = ∆−1/2νi.

50

Page 51: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Combien d’axes principaux doit-on retenir ?

Rappelons que l’on cherche a resumer l’information apportee par les variables parun “petit” nombre de facteurs, en tenant compte des correlation existant entreles variables.

=⇒ on veut garder peu d’axes principaux, avec• un soucis d’interpretation : on ne garde que des axes que l’on puisse interpreter,• des axes qui expliquent suffisement d’inertie. Pour cela, on a deux methodes◦ la methode du coude, correspondant a un decrochage au niveau des valeurs

propres◦ la regle de Kaiser, pour les variables centrees reduites : on ne garde que les

valeurs propres superieures a 1.(ce seuil de 1 correspond a la moyenne des valeurs propres).

51

Page 52: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les composantes principales

prendre x pour les individus, y pour les individu centres, et z pour lesindividus centres reduits

Les coordonnees d’un individu centre yi sur un axe principal ∆k sont obtenuespar D-projection

ci,j =< yi, uk >D= y′iDuk

Definition 12. On appelera composantes principales les variables ck, dans RI ,definies par

ck = Y Duk

Il s’agit des coordonnees des projections D-orthongales sur les axes principaux.

52

Page 53: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les composantes principales

Definition 13. La representation graphique du nuage des individus dans le planprincipal est alors le nuage des points c1, c2.

On notera que, par construction,ck = 0

car les colonnes de y sont centrees. De plus, V ar(ck) = λk et Cov(ck1 , ck2) = 0,i.e. les composantes principales sont orthogonales

53

Page 54: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les donnees centrees reduites

Il peut parfois etre pertinant de travailler avec la metrique D1/s2 , car lesdistances entre variables sont tres sensibles aux unitees (et donc a la dispersion).

Rappelons que travailler avec la matrice D1/s2 sur le nuage y est equivalent atravailler avecla metrique usuelle I sur le nuage de points centres reduits.

Definition 14. On appelera nuage centre reduit le tableau Z contenant les

zi,j =xi,j − xj

sj

i.e. z = (x− x)D1/s = yD1/s.

54

Page 55: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Le “cercle des correlations”

On suppose que l’espace des variables est muni d’une metrique D. On prendra lametrique des poids. Alors

s2x = V ar(x) = ‖x‖2D et sxy = cov(x, y) =< x, y >D .

De plus, r(x, y) =< x, y >D

‖x‖D‖y‖D.

Si les variables sont supposees centrees et reduite, la correlation entre unecomposante principale ck et une variable zj , ou z = (x− x)D1/s est

r(zj , ck) =cov(zj , ck)√ck

=xj ′Dck√

λk

,

donc le vecteur des correlations du facteur ck avec toutes les variables z est

r(z, ck) =z′Dck√λk

,

55

Page 56: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

or comme z′Dck = z′Dck = λkuk, on en dduit simplement que

r(z, ck) =√λkuk.

De cette expression, notons quep∑

k=1

r(zj , ck)2 = ‖zj‖2D = 1

et donc, en particulier, r(zj , c1)2 + r(zj , c2)2 ≤ 1.

Definition 15. On appelera cercle des correlations (e.g. dans le plan principal)le nuage de points (r(zj , c1), r(zj , c2)) pour k = 1, · · · , ????, ou sont projetees lesvariables.

La notion de “cercle” vient de la premiere propriete. Mais l’interpretation de laproximite des points n’est possible qu’au bord du cercle.

56

Page 57: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

57

Page 58: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Ls contributions des individus

Nous avions note que λk =1n

n∑i=1

c2i,k.

Definition 16. On appelera contribution d’un individu i a un axe k la quantitec2i,knλk

.

La contribution sera importante si elle excede le poids de l’individu 1/n, i.e.|ci,k| >

√λk.

58

Page 59: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Enlever/rajouter des variables/individus

Il est possible de faire une analyse en enlevant certaines variables et/ou individus,quite a les rajouter par la suite,• certains individus vont etre sur-representes, et risqueront de tirer le nuage dans

une direction. On peut les exclure de la regression, quite a les rajouter par lasuite

• certains individus vont etre sur-representes, et risqueront de tirer le nuage desindividus dans une direction. On peut les exclure de la regression, quite a lesrajouter par la suite

• certaines variables peuvent, par un comportement assez different, deformer lenuage des variables.

Considerons ici la base ACPsup.csv) telechargeables sur ma page internet, dontl’ACP brute donne

59

Page 60: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Enlever/rajouter des variables/individus

−80 −60 −40 −20

−5

05

Les individus

cl1

cl2

1

2

3

4

5

6

7

8

9

10

11

121314

15

16

1718

19

20

2122

23

24

25

26

2728

29

30

31

323334

3536

37

38

3940

4142

43

44

4546

47

48

49

50

51

52

53

54

55

56

57

58

59

6061

62

63

64

65

66

67

68

69

70

71

72

7374

75 76

7778

79

80

81

82

83

84

8586

87

88

89

90

91

92

9394

95

96

97

9899

100

−14 −12 −10 −8 −6 −4 −2

−4

−2

02

46

Les variables

Comp1C

omp2

AB CD

E

60

Page 61: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Enlever/rajouter des variables/individus

−15 −10 −5 0

−4

−3

−2

−1

01

2

Les individus

cl1

cl2

123

4

5

6

7

8

9

10

11

12

1314

15

16171819

20

21

22

23

2425

26

2728

29

30

31

32

3334

3536

37

38

3940

4142

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

7778

79

80

81

82

83

84

85

86

87

88

89

90

91

92

9394

95

96

97

98

99

100

−1 0 1 2

−3

−2

−1

01

2

Les individus

cl1[1:99]

cl2[

1:99

]

1 23

4

5

6

7

8

9

10

11

12

1314

15

16 171819

20

2122

23

24 25

26

2728

29

30

31

32

3334

3536

37

38

3940

4142

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

6162

63

64

65

6667

68

6970

71

72

73

74

75

76

7778

79

80

81

82

83

84

8586

87

88

89

90

91

92

9394

95

96

97

98

99

−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0

−0.

8−

0.6

−0.

4−

0.2

0.0

0.2

Les variables

Comp1

Com

p2

A

B

CD

E

61

Page 62: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les variables supplementaires

Pour les individus supplementaires, on peut calculer la correlation entr lavariable et les composantes principales, plus placer ce point dans le cercle descorrelations. Si z est la variable centree reduite supplementaire, on calcule

r(z, ck) =z′Dck√λk

=1

n√λk

n∑i=1

zici,k.

Notons qu’il est possible de tester la significativite de la correlation.

z<- dudi.pca(don, center = T, scale = T, scannf = F)

ligsup<-suprow(z,donsup)

62

Page 63: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les individus supplementaires

De meme ici, si Si z est l’individu centree reduite supplementaire, on calcule pourchaque axe principal k

ck =< z,uk) =p∑

j=1

zjuk,j .

63

Page 64: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Exemple sur donnees simulees

0.6 0.8 1.0 1.2

−0.

2−

0.1

0.0

0.1

0.2

0.3

0.4

Les variables

Comp1

Com

p2

A

B

C

D

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4−

0.6

−0.

4−

0.2

0.0

0.2

0.4

Les variables

Comp1

Com

p2

A

BC

D

E

64

Page 65: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−4 −2 0 2 4 6

−2

−1

01

Les individus

cl1

cl2

1

2

3

45

6

7

8

9

10

11

12

1314

15

16

17

18

19

20

21 22

2324

25

26 27

28

29

30 31

32

3334

35

36

37

38

39

40 41

42

43

44

45

46

4748

49

50

51

52

53

54

55

5657

58

59

60

61

62

63

64

65

66

67

68

69

707172

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

9394

95

96

97

98

99

−25 −20 −15 −10 −5 0 5

05

10

Les individus

cl1

cl2

1

2

345

6

7

8

910 1112

131415

16

171819

20

21222324

2526 27

2829

30 31323334

35

36

3738

39404142

43

444546

4748

49

50 5152

53

54

55

565758

59

6061

6263

64

6566

6768

69

70 71727374

75

767778

79

80

81

82

838485

8687 88

89

9091929394

95

96

979899

100

65

Page 66: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Un cas d’ecole

Considerons les resultats de l’election presidentielle de 1995, au premier tour(base election95.csv). Notons que la personne pour laquelle on vote peut etre vuecomme une variable qualitative (cf cours 3 sur l’ACM).

Les variables principales sont les variables suivantes• VOY95 Pourcentage de vote de Mme Voynet• HUE95 Pourcentage de vote de M. Hue• JOS95 Pourcentage de vote de M. Jospin• LAG95 Pourcentage de vote de Mme Laguiller• VIL95 Pourcentage de vote de M. de Villiers• CHEM95 Pourcentage de vote de M. Cheminade• CHI95 Pourcentage de vote de M. Chirac• BAL95 Pourcentage de vote de M. Balladur• LEP95 Pourcentage de vote de M. Le Pen• inscrits 95 Nombre d’inscrits sur les listes electorales en mai 1995• exprimes 95 Nombre de suffrages exprimes au premier tour de l’election

66

Page 67: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

presidentielle de 1995On obtient les graphiques suivants

−0.5 0.0 0.5 1.0

−0.

50.

00.

51.

0

CA factor map

Dim 1 (71.57%)

Dim

2 (

12.4

%)

●●

Agriculteurs

Artisans

Commercants

ChefsEntrepriseProfLiberales

CadresPublicCadresEntreprProfIntPublic

ProfIntEntreprTechniciens

Contremaitres

EmployesPublicEmployesEntreprEmployesCommerc

PersonnelsServ

OuvriersQualifOuvriersNonQual

OuvriersAgricol

EspagnolItalienPortugais

AutresUE

Algerien

Marocain

Tunisien

Turc

Autres

−1.5 −1.0 −0.5 0.0 0.5 1.0

−0.

50.

00.

51.

0Axe 1

Axe

2

Agriculteurs

Artisans

CommercantsChefsEntreprise

ProfLiberalesCadresPublic

CadresEntreprProfIntPublicProfIntEntrepr

Techniciens

Contremaitres

EmployesPublicEmployesEntreprEmployesCommerc

PersonnelsServ

OuvriersQualifOuvriersNonQual

OuvriersAgricol

EspagnolItalien Portugais

AutresUE

Algerien

Marocain

Tunisien

Turc

Autres

• il y a plusieurs variables supplementaire, lies a la repartition par CSP dans undepartement, le niveau de diplome, la nationalite.

67

Page 68: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

• on notera que des departements ont un comportement “singulier”, il seraitpeut-etre judicieux de les traiter comme individus supplementaires

Le diplome est traite comme variable “normale” a gauche, mais comme variablesupplmentaire a droite. Les modalites sont les suivantes DIPL0 Personne gee demoins de 15 ans, DIPL1 Aucun diplme, DIPL2 Certificat d’etudes primaires,DIPL3 BEPC, brevet elementaire, brevet des colleges, DIPL4 CAP, DIPL5 BEP,DIPL6 Baccalaureat general, DIPL7 Baccalaureat technologique ou professionnel,DIPL8 Diplme universitaire de 1er cycle, DIPL9 Diplme universitaire de 2e ou 3ecycle.

68

Page 69: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2

VOY95

HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95LEP95

DIPLOME0

DIPLOME1

DIPLOME2

DIPLOME3

DIPLOME4

DIPLOME5

DIPLOME6

DIPLOME7

DIPLOME8

DIPLOME9

−0.5 0.0 0.5

−0.

6−

0.4

−0.

20.

00.

20.

40.

60.

8

Les variables

Comp1

Com

p2

VOY95

HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

DIPLOME0DIPLOME1DIPLOME2DIPLOME3DIPLOME4DIPLOME5DIPLOME6DIPLOME7DIPLOME8DIPLOME9

69

Page 70: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2

VOY95

HUE95JOS95

LAG95

VIL95

CHEM95

CHI95

BAL95

LEP95CHOMEURS

ETUDIANTS

MILITAIRES

−1.0 −0.5 0.0 0.5

−0.

50.

00.

51.

0

Les variables

Comp1

Com

p2 VOY95

HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

CHOMEURS

ETUDIANTSMILITAIRES

Pour les CSP, on notera CS1· Agriculteurs exploitants, CS2· Artisans,commerants et chefs d’entreprises, CS3· Cadres et professions intellectuellessuperieures, CS4· Professions intermediaires (dont CS44 pour le clerge), CS5·Employes, CS6· Ouvriers, CS7· Retraites (dont CS72 Anciens artisans,

70

Page 71: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

commerants, chefs d’entreprise), CS8· Autres personnes inactives (dont CS81

Chmeurs n’ayant jamais travaille).

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2

VOY95

HUE95JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

CS11

CS12

CS13

CS21

CS22

CS23

CS31CS33

CS34

CS35

CS37

CS38

CS42

CS43

CS44

CS45

CS46

CS47

CS48

CS52CS53

CS54CS55

CS56

CS62

CS63

CS64CS65

CS67

CS68

CS69CS71

CS72

CS74CS75

CS77

CS78

CS81

CS83

CS84

CS85

CS86

−0.5 0.0 0.5−

0.6

−0.

4−

0.2

0.0

0.2

0.4

0.6

0.8

Les variables

Comp1

Com

p2

VOY95

HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

CS11CS12CS13CS21

CS22 CS23

CS31CS33

CS34

CS35CS37

CS38CS42

CS43

CS44

CS45

CS46

CS47CS48

CS52

CS53CS54CS55

CS56

CS62

CS63

CS64

CS65

CS67

CS68CS69

CS71

CS72

CS74CS75CS77

CS78CS81

CS83

CS84

CS85

CS86

Pour les departements, on peut commencer par ecarter la correze

71

Page 72: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2 VOY95

HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

−6 −4 −2 0 2 4

−5

−4

−3

−2

−1

01

2

Les individus

cl1cl

2

AIN

AISNE

ALLIER

ALPES−DE−HAUTE−PROVENCE

HAUTES−ALPESALPES−MARITIMES

ARDECHE

ARDENNES

ARIEGE

AUBE

AUDE

AVEYRON

BOUCHES−DU−RHONE

CALVADOS

CANTAL

CHARENTE

CHARENTE−MARITIME

CHER

CORSE−DU−SUDHAUTE−CORSE

COTE−D−OR

COTES−D−ARMOR

CREUSE

DORDOGNE

DOUBSDROMEEURE

EURE−ET−LOIRFINISTERE

GARD

HAUTE−GARONNE

GERS

GIRONDE

HERAULT

ILLE−ET−VILAINE

INDREINDRE−ET−LOIRE

ISERE

JURA

LANDES

LOIR−ET−CHER

LOIRE

HAUTE−LOIRE

LOIRE−ATLANTIQUE

LOIRET

LOT

LOT−ET−GARONNE

LOZERE

MAINE−ET−LOIREMANCHE

MARNEHAUTE−MARNE

MAYENNE

MEURTHE−ET−MOSELLE

MEUSEMORBIHAN

MOSELLE

NIEVRE

NORD

OISE

ORNE

PAS−DE−CALAIS

PUY−DE−DOME

PYRENEES−ATLANTIQUES

HAUTES−PYRENEES

PYRENEES−ORIENTALES

BAS−RHINHAUT−RHIN

RHONE

HAUTE−SAONE

SAONE−ET−LOIRESARTHE

SAVOIE

HAUTE−SAVOIE

PARIS

SEINE−MARITIME

SEINE−ET−MARNE

YVELINES

DEUX−SEVRES

SOMME

TARNTARN−ET−GARONNE

VAR

VAUCLUSE

VENDEE

VIENNEHAUTE−VIENNE

VOSGES

YONNE

TERRITOIRE−DE−BELFORT

ESSONNE

HAUTS−DE−SEINE

SEINE−SAINT−DENIS

CORREZE

On peut aussi etudier l’impact de la Vendee

72

Page 73: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2

VOY95HUE95

JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

−8 −6 −4 −2 0 2 4 6

−3

−2

−1

01

23

4

Les individus

cl1cl

2

AIN

AISNE

ALLIER

ALPES−DE−HAUTE−PROVENCEHAUTES−ALPES

ALPES−MARITIMES

ARDECHE

ARDENNES

ARIEGE

AUBE

AUDE

AVEYRON

BOUCHES−DU−RHONE

CALVADOS

CANTAL

CHARENTE

CHARENTE−MARITIME

CHER

CORREZE

CORSE−DU−SUD

HAUTE−CORSE

COTE−D−OR

COTES−D−ARMOR

CREUSE

DORDOGNEDOUBSDROME

EURE

EURE−ET−LOIR

FINISTERE

GARD

HAUTE−GARONNE

GERS

GIRONDE

HERAULT

ILLE−ET−VILAINE

INDREINDRE−ET−LOIRE

ISERE

JURALANDES

LOIR−ET−CHER

LOIRE

HAUTE−LOIRE

LOIRE−ATLANTIQUE

LOIRET

LOT

LOT−ET−GARONNE

LOZERE

MAINE−ET−LOIRE

MANCHEMARNEHAUTE−MARNEMAYENNE

MEURTHE−ET−MOSELLE

MEUSEMORBIHANMOSELLE

NIEVRE

NORDOISE

ORNE

PAS−DE−CALAIS

PUY−DE−DOME

PYRENEES−ATLANTIQUES

HAUTES−PYRENEES

PYRENEES−ORIENTALES

BAS−RHIN

HAUT−RHIN

RHONE

HAUTE−SAONESAONE−ET−LOIRE

SARTHE

SAVOIE

HAUTE−SAVOIEPARIS

SEINE−MARITIME

SEINE−ET−MARNE

YVELINES

DEUX−SEVRES

SOMME

TARN

TARN−ET−GARONNE

VAR

VAUCLUSE

VIENNEHAUTE−VIENNE

VOSGESYONNE

TERRITOIRE−DE−BELFORTESSONNE

HAUTS−DE−SEINE

SEINE−SAINT−DENIS

VENDEE

Et enfin l’impact de l’Alsace (Bas et Haut Rhin)

73

Page 74: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

−1.0 −0.5 0.0 0.5 1.0

−1.

0−

0.5

0.0

0.5

1.0

Les variables

Comp1

Com

p2 VOY95

HUE95JOS95

LAG95

VIL95CHEM95

CHI95

BAL95

LEP95

−8 −6 −4 −2 0 2 4

−4

−3

−2

−1

01

2

Les individus

cl1

cl2

AIN

AISNE

ALLIERALPES−DE−HAUTE−PROVENCE

HAUTES−ALPESALPES−MARITIMES

ARDECHE

ARDENNESARIEGE

AUBE

AUDE

AVEYRON

BOUCHES−DU−RHONE

CALVADOS

CANTAL

CHARENTE

CHARENTE−MARITIME

CHER

CORREZE CORSE−DU−SUD

HAUTE−CORSE

COTE−D−OR

COTES−D−ARMOR

CREUSE

DORDOGNE

DOUBSDROMEEURE

EURE−ET−LOIR

FINISTERE

GARDHAUTE−GARONNE

GERS

GIRONDE

HERAULT

ILLE−ET−VILAINE

INDREINDRE−ET−LOIRE

ISERE

JURA

LANDES LOIR−ET−CHER

LOIRE

HAUTE−LOIRE

LOIRE−ATLANTIQUE

LOIRET

LOTLOT−ET−GARONNE

LOZERE

MAINE−ET−LOIREMANCHE

MARNEHAUTE−MARNE

MAYENNE

MEURTHE−ET−MOSELLE

MEUSEMORBIHAN

MOSELLE

NIEVRE

NORD

OISE

ORNE

PAS−DE−CALAIS

PUY−DE−DOME

PYRENEES−ATLANTIQUES

HAUTES−PYRENEES

PYRENEES−ORIENTALES

RHONEHAUTE−SAONE

SAONE−ET−LOIRESARTHE

SAVOIE

HAUTE−SAVOIE

PARIS

SEINE−MARITIME

SEINE−ET−MARNE

YVELINES

DEUX−SEVRES

SOMME

TARN

TARN−ET−GARONNEVAR

VAUCLUSE

VENDEE

VIENNE

HAUTE−VIENNEVOSGES

YONNE

TERRITOIRE−DE−BELFORT

ESSONNE

HAUTS−DE−SEINE

SEINE−SAINT−DENIS

BAS−RHINHAUT−RHIN

74

Page 75: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Mise en oeuvre pratique

75

Page 76: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les donnees, en ACP

“Le palmares des departements. Ou vit-on en securite ?, dans L’Express (no 2589, 15

fevrier 2001).

• infra Nombre d’infractions totale pour 1000 habitants (2000)

• vols Nombre total de vols pour 1000 habitants (2000)

• eco Nombre d’infractions economiques et finacieres pour 1000 habitants (2000)

• crim Nombre de crimes et delits contre les personnes pour 1000 habitants (2000)

• vma Nombre de vols a main armee pour 1000 habitants (2000)

• vvi Nombre de vols avec violance pour 1000 habitants (2000)

• camb Nombre de cambriolages pour 1000 habitants (2000)

• roul Nombre de vols a la roulotte pour 1000 habitants (2000)

• auto Nombre de vols d’automobiles pour 1000 habitants (2000)

76

Page 77: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les donnees, en ACP robuste

Dans les ACP robuste, on ne s’interesse plus aux niveaux mais aux rangs

77

Page 78: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Base de donnees pour les 25 villes compareesAngers 14 19 12 12 11 19 19 7 6 14 21

Bordeaux 20 7 18 18 9 3 7 19 19 23 13

Caen 8 17 16 6 24 13 15 12 5 13 18

Clermont-Ferrand 14 25 8 16 7 20 5 5 9 1 24

Dijon 17 20 18 14 13 24 16 11 11 10 16

Douai-Lens 1 23 3 5 23 17 21 3 2 5 19

Grenoble 22 11 16 21 7 4 8 23 14 7 6

Lille 10 6 8 5 20 6 24 21 16 3 8

Lyon 10 8 23 17 5 2 13 22 23 24 4

Marseille-Aix-en-Provence 24 3 24 25 3 5 6 5 24 19 3

Metz 4 12 3 2 13 14 19 9 10 12 22

Montpellier 25 2 14 22 4 12 4 18 20 9 10

Nancy 24 16 12 10 16 18 19 24 5 21 17

Nantes 6 10 12 12 17 9 12 17 18 22 11

Nice 20 1 23 23 2 8 1 6 22 16 2

Orl ?ns 4 13 8 13 15 15 24 17 3 11 14

Paris 12 4 25 8 19 1 25 25 25 25 1

Rennes 12 14 8 7 21 11 11 15 12 15 7

Rouen 6 22 20 1 22 25 22 13 7 8 20

Saint-Etienne 15 24 12 19 8 21 9 1 13 2 25

Strasbourg 20 9 21 9 14 16 10 15 17 20 15

Toulon 8 18 4 24 1 10 3 2 15 4 5

Toulouse 22 5 20 20 10 7 2 20 21 17 12

Tours 17 15 14 15 18 22 20 8 8 6 9

Valenciennes 2 21 3 5 25 23 14 10 1 18 23

78

Page 79: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Nombre de medecins (pour 1000 habitants)Angers 14 19 12 12 11 19 19 7 6 14 21

Bordeaux 20 7 18 18 9 3 7 19 19 23 13

Caen 8 17 16 6 24 13 15 12 5 13 18

Clermont-Ferrand 14 25 8 16 7 20 5 5 9 1 24

Dijon 17 20 18 14 13 24 16 11 11 10 16

Douai-Lens 1 23 3 5 23 17 21 3 2 5 19

Grenoble 22 11 16 21 7 4 8 23 14 7 6

Lille 10 6 8 5 20 6 24 21 16 3 8

Lyon 10 8 23 17 5 2 13 22 23 24 4

Marseille-Aix-en-Provence 24 3 24 25 3 5 6 5 24 19 3

Metz 4 12 3 2 13 14 19 9 10 12 22

Montpellier 25 2 14 22 4 12 4 18 20 9 10

Nancy 24 16 12 10 16 18 19 24 5 21 17

Nantes 6 10 12 12 17 9 12 17 18 22 11

Nice 20 1 23 23 2 8 1 6 22 16 2

Orl ?ns 4 13 8 13 15 15 24 17 3 11 14

Paris 12 4 25 8 19 1 25 25 25 25 1

Rennes 12 14 8 7 21 11 11 15 12 15 7

Rouen 6 22 20 1 22 25 22 13 7 8 20

Saint-Etienne 15 24 12 19 8 21 9 1 13 2 25

Strasbourg 20 9 21 9 14 16 10 15 17 20 15

Toulon 8 18 4 24 1 10 3 2 15 4 5

Toulouse 22 5 20 20 10 7 2 20 21 17 12

Tours 17 15 14 15 18 22 20 8 8 6 9

Valenciennes 2 21 3 5 25 23 14 10 1 18 23

79

Page 80: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Nombre de crimes et delits (pour 1000 habitants)Angers 14 19 12 12 11 19 19 7 6 14 21

Bordeaux 20 7 18 18 9 3 7 19 19 23 13

Caen 8 17 16 6 24 13 15 12 5 13 18

Clermont-Ferrand 14 25 8 16 7 20 5 5 9 1 24

Dijon 17 20 18 14 13 24 16 11 11 10 16

Douai-Lens 1 23 3 5 23 17 21 3 2 5 19

Grenoble 22 11 16 21 7 4 8 23 14 7 6

Lille 10 6 8 5 20 6 24 21 16 3 8

Lyon 10 8 23 17 5 2 13 22 23 24 4

Marseille-Aix-en-Provence 24 3 24 25 3 5 6 5 24 19 3

Metz 4 12 3 2 13 14 19 9 10 12 22

Montpellier 25 2 14 22 4 12 4 18 20 9 10

Nancy 24 16 12 10 16 18 19 24 5 21 17

Nantes 6 10 12 12 17 9 12 17 18 22 11

Nice 20 1 23 23 2 8 1 6 22 16 2

Orl ?ns 4 13 8 13 15 15 24 17 3 11 14

Paris 12 4 25 8 19 1 25 25 25 25 1

Rennes 12 14 8 7 21 11 11 15 12 15 7

Rouen 6 22 20 1 22 25 22 13 7 8 20

Saint-Etienne 15 24 12 19 8 21 9 1 13 2 25

Strasbourg 20 9 21 9 14 16 10 15 17 20 15

Toulon 8 18 4 24 1 10 3 2 15 4 5

Toulouse 22 5 20 20 10 7 2 20 21 17 12

Tours 17 15 14 15 18 22 20 8 8 6 9

Valenciennes 2 21 3 5 25 23 14 10 1 18 23

80

Page 81: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Ensoleillement moyen, entre 1991 et 2000Angers 14 19 12 12 11 19 19 7 6 14 21

Bordeaux 20 7 18 18 9 3 7 19 19 23 13

Caen 8 17 16 6 24 13 15 12 5 13 18

Clermont-Ferrand 14 25 8 16 7 20 5 5 9 1 24

Dijon 17 20 18 14 13 24 16 11 11 10 16

Douai-Lens 1 23 3 5 23 17 21 3 2 5 19

Grenoble 22 11 16 21 7 4 8 23 14 7 6

Lille 10 6 8 5 20 6 24 21 16 3 8

Lyon 10 8 23 17 5 2 13 22 23 24 4

Marseille-Aix-en-Provence 24 3 24 25 3 5 6 5 24 19 3

Metz 4 12 3 2 13 14 19 9 10 12 22

Montpellier 25 2 14 22 4 12 4 18 20 9 10

Nancy 24 16 12 10 16 18 19 24 5 21 17

Nantes 6 10 12 12 17 9 12 17 18 22 11

Nice 20 1 23 23 2 8 1 6 22 16 2

Orl ?ns 4 13 8 13 15 15 24 17 3 11 14

Paris 12 4 25 8 19 1 25 25 25 25 1

Rennes 12 14 8 7 21 11 11 15 12 15 7

Rouen 6 22 20 1 22 25 22 13 7 8 20

Saint-Etienne 15 24 12 19 8 21 9 1 13 2 25

Strasbourg 20 9 21 9 14 16 10 15 17 20 15

Toulon 8 18 4 24 1 10 3 2 15 4 5

Toulouse 22 5 20 20 10 7 2 20 21 17 12

Tours 17 15 14 15 18 22 20 8 8 6 9

Valenciennes 2 21 3 5 25 23 14 10 1 18 23

81

Page 82: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Cumul des emboutillagesAngers 14 19 12 12 11 19 19 7 6 14 21

Bordeaux 20 7 18 18 9 3 7 19 19 23 13

Caen 8 17 16 6 24 13 15 12 5 13 18

Clermont-Ferrand 14 25 8 16 7 20 5 5 9 1 24

Dijon 17 20 18 14 13 24 16 11 11 10 16

Douai-Lens 1 23 3 5 23 17 21 3 2 5 19

Grenoble 22 11 16 21 7 4 8 23 14 7 6

Lille 10 6 8 5 20 6 24 21 16 3 8

Lyon 10 8 23 17 5 2 13 22 23 24 4

Marseille-Aix-en-Provence 24 3 24 25 3 5 6 5 24 19 3

Metz 4 12 3 2 13 14 19 9 10 12 22

Montpellier 25 2 14 22 4 12 4 18 20 9 10

Nancy 24 16 12 10 16 18 19 24 5 21 17

Nantes 6 10 12 12 17 9 12 17 18 22 11

Nice 20 1 23 23 2 8 1 6 22 16 2

Orl ?ns 4 13 8 13 15 15 24 17 3 11 14

Paris 12 4 25 8 19 1 25 25 25 25 1

Rennes 12 14 8 7 21 11 11 15 12 15 7

Rouen 6 22 20 1 22 25 22 13 7 8 20

Saint-Etienne 15 24 12 19 8 21 9 1 13 2 25

Strasbourg 20 9 21 9 14 16 10 15 17 20 15

Toulon 8 18 4 24 1 10 3 2 15 4 5

Toulouse 22 5 20 20 10 7 2 20 21 17 12

Tours 17 15 14 15 18 22 20 8 8 6 9

Valenciennes 2 21 3 5 25 23 14 10 1 18 23

82

Page 83: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

> add=read.table("http://perso.univ-rennes1.fr/arthur.charpentier/ADD-ex-villes.txt",header=TRUE)

> base=add[,2:ncol(add)]

> rownames(base)=add$Agglo

Considerons comme matrice D la matrice1n

I pour l’espace des individus, et la

matrice identite pour l’espace des variables, ∆ = I.

On diagonale alors1nX ′X, et on note v1, · · · ,vq les vecteurs propres associes aux

valeurs propres λ1 > · · · > λq. On obtient alors les vecteurs uk engendrant les axes

principaux, qui expliquent chacun 100× λk∑kj=1 λj

% de l’inertie totale.

> X <- as.matrix(base)

> n <- nrow(base)

> eigen(1/n * t(X) %*% X)

> eigen(1/n * t(X) %*% X)$vectors

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]

[1,] -0.31 -0.29 -0.2684 -0.254 -0.493 -0.331 -0.096 -0.097 -0.242 0.3880 0.318

[2,] -0.29 0.38 -0.2735 0.016 0.190 0.272 -0.083 -0.122 0.527 0.2165 0.492

[3,] -0.32 -0.27 0.0340 0.099 0.236 -0.590 0.156 0.108 0.539 0.0492 -0.277

[4,] -0.29 -0.29 -0.4430 -0.157 0.208 0.393 -0.380 -0.080 -0.062 -0.0011 -0.506

83

Page 84: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

[5,] -0.30 0.30 0.3507 -0.013 0.039 -0.054 0.104 -0.742 -0.161 0.1818 -0.270

[6,] -0.30 0.35 -0.2859 0.022 -0.051 -0.337 -0.153 -0.042 -0.174 -0.7280 0.073

[7,] -0.30 0.26 0.3767 -0.412 0.344 -0.078 -0.236 0.493 -0.262 0.1879 0.034

[8,] -0.31 -0.16 0.4032 -0.329 -0.485 0.346 0.075 0.063 0.349 -0.3510 -0.022

[9,] -0.29 -0.39 -0.0098 0.013 0.409 0.190 0.560 -0.059 -0.304 -0.1865 0.341

[10,] -0.30 -0.19 0.3090 0.734 -0.073 0.097 -0.423 0.081 -0.113 0.0173 0.152

[11,] -0.30 0.35 -0.2256 0.288 -0.301 0.161 0.476 0.384 -0.131 0.2088 -0.325

> eigen(1/n * t(X) %*% X)$values

[1] 1940.974149 275.990875 123.372428 31.594830 28.553309 23.945216

[7] 14.283394 12.950240 10.841242 6.205078 4.849239

Pour mieux comprendre quelle part est expliquee par les premiers axes propres, on

utilise

> valp <- eigen(1/n * t(X) %*% X)$values

> 100 * valp/sum(valp)

[1] 78.4688525 11.1576382 4.9876465 1.2773020 1.1543407 0.9680467

[7] 0.5774428 0.5235466 0.4382850 0.2508562 0.1960429

i.e. le premier axe explique 78.5% de l’inertie, et le second 11% de l’inertie (soit pres

de 90% pour le plan principal).

84

Page 85: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Une autre possibilite est d’utiliser dudi.pca de library(ade4).

> acp <- dudi.pca(base, scale = F, center = F,scannf = F, nf = ncol(base))

> acp$c1

CS1 CS2 CS3 CS4 CS5 CS6 CS7 CS8 CS9 CS10 CS11

Medecins -0.31 -0.29 -0.2684 -0.254 -0.493 -0.331 -0.096 -0.097 -0.242 0.3880 0.318

Crimin -0.29 0.38 -0.2735 0.016 0.190 0.272 -0.083 -0.122 0.527 0.2165 0.492

Musees -0.32 -0.27 0.0340 0.099 0.236 -0.590 0.156 0.108 0.539 0.0492 -0.277

Soleil -0.29 -0.29 -0.4430 -0.157 0.208 0.393 -0.380 -0.080 -0.062 -0.0011 -0.506

Polution -0.30 0.30 0.3507 -0.013 0.039 -0.054 0.104 -0.742 -0.161 0.1818 -0.270

Embout -0.30 0.35 -0.2859 0.022 -0.051 -0.337 -0.153 -0.042 -0.174 -0.7280 0.073

LienParis -0.30 0.26 0.3767 -0.412 0.344 -0.078 -0.236 0.493 -0.262 0.1879 0.034

Cadres -0.31 -0.16 0.4032 -0.329 -0.485 0.346 0.075 0.063 0.349 -0.3510 -0.022

CreatEntrp -0.29 -0.39 -0.0098 0.013 0.409 0.190 0.560 -0.059 -0.304 -0.1865 0.341

Revenu -0.30 -0.19 0.3090 0.734 -0.073 0.097 -0.423 0.081 -0.113 0.0173 0.152

PrixImmob -0.30 0.35 -0.2256 0.288 -0.301 0.161 0.476 0.384 -0.131 0.2088 -0.325

Les valeurs propres sont elles

> acp$eig

[1] 1940.974149 275.990875 123.372428 31.594830 28.553309 23.945216 14.283394

[8] 12.950240 10.841242 6.205078 4.849239

85

Page 86: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Les projections sur les deux premiers axes sont donnees par acp$c1[,1 :2]. Toutes les

variables contribuent a l’axe 1 (sens negatif).

On utilise s.label(acp$li) et s.label(acp$co) pour projeter lignes et colonnes

respectivement

d = 10

Angers

Bordeaux

Caen ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

Nancy

Nantes

Nice

Orléans

Paris

Rennes

Rouen

SaintEtienne

Strasbourg Toulon

Toulouse

Tours

Valenciennes

d = 5

Medecins

Crimin

Musees Soleil

Polution

Embout

LienParis

Cadres

CreatEntrp

Revenu

PrixImmob

86

Page 87: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

ACP centree ou pas

Parmi les transformations usuelles des variables, on peut les centrer. La nouvelle

origine G a pour coordonnees (C1, · · · , Cq), correspondant au centre de gravite du

nuage de points.

On note Cj les colonnes (centrees) de X, i.e. Cj = Cj −Cj . Alors la norme de Cj

correspond a l’ecart-type de Cj , puisque

‖Cj‖2 =1n

∑i=1

n(xi,j − Cj)2 = V ar(Cj).

87

Page 88: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

d = 10

Angers

Bordeaux

Caen

ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

Nancy Nantes

Nice

Orléans

Paris

Rennes Rouen

SaintEtienne

Strasbourg

Toulon

Toulouse Tours

Valenciennes

d = 2 d = 2

Medecins Crimin

Musees

Soleil

Polution

Embout

LienParis Cadres

CreatEntrp

Revenu

PrixImmob

88

Page 89: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

ACP normee ou pas

Parmi les transformations usuelles des variables, on peut les normer. Ceci permet de

reequilibrer des variables qui peuvent etre exprimees dans des unitees differentes. d = 2

Angers

Bordeaux

Caen

ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

Nancy Nantes

Nice

Orléans

Paris

Rennes Rouen

SaintEtienne

Strasbourg

Toulon

Toulouse Tours

Valenciennes

d = 0.5 d = 0.5

Medecins Crimin

Musees

Soleil

Polution

Embout

LienParis Cadres

CreatEntrp

Revenu

PrixImmob

Attention On a seulement normalise les variables.

89

Page 90: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Angers

Bordeaux

Caen

ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

Nancy Nantes

Nice

Orléans

Paris

Rennes Rouen

SaintEtienne

Strasbourg

Toulon

Toulouse Tours

Valenciennes

Medecins Crimin

Musees

Soleil

Polution

Embout

LienParis Cadres

CreatEntrp

Revenu

PrixImmob

L’etude de ces inerties peut se faire a l’aide de plot(princomp(base)) sous R.

biplot(princomp(base)) permet de projeter les individus sur le premier plan principal.

90

Page 91: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Comp.1 Comp.3 Comp.5 Comp.7 Comp.9

Var

ianc

es

050

100

150

200

250

−0.4 −0.2 0.0 0.2 0.4

−0.

4−

0.2

0.0

0.2

0.4

Comp.1

Com

p.2

Angers

Bordeaux

Caen

ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

NancyNantes

Nice

Orléans

Paris

RennesRouen

SaintEtienne

Strasbourg

Toulon

ToulouseTours

Valenciennes

−30 −20 −10 0 10 20 30 40

−30

−20

−10

010

2030

40

MedecinsCrimin

Musees

Soleil

Polution

Embout

LienParisCadres

CreatEntrp

Revenu

PrixImmob

Attention le signe peut changer d’un logiciel a l’autre. Par exemple, le calcul

complet a partir de la diagonalisation donne

x=as.matrix(base)

n <- nrow(x); p <- ncol(x)

centre <- apply(x, 2, mean)

91

Page 92: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

x <- x - matrix(centre, nr=n, nc=p, byrow=T)

e1 <- eigen( t(x) %*% x, symmetric=T )

e2 <- eigen( x %*% t(x), symmetric=T )

variables <- t(e2$vectors) %*% x

individus <- t(e1$vectors) %*% t(x)

variables <- t(variables)

individus <- t(individus)

valeurs.propres <- e1$values

plot( individus[,1:2],

xlim=c( min(c(individus[,1],-individus[,1])),

max(c(individus[,1],-individus[,1])) ),

ylim=c( min(c(individus[,2],-individus[,2])),

max(c(individus[,2],-individus[,2])) ),

xlab=’’, ylab=’’, frame.plot=F )

par(new=T)

plot( variables[,1:2], col=’red’,

xlim=c( min(c(variables[,1],-variables[,1])),

max(c(variables[,1],-variables[,1])) ),

ylim=c( min(c(variables[,2],-variables[,2])),

max(c(variables[,2],-variables[,2])) ),

axes=F, xlab=’’, ylab=’’, pch=’.’)

92

Page 93: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

axis(3, col=’red’)

axis(4, col=’red’)

arrows(0,0,variables[,1],variables[,2],col=’red’)

●●

●●

●●

−30 −20 −10 0 10 20 30

−20

−10

010

20

−30 −20 −10 0 10 20 30

−20

−10

010

20

−0.4 −0.2 0.0 0.2 0.4−

0.4

−0.

20.

00.

20.

4

Comp.1

Com

p.2

Angers

Bordeaux

Caen

ClermontFerrand

Dijon

Douai

Grenoble

Lille

Lyon

Marseille

Metz

Montpellier

NancyNantes

Nice

Orléans

Paris

RennesRouen

SaintEtienne

Strasbourg

Toulon

ToulouseTours

Valenciennes

−30 −20 −10 0 10 20 30 40

−30

−20

−10

010

2030

40

MedecinsCrimin

Musees

Soleil

Polution

Embout

LienParisCadres

CreatEntrp

Revenu

PrixImmob

93

Page 94: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Explication des axes

Pour interpreter le premier axe, rappelons que

> names(base)

[1] "Medecins" "Crimin" "Musees" "Soleil" "Polution"

[6] "Embout" "LienParis" "Cadres" "CreatEntrp" "Revenu"

[11] "PrixImmob"

> acp$co[, 1]

[1] 0.6847162 -0.8499417 0.7013389 0.7029398 -0.6656761

[6] -0.7872159 -0.5619327 0.3887672 0.9139772 0.4715448

[11] -0.7837233

94

Page 95: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

variable axe 1 axe 2

Medecins 0.6847162 ↗ -0.27456842 ↘Criminalite -0.8499417 ↘ -0.31990547 ↘Musees 0.7013389 ↗ 0.20111087 ↗Soleil 0.7029398 ↗ -0.63249825 ↘Polution -0.6656761 ↘ 0.63938379 ↗Embouteillages -0.7872159 ↘ -0.30661893 ↘Lien Paris -0.5619327 ↘ 0.65398028 ↗Cadres 0.3887672 ↗ 0.72323380 ↗Creation Entreprises 0.9139772 0.04395415

Revenu 0.4715448 ↗ 0.58525352 ↗Prix Immobilier -0.7837233 ↘ -0.22212682 ↘

95

Page 96: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Recherche des points affluents

L’etude de la projection de l’espace des individus permet d’associer ou de dissocier

des individus au comportement proche, ou radicalement different : deux points tres

eloignes sur le premier axe sont tres eloignes dans le nuage initial.

L’intertie du nuage s’ecrit

I(X,D,u) =n∑

i=1

< Li,u >2D .

On peut ainsi chercher les points qui contribuent le plus au positionnement de l’axe,

i.e. i pour lequels < Li,u >2D est grand.

Definition 17. On appelle contribution (absolue) du point Li a la position del’axe uk la quantite

CTk(Li) =< Li,uk >

2D

λk.

96

Page 97: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Notons quen∑

i=1

CTk(Li) =I(X,D,uk)

λk= 1.

97

Page 98: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Qualite d’une projectionDefinition 18. On appelle qualite de la representation du point Li sur l’axe uk

la quantite

QRk(Li) =< Li,uk >

2D

‖Li, ‖D.

Notons que

q∑k=1

CTk(Li) = 1. On parle aussi de contribution relative

> intacp <- inertia.dudi(acp, col.inertia = T,row.inertia = T)

> intacp$row.rel[, 1]

Angers Bordeaux Caen ClermontFerrand

-5500 7565 -5479 -2464

Dijon Douai Grenoble Lille

-2131 -8886 5132 -144

Lyon Marseille Metz Montpellier

6621 8075 -5311 6134

Nancy Nantes Nice Orleans

-455 572 7660 -4276

Paris Rennes Rouen SaintEtienne

98

Page 99: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

2726 -559 -6401 -1760

Strasbourg Toulon Toulouse Tours

1865 150 8115 -2283

Valenciennes

-7911

> intacp$row.abs[, 1]

Angers Bordeaux Caen ClermontFerrand

210 483 251 243

Dijon Douai Grenoble Lille

72 1212 347 12

Lyon Marseille Metz Montpellier

678 1168 374 523

Nancy Nantes Nice Orleans

30 19 1024 226

Paris Rennes Rouen SaintEtienne

444 16 681 172

Strasbourg Toulon Toulouse Tours

60 17 605 94

Valenciennes

1040

Le signe indique le signe de la coordonnee sur l’axe 1.

99

Page 100: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

acp$co fournit l’interpretation des axes µk dans l’espace des variables ,

acp$c1 fournit l’interpretation des axes uk dans l’espace des individus

Les valeurs sont identiques a une constante de normalisation pres.

=⇒ Les axes uk et µk ont la meme interpretation par rapport aux variables initiales.

On peut alors envisager une repesentation simultanee des espaces individus ou

variables.

On peut regarder la projection des villes, en fonction de differentes variables

explicatives,

s.value(acpli, scale(base$Medecins))s.value(acpli, scale(base$Crimin ))

100

Page 101: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

101

Page 102: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

d = 2

−1.5 −0.5 0.5 1.5

102

Page 103: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Retour sur la mthodologie de l’ACP

Pour resumer, on part d’un nuage de n individus dont on connaıt p variables

quantitatives, notees x.

On general, on centre et on rduit les variables pour obtenir une matrice z.

La matrice p× p de correlation de z possede des valeurs propres que l’on ordonne

λ1 ≥ λ2 ≥ · · · ≥ λk ≥ 0.

Les facteurs principaux uk sont les vecteurs propres orthonormes de la matrice de

correlation, associes aux valeurs propres λk. uk,j est le poids de la variable j dans la

composante k.

Les composantes principales ont les vecteurs ck = zuk, de taille n. ck,i est la valeur

de la composante k pour l’individu i. Notons que la variance de ck vaut λk.

Le cercle des correlation permet de visualiser les correlations entre les variables avec

les axes principaux. Seules les variables au bord du cercle sont interpretables (car bien

representes par les deux axes).

103

Page 104: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Petite digression : modeliser des taux

Considerons le jeu de donnees suivant,

load(url("http://pbil.univ-lyon1.fr/R/donnees/pps066.rda"))

Trois tableaux croisent 20 pays et 39 annees pour la consommation individuelle de

biere, vin et spiritueux. Le but est detudier la repartition entre ces trois types d’alcool

(et non pas les niveaux d’alcool consommes).

Pour cela, comme on est en dimension 3 (3 alcools possibles), on peut utiliser une

representation dite triangulaire.

104

Page 105: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

0 1

Vin

10 Bière 1

0

Spiriteux

●●

● ●

●●

●●●

All Aut

Bel

Chy Dan

Esp

Fin

Fra Gre

Hon

Irl Ita Lux

PBa

Pol

Por

Rep RU Slq

Sue

0 1

Vin

0.90 Bière 0.9

0.1

Spiriteux

●● ●

All

Aut

Bel

Chy Dan

Esp

Fin

Fra

Gre Hon

Irl Ita Lux

PBa

Pol

Por Rep

RU

Slq

Sue

105

Page 106: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Retour sur la methodologie de l’ACP

Sous R, plusieurs fonctions permettent de faire des ACP

• dans library(base), la fonction princomp,

• dans library(ade4), la fonction dudi.pca, qui permet simplement de centrer et

reduire les variables.

• dans library(FactoMineR), la fonction PCA

106

Page 107: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’ACP avec dudi.pca

Cette partie sera inspiree de Dufour & Lobry (2008), tdr601.pdf.

Considerons les donnees survey de library(MASS). On retiendra 4 variables,

• survey$Wr.Hnd correspondant a l’empan de la main d’ecriture

• survey$NW.Hnd correspondant a l’empan de la main qui n’ecrit pas

• survey$Height correspondant a la taille de la personne

• survey$sex correspondant au sexe de la personne.

107

Page 108: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’ACP avec dudi.pca

108

Page 109: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’ACP avec dudi.pca

109

Page 110: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

L’ACP avec dudi.pca

survey.cc <- survey[complete.cases(survey), ]

mesures <- survey.cc[, c("Wr.Hnd", "NW.Hnd", "Height")]

La premiere commande permet de ne garder que les individus ne presentant pas de

valeurs manquantes. L’ACP se fait en utilisant simplement acp <- dudi.pca(mesures,

scann = FALSE, nf = 3).

Pour recuperer toutes les informations, on peut utiliser la fonction suivantes

> eval(acp$call)

Duality diagramm

class: pca dudi

$call: dudi.pca(df = mesures, scannf = FALSE, nf = 3)

$nf: 3 axis-components saved

$rank: 3

eigen values: 2.509 0.4568 0.03445

vector length mode content

1 $cw 3 numeric column weights

2 $lw 168 numeric row weights

110

Page 111: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

3 $eig 3 numeric eigen values

data.frame nrow ncol content

1 $tab 168 3 modified array

2 $li 168 3 row coordinates

3 $l1 168 3 row normed scores

4 $co 3 3 column coordinates

5 $c1 3 3 column normed scores

acp$tab est la matrice z obtenue en centrant puis en reduisant la table initiale x.

acp$cw contient des points attibues a chaque variable (colonne), i.e. ici 1 partout.

acp$lw contient des points attibues a chaque individu (ligne), i.e. ici 1/n partout.

acp$eig contient le vecteur des valeurs propres.

acp$c1 donne les coordonees des variables sur les 3 permiers axes principaux. Ces

vecteurs sont de norme 1.

acp$co donne les coordonees des variables sur les 3 permiers axes principaux. Ces

vecteurs sont de norme√λ.

> acp$c1

111

Page 112: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

CS1 CS2 CS3

Empan1 0.6084890 -0.3420962 0.71603859

Empan2 0.6040404 -0.3855223 -0.69750107

Taille 0.5146613 0.8569380 -0.02794614

> acp$co

Comp1 Comp2 Comp3

Empan1 0.9637816 -0.2312213 0.132897100

Empan2 0.9567355 -0.2605728 -0.129456527

Taille 0.8151685 0.5792006 -0.005186817

> t(t(acp$c1)*sqrt(acp$eig))

CS1 CS2 CS3

Empan1 0.9637816 -0.2312213 0.132897100

Empan2 0.9567355 -0.2605728 -0.129456527

Taille 0.8151685 0.5792006 -0.005186817

acp$l1 donne les coordonees des individus sur les 3 permiers axes principaux, ces

vecteurs etant unitaires

acp$li donne les coordonees des individus sur les 3 permiers axes principaux, ces

vecteurs etant unitaires

> head(acp$l1)

112

Page 113: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

RS1 RS2 RS3

1 -0.18511289 0.35854289 0.7771789

2 0.65642982 -0.01654562 -2.0459458

5 0.24122137 -1.63843055 0.1095513

6 -0.35270516 0.54186131 0.3453116

7 -0.08047126 1.91845520 -0.4136080

8 -1.14527378 -1.08502402 -0.6698669

> head(acp$li)

Axis1 Axis2 Axis3

1 -0.2931990 0.24233756 0.14424478

2 1.0397147 -0.01118311 -0.37972849

5 0.3820689 -1.10740798 0.02033277

6 -0.5586473 0.36624167 0.06409000

7 -0.1274579 1.29667540 -0.07676584

8 -1.8139913 -0.73336294 -0.12432761

> head(t(t(acp$l1) * sqrt(acp$eig)))

RS1 RS2 RS3

1 -0.2931990 0.24233756 0.14424478

2 1.0397147 -0.01118311 -0.37972849

5 0.3820689 -1.10740798 0.02033277

6 -0.5586473 0.36624167 0.06409000

113

Page 114: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

7 -0.1274579 1.29667540 -0.07676584

8 -1.8139913 -0.73336294 -0.12432761

Enfin, pour faire quelques graphiques, on utilise s.label ou s.class pour visualiser les

individus

> s.label(acp$li, xax = 1, yax = 2)

> s.class(acp$li, fac=sexe,col=c("red","blue"),xax = 1, yax = 2)

114

Page 115: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

d = 2

1 2

5

6

7

8

9 10

11

14

17

18

20

21 22

23

24

27 28 30

32

33 34 36

38

39

42

44

47

48

49 50

51 52

53 54

55

57 59

61 62

63

65

71

73 74

75 76

77

79

82

85

86

87

88

89

91

93 95

97

98 100 102

104

105 106

109

110 111

112

113 114 115

116 117

118 119

120

122

123 124 125

127

128

129 130

131 132 134

135

136 138

140

141

143 144 145

146

147

148

149 150

151

152

153 154

155

156 158

160

161

163

164 166

167

168

170 172

174

175

176

177

178 180 181

182

183 184

185

186

187

188

189

190

191

192

193

194

196

197

198 199 200

201

202 204

205

206

207

208

209

211

212 214 215

218

220

222

223

227

228

229

230

231 233

234

236 237

d = 2

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●●

● ●

●●

●●

● ●

●●

●●

Female Male

Pour visualiser les variables, on utilise s.corcircle ou pour tout representer ensemble,

la fonction scatter

> s.corcircle(acp$co, xax = 1, yax = 2)

> scatter(acp)

115

Page 116: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Empan1 Empan2

Taille

d = 2

1 2

5

6

7

8

9 10

11

14

17

18

20

21

22

23

24

27 28 30

32

33 34

36

38

39

42

44

47

48

49

50

51

52

53 54

55

57

59

61 62

63

65

71

73

74 75

76

77

79

82

85

86

87

88

89

91

93 95

97

98 100 102

104

105 106

109

110 111

112

113 114 115

116 117

118

119 120

122

123 124 125

127

128

129 130

131 132

134

135

136 138

140

141

143 144

145

146

147

148

149 150

151

152

153 154

155

156 158

160

161

163

164

166

167

168

170 172

174

175

176

177

178 180 181

182

183 184

185

186

187

188

189

190

191

192

193

194

196

197

198 199 200

201

202 204

205

206

207

208

209

211

212 214 215

218

220

222

223

227

228

229

230

231 233

234

236 237

Empan1 Empan2

Taille Eigenvalues

116

Page 117: Cours add-r1-part1

Arthur CHARPENTIER - Analyse des donnees

Travaux diriges

Le TD portera sur la base de donnees departement.xls (dont une codification est

donne dans le fichier code-departement.xls) telechargeables sur ma page internet.

117


Recommended