+ All Categories
Home > Documents > EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Date post: 25-Feb-2022
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
17
EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE PAIRVIZ PACKAGE Catherine Hurley NUI Maynooth R.W. Oldford U. Waterloo July 8 2009 UseR! Monday 13 July 2009
Transcript
Page 1: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION

AND THE PAIRVIZ PACKAGE

Catherine HurleyNUI Maynooth

R.W. OldfordU. Waterloo

July 8 2009 UseR!Monday 13 July 2009

Page 2: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Graphics: Effect Ordering

• Packages: seriation, gclus, corrgram

• Example: PCP Flea data

Standard order

Tars1 Tars2 Aede1 Aede2 Head Aede3

Correlation order

Tars2 Aede1 Aede2 Aede3 Tars1 Head

-0.6

0.2

Monday 13 July 2009

Page 3: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Pairviz: relationship ordering

• Statistical graphics are about comparisons

between variables, cases, groups, models

Aede3 Aede2 Aede1 Tars2 Aede2 Aede1 Aede3 Tars2 Tars1 Head Tars2 Tars1 Aede1 Head Aede2 Tars1 Aede3 Head

-0.6

0.0

0.6

Flea data: correlation order

Monday 13 July 2009

Page 4: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

A graph model• Build a graph where nodes are statistical objects

• Edges are relationships

• Example:

Node Vis Edge Vis

Group Boxplot Two groups CI for mean diffVar Hist Two vars Scatterplot

2 vars Scat 4-d space Dynamic scatModel Resid 2 Models PCP

A

BC

D

E F

Monday 13 July 2009

Page 5: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Example: planned comparisons

Mice in 5 diet groups, response is lifetime

Nodes are treatments, edges are planned comparisons

Weights are p-values

0 0

0.0083

0.0147

0.3111N/N85

N/R40

N/R50NP R/R50

lopro

N/R50 N/N85 NP lopro N/R50 N/R40 R/R50 N/R5010

20

30

40

50

Planned comparisons of diets

Lifetime

-50

510

Differences

Reducing calories and protein increases lifetime

Monday 13 July 2009

Page 6: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Graph Traversal

• Traverse all nodes: hamiltonian path

• Traverse all edges: eulerian path

• Use gclus, seriation: hamiltonian paths on complete graphs

• PairViz: eulerian paths

A

B

C

D

E

F

G

H

A

B

C

D

E

F

G

H

Open hamiltonian path Closed hamiltonian path

Closed eulerian path on K7

A

B

C

D

E

F

G

Monday 13 July 2009

Page 7: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Graph Structures

• Complete graph: all comparisons are interesting

• Edge-weighted graphs: low weight edges are more interesting

• Bipartite graph

eg only treatment-control comparisons are of interest

Aede3 Aede2 Aede1 Tars2 Aede2 Aede1 Aede3 Tars2 Tars1 Head Tars2 Tars1 Aede1 Head Aede2 Tars1 Aede3 Head

-0.6

0.0

0.6

Weight edges by 1-corr, eulerian follows low weight edges

X1

X2

X3

Y1

Y2

Monday 13 July 2009

Page 8: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

•Hypercube graph

or model selection: Each node in G is a predictor subsetedge: add/drop predictor

Graph Structures- cont’d

•Line graph

transform G

to L(G)

eg Each node in G is a var, each node in L(G) is var pair, edge is 3-d transition

Cube for factorial experiment

000 001

010 011

100 101

110 111

A

B

C

D

AB

ACAD

BC

BD CD

Monday 13 July 2009

Page 9: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Algorithms- Complete graph• Closed eulerian path exists when each node has odd number of vertices: ie for K2n+1

• Hamiltonian decomposition of graph

• into hamiltonian cycles: eulerian for K2n+1

• into hamiltonian paths: approx eulerian for K2n

• classical algorithm: hpaths

• WHam: weighted_hpaths: pick best for H1, best orientaton and order for others.

1

2

3

4

5

6

7

1

2

3

4

5

6

7

1

2

3

4

5

6

7

Monday 13 July 2009

Page 10: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Algorithms-Complete graph cont’d

• Recursive algorithm: eseq:

• Start with eulerian on Kn, append edges to get eulerian on Kn+2

1 2 3 4 5

6 7

Monday 13 July 2009

Page 11: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Algorithms- general

• Eulerian graph: connected, all nodes have even number of edges

• Otherwise, add edges, pairing up odd nodes

• Classical algorithm (Hierholzer, Fleury)

• Our version GrEul, (etour) follows weight increasing edges

Chinese postman does this in optimal way0 0

0.0083

0.0147

0.3111N/N85

N/R40

N/R50NP R/R50

lopro

Monday 13 July 2009

Page 12: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Algorithms comparison

Complete-no weights

0 5 10 15 20 25 30 35

24

68

Etour 9

0 5 10 15 20 25 30 35

24

68

Eseq 9

0 5 10 15 20 25 30 352

46

8

hpaths 9

prefers low vertices prefers low edges 4 hamiltonians

Monday 13 July 2009

Page 13: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Algorithms: complete, weighted

0 50 100 150 200

01000

2000

3000

4000

Algorithm eseq: Eurodist edge weights

0 50 100 150 200

01000

2000

3000

4000

Weighted etour on Eurodist

0 50 100 150 200

01000

2000

3000

4000

Weighted hamiltonians on Eurodist

1 2 3 4 5 6 7 8 9 10

ignores weights Starts in Geneva

hamiltonian decomp, with increasing path lengths

Eurodist: 21 European cities

Monday 13 July 2009

Page 14: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

Example: model selection

Mammal sleep data Y= log brain wt.

Predictors A= non dreaming sleep, B=dreaming sleep, C=log body wt, D=life span

0

A B C D

AB AC AD BC BD CD

ABC ABD ACD BCD

ABCD

•Hypercube graph represents possible moves in a stepwise regression algorithm•Graph Qn is hamiltonian, and eulerian for even n•Edge weights: change in SSE

•Eulerian starting with full model•All models with C are good•Bar chart: change in SSE

Sleep data: Model residuals.

ABCD BCD CD ACD ABCD ABC BC C AC ABC AB A AD ABD BD D AD ACD AC A 0 D CD C 0 B BD BCD BC B AB ABD ABCD

Monday 13 July 2009

Page 15: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

More variables

Sleep data: 10 vars (nodes)45 edgesEulerian has length 50

Eulerian on scagnostics: Outlying

GP Bd L Br Bd SW PS TS SE PS TS D L P L PS Br P TS Bd TS PS P D D Br P D

0.0

0.3

0.6

Using outlying index from scagnostics package for eulerian traversalzoom on first half of display

Monday 13 July 2009

Page 16: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

More variables-cont’d

Reduce the graphNN graph: eliminate edges with outlier index < .2

Reduces graph from 10 to 5 nodes, and 45 to 5 edgesOther nodes have no edges

NN Eulerian on scagnostics: Outlying

GP L Bd SW L Br GP

0.0

0.3

0.6

SW

Bd

Br

L

GP

Monday 13 July 2009

Page 17: EULERIAN TOUR ALGORITHMS FOR DATA VISUALIZATION AND THE

IN CONCLUSION..

• Pairviz package: relationship ordering for data visualisation

• Current version: algorithms presented here

• Thanks to graph, igraph

• Work in progress: ordering dynamic visualisations via ggobi.

with Adrian Waddell, UW

Monday 13 July 2009


Recommended