+ All Categories
Home > Documents > A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by:...

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by:...

Date post: 14-Dec-2015
Category:
Upload: paulina-hadsell
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011
Transcript
Page 1: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays

Presented by: ZHANG XiaofeiMarch 2, 2011

Page 2: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 3: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 4: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Motivation

• Multidimensional arrays– Suit for scientific and engineering applications– Logically equivalent to relational tables

D1

D2 <A1,A2,…,An>

D1 D2 A1 A2 … An

A cell of the multidimensional arrays: (A1,A2,…,Ak, D1,D2,…Dd)

Page 5: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Motivation (Cont’d)

• Uncertain data– Inevitable– Two categories

Page 6: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Motivation (Cont’d)

• Correlated uncertain data– Examples: Geographically distributed sensors

ID X Y Z T. H. P.

S1 11 9 15 20* 20* 5*

S2 11 7 13 19* 19* 5.5*

More applications examples can be found in router’s network traffic analysis, quantization of image or sound, etc.

Page 7: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 8: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Modeling Correlated Uncertainty

• PGM: Probabilistic Graphical Model– Bayesian network

)|(),,|(

)|()|(),|(

ANCPAEBNCP

BAPEAPBEAP

Limitations:1) Prior knowledge and initial probabilities2) Significant computational cost(NP hard)

Page 9: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Modeling Correlated Uncertainty (Cont’d)

• PGM: Probabilistic Graphical Model– Markov Random Fields

A graphical model in which a set of random variables have a Markov property described by an undirected graph

Pros: cyclic dependenciesCons: no induced dependencies

NP hard to compute

Page 10: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Modeling Correlated Uncertainty (Cont’d)

• Considering the locality of correlation– E.g. a 2-dimensional arrays

Page 11: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 12: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Construction of A*-tree

• Basic A*-structure1) k-ary tree: k=2^d, where d is the number of

correlated dimensions2) Each leaf contains the joint distribution of four

neighboring cells it maps to3) The joint distribution at each internal node is

recursively defined

Page 13: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Construction of A*-tree (Cont’d)

• Joint distribution at a node

X1 X2

X3 X4

Y=(X1+X2+X3+X4)/4Xi=Y(1+Fi)

Fi range k, r entries in distribution table, l bits to present probability

8

)log3( lkr

Page 14: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Construction of A*-tree (Cont’d)

• Extension of A*-tree– Uneven dimensional size• 2k+1 partitioned as k and k+1• Shorter dimension stops partition first, with partition of

longer dimension goes on

Page 15: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Construction of A*-tree (Cont’d)

• Extension of A*-tree– Basic uncertainty blocks of arbitrary shapes• Each cell is intuitively the basic uncertain block,

however, maybe this granularity is too fine• Initial identification of uncertainty blocks is user and

application specified

Page 16: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 17: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree

• Natural mapping from A*-tree to Bayesian Network

Page 18: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree (Cont’d)

• How A*-tree model express the neighboring correlation– From the perspective of any random query, the

average level where cell correlation is encoded is low. (efficient inference & accurate modeling)

Page 19: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree (Cont’d)

• Neighboring cells and clustering distance– Definition

Page 20: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree (Cont’d)

• Neighboring cells and clustering distance

Page 21: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree (Cont’d)

• CD (Clustering Distance)– For any query that may return q pairs of

neighboring cells

Expected average CD

e.g. for 1024*1024 array, h=10, then E(argCD )~ 1.01

h

h

ii

hiavgCDE

2

11

2

1)(

1

11

Page 22: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Analysis of A*-tree (Cont’d)

• Accuracy vs. Efficiency– Double “flip”– Polynomial time scan O(d*n)– Consider basic uncertainty block

Page 23: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 24: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Query Processing

• Monte Carlo based query processing– SamplingQ: select avg(brightness)From space_imageWhere Dis(x,y,z,322,108,251)<50

Page 25: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Query Processing (Cont’d)

• Compared with MRF– MRF require sequenced round sampling– Each sample node is computed from all the nodes

Page 26: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Query Processing (Cont’d)

• Other queries– COUNT, AVG and SUM

•Minimum Set Cover•Build-in cell-count function•Effectively query answering

t

ii

t

iii

t

iii

t

ii

cac

ac

c

11

1

1

/)(

Page 27: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Outline

• Motivation• Modeling correlated uncertainty• Construction of A*-tree• Analysis of A*-tree• Query processing• Experiments

Page 28: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Experiments

• Data set description

• Evaluations– Accuracy of modeling the underlying joint

distribution– Execution time– Aggregate query– Space cost

ID X Y T Temperature

Page 29: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Experiments (Cont’d)

• Accuracy

Page 30: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Experiments (Cont’d)

• Accuracy

Page 31: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Experiments (Cont’d)

• Execution time

Page 32: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Experiments (Cont’d)

• Aggregate query and space cost

Page 33: A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.

Thank you!

Q&A


Recommended