+ All Categories
Home > Documents > A Framework for Discovering Co-location Patterns in Data...

A Framework for Discovering Co-location Patterns in Data...

Date post: 24-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
23
Slide 1 A Framework for Discovering Co-location Patterns in Data Sets with Extended Spatial Objects Hui Xiong Department of Computer Science & Engineering University of Minnesota - Twin Cities (c) University of Minnesota - Twin Cities April 8, 2004
Transcript
Page 1: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 1

A Framework for Discovering Co-location Patterns in DataSets with Extended Spatial Objects

Hui Xiong

Department of Computer Science & EngineeringUniversity of Minnesota - Twin Cities

(c) University of Minnesota - Twin Cities April 8, 2004

Page 2: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 2

Overview

� Introduction� General Problems

� Related Works

� Research Motivations

� A Buffer-based Model

� A Filter-and-Refine Co-location Pattern Mining Framework

� Experimental Evaluation

� Conclusions and Future Work

(c) University of Minnesota - Twin Cities April 8, 2004

Page 3: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 3

Introduction & Background

0 10 20 30 40 50 60 70 800

10

20

30

40

50

60

70

80Co−location Patterns − Sample Data

X

Y

(c) University of Minnesota - Twin Cities April 8, 2004

Page 4: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 4

Examples of Co-location Patterns

� �� ��� � �� �� �� � ��� � � � �� � �� ��� �� �� � �� � � � � � ��

� ��� � �� � �! �" !# $&% "� ! �' � ��( "� ! ) � �� �* " +, � �- ! ' .� + ' */ # �" ! , �! �� "0 +* ! +, ( ( "# *1 ' 2 +, �!

! - ! , *# $43 "� ( 5 ' ! )/ � * )( ' � )� " �/ * , " , � .

� � � , � 0 " �# " , ( 1 # * ' � * � !# $# 1 � " ! '# ) ' �( 1 �! '# ) � � , # 1 � * + , *# .� "( ! 0 " �� � �� ( "# ! +# ! * � �!# +, (

! , - " ' � , 0 ! , * +� ! - ! , *# $76 ! # * % "� ! ( "# ! +# ! )# * + � , +, * 3 +* ! '

# �1 ' �!# )( ! +( 2 " '( # )0 �# 81 " * � ! # .9 � � +* " � ,;: 2 +# ! ( # ! '- " �! # ! '- " �! * � ! ' ! 81 !# *# $* � 3 ) �� " �! ) +0 21 � +, �! .6 ! +*/ ! ' <' � , *# ) ' ! �" " * +* " � , $ ��� ( <' � , * ) 3 + ' 0 <' � , * )# , � 3 < +� � .= ' + , # � ' * +* " � , ( ! � "- ! ' � # ! '- " �! * ' + �># $7? �@ �# * +� �! '- " �! ) ? @ � ) , ! 3 # + ! '

( ! � "- ! ' � .(c) University of Minnesota - Twin Cities April 8, 2004

Page 5: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 5

Related Works

� Spatial Statistics� Use measures of spatial correlation to characterize the

relationship between spatial features* the cross-K function with Monte Carlo simulation* mean nearest neighbor distance* spatial regression model

� Computationally expensive

� Data Mining Approaches

� A clustering based approach by Estivill-Castro et al.* Features can be completely spatially random or declustered.* Sensitive to the choices of clustering algorithms.

� Asssocation-rule based approaches.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 6: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 6

Related Works - Cont.

� Asssocation-rule based approaches.� Transaction-based approaches.

* A reference-feature centric model by Koperski et al.- Generalizing this paradigm to the case where no reference

feature is specified is non-trivial.- May yield duplicate counts for many candidate associations.

� Distance-based approaches.* k-neighboring classs sets by Morimoto.

- the number of instances for each pattern is used as theprevalence measure

* an event centric model by Shekhar et al.

� All these approaches are for point spatial features.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 7: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 7

Motivation

� Identifying co-location patterns in data sets with extended spatialobjects (e.g. polygons and line strings).

� Highway often have frontage road nearby in large metropolitan.� ABC D AE DFG H B DE � IJ K IL D M&N O O

(c) University of Minnesota - Twin Cities April 8, 2004

Page 8: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 8

Problem Formulation

� Given:� A set T of P spatial feature types QSR TUV WUX Y Y Y WUZ [ and spatial

data types can be point as well as other extended spatial objects,such as line strings and polygons.

� A set of \ instances ] R T^V Y Y Y^_ [ , each^` a ] is a vector

b instance-id, spatial feature type, location c where spatial featuretype a Q and location a spatial framework d .

� A buffer size, a minimum prevalence threshold, a minimumconditional probability threshold.

� Find: Co-location Patterns and Co-location Rules.

� Objective: Computational Efficiency.

� Constraints: Correctness and Completeness.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 9: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 9

A Buffer-based Model

e fg hi ji k hl Buffer is a zone of specified distance around spatialobjects. The boundary of the buffer is the isoline of equal distance tothe edge of the objects.

� Motivation

� Objects in space frequently have sort of impact on the objectsand areas around them* freeways create “noise pollution” that can be heard blocks

away.* factories emit fumes that can affect people for miles around.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 10: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 10

A Buffer-based Model

.

.

A LineString Object O2 A Polygon Object O3

Neighborhood of O1 Neighborhood of O2 Neighborhood of O3

Instance r. .

.Neighborhood of r

.

A Point Object O1e fg hi ji k hm \ n^ o , the size-E Euclidean neighborhood of a point

location^ , is a circle of sideE with^ as its center.

e fg hi ji k h p \ nB o , the size-E neighborhood of an extended spatialobject (e.g. polygon, line-string), is defined by the buffer operation.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 11: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 11

A Buffer-based Model - Cont.

e fg hi ji k hq The r ks ft u v f t u ji k ] w nUV UX Y Y YU Z o for a co-locationx R TUV W Y Y Y WU Z [ is _ yz{ z|~} } } z ��� ���� �� �� ��� � z� �� � �� � � , where \ nUV U X Y Y YU Z o is the

Euclidean neighborhood of the co-location C.

e fg hi ji k h � The r k h �i ji k h u� � t k� u� i � i j � ] w n xX � xV o of a co-locationrule xV � xX is the probability of finding the neighborhood of xX in theneighborhood of xV . It can be computed as_ y;� {� � | �_ y� { � using theneighborhoods of co-locations xV and xV � xX .

� f� � ul The coverage ratio for co-location patterns is monotonicallynon-increasing with the size of the co-location pattern increasing.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 12: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 12

A Coarse-Level Co-location Pattern Mining Framework

.

. . .

. .A LineString Object OA Point Object O A Polygon Object O Row Instance r for two

Point Objects

Neighborhood of ONeighborhood of O Neighborhood of O Neighborhood of r

�����������

�����������

e fg hi ji k h �� \ nB o , the bounding neighborhood of a spatial object isdefined as MBBR(Buffer(MOBR(Spatial Object O), d)), where MOBR isthe minimum object bounding box, Buffer is the buffer operation with abuffer size as d, and MBBR is the minimum buffer bounding box.

e fg hi ji k h � The Euclidean bounding neighborhood� \ nU � o of aspatial featureU � is the union of� \ nJ � o for every instanceJ � of thespatial featureU � .

(c) University of Minnesota - Twin Cities April 8, 2004

Page 13: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 13

A Coarse-Level Co-location Pattern Mining Framework - Cont.

e fg hi ji k h � The Euclidean bounding neighborhood� \ nUV UX Y Y YU Z o fora coarse-level co-location pattern x x R TUV W Y Y Y WU Z [ is the intersectionof� \ nU` o for every spatial featureU ` in x x .

e fg hi ji k h � The r k u t   f¢¡ � fs f� r k s ft u v f t u ji k x ] w nUV U X Y Y YU Z o for acoarse-level co-location pattern x x R TUV W Y Y Y WU Z [ is £_ y z{ z| } } } z ��� ���� �� �� ��� � z� �� � �� � � ,where� \ nUV UX Y Y YU Z o is the Euclidean bounding neighborhood of thecoarse-level co-location pattern CC.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 14: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 14

A Coarse-Level Co-location Pattern Mining Framework - Cont.

� f� � um The coarse-level coverage ratio for coarse-level co-locationpatterns is monotonically non-increasing with the size of thecoarse-level co-location pattern increasing.

� f� � u p For any spatial feasure set ¤ R TUV WU X W Y Y Y WU Z [ , thecoarse-level coverage ratio CPr(F) is larger than or equal to thecoverage ratio Pr(F).

(c) University of Minnesota - Twin Cities April 8, 2004

Page 15: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 15

A Coarse-Level Co-location Pattern Mining Framework - Cont.

B2

A2

A3

A4

B3

C1

B1

A1

� x ] w n¦¥ o = £_ y¨§ �� ���� �� �� ��� � z� �� � �� � � = ©ªX « « = O YN ¬­

� x ] w n¦¥ � o = £_ y§ £ �� ���� �� �� ��� � z� �� � �� � � =V XX « « = O Y O® .

(c) University of Minnesota - Twin Cities April 8, 2004

Page 16: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 16

Geometric Challenges and Solutions

X

S

Y

O

A

A

A

(X , Y )

xx x x x x

yy

y

yy

y2lb

1lb3lb

3rt

1rt

2rt

2 2

(X , Y )11

2

1

3

1lb 2lb 3lb 1rt 2rt 3rt

� f� � u p For any n spatial events ¥ V W Y Y Y W ¥ � ,

¯°±² ³µ´¶ ·~¸ ± ¹4º ¯»±² ³ ´¶ ·~¸ ± ¹¦¼ » ± ½ ¾ ´¶ ·¸ ±¸ ¾ ¹¿ »± ½ ¾ ½À ´¶ ·~¸ ±¸ ¾ ¸ À ¹¦¼ »± ½ ¾ ½À ½Á ´¶ ·¸ ±¸ ¾ ¸ À ¸ Á ¹¿Â   ¿ ·¼ à ¹ ¯ Ä ³´¶ ·~¸ ³ ¸ Å   ¸ ¯ ¹Â · à ¹

(c) University of Minnesota - Twin Cities April 8, 2004

Page 17: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 17

Geometric Challenges and Solutions - Cont.

Theorem 2 Given any n spatial events ¥ V W ¥ X W Y Y Y W ¥ � and thecorresponding bounding neighborhoods n nÇÆV �È W MV �È o W n ÆV �� W MV �� o o ,

n nÇÆX �È W MX �È o W nÇÆX �� W MX �� o o , Y Y Y , n nÇÆ � �È W M � �È o W nÇÆ � �� W M � �� o o , where the boundingneighborhood of the event ¥ ` ,N É J É A , is represented by the leftbottom point nÇÆ` �È W M` �È o and the right top point nÇÆ` �� W M` �� o , if the boundingneighborhoods of these n spatial events have the common intersectionarea, then this intersection area can be computed by Equation 2.

� \ n¦¥ V ¥ X Y Y Y ¥ � oR n¦Ê X Ë Ê V oÍÌ n¦ÎX Ë ÎV o n¦Ï o

where

Ê X R C J A T ÆV �� W ÆX �� W Y Y Y W Æ � �� [ ,

Ê V R C D Æ T ÆV �È W ÆX �È W Y Y Y W Æ � �È [ ,

ÎX R C J A T MV �� W MX �� W Y Y Y W M � �� [ ,

ÎV R C D Æ T MV �È W MX �È W Y Y Y W M � �È [ .

(c) University of Minnesota - Twin Cities April 8, 2004

Page 18: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 18

Algorithm Design

� DCS: A Direct Combinatorial Search Algorithm.� EXCOM: An Extended Co-location Mining Algorithm.

DirectCombinatorial Search

With Buffer Test(Prevalence-based Pruning)

RefinementCombinatorial Search

With Buffer Test(Prevalence-based Pruning)

Co-location PatternsCoarse Level

Spatial Objects PatternsCo-locationGeometric Filter

(Quad-Tree)

(c) University of Minnesota - Twin Cities April 8, 2004

Page 19: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 19

Experimental Setup

� Experimental Data Set: MN/DOT base map.� Experimental Design

Mining Algorithms Co−location

DCS, EXCOMCandidates:

Parameters:Coverage Ratio, Buffer Size

Co−location ratio analysis for line−string

co−location patternsSummary

measurements, e.g. execution time

Road Map Data Set

(c) University of Minnesota - Twin Cities April 8, 2004

Page 20: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 20

The Filtering Effect of the Geometric Component

100

1000

10000

20 40 60 80 100 120 140 160

Run

ning

Tim

e (s

ec)

Buffer Size (feet)

DCSEXCOM

� The geometric filter can speed up the prevalence-based pruningapproach by a fact of 30 - 40.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 21: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 21

Line-String Co-location Patterns for Test Route Selection

� Evaluate the positional accuracy of digital roadmap databases.� Co-located roads are the most challenging test sites for evaluating

the ability of global positioning systems (GPS) systems to identifycorrect roads from a digital roadmap.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 22: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 22

Contributions

� Generalize the concept of co-location patterns to extended spatialobjects, e.g. polygons and line-strings.

� Propose a novel buffer-based model for mining co-location pattern.This model has three advantages over the event centric model andis transaction-free.

� Propose a geometric filter-and-refine co-location mining framework.

� Experiment evaluation with a real data set shows that the geometricfilter-and-refine approach can speed up the prevalence-basedpruning approach by a fact of 30 to 40.

� The application of applying line-string co-location patterns forselecting test routes has been provided to show the usefulness ofco-location patterns.

(c) University of Minnesota - Twin Cities April 8, 2004

Page 23: A Framework for Discovering Co-location Patterns in Data ...datamining.rutgers.edu/talk/excom.pdf · Generalize the concept of co-location patterns to extended spatial objects, e.g.

Slide 23

Conclusions and Future Work

� Extending the notion of co-location pattern� de-colocation pattern

� co-incidence pattern

� Applications of Co-location Pattern

(c) University of Minnesota - Twin Cities April 8, 2004


Recommended