+ All Categories
Home > Documents > Modeling Spatial and Spatio-temporal Co-occurrence Patterns Mete Celik Spatial Database / Data...

Modeling Spatial and Spatio-temporal Co-occurrence Patterns Mete Celik Spatial Database / Data...

Date post: 18-Jan-2018
Category:
Upload: annabelle-vivien-page
View: 224 times
Download: 0 times
Share this document with a friend
Description:
3 Thesis Related Publications Chapter 2 Zonal Co-location Pattern Discovery with Dynamic Parameters, w/ J. M. Kang, S. Shekhar, In Proc. of 7th IEEE Int’ l Conf. on Data Mining (ICDM), NE, Chapter 3 Mixed-drove Spatio-temporal Co-occurrence Pattern Mining, w/ S. Shekhar, J. P. Rogers, J. A. Shine, Accepted to the IEEE TKDE Mixed-drove Spatio-temporal Co-occurrence Pattern Mining: A Summary of Results, w/ S. Shekhar, J.P. Rogers, J.A. Shine, and J.S. Yoo, In Proc. of 6th IEEE Int’ l Conf. on Data Mining (ICDM), Hong Kong, Chapter 4 Sustained Emerging Spatio-temporal Co-occurrence Pattern Mining: A Summary of Results, w/ S. Shekhar, J.P. Rogers, and J.A. Shine, In Proc. of IEEE Int’ l Conf. on Tools on Artificial Intelligence (ICTAI), Washington, D.C., 2006.

If you can't read please download the document

Transcript

Modeling Spatial and Spatio-temporal Co-occurrence Patterns Mete Celik Spatial Database / Data Mining Group Department of Computer Science University of Minnesota Advisor: Shashi Shekhar 2 Major Projects US DoD/ERDC/TEC - Modeling and Mining Spatio-Temporal Co-occurrence Patterns Defined interest measures, and designed models and algorithms to analyze moving object datasets. JGI/DTC - A Digital Library to Archive Research Material from Jane Goodalls Gombe Chimpanzee Project Designed a database schema, developed digital archive databases and a web search engine to query visual materials of Gombe project. NGA - Spatio-Temporal Pattern Mining for Multi-Jurisdiction Multi-Temporal Activity Datasets Designed models and algorithms to discover graph-based hotspots (high-crime activity streets) AHPCRC - High Performance Spatial Data Mining Formulized scalable solutions and designed new heuristics for spatial auto-regression model. Biography Education Ph.D., Student, Dept of Computer Science, U. of Minnesota, MN, 2002 Present. M.S., Ph.D., Student, Dept. of Electronic Eng. Erciyes University, Turkey, B. S., Dept. of Control & Computer Eng., Erciyes University, Turkey, 3 Thesis Related Publications Chapter 2 Zonal Co-location Pattern Discovery with Dynamic Parameters, w/ J. M. Kang, S. Shekhar, In Proc. of 7th IEEE Int l Conf. on Data Mining (ICDM), NE, Chapter 3 Mixed-drove Spatio-temporal Co-occurrence Pattern Mining, w/ S. Shekhar, J. P. Rogers, J. A. Shine, Accepted to the IEEE TKDE Mixed-drove Spatio-temporal Co-occurrence Pattern Mining: A Summary of Results, w/ S. Shekhar, J.P. Rogers, J.A. Shine, and J.S. Yoo, In Proc. of 6th IEEE Int l Conf. on Data Mining (ICDM), Hong Kong, Chapter 4 Sustained Emerging Spatio-temporal Co-occurrence Pattern Mining: A Summary of Results, w/ S. Shekhar, J.P. Rogers, and J.A. Shine, In Proc. of IEEE Int l Conf. on Tools on Artificial Intelligence (ICTAI), Washington, D.C., 2006. 4 Other Publications Spatio-temoral Co-occurrence Pattern Mining Mining At Most Top-K% Mixed-drove Spatio-temporal Co-occurrence Patterns: A Summary of Results, w/ S. Shekhar, J.P.Rogers, J.A.Shine, and J.M. Kang, In Proc. of the Workshop on Spatio-Temporal Data Mining (IEEE ICDE 2007), Turkey, Discovery of Co-evolving Spatial Event Sets, w/ J. S. Yoo, S. Shekhar, S. Kim, In Proc. of the SIAM Intl Conf. on Data Mining (SDM), Bethesda, Maryland, Spatial Co-location Pattern Mining Zonal Co-location Pattern Discovery with Dynamic Parameters, w/ J. M. Kang, S. Shekhar, In Proc. of 7th IEEE Int l Conf. on Data Mining (ICDM), NE, A Join-less Approach for Co-location Pattern Mining: A Summary of Results, w/ J. S. Yoo, S. Shekhar, In Proc. of the 5th IEEE Intl Conf. on Data Mining (ICDM), Houston, Texas, Misc. Spatial Data Mining Spatial Dependency Modeling Using Spatial Auto-regression, w/ B.M. Kazar, S. Shekhar, D. Boley, D. J. Lilja, In Proc. of the ISPRS/ICA Workshop on Geospatial Analysis and Modeling as part of Intl Conf. GICON, Austria, Parameter Estimation for the Spatial Auto-Regression Model: A Rigorous Approach, w/ B. M. Kazar, S. Shekhar, and D. Boley, The Second NASA Data Mining Workshop: Issues and Applications in Earth Science, California, 2006. 5 Outline Motivation and Related Work Example Mixed-drove Co-occurrence Pattern Limitation of Related Work Contributions Related Work MDCOP Mining Problem Proposed MDCOP Mining Algorithms Evaluation Conclusion and Future Work 6 MDCOP Motivating Example : Input Manpack stinger (2 Objects) M1A1_tank (3 Objects) M2_IFV (3 Objects) Field_Marker (6 Objects) T80_tank (2 Objects) BRDM_AT5 (enemy) (1 Object) BMP1 (1 Object) 7 MDCOP Motivating Example : Output Manpack stinger (2 Objects) M1A1_tank (3 Objects) M2_IFV (3 Objects) Field_Marker (6 Objects) T80_tank (2 Objects) BRDM_AT5 (enemy) (1 Object) BMP1 (1 Object) 8 Why are mixed-drove patterns important? Improving capabilities of information processing Earth Science, environmental management, government services, and transportation. Helping explanatory or descriptive use for intelligence, resource allocation, confirmatory. Public health (Infectious emerging diseases) Ecology (tracking species and pollutant movements) Homeland defense (looking for growing events, biodefense) Military Identifying patterns or critical elements Predicting near-feature locations of enemy units 9 Challenges Current interest measures (i.e. participation index) are not sufficient to quantify such patterns New composite interest measure must be created and formalized. The set of candidate patterns grows exponentially with the number of object-types. Spatio-temporal datasets are huge Computationally efficient algorithms must be developed. 10 Related Work 1 -Mining of uniform group of moving objects Does not recognize group of mixed object-types Does not recognize if time intervals are discrete Treats different type of objects as same Patterns should be in consecutive time slots Flock Pattern [Gudmundsson05] t1t1 t2t2 t3t3 Moving clusters [Kalnis05] 11 Related Work 2 -Mining of mixed group of moving objects Generalize co-location patterns to spatio-temporal domain Collocation episodes [Cao et. al. ICDM06] Reference centric model Topological patterns [Wang et. al. CIKM05] Semantics are not well-defined for moving objects 12 Example American Football Broken blitz play: The objective of the offensive wide receivers (W) is to outrun any linebackers (L) and defensive backs (C) and get behind them, catching an undefended pass while running untouched for a touchdown. Sketch of the game Output : MDCOP (wide receiver, cornerback) time slot t=0 t=0 : Ws and Cs, Ws and Ls are co-located. {W.1, C.1} {W.4 C.2}, time slot t=1 t=1 : Ws begin their run, while Cs remain in their original position possibly due to a fake handoff from the Q to running back. time slot t=2 t=2 : Ws cross over each other and try to drift further away from their respective Cs. time slot t=3 t=3 : Q shows signs of throwing the football, Cs run to their respective Ws. S guards Ws. Flock pattern : There is no flock patterns Moving objects are not same type. Moving clusters: There is no moving clusters There is no pattern in consecutive time slots Do not take into account different object type patterns Co-location episodes: There is no co-location episodes There is no reference object type in consecutive time slots. Topological patterns: There is no topological pattern Will discover {W,C}, {W,L}, {W,S}, and {W,C,S}. However {W,L}, {W,S}, {W,C,S} (1/4) are not time persistent. Broken blitz play: The objective of the offensive wide receivers (W) is to outrun any linebackers (L) and defensive backs (C) and get behind them, catching an undefended pass while running untouched for a touchdown. 13 Related Work Summary Spatio-Temporal Pattern LevelTime Interval ObjectObject-typeConsecutiveNon-consecutive Mining of uniform group of moving objects Flock Pattern [1,2]XX Moving Clusters [3]XX Mining of mixed group of moving objects Collocation EpisodesX reference centric X Topological PatternsXSemantics are not well defined Mixed-drove Pattern (MDCOP) XXX * Proposed approach will catch MDCOP (W, C). 14 Proposed Approach: Key Ideas Defined a new monotonic composite interest measure Generalize Participation Index, a monotonic interest measure for spatial co-location patterns Temporal persistence over an interval Developed a novel and computationally efficient MDCOP mining algorithms Exploit monotonic interest measure to prune candidates Alternative designs for temporal persistence i) post-processing after reuse of co-location-miner ii) temporal pruning after pattern size 1,2,3,. iii) temporal pruning as soon as possible 15 Outline Introduction Related Work MDCOP Mining Problem and Algorithms Key Concepts Interest Measure Formal Problem Definition Algorithms Nave Approach MDCOP-Miner FastMDCOP-Miner Evaluation Conclusion and Future Work 16 Key Concepts-1 Spatial co-location Col is a set of object-types frequently co-located. Col={W, C} co-locations in t=0 and t=3. Sketch of the game Participation ratio PR(o i, Col) = (instances (o i ) in co-location Col)/ instances (o i ) Spatial prevalence measure: Participation index PI= min{PR(o i, Col)} Col={W, C} => PI (Col)=min (PR(W,C), PR(C,Col)) = 2/4 for t=0 => PI (Col)=min (PR(W,C), PR(C,Col)) = 3/4 for t=3 A co-location is called spatial prevalent if PI of it is not less than a given threshold p. time slot t=3 t=3 PR(W,Col)=3/4, PR(C,Col)=2/2 time slot t=2 t=2 PR(W,Col)=0, PR(C,Col)=0 time slot t=1 t=1 PR(W,Col)=0, PR(C,Col)=0 time slot t=0 t=0 PR(W,Col)=2/4, PR(C,Col)=2/2 17 Key Concepts-2 Definition 1: The time prevalence or persistence measure of the pattern P TP(P,T)=(# of time slots where the pattern occurs) / (total # of time slots) time interval T={T 0, T 1, , T n } Example {W, C} pairs are co-located in time slots t=0 and t=3, Time prevalence of pattern {W, C} = 2/4 Spatial Prevalence MeasureTime prevalence index time slot t=0time slot t=1time slot t=2time slot t=3 W, C2/4003/42/4 W, L2/40001/4 W,S0003/41/4 A pattern is time prevalent if its time prevalence measure is not less than a given threshold time i.e.,if time = 0.5 than {W, C} is time prevalent, because TP{W, C} = 2/4 >= time 18 Key Concepts-3 Definition 2: The mixed-drove prevalence measure of a pattern P i a composition of the spatial prevalence and time prevalence measure. A mixed-drove prevalence measure is monotonically decreasing with respect to MDCOP size. Example if P >=0.5 co-location {W,C} is spatial prevalent in time slots t=0 and t=3 Prob({W,C})=2/4 Spatial Prevalence MeasureTime prevalence index time slot t=0time slot t=1time slot t=2time slot t=3 W, C2/4003/42/4 W, L2/40001/4 W,S0003/41/4 A pattern is mixed-drove prevalent, if its mixed-drove prevalence satisfies thresholds P and time. i.e., if P =0.5 and time =0.5, than {W,C}is prevalent since Prob({W,C})=2/4 >= 0.5 19 Problem Definition Given A set P of Boolean ST object-types over a common ST framework A neighbor relation R over locations A spatial prevalence index threshold, P A time prevalence index threshold, time Find Mixed-drove spatio-temporal co-occurrence patterns whose spatial prevalence >= P and time prevalence >= time Objective Minimize computation cost Constraints Correctness Completeness Monotonic composite interest measure 20 Example: An American Football Play Input Each play is a spatio-temporal dataset Boolean object-types are role of players (e.g. wide receiver, cornerback, liner backers) Objects are instances of object types (instances of W are W.1, W.2, W.3, W.4) The duration of the play is 4 time units Neighborhood relation may be less than 1 meter or an average arm distance Spatial prevalence threshold p = 0.5 Time prevalence threshold time = 0.5 Output Pattern {W, C} time slot t=0time slot t=1time slot t=2time slot t=3Sketch of the game 21 Proposed Approach: Key Ideas ii) MDCOP-Mineri) Nave Approach iii) FastMDCOP-Miner Key Decision: Timing of temporal pruning i) post-processing after reuse of co-location-miner ii) after pattern size 1,2,3,. iii) as soon as possible 22 Execution Trace Summary of MDCOPs Patternt=0t=1t=2t=3Time prevalence index after size=k FastMDCOPMDCOP-MinerNaive W C2/4003/42/4Survived Generate W L2/40001/4AlreadyPrunedSurvived FastMDCOPnot calculatedpruned size 2W S0002/41/4AlreadyPrunedSurvived FastMDCOPno generationnot calculatedpruned C S00011/4AlreadyPrunedSurvived FastMDCOPno generationnot calculatedpruned PruningFastMDCOP MDCOP-Miner Generate size 3 W S C0002/41/4not generated Survived Pruning at the post processingNave time slot t=0time slot t=1time slot t=2time slot t=3 23 t=3t=1 triples Input:spatio-temporal dataset P =0.5, time =0.5 t=2 No generation WSC t=0 WCWL WS WQ CS CL CQ SL SQ LQ No generation Temporal pruning (Prune WL, WS, CS, WSC) pairs Output:MDCOPs WC Nave Approach t=3 t=1 Input:spatio-temporal dataset, P =0.5, time =0.5 t=2 Temporal pruning WL, WS, CS t=0 No generation Output:MDCOPs WC MDCOP-Miner triples Temporal pruning Bold Black and Blue: Common Blue Underlined: Spatial Pruning Bold Red : Difference WC WL WS WQ CS CL CQ SL SQ LQ WCWL WS WQ CS CL CQ SL SQ LQ WC WL WS WQ CS CL CQ SL SQ LQ WCWL WS WQ CS CL CQ SL SQ LQ pairs WC WL WS WQ CS CL CQ SL SQ LQ WCWL WS WQ CS CL CQ SL SQ LQ WC WL WS WQ CS CL CQ SL SQ LQ No generation 24 Proposed Approaches Nave Approach Step 1) Find spatial co-locations for all time slots Step 2) Post-processing: Prune time non- prevalent MDCOPs. Limitation : Redundant generation of non- prevalent MDCOPs before post processing Pseudo-code : MDCOP-Miner Idea : Push the post processing step inside the loop. Step 1) Find MDCOP patterns by applying MDCOP prevalence interest measure. Advantage: MDCOP-Miner eliminates redundant candidate generation of non- prevalent MDCOPs Pseudo-code: 1.Initialization 2.while k in (1,2,3, K) { 3.For each time slot { 4.generate co-locations 5.generate co-location instances 6. prune_spatial non-prevalent co-loc } 7.} Post-processing step 1.For size k spatial prevalent co-locations 2.Calculate_time_prevalence_index 3.prune_temporal_non-prevalent_co-occur } 1.Initialization 2.while k in (1,2,3, K) { 3.For each time slot { 4. generate co-occur 5. generate co-occur instances 6. prune_spatial_non-prevalent co-occur } 7.calculate_time_prevalence_index 8.prune_temporal_non-prevalent_co-occur 9.} 25 FastMDCOP-Miner triples pairs t=3 t=1 Input:spatio-temporal dataset, P =0.5, time =0.5 t=2 No generation Temporal pruning t=0 No generation Output:MDCOPs WC Bold Black and Blue: Common Blue Underlined: Spatial Pruning Bold Red : Difference No generation MDCOP-Miner t=3 t=1 Input:spatio-temporal dataset, P =0.5, time =0.5 t=2 Temporal pruning WL, WS, CS t=0 No generation Output:MDCOPs WC Temporal pruning WCWL WS WQ CS CL CQ SL SQ LQ WC WL WS WQ CS CL CQ SL SQ LQ WCWL WS WQ CS CL CQ SL SQ LQ WC WL WS WQ CS CL CQ SL SQ LQ No generation WCWL WS WQ CS CL CQ SL SQ LQ WC WL WS WQ CS CL CQ SL SQ LQ WCWL ---- WC WL WS WQ CS CL CQ SL SQ LQ 26 Proposed Approaches FastMDCOP-Miner Idea : Prune time non-prevalent patterns as early as possible. Step 1) Find MDCOP patterns by applying MDCOP prevalence interest measure. Advantage: MDCOP-Miner eliminates redundant candidate generation of time non- prevalent MDCOPs Pseudo-code: MDCOP-Miner Idea : Push the post processing step inside the loop. Step 1) Find MDCOP patterns by applying MDCOP prevalence interest measure. Pseudo-code: 1.Initialization 2.while k in (1,2,3, K) { 3.For each time slot { 4. generate co-occur 5. generate co-occur instances 6. prune_spatial_non-prevalent co-occur } 7.calculate_time_prevalence_index 8.prune_temporal_non-prevalent_co-occur 9.} 1.Initialization 2.while k in (1,2,3, K) { 3.For each time slot { 4. generate co-occur 5. generate co-occur instances 6. prune_spatial_non-prevalent co-occur 7. calculate_time_prevalence_index 8. prune_temporal_non-prevalent_co-occur } 9.} 27 Execution Trace (FastMDCOP-Miner) t=0 t=0 PatternsWCWLWSWQCLCSCQLSLQSQ InstancesW.1 C.1W.2 L.1 W.4 C.2W.3 L.2 P. Ratio2/4 2/2 P. Index2/4 P =2/4 PRUNED time slot t=0 time slot t=1time slot t=2time slot t=3 t=0Time prevalence WC11/4 WL11/4 WS00 WQ00 CL00 CS00 CQ00 LS00 LQ00 SQ00 Step 1 Step 2 28 Execution Trace (FastMDCOP-Miner) t=1 time slot t=0time slot t=1time slot t=2time slot t=3 t=1 PatternsWCWLWSWQCLCSCQLSLQSQ Instances P. Ratio P. Index P =2/4 PRUNED t=0t=1Time prevalence WC101/4 WL101/4 WS000 WQ000 CL000 CS000 CQ000 LS000 LQ000 SQ000 Step 1 Step 2 29 Execution Trace (FastMDCOP-Miner) t=2 time slot t=0time slot t=1time slot t=2time slot t=3 t=2 PatternsWCWLWSWQCLCSCQLSLQSQ Instances P. Ratio P. Index P =2/4 PRUNED t=0t=1t=2Time prevalence WC1001/4 WL1001/4 WS0000 WQ0000 CL0000 CS0000 CQ0000 LS0000 LQ0000 SQ0000 Step 1 Step 2 30 Execution Trace (FastMDCOP-Miner) t=3 time slot t=0time slot t=1time slot t=2time slot t=3 t=3 PatternsWCWLWSWQCLCSCQLSLQSQ InstancesW.2 C.1 W.4 C.1 W.1 C.2 P. Ratio3/4 2/2 P. Index3/4 P =2/4 Step 1 t=0t=1t=2t=3Time prevalence WC10012/4 WL10001/4 WS0000 WQ0000 CL0000 CS0000 CQ0000 LS0000 LQ0000 SQ0000 Step 2 31 Execution Trace (FastMDCOP-Miner) time prev. index Calculate time prevalence indices of spatial prevalent co-locations (step 8) Find mixed-drove prevalent MDCOPs (step9) {A,B}, {A,C}, {B,C} time slot t=0time slot t=1time slot t=2time slot t=3 t=0t=1t=2t=3Time prevalence WC10012/4 WL1000Pruned WS000Already Pruned WQ000Already Pruned CL000Already Pruned CS000Already Pruned CQ000Already Pruned LS000Already Pruned LQ000Already Pruned SQ000Already Pruned Calculate time prevalence indices of spatial prevalent co-locations Find mixed-drove prevalent MDCOPs {W,C} 32 Analytical Evaluation Lemma 1: A spatial prevalence measure, e.g., participation index, is monotonically non-increasing in the size of the MDCOPs at each time slot. Lemma 2: A mixed-drove prevalence index measure is monotonically non- increasing with the size of MDCOP over space and time. * Spatial prevalence of {W,C,S} pattern is less than or equal to the spatial prevalence of sub- patterns {W,C}, {W,S}, {C,S}. ** Mixed-drove prevalence index measure of pattern {W, C, S} is less than or equal to the spatial prevalence of sub-patterns {W,C}, {W,S}, {C,S} Lemma 1*Lemma 2** Spatial prevalence indexTime prevalence index t=0t=1t=2t=3 WC2/4--3/42/4 WS---3/41/4 CS---11/4 WCS---2/41/4 33 Analytical Evaluation Theorem 1: The MDCOP-miner and FastMDCOP-Miner are complete. Proof : Algorithms find all MDCOPs whose spatial prevalence >= P and time prevalence >= time Any subset of MDCOP prevalent pattern is MDCOP prevalent (Lemma 2) None of the functions of the algorithm miss any prevalent MDCOP. prune_spatial_non-prevalent_co-occur, and prune_temporal_non-prevalent_co-occur Theorem 2: The MDCOP-miner and FastMDCOP-Miner is correct. If a MDCOP pattern P is returned by algorithms then P is a prevalent MDCOP. Proof : The pruning steps of prune_spatial_non-prevalent_co-occur and prune_temporal_non-prevalent_co-occur prune out candidates not meeting the given thresholds. 34 Outline Introduction Related Work MDCOP Mining Problem and Algorithm Evaluation Analytical Evaluation Performance Evaluation Conclusion and Future Work 35 Performance Evaluation: Experiment Design Experiment Goals : Compare FastMDCOP-Miner,MDCOP-Miner with Nave Approach What is the effect of number of time slots? What is the effect of number of object-types? What is the effect of spatial prevalence threshold P ? What is the effect of time prevalence threshold time ? Metric of comparison : Computational complexity Workload : Vehicle moving dataset and synthetic dataset Hardware : Intel Centrino PIV 1.60GHz, 512 Mb of RAM 36 Real Dataset Description Vehicle movement dataset 15 time slots, x and y coordinates are in meter 22 distinct vehicle types and their instances Minimum instance number 2, maximum instance number 78 Average instance number 19 Example Input from Spatio-temporal Dataset Output: Spatio-temporal Co-occurrence Pattern (Manpack_stinger, fire cover (e,g., Bradley tank )) 37 Real Dataset What is the effect of number of time slots? Execution times of the algorithms increase, as the number of time slots is increased. MDCOP-Miner and Nave approach generates some number of size 2 candidates. FastMDCOP-Miner generates less candidates that the other algorithms due to early pruning. FastMDCOP-Miner outperforms other algorithms. Fixed Parameters Spatial threshold P = 0.2 Time threshold time = 0.8 Distance=150m # of object types =22 38 Real Dataset What is the effect of number of object-types? Fixed parameters Spatial threshold P = 0.2 Time threshold time = 0.8 # of time slots = 15 Distance=150m Execution times of the algorithms increase, as the number of object-types is increased. MDCOP-Miner and Nave approach generates some number of size 2 candidates. FastMDCOP-Miner outperforms other algorithms by generating less candidates. 39 Real Dataset What is the effect of time prevalence index threshold? Fixed parameters Spatial threshold p = 0.5 # of time slots = 15 Distance=150m # of object types =22 Execution times of the algorithms decrease, as the time prevalence threshold is increased. FastMDCOP-Miner outperforms other algorithms. Its advantage increases as the threshold increases. 40 Real Dataset What is the effect of spatial prevalence index threshold? Fixed parameters Time threshold time = 0.5 # of time slots = 15 Distance=150m # of object types =22 Execution times of the algorithms decrease, as the spatial prevalence threshold is increased. FastMDCOP-Miner outperforms other algorithms. 41 Synthetic Dataset Generation TSlotOType Noise Tprev Sprev Avrg 42 Synthetic Dataset What is the effect of number of time slots? Fixed parameters Spatial threshold P = 0.3 Time threshold time = 0.9 Distance=10 # of object types = 200 Execution times of the algorithms increase, as the number of time slots is increased. MDCOP-Miner and Nave approach generates some number of size 2 candidates. FastMDCOP-Miner generates less candidates that the other algorithms. 43 Synthetic Dataset What is the effect of number of object-types? Fixed parameters Spatial threshold p = 0.3 Time threshold time = 0.8 # of time slots = 20 Distance=10m MDCOP-Miner and Nave approach generates same number of size 2 candidates. The ratio of the increase in the execution time of nave approach greater than that of other algorithms as the number of object-type increases. Nave approach generates redundant non-persistent co-locations. FastMDCOP-Miner outperforms other algorithms by generating less candidates. 44 Synthetic Dataset What is the effect of time prevalence index threshold? Fixed parameters Spatial threshold p = 0.4 # of time slots = 50 Distance=10m # of object types=200 Execution times of the algorithms decrease, as the time prevalence threshold is increased. FastMDCOP-Miner outperforms other algorithms. Its advantage increases as the threshold increases. The ratio of the increase in the execution time of nave approach greater than that of other algorithms as the number of object-type increases. 45 Synthetic Dataset What is the effect of spatial prevalence index threshold? Fixed parameters Time threshold time = 0.8 # of time slots = 50 Distance=10m # of object types=200 Execution times of the algorithms decrease, as the spatial prevalence threshold is increased. FastMDCOP-Miner outperforms other algorithms. 46 Synthetic Dataset What is the effect of noise instances and average number of co-occurrence instances? Fixed parameters Spatial threshold p =0.3 Time threshold time = 0.8 # of time slots = 20 Distance=10m Fixed parameters Spatial threshold p =0.3 Time threshold time = 0.8 # of time slots = 20 Distance=10m Execution times of the algorithms increase, as the spatial prevalence threshold is increased. FastMDCOP-Miner more robust than other algorithms. FastMDCOP-Miner outperforms other algorithms. 47 Summary of Experimental Results What is the effect of number of time slots? The cost of Nave approach and non-persistent candidate generation increases as the number of time slots increases. FastMDCOP-Miner outperforms other algorithms. What is the effect of number of object-types? The ratio of the increase in the execution time of nave approach greater than that of other algorithms as the number of object-type increases. FastMDCOP-Miner more robust than other algorithms. It detects non-persistent patterns as early as possible. What is the effect of spatial prevalence threshold? The cost of Nave approach is higher than that of other algorithms for low values of spatial prevalence threshold. What is the effect of time prevalence threshold? Nave approach is not sensitive to the time prevalence threshold since it is used in the post processing step. FastMDCOP-Miner is the most sensitive algorithm to the time threshold. It performs well especially for high time thresholds. In all experiments, Nave approach and MDCOP-Miner generates same number of size 2 candidates. 48 Outline Introduction Related Work MDCOP Mining Problem Proposed MDCOP Mining Algorithms Evaluation Conclusion and Future Work 49 Contributions described today Mixed-drove spatio-temporal co-occurrence patterns (MDCOPs) and the MDCOP mining problem are defined. A new monotonic composite interest measure defined. Developed a novel and computationally efficient MDCOP mining algorithms. Proposed algorithm is correct and complete in finding mixed-drove prevalent patterns. Performance evaluation using real and synthetic datasets 50Ecology zonal co-location pattern ICDM05 - Discovering co-evolving spatio-temporal event setsGame (tactics) mixed-drove pattern Emerging Infectious Diseases Sustained emerging co-occurrence patterns 5. Periodic co-occurrence patterns 6. Spatio-temporal cascade patterns Co-occurrence patterns of moving objects Flock pattern, mixed-drove pattern, follow pattern, moving clusters, etc. Spatio-temporal Co-occurrence Pattern Taxonomy 1. Spatial co-location Global and zonal co-location patterns, etc. ICDM07 Zonal Co-location Pattern Mining ICDM05 Joinless Approach for Co-location Pattern Mining TKDE08 and ICDM06 - Mixed-Drove Spatio-Temporal Co-occurrence Pattern Mining ICDE-STDM07 - Mining At Most Top-K% Mixed-drove Spatio-temporal Co-occurrence Patterns 3. Emerging or vanishing co-occurrence patterns Emerging pattern: Interest measure getting stronger by the time Vanishing pattern: Interest measure getting weaker by the time ICTAI06 - Sustained Emerging Spatio-temporal Co-occurrence Pattern Mining 4. Co-evolving patterns 51 Chapter 2- Zonal Co-location Pattern Discovery Zones 2,4 Zone Given: different object types of spatial events and zone boundaries Find : Co-located subset of event types specific to zones Method: A novel algorithm by using an indexing structure. 52 Chapter 4 - Sustained Emerging ST Co-occurrence Pattern Discovery Given: A set P of Boolean ST object-types over a common ST framework Find: Sustained emerging spatio- temporal co-occurrence patterns whose prevalence measure increase over time. Method: Developing novel algorithms by defining monotonic interest measures. 53 Future Work Short Term 1. Spatial co-location Interest measure: participation index Global and zonal co-location patterns, etc. 2. Co-occurrence patterns of moving objects Flock pattern, mixed-drove pattern, follow pattern, cross pattern, moving clusters, etc. 3. Emerging or vanishing co-occurrence patterns Emerging pattern: Interest measure getting stronger by the time Vanishing pattern: Interest measure getting weaker by the time 4. Co-evolving patterns 5. Periodic co-occurrence patterns 6. Spatio-temporal cascade patterns Efficient methods Comparison of int. measures with statistical int. measures 54 Future Work Long Term Spatial and Spatio-temporal Pattern Mining Design Crime Analysis, GIS, Epidemiology Challenges discovering patterns and anomalies from enormous frequently updated spatial and spatio-temporal datasets, developing an ontological framework for spatial and spatio-temporal analysis, integrating spatial and spatio-temporal data from multiple agencies, distributed data, and multi-scale data 55 Acknowledgements Adviser: Prof. Shashi Shekhar Committee: Prof. Jaideep Srivastava, Prof. Arindam Banerjee, and Prof. Sudipto Banerjee Spatial Databases and Data Mining Group TEC collaborators: James P. Rogers, James A. Shine Dept. of Computer Science 56 References [1] J. Gudmundsson, M. v. Kreveld, and B. Speckmann, Efficient Detection of Motion Patterns in Spatio-Temporal Data Sets, ACM-GIS, , [2] P. Laube and S. Imfeld, Analyzing relative motion within groups of trackable moving point objects, in In GIScience, number 2478 in Lecture notes in Computer Science. Berlin: Springer, pp , [3] P. Kalnis, N. Mamoulis, and S. Bakiras, On Discovering Moving Clusters in Spatio-temporal Data, 9th Int'l Symp. on Spatial and Temporal Databases (SSTD), Angra dos Reis, Brazil, [4] Y. Huang, S. Shekhar, and H. Xiong, Discovering Co-location Patterns from Spatial Datasets: A General Approach, IEEE Trans. on Knowledge and Data Eng. (TKDE), vol. 16(12), pp , [5] M. Hadjieleftheriou, G. Kollios, P. Bakalov, and V. J. Tsotras, Complex Spatio- Temporal Pattern Queries, VLDB, pp , [6] C. du Mouza and P. Rigaux, Mobility Patterns, GeoInformatica, 9(4), , [7]J. S. Yoo and S. Shekhar, A Join-less Approach for Mining Spatial Co-location Patterns, IEEE Trans. on Knowledge and Data Eng. (TKDE), Vol.18, No.10, 2006.


Recommended