Compression of Spatio-Temporal Data
MDM 2016 Advanced Seminars
Goce TrajcevskiDept. of EECS
Northwestern University
Advanced Seminar MDM, Porto, June 2016
Pre-Introduction
Plethora of applications relying on some form of Location Based Service (LBS):
• transportation/routing• social networks, online/mobile marketing• traffic management, disaster response• environmental/structural health monitoring• ecology (flora and fauna)
Trajectories = location-in-time data
2
Advanced Seminar MDM, Porto, June 2016
Pre-Introduction
Novel trends
3
Advanced Seminar MDM, Porto, June 2016
Pre-Introduction
McKinsey 2011: Location data from GPS-equipped mobile phones = O(peta-bytes) (20% in 2010 → 70% in 2020)◦ 400-fold increase if cell-tower data included
◦ Coupled with other sensors data (US Express ~950 sensors)
Daily travel in the US averages 11 billion miles a day (approximately 40 miles per person)◦ 87% of them take place in personal vehicles – recording
location samples generated every 10 seconds ⇒ 275TB daily
4
Advanced Seminar MDM, Porto, June 2016
Pre-Introduction: the trends of Internet of Things
0 2 4 6 8 10 12 14 16 18
-1
-0.5
0
0.5
1
Low resolution Sensor, Test4, Increasing frequency
Time (sec)
Acc
eler
atio
n (g
)
5
Advanced Seminar MDM, Porto, June 2016
Outline
Introduction Spatial data and Temporal data compression Compression of Spatio-temporal data◦ Trajectories/fundamentals◦ Constraints◦ Real-time/tracking◦ Mobile shapes
Alternative views and recent trends Concluding remarks
6
Advanced Seminar MDM, Porto, June 2016
Introduction – Basics
Broadly, data compression can be perceived as a science or an art – or a mix of both – aiming at development of efficient methodologies for a compact representation of information◦ take a dataset D1 with a size β bits as an input, and produce
a dataset D′1 as a representation of D1 having a size β′ bits, where β′ <β (hopefully << ).
Example:◦ Transmitting raw (un-compressed) HDTV signal would
require a channel resource enabling 884 Mbits/second which, in turn, means a bandwidth of 220 MHZ.◦ Compressed version of the respective mix of video-frames
and audio require 20Mbits per second – only 6MHz of a bandwidth (which is the amount of bandwidth allocated in the US).
7
Advanced Seminar MDM, Porto, June 2016
Introduction – History
Smoke signals
Polybius square
Heliographs
Chappe
8
Advanced Seminar MDM, Porto, June 2016
Introduction – History: the electricity era
Telegraph:◦ Some letters occur more frequently than others… a → . _ q → _ _ . _Exploit frequency to reduce average transmission time
Non-statistical approaches◦ Vocoder (exploit a compact description of voice-box)◦ Recover the original “voice” at the receiving-end
Claude Shannon (1940s)◦ Fano; Huffman – information-theoretic bounds
LZ77◦ Many variants (even more law suits…)
Images/Video – MPEG (multiple versions)9
Advanced Seminar MDM, Porto, June 2016
Introduction – Taxonomies
Lossless vs. Lossy◦ Original data can/not be restored (i.e., distortion)
Entropy based vs. Dictionary based◦ Shannon/Huffman (frequency/probability of occurrence)◦ LZ variants (raw data has repetitions; collect a dictionary
and substitute repeated occurrences with its entry index)
Static vs. Dynamic/Adaptive◦ Properties (i.e., dictionary) known in advance◦ Properties vary in “real-time”…
10
Advanced Seminar MDM, Porto, June 2016
Outline
Introduction Spatial data and Temporal data compression Compression of Spatio-temporal data◦ Trajectories/fundamentals◦ Constraints◦ Real-time/tracking◦ Mobile shapes
Alternative views and recent trends Concluding remarks
11
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Cartography◦ Maps generalization Given a fixed-size window, represent larger area Equivalently, represent a given area in a smaller window
⇒ sacrifice the level of detail…
12
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Errors “from the get-go” Earth is not flat ⇒ projections◦ E.g., Mercator
Distorting the data at the benefit of “semantics”◦ 1:100,000 reduction scale will make any object (e.g., a
building) which has an edge smaller than 35 m - the vast majority of single family homes - to drop below 0.35mm; ◦ based on the physiology of the human vision, 0.35mm is the
limit of perceptibility. ◦ to keep a particular polygonal object with sides < 0.35mm
on the map, an “artificial enlargement” is needed13
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Basic issues:◦ Philosophical objectives of why to generalize.◦ Cartometric evaluation of the conditions which indicates
when to generalize. ◦ The selection of appropriate spatial and attribute
transformations which provide the techniques on how to generalize.
Fundamental operators (“how”):◦ simplification, smoothing, aggregation, amalgamation, merging,
collapse, refinement, exaggeration, enhancement and displacement
14
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Canonical problem – polyline simplification Given a polyline PL1 with vertices {v1,v2, . . . ,vn}, and a
tolerance ε Construct another polyline PL′1 with vertices {v′1,v′2, . . . ,v′m} such that:◦ m ≤ n, and◦ for every point P ∈ PL1 its distance from PL′1 is smaller than
a given threshold: dist(P,PL′1) ≤ ε .
ASIDE: if {v′1,v′2, . . . ,v′m} ⊆ {v1,v2, . . . ,vn}, the simplification is strong; otherwise, it is a weak simplification
15
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
By far the most popular heuristic approach: Douglas-Peucker (DP) or, Ramer-Douglas-Peucker (RDP)
16
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
RDP algorithm
17
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Observations: Complexity: O(n2)◦ Hershberger and Snoeyink provided O(n logn) algorithm…
Is NOT optimal (in the sense of guaranteeing the minimal number of points for the output)
Still, widely popular due to “visual appeal”◦ Plus, first one to be implemented in FORTRAN◦ Part of many GIS implementations…
Solves only min-# variant of compression…◦ min-#: given ε, generate a sub-polyline with the smallest
number of vertices;◦ min-ε: given a “budget” m (< n) generate a sub-polyline with
the smallest distance from the original one18
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Optimal algorithm:◦ Draw circles with radii ε centered at each vertex of the
polyline. ◦ Starting from the first vertex, draw the pair of tangents to each circle in
the sequence.◦ Let Ui and Li denote the upper and the lower ray emanating from v1
after drawing a pair of tangents to all the vertices up to vi
Essentially, the boundaries of a non-empty wedge◦ Repeat when wedge empty…◦ Repeat, starting at a different vertex (looking both backwards and
forward)
19
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Basic Properties:
◦ Note: a randomized algorithm yields O(n 4/3 + δ) for a small δ
Extensions to 3D:
20
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Life is not that simple… Errors may occur which have “topological nature”◦ Inside vs. outside (or, to-east vs. to-west)◦ Intersections (i.e., river outside (vs. inside) the city
boundary
21
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Other issues in spatial data compression: Application+device awareness:◦ Combining navigation with notifications on a limited display
(and during motion)
Distance measures:◦ Minimize area between original and reduced polyline◦ Minimize the notion of directionality-disturbance
Broader contexts:◦ Clustering◦ Point-sets compression Convex hull Non-convex hulls (α-shapes; χ-shapes)
22
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Other fields: Image processing Video processing◦ Subject to JPEG; MPEG families of compression◦ Either lossless, or rely in physiology when restoring
Spatial Data Warehousing◦ Multi-thematic layers◦ Context-based zoning E.g., soil types; plant/crop types
Spatial histograms for inter-molecular distances based on affinity…
23
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
time in computer science has lead to development of numerous protocols and methodologies from systems as well as semantic-based perspectives: ◦ Synchronization among processes ◦ concurrency management ◦ distributed systems◦ events management ◦ collaboration among sensing and computing devices◦ …
Compression:◦ Temporal Databases◦ Time series (+ Data Streams)
24
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Temporal databases: majority of the applications do have certain time-related semantics◦ adding “fixes” to a non-temporal database to cater to this,
has proven to be either too much overhead, or simply infeasible.
Example: having a DATE attribute enables knowing when a particular tuple became valid in the database.
However, the natural expectation is to know when a particular tuple (or an attribute therein) ceased to be valid◦ ergo, adding another DATE column ◦ Problem becomes the one of how to express in SQL some
simple queries…25
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Consider: “Who was Aaron’s manager when he worked in Capital account”?
Adding such features brought the concept of time-varying tables which, in turn, spurred the field of Temporal Databases )TDb◦ Temporal-SQL (TSQL), and the standardization of
incorporating and the temporal dimension in SQL3
Temporal Data Types:◦ Instant◦ Interval◦ Period
Kinds of time (user-defined; valid; transaction)
26
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Consider
and the query: “Retrieve the total number of employees per department, during January 1
and July 15 of 2015.”
Both Jack and James:◦ Increased salaries◦ Their intervals have been merged in the answer (coalescing) Lost the info about the salary increase…
27
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Granularity issue:◦ Time expressed in month/year◦ Flights expressed in minutes How to join such tables?
If interval-base temporal semantics is used, there are 13 possible relationships: Before Meet Overlap During Starts Finishes Plus 6 negations, plus Equal
Important in DW/BI – hierarchies along the time dimension…
28
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Time series and Data Streams◦ Two complementary classes of problems
Time series◦ Large datasets of sequences of (time, value) pairs.◦ Each sequence perceived as a point in N-dimensional space.◦ To decrease the “dimensionality curse”, often one resorts
to compressed representation, typically used for index
Data streams◦ Fast-arriving values, exceeding the memory capacity◦ Must be processed on-the-fly◦ Continuous queries (approximate answers)
29
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Time series◦ Index should be “build-able” in a reasonable time, and cater
to different distance functions Adaptive and non-adaptive representation methods Ensure lower-bounding (to be able to prune without false negatives
during searches)
Other desirable properties need to be maintained:◦ Quick answer to queries of interest (i.e., similarity in terms
of NN)◦ Maintain “perceptual similarity”
30
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Streaming Data◦ data model and query semantics must allow order-based
and time-based operations (e.g. queries over a five-minute moving window).
◦ The inability to store a complete stream suggests the use of approximate summary structures, referred to in the literature as synopses or digests.◦ Queries over the summaries may not return exact answers.◦ Streaming query plans may not use blocking operators that
must consume the entire input before any results are produced.
Best one can work towards is ensuring probabilistic guarantees on the bound of the error to a query-answer.
31
Advanced Seminar MDM, Porto, June 2016
Spatial and Temporal Data Compression
Random sampling, histograms, wavelets… Example – consider the set:
C = {5.0,2.2,3.1,4.7,5.2,4.0,5.3,7.1,7.9,3.7,4.2,6.8,7.3,6.1}12 values in total
Histogram:
3 buckets (equi-width)◦ 5 numbers total ⇒ 58.6% savings
However, the query “how many members fall between 3 and 5.5 yields 4.5 (assuming uniformity)…
32
Advanced Seminar MDM, Porto, June 2016
Outline
Introduction Spatial data and Temporal data compression Compression of Spatio-temporal data◦ Trajectories/fundamentals◦ Constraints◦ Real-time/tracking◦ Mobile shapes
Alternative views and recent trends Concluding remarks
33
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Moving Object
Any real-world elementthat may be perceived as“unique” (car, person, animal)
Changes its spatial whereabouts over time(during its existence)
Trajectory: continuous mapping from Time to (some Geographical) 2D SpaceI(⊆) R → R2
or, even think of it as parameterization over timeα(t) = (αx(t), αy(t))
So, now Trajectory = {(αx(t), αy(t),t)| t ∈ I}
•Description•Representation•Manipulations
34
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Raw Data → Trajectory
Sample-points (location, time) need to be part of the trajectory, however, in-between?
Linear Interpolation: (x,y,t) = (xi,yi,ti)+ [(t −ti)/(ti+1 −ti)] (xi+1−xi,yi+1−yi,ti+1−ti)(assumption: constant velocity in—between samples…)
Interpolation via Bezier Curves
(time-slices)
-Spatiotemporal Data Model-Constraint Database Model-Moving Objects Database Model
Off the shelf – STER
Differential Geometry; Toplogy
35
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Road-Network Constrained
Distance → graph-distance⇒Dijkstra-like algorithmic approaches
The flow/speed in a given link may vary in time (A* based SP)…
Eco-routes
36
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Trajectory construction◦ (location, time) updates◦ Periodic (location, time, velocity) updates◦ Full-future trajectory
So, why simplification/compression?
As always:1. Save storage2. Save on communication/bandwidth
37
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
T′ is called an ε –simplification of T with respect to a distance measure M (equivalently, T′ is a simplification of T with an M-tolerance ε ), denoted by
T′ = S(T,ε ,M) if DM(T,T′) ≤ε
T
T’
38
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Distance functions:◦ Hausdorff distance (ignoring time):
◦ From T to T’
◦ Symmetric: DH(T,T′) = max(˜DH(T,T’), ˜DH(T’, T))
39
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Hausdorff distance does not incorporate time Not appropriate for “mobile world” Consider “man walks the dog” example:
40
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Frechet distance: The most general way of incorporating time (i.e., all
the possible ways) Consider two curves:
Their Frechet distance is defined as:
where α and β range over all the possible continuous an monotonically increasing mappings[0,1] → [a1,b1] and [0,1] → [a2,b2]
41
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Restricted variants of Frechet distance: Eu – The three dimensional time uniform distance is
defined when tm is between ti and tj , as follows:◦ Eu(pm, pipj) = √(xm−xc)2 +(ym−yc)2 where pc = (xc,yc, tc) is the
unique point on pipj which has the same time value as pm(i.e., tc = tm).
42
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Et – The time distance is defined as: Et(pm, pipj) = |tm−tc|, where tc is the time of the point
on p′ip′j (which is the X-Y projection of pipj) that is closest in terms of the 2D Euclidean distance to p′m(the X-Y projection of pm);
if the closest point on p′ip′j has more than one time point, choose the one that maximizes |tm −tc|.
Intuitively:◦ project both on the X-Y plane, then find the point p′c on
the projected segment which is closest to p′m,◦ find the difference between their time-values.
43
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Compression is lossy – so, using the compressed trajectory to answer queries will generate errors (with respect to the answer applied to the original trajectory).
Errors depend on the distance function (and, of course, the type of query)
For a given query q, the error with respect to a trajectory T is bounded by δ if the difference between the answer of q on T and the answer of q on a ε -simplification of T is bounded by δ.◦ Clearly, semantics of the δ depends on the query type.
44
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Relationships among different distance-functions
ASIDE: when treating time as “almost-z” one needs to “relativize”
45
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Queries:◦ Where_at◦ When_at◦ Range (i.e., Intersect(T,P))◦ NN◦ Θ-join
ASIDE: heuristic (i.e., RDP) used;Optimal algorithm yields an increasein complexity
46
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
So far, full-trajectories were considered◦ ASIDE: one may consider periodic re-simplification of the
compressed/simplified trajectories To enable further space-savings Applied to “old-enough” trajectories
◦ aka “Aging”
But what happens with the data that arrives to the MOD server in real-time?◦ One can store it all, and then apply simplification on the
completed motion◦ OR – one can attempt to compress the data “on the fly”…
47
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Real-time compression:◦ Motion model: sending (location, time, velocity) updates◦ When: event-triggered, whenever the actual location
deviates by > δ from the expected location (based on the previous update) – aka dead-reckoning
48
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
With distance-based dead reckoning, the MOD server performs two tasks:◦ (1) corrects its own “knowledge” about the recent past and
approximates the actual trajectory between toldand tnow with a straight line-segment, which defines the actual simplification of the near-past trajectory;◦ (2) generates another infinite ray corresponding to the
future-expected trajectory, starting at the last update point, and using the newly received velocity vector for extrapolation.
As it turns out:◦ Using ε as a dead-reckoning threshold would generate a
simplified trajectory which would be a strong simplification with bound ≤ 2ε with respect to the entire trajectory…
49
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Generic real-time compression (tracking protocols)
50
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Tracking in sensor networks◦ Instantaneous location detected by trilateration (i.e.,
distances from 3 sensors)◦ Brute-force: transmit every location to the sink and let the
sink construct the trajectory
51
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Assign a buffer (transmitted to next principal, along with current location) When buffer exceeds certain capacity-threshold,
apply compression When buffer fills in, transmit the entire buffer-
simplified trajectory to the sink
ASIDE:The issue of “freshness”in the sink…
52
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
An important context: what if the motion is constrained to an existing road network?
Often the case in practice – and one can capitalize on identifying “popular routes” and using them as “dictionary entries”.
Example:◦ If an object is known to move along an existing road
segment, than we need not store any GPS updates in-between (assuming uniformity of the motion) If the motion is non-uniform, then store the time-instants where the
speed changes, along with the distance from one of the end-points of that road-segment
53
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
In other words, change the “traditional” network edge-oriented model
Into a route-based model
Generalize route:
54
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Key observation:
Compression is not based on each individual trajectory, but reduces the size of MOD as a whole (in terms of number of Bytes needed to represent the dataset)
While the issues of uncertainty (in terms of query-errors) are not alleviated, the overall compression ratio for the entire MOD is higher than the “sum of the individual savings”
Need more solid (ML-”ish”) prediction to be applicable for real-time compression
55
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
What if the entities which move also have (deformable) extents?◦ Example: ◦ Spreading of toxic gasses or spills◦ Region affected by a hurricane (eye +tail)
To begin with, one can only get discrete spatial samples at discrete time-instants
Thus, in a sense:◦ Compact representation of point-sets◦ Those point-sets are close-enough in capturing a particular
continuous phenomenon
56
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Iso-contours:◦ Boundaries of region where a phenomenon has values
within certain bounds
Observation:◦ This is more similar to the min-ε variant (i.e., one is given a
budget of “size” for the polygons used) with respect to the area◦ One can dynamically vary the (sub)sizes allocated to
different values from the domain of the phenomenon…57
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
What to use when knowing that measurements were taken in discrete locations?
Using convex hull (while simple), need not be the best idea (i.e., one may end up with a lot of dead-space…). ◦ α-shapes need not generate a simple polygon
58
Advanced Seminar MDM, Porto, June 2016
Spatio-Temporal Data Compression
Key observation when tracking:◦ Do not calculate the new shape from the scratch
(if/whenever possible)
◦ Similarly to trajectory-tracking – if the “freshness” of the sink is not “a must” – one can put a threshold based only on the pending queries and their answer-changes
59
Advanced Seminar MDM, Porto, June 2016
Outline
Introduction Spatial data and Temporal data compression Compression of Spatio-temporal data◦ Trajectories/fundamentals◦ Constraints◦ Real-time/tracking◦ Mobile shapes
Alternative views and recent trends Concluding remarks
60
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Behavioral classification of movement:◦ Classification relies on computing and analyzing movement
features jointly in both the spatial and temporal domains. focusing on the spatial domain, the underlying movement space is
partitioned into several zonings that correspond to different spatial scales, and features related to movement are computed for each partitioning level.
concentrating on the temporal domain, several movement parameters are computed from trajectories across a series of temporal windows of increasing sizes, yielding another set of input features for the classification.
For both the spatial and the temporal domains, ML techniques used to determine the “reliable scale”
61
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Zebra-fish motion in reaction to different pharmacological products
62
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Spatio-temporal Data warehousing
A wide range of materializationsat different levels of hierarchies
63
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Visualization of trajectories
64
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Visualization of trajectories Density map framework for expert users, who
explore distributions of attributes defined along trajectories.
In the exploration, the user mainly interacts with the distribution maps, though fine tuning for optimizing details is possible.
65
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Symbolic/Semantic Trajectories
Recall – MOD:◦ Collection of trajectories {Tr1, Tr2, …, Trk}◦ Each Tri a sequence:◦ [(xi1,yi1,ti1), (ki2,yi2,ti2), …, (xim,yim,tim)]
tij < ti(j+1)in-between location samples, interpolation assumed
Semantic Trajectories◦ Sequence of Semantic Episodes
𝑆𝑆𝑆𝑆 = [𝑠𝑠𝑠𝑠1 , 𝑠𝑠𝑠𝑠2 , 𝑠𝑠𝑠𝑠3 , … , 𝑠𝑠𝑠𝑠𝑛𝑛]◦ Each episode a tuple of the form𝑠𝑠𝑠𝑠𝑖𝑖 = (𝑑𝑑𝑑𝑑𝑖𝑖 , 𝑠𝑠𝑠𝑠𝑖𝑖 , 𝑥𝑥𝑖𝑖𝑖𝑖𝑛𝑛 ,𝑦𝑦𝑖𝑖𝑖𝑖𝑛𝑛 , 𝑡𝑡𝑖𝑖𝑖𝑖𝑛𝑛 , 𝑥𝑥𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 ,𝑦𝑦𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 , 𝑡𝑡𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 , 𝑡𝑡𝑑𝑑𝑡𝑡𝑡𝑡𝑡𝑡𝑠𝑠𝑡𝑡)
66
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Symbolic/Semantic Trajectories
Each semantic episode:𝑠𝑠𝑠𝑠𝑖𝑖 = (𝑑𝑑𝑑𝑑𝑖𝑖 , 𝑠𝑠𝑠𝑠𝑖𝑖 , 𝑥𝑥𝑖𝑖𝑖𝑖𝑛𝑛 ,𝑦𝑦𝑖𝑖𝑖𝑖𝑛𝑛 , 𝑡𝑡𝑖𝑖𝑖𝑖𝑛𝑛 , 𝑥𝑥𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 ,𝑦𝑦𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 , 𝑡𝑡𝑖𝑖𝑜𝑜𝑜𝑜𝑜𝑜 , 𝑡𝑡𝑑𝑑𝑡𝑡𝑡𝑡𝑡𝑡𝑠𝑠𝑡𝑡)
Consists of:da = defining annotation◦ typically expressing an activity (verb) such as “stop”, “move”; etc…
sp = semantic location/position of the activity ◦ e.g., POI (museum, restaurant, zoo), home, work, etc.
tin and tout
◦ entry/exit times of a semantic position.tagList◦ any additional semantic information (e.g., transportation mode)
67
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
Symbolic/Semantic Trajectories
ST1 =[(drive, Adams St, 50, 10, 10:45, 10, 10, 11:00, drive, car, VW)(stop, “Roditis”, 10, 10, 11:00, 10, 10, 11:45, restaurant, eat,salad),(walk, parking lot, 10, 10, 11:45, 11, 10, 11:50, car, VW),(drive, Randolph St, 11, 10, 11:55, 25, 10, 12:00, car),(stop, traffic light, 25, 10, 12:00, 25, 10, 12:03, car),(. . .)(stop,“Starbucks”, 25, 40, 12:25, 25, 40, 1:30, coffee, eat, dessert)]
ST2 = [(move, Dearborn St, 60, 60, 11:30, 60, 40, 11:45, walk),(stop, “Arby’s” , 60, 40, 11:45, 60, 40, 12:30, fast-food, eat, beef),(move, Dearborn St, 60, 40, 12:30, 60, 35, 13:00, walk),(move, Chicago Ave, 50, 35, 13:00, 25, 35, 13:25, ride, bus 14),(stop, “Starbucks”, 25, 35, 13:25, 25, 35, 13:50, coffee, desert),. . .(move, Jackson St, 10, 20, 14:15, 50, 20, 14:40, ride, bus 151) ]
68
Advanced Seminar MDM, Porto, June 2016
Alternative Views/Trends
From a certain perspective,
Clusters Flocks Convoys
Can be considered as compressed (more compact) representation of constituent trajectories…
69
Advanced Seminar MDM, Porto, June 2016
Instead of conclusions
The “saga” will continue
Internet of Things offers a plethora of cross-contexts compression approaches◦ Example: accounting for possible offline/indoor motion
Privacy assurances in the IoT context (while providing particular sampling-quality)
Analysis of collective motion in social settings is likely to generate novel compact representations◦ Interested only in groups who have exhibited a pattern of
sequentially visiting PoI’s within same time-delays and with above-threshold memberships
70
Advanced Seminar MDM, Porto, June 2016
References
P.K. Agarwal and K. R. Varadarajan. Efficient algorithms for approximating polygonal chains. Discrete & Computational Geometry, 23:273– 291, 2000.
I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless sensor networks: a survey. Computer Networks, 38(4), 2002.
Besim Avci, Goce Trajcevski, and Peter Scheuermann. Managing evolving shapes in sensor networks. In Conference on Scientific and Statistical Database Management, SSDBM ’14, Aalborg, Denmark, June 30 - July 02, 2014, pages 22:1–22:12, 2014.
Artur Baniukevic, Christian S. Jensen, and Hua Lu. Hybrid indoor positioning with wi-fi and bluetooth: Architecture and performance. In 2013 IEEE 14th International Conference on Mobile Data Management, Milan, Italy, June 3-6, 2013 - Volume 1, pages 207–216, 2013.
Gill Barequet, Danny Z. Chen, Ovidiu Daescu, Michael T. Goodrich, and Jack Snoeyink. Efficiently approximating polygonal paths in three and higher dimensions. Algorithmica, 33(2):150–167, 2002.
David Walter Baronowski. Polybius and Roman Imperialism. Bloomsbury Academic, 2013.
Borko Furht. A survey of multimedia compression techniques and standards. part I: JPEG standard. Real-Time Imaging, 1(1):49–67, 1995.
Robert L. Hilliard and Michael C. Keith. The Broadcast Century and Beyond: A Biography of American Broadcasting. Focal Press, 2010.
David A. Huffman. A method for the construction of minimum redundancy codes. In IRE, pages 1098–1101, 1951.
Michael Iliadis, Jeremy Watt, Leonidas Spinoulas, and Aggelos K. Katsaggelos. Video compressive sensing using multiple measurement vectors. In IEEE International Conference on Image Processing, ICIP 2013, Melbourne, Australia, September 15-18, 2013, pages 136– 140, 2013.
71
Advanced Seminar MDM, Porto, June 2016
References
James Biagioni, A. B. M. Musa, and Jakob Eriksson. Thrifty tracking: online GPS tracking with low data uplink usage. In 21st SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPATIAL 2013, Orlando, FL, USA, November 5-8, 2013, pages 486–489, 2013.
Michael H. B¨ohlen, Johann Gamper, and Christian S. Jensen. How would you like to aggregate your temporal data? In 13th International Symposium on Temporal Representation and Reasoning (TIME 2006), 15-17 June 2006, Budapest, Hungary, pages 121–136, 2006.
Alan Both, Matt Duckham, Patrick Laube, Tim Wark, and Jeremy Yeoman. Decentralized monitoring of moving objects in a transportation network augmented with checkpoints. Comput. J., 56(12):1432–1449, 2013.
Hu Cao, Ouri Wolfson, and Goce Trajcevski. Spatio-temporal data reduction with deterministic error bounds. VLDB Journal, 15(3), 2006.
W. Bernard Carlson. Tesla: The Inventor of the Electrical Age. Princeton University Press, 2013.
W.S. Chan and F. Chin. Approximation of polygonal curves with minimum number of line segments or minimum error. International Journal on Computational Geometry and Applications, 6(1), 1996.
Zhaofu Chen, Rafael Molina, and Aggelos K. Katsaggelos. Robust recovery of temporally smooth signals from under-determined multiple measurements. IEEE Transactions on Signal Processing, 63(7):1779–1791, 2015.
Urska Demsar, Kevin Buchin, E. Emiel van Loon, and Judy Shamoun- Baranes. Stacked space-time densities: a geovisualisation approach to explore dynamics of space use over time. GeoInformatica, 19(1):85– 115, 2015.
David Douglas and T. Peuker. Algorithms for the reduction of the number of points required to represent a digitised line or its caricature. The Canadian Cartographer, 10(2), 1973.
Matt Duckham, Lars Kulik, Michael F. Worboys, and Antony Galton. Efficient generation of simple polygons for characterizing the shape o a set of points in the plane. Pattern Recognition, 41(10):3224–3236, 2008.
72
Advanced Seminar MDM, Porto, June 2016
References
Sorabh Gandhi, John Hershberger, and Subhash Suri. Approximate isocontours and spatial summaries for sensor networks. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks, IPSN 2007, Cambridge, Massachusetts, USA, April 25-27, 2007, pages 400–409, 2007.
Oliviu Ghica, Goce Trajcevski, Ouri Wolfson, Ugo Buy, Peter Scheuermann, Fan Zhou, and Dennis Vaccaro. Trajectory data reduction in wireless sensor networks. IJNGC, 1(1), 2010.
Vladimir Grupcev, YongkeYuan, Yi-Cheng Tu, Jin Huang, Shaoping Chen, Sagar Pandit, and Michael Weng. Approximate algorithms for computing spatial distance histograms with accuracy guarantees. IEEETrans. Knowl. Data Eng., 25(9):1982–1996, 2013.
Joachim Gudmundsson, Jyrki Katajainen, Damian Merrick, Cahya Ong, and Thomas Wolle. Compressing spatio-temporal trajectories. Comput. Geom., 42(9):825–841, 2009.
Ralf H. G¨uting, Michael H. B¨ohlen, Martin Erwig, Christian S. Jensen, Nikos Lorentzos, Markus Schneider, and Michalis Vazirgiannis. A foundation for representing and queirying moving objects. ACM TODS, 2000.
Ralf H. G¨uting and Markus Schneider. Moving Objects Databases. Morgan Kaufmann, 2005.
Ralf Hartmut G¨uting, Fabio Vald´es, and Maria Luisa Damiani. Symbolic trajectories. ACM Trans. Spatial Algorithms and Systems, 1(2):7, 2015.
John Hershberger and Jack Snoeyink. Speeding up the douglas-peuker line-simplification algorithm. In Proceedings of the 5th International Symposium on Spatial Data Handling, 1992.
Robert L. Hilliard and Michael C. Keith. The Broadcast Century and Beyond: A Biography of American Broadcasting. Focal Press, 2010.
D. Hirschberg and D. A. Lelewer. Data compression. Computing Surveys, 19(3), 1987.
Honeywell International Inc. Vehicle detection using amr sensors. Technical report, Defense and Space Electronics Systems, 12001 Highway 55, Plymouth, MN 55441, 2005.
Katja Hose and AkriviVlachou. A survey of skyline processing in highly distributed environments. VLDB J., 21(3), 2012.
Zhiyong Huang, Hua Lu, Beng Chin Ooi, and Anthony K. H. Tung. Continuous skyline queries for moving objects. IEEE Trans. Knowl. Data Eng., 18(12), 2006.
[H. Imai and M. Iri. Polygonal approximations of a curve-formulations and algorithms. In Computational Morphology, pages 71–86. Elsevier Science Publishers, New York, N.Y., 1988.
73
Advanced Seminar MDM, Porto, June 2016
References
JeongAh Jang, Hyunsuk Kim, and HanByeog Cho. Smart roadside server for driver assistance and safety warning: Framework and applications. In CUTE 2010, pages 1–5, Dec 2010.
Christian S. Jensen and Richard T. Snodgrass. Temporal data management. IEEE Trans. Knowl. Data Eng., 11(1):36–44, 1999.
Panagiota Katsikouli, Rik Sarkar, and Jie Gao. Persistence based online signal and trajectory simplification for mobile devices. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Dallas/Fort Worth, TX, USA, November 4-7, 2014, pages 371–380, 2014.
Georgios Kellaris, Nikos Pelekis, and Yannis Theodoridis. Map-matched trajectory compression. Journal of Systems and Software, 86(6):1566– 1579, 2013.
M. Koubarakis, T. Sellis, A.U. Frank, S. Grumbach, R.H. G¨uting, C.S. Jensen, N. Lorentzos, Y. Manolopoulos, E. Nardelli, B. Pernici, H.-J.Scheck, M. Scholl, B. Theodoulidis, and N. Tryfona, editors. Spatio-Temporal Databases – the CHOROCHRONOS Approach. Springer-Verlag, 2003.
Ralph Lange, Frank D¨urr, and Kurt Rothermel. Efficient real-time trajectory tracking. VLDB J., 20(5):671–694, 2011.
Ralph Lange, Tobias Farrell, Frank D¨urr, and Kurt Rothermel. Remote real-time trajectory simplification. In PerCom, 2009.
Abraham Lempel and Jacob Ziv. A universal algorithm for sequentia data compression. IEEE Transactions on Information Theory,23(3):337–343, 1977.
Hechen Liu and Markus Schneider. Tracking continuous topological changes of complex moving regions. In SAC, pages 833–838, 2011.
. Jonathan Muckell, Jeong-Hyon Hwang, Vikram Patil, Catherine T. Lawson, Fan Ping, and S. S. Ravi. SQUISH: an online approach for GPS trajectory compression. In Proceedings of the 2nd International Conference and Exhibition on Computing for Geospatial Research & Application, COM.Geo 2011, Washington, DC, USA, May 23-25, 2011, pages 13:1–13:8, 2011
74
Advanced Seminar MDM, Porto, June 2016
References
Cheng Long, Raymond Chi-Wing Wong, and H. V. Jagadish. Direction preserving trajectory simplification. PVLDB, 6(10):949–960, 2013.
Pierre-Franc¸ois Marteau and Gildas M´enier. Speeding up simplification of polygonal curves using nested approximations. Pattern Anal. Appl., 12(4):367–375, 2009.
Mckinsey Global Institute. Big data: The next frontier for innovation, competition, and productivity, 2011.
Robert McMaster. Automated line generalization. Cartographica, 24(2):74–111, 1987
Widong Kou. Digital Image Processing: Algorithms and Standards. Springer, 1995.
William J. Phalen. How Telegraph Changed the World. Mc Fallen, 2014.
Khalid Sayood. Introduction to Data Compression. Morgan Kauffman, 1996.
Penelope Wilson. Hieroglyphs: A Short Introduction. Oxford University Press, 2005.
Nirvana Meratnia and Rolf A. de By. Spatiotemporal compression techniques for moving point objects. In Advances in Database Technology - EDBT 2004, 9th International Conference on Extending Database Technology, Heraklion, Crete, Greece, March 14-18, 2004, Proceedings, pages 765–782, 2004.
Mohamed F. Mokbel and Walid G. Aref. SOLE: scalable on-line execution of continuous queries on spatio-temporal data streams. VLDB Journal, 17(5):971–995, 2008.
Mark Monmonier. Rhumb Lines and Map Wars: A Social History of the Mercator Projection. University of Chicago Press, 2004.
Axel Mosig, Stefan J¨ager, ChaofengWang, Sumit Kumar Nath, Ilke Ersoy, Kannappan Palaniappan, and Su-Shing Chen. Tracking cells in life cell imaging videos using topological alignments. Algorithms for Molecular Biology, 4, 2009.
Iulian Sandu Popa, Karine Zeitouni, Vincent Oria, and Ahmed Kharrat. Spatio-temporal compression of trajectories in road networks. GeoInformatica, 19(1):117–145, 2015
75
Advanced Seminar MDM, Porto, June 2016
References
Jonathan Muckell, Paul W. Olsen, Jeong-Hyon Hwang, Catherine T. Lawson, and S. S. Ravi. Compression of trajectory data: a comprehensive evaluation and new approach. GeoInformatica, 18(3):435–460, 2014.
Christine Parent, Stefano Spaccapietra, Chiara Renso, Gennady L. Andrienko, Natalia V. Andrienko, Vania Bogorny, Maria Luisa Damiani, Aris Gkoulalas-Divanis, Jos´e Antˆonio Fernandes de Macˆedo, Nikos Pelekis, Yannis Theodoridis, and ZhixianYan. Semantic trajectories modeling and analysis. ACM Comput. Surv., 45(4):42, 2013.
A. Di Pasquale, L. Forlizzi, C. S. Jensen, Y. Manolopoulos, E. Nardelli, D. Pfoser, G. Proietti, S. Saltenis, Y. Theodoridis, and T. Tzouramanis. Access methods and query processing techniques. In Spatio-Temporal Databases: the Chorochronos Approach. 2003.
Mindaugas Pelanis, Simonas Saltenis, and Christian S. Jensen. Indexing the past, present, and anticipated future positions of moving objects. ACM Trans. Database Syst., 31(1), 2006.
Nikos Pelekis, Gennady L. Andrienko, Natalia V. Andrienko, Ioannis Kopanakis, Gerasimos Marketos, and Yannis Theodoridis. Visually exploring movement data via similarity-based analysis. J. Intell. Inf. Syst., 38(2):343–391, 2012.
Nikos Pelekis and Yannis Theodoridis. Mobility Data Management and Exploration. Springer, 2014.
Davood Rafiei and Alberto O. Mendelzon. Similarity-based queries for time series data. In SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA., pages 13–25, 1997.
Guenter Rebmann, Albert Verhoeff, and Megan Gilge. Ibm real-time compression on the ibm xiv storage system. In IBM RedBooks Collection. (ibm.com/redbooks), 2015.
Chiara Renso, Stefano Spaccapietra, and Esteban Zim´anyi (editors). Mobility Data: Modeling, Management and Understanding. Cambridge University Press, 2013.
225. K. Reumann and A.P. Witkam. Optimizing curve segmentation in computer graphics. In International Computing Symposium, pages 467–472, 1974.
76
Advanced Seminar MDM, Porto, June 2016
References
K. Reumann and A.P. Witkam. Optimizing curve segmentation in computer graphics. In International Computing Symposium, pages 467–472, 1974.
Mehdi Riahi, Thanasis G. Papaioannou, Immanuel Trummer, and Karl Aberer. Utility-driven data acquisition in participatory sensing. In Joint 2013 EDBT/ICDT Conferences, EDBT ’13 Proceedings, Genoa, Italy, March 18-22, 2013, pages 251–262, 2013.
Kai-Florian Richter, Falko Schmid, and Patrick Laube. Semantic trajectory compression: Representing urban movement in a nutshell. J. Spatial Information Science, 4(1):3–30, 2012.
Philippe Rigaux, Michel Scholl, and Agnes Voisard. Introduction to Spatial Databases: Applications to GIS. Morgan Kauffmann, 2000.
Alan Saalfeld. Topologically consistent line simplification with the douglas-peucker algorithm. Cartography and Geographic Information Science, 26(1):718, 1999.
Juarez A. P. Sacenti, Fabio Salvini, Renato Fileto, Alessandra Raffaet`a, and Alessandro Roncato. Automatically tailoring semantics-enabled dimensions for movement data warehouses. In Big Data Analytics and Knowledge Discovery - 17th International Conference, DaWaK 2015, Valencia, Spain, September 1-4, 2015, Proceedings, pages 205–216, 2015.
Mahmoud Attia Sakr, Gennady L. Andrienko, Thomas Behr, Natalia V. Andrienko, Ralf Hartmut G¨uting, and Christophe Hurter. Exploring spatiotemporal patterns by integrating visual analytics with a moving objects database system. In 19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2011, November 1-4, 2011, Chicago, IL, USA, Proceedings, pages 505–508, 2011.
Khalid Sayood. Introduction to Data Compression. Morgan Kauffman,1996.
J.H. Schiller and A. Voisard (editors). Location-Based Services. Morgan Kaufmann, 2004.
Yufei Tao and Dimitris Papadias. Spatial queries in dynamic environments. ACM Trans. Database Syst., 28(2), 2003.
77
Advanced Seminar MDM, Porto, June 2016
References
Goce Trajcevski. Probabilistic range queries in moving objects databases with uncertainty. In Proceedings of the Third ACM International Workshop on Data Engineering for Wireless and Mobile Access, MobiDE 2003, pages 39–45, 2003.
Goce Trajcevski, Hu Cao, Peter Scheuermann, OuriWolfson, and Dennis Vaccaro. On-line data reduction and the quality of history in moving objects databases. In MobiDE, pages 19–26, 2006.
UNISYS. 2009 hurricane/tropical data for Atlantic, 2009.
USGS. Maps, imagery, and publications, 2009.
Alejandro A. Vaisman and Esteban Zimanyi. Data Warehouse Systems - Design and Implementation. Data-Centric Systems and Applications. Springer, 2014.
Robert Weibel. Generalization of spatial data: Principles and selected algorithms. In Algorithmic Foundations of Geographic Information Systems. LNCS Springer Verlag, 1997.
Florian Wenzel and Werner Kießling. Aggregation and analysis of enriched spatial user models from location-based social networks. In Proceedings of GeoRich@SIGMOD, page 8, Snowbird, USA, 2014.
Penelope Wilson. Hieroglyphs: A Short Introduction. Oxford University Press, 2005.
XiaohuiYu, Ken Q. Pu, and Nick Koudas. Monitoring k-nearest neighbor queries over moving objects. In ICDE, pages 631–642, 2005.
Xianjin Zhu, Rik Sarkar, Jie Gao, and Joseph S. B. Mitchell. Lightweight contour tracking in wireless sensor networks. In INFOCOM, pages 1175–1183, 2008.
78
Advanced Seminar MDM, Porto, June 2016
Thank You
Questions?
79