Data Structures for Range Minimum Queries
in Multidimensional Arrays
Hao Yuan Mikhail J. Atallah
Purdue University
January 17, 2010
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 1 / 28
Outline
Introduction
Results
OverviewDetails
Step 1: Comparison-Efficient Data StructuresStep 2: Random Access Machine Implementation
Future Work
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 2 / 28
Definitions
Given a d-dimensional array A with N entries, a Range Minimum Query(RMQ) asks the minimum element in the query rangeq = [a1, b1]× [a2, b2]× · · · × [ad , bd ], i.e.,
RMQ(A, q) = minA[q] = min(k1,...,kd )∈q
A[k1, . . . , kd ].
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 3 / 28
Applications
String Pattern Matching: 1D RMQ and its related Least CommonAncestor (LCA) problems are fundamental building blocks in suffixtrees/arrays
Computational Biology: Finding min/max number in an alignmenttableau (genome sequence analysis)
Image Processing: Finding the lightest/darkest point in a range(Dilate/Erode Filter)
Databases: Range Min/Max Query in OLAP Data Cube
Example
Select the highest paid employee whose age is between 30 and 40 andjoined the company during the period between 1995 and 2005
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 4 / 28
Previous Work - 1D RMQ
1D Range Minimum Query
Linear Reduction to Least Common Ancestor (LCA) Problem[Gabow, Bentley and Tarjan 1984]
LCA: O(N) Preprocessing, O(1) Querying[Harel and Tarjan 1984]
RMQ & LCA: Much Studied(Parallelization, Simplification, Distributed Algorithms, etc)[Schieber and Vishkin 1988][Bender and Farach-Colton 2000][Alstrup et al. 2002]
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 5 / 28
Previous Work - Semigroup Model
Related to the semi-group sum problem (MIN is a semi-group operator)Data Structures: O(M) preprocessing time and space (M ≥ N),O(α(M,N)) querying time
One Dimensional: [Yao 1982], [Alon and Schieber 1987]
Multidimensional (fixed d): [Chazelle and Rosenberg 1989]
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 6 / 28
Multidimensional RMQ
Unit-Cost RAM Model:O(1) cost for: Read/Write Memory, +,−, ∗, /, <<, >>
Comparison-Based: Array entries can only be compared
Table: Results for d-dimensional RMQ (d is fixed). The O(·) is omitted.
Preprocess Time Space Querying Time
Gabow et al. 1984 N logd−1 N N logd−1 N logd−1 N
Chazelle and Rosenberg 1989 M M αd(M,N)
Poon 2003 N(log∗ N)d N 1
Amir et al. 2007 (d = 2) N log[k+1] N kN 1
Our result N N 1
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 7 / 28
Overview
General Approach
Design Comparison-Efficient Algorithm:Only comparisons between input array entries are counted
Implement the Algorithm in RAM:All the computations are counted
Example: Minimum Spanning Tree Verification[Komlos 1984] [Dixon, Rauch and Tarjan 1992]
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 8 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Overview
Following the general approach:
Comparison-Efficient Data StructuresNew 1D RMQ
Preliminary: O(N log N)-comparison preprocessingand 1-comparison queryingSpeedup the preprocessing to O(N) comparisons
New data structure generalizes to two or higher dimensional cases
Preprocessing: O(N) comparisonsQuerying: O(1) comparisons
RAM Implementations
Micro blocks of size ε log NSolve big size query by well-known algorithmsSolve small size query by table lookup
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 9 / 28
Multidimensional RMQ
If only count comparisons:
2D RMQ Lower Bound: [Demaine, Landau and Weimann, 2009]If NO COMPARISON is allowed at the query stage, then Ω(N log N)comparisons preprocessing is required
Our Result: O(2.89d(d + 1)!N) comparisons preprocessing,2d − 1 comparisons querying
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 10 / 28
Canonical Ranges
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
x
CR(x) = [5, 8]
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 11 / 28
Pre-Computations
For each p ∈ CR(x), define
LeftMin(x , p) = mink∈CR(x) and k≤p
A[k]
RightMin(x , p) = mink∈CR(x) and k≥p
A[k]
5 6 7 8
x
LeftMin(x, 7) = min A[5], A[6], A[7]
RightMin(x, 7) = min A[7], A[8]
p
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 12 / 28
Query
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
x y
RightMin(x, 6) LeftMin(y, 14)
min A[6..14]LCA(6, 14)
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 13 / 28
Pre-Computations
Naıve Algorithm: O(N log N) Comparisons
Faster Approach
Sort the canonical ranges by their lengthsCompute the LeftMin and RightMin entries for canonical ranges inthe sorted order
For length-one canonical range CR(w),LeftMin(w , p) = RightMin(w , p) = A[p] (p ∈ CR(w))For a canonical range CR(w) covering more than one position,compute the LeftMin and RightMin arrays in O(log |CR(w)|) time(instead of O(|CR(w)|))
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 14 / 28
Pre-Computations
Case 1: p ∈ CR(x)
LeftMin(w, p) = LeftMin(x, p)
x y
w
p
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 15 / 28
Pre-Computations
Case 2: p ∈ CR(y)
x y
LeftMin(w, p) = min min CR(x), LeftMin(y,p)
w
p
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 16 / 28
Pre-Computations
Case 2: p ∈ CR(y)
x y
LeftMin(w, p) = min min CR(x), LeftMin(y,p)
w
p
Monotonicity (Non-Increasing): LeftMin(y , p) ≥ LeftMin(y , p + 1)
Binary Search!
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 17 / 28
Binary Search
Example
x y
min CR(x) = 40
w
p
LeftMin(y, ...) = 90, 70, 50, 20, 10
LeftMin(w, ...) = 40, 40, 40, 20, 10
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 18 / 28
Comparison Complexity
T (n): the number of comparisons to compute the LeftMin and RightMinentries for canonical ranges whose size is at most n
T (1) = 0
T (n) = 2T(n
2
)+ O(log n) for n ≥ 2
We have the preprocessing comparison complexity
T (n) = O(n),
and need to do 1 comparison at the query stage.
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 19 / 28
2D Case
2D Canonical Range: Cartesian Product of Two 1D Canonical Ranges
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1112
1314
1516
x
y 2D Canonical Range
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 20 / 28
2D Pre-Computations
For each 2D canonical range r and a point p ∈ r , compute the 4“Dominance Min” array entries
BotLeftMin(r, p)
pTopLeftMin(r, p) TopRightMin(r, p)
BotRightMin(r, p)
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 21 / 28
2D Query
For any query range q, we can always divide it into 4 parts, which are allpre-computed
BotLeftMin(r2, p2)
TopLeftMin(r4, p4)TopRightMin(r3, p3)
BotRightMin(r1, p1)
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 22 / 28
Efficient Pre-Computations
For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2
Do binary search row by row (or column by column)
O(N) comparisons for 2D preprocessing
Generalize to any fixed dimension d :
Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons
BotLeftMin(r, p)
r1 r2
pBinary Search
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28
Efficient Pre-Computations
For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2
Do binary search row by row (or column by column)
O(N) comparisons for 2D preprocessing
Generalize to any fixed dimension d :
Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons
BotLeftMin(r, p)
r1 r2
pBinary Search
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28
Efficient Pre-Computations
For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2
Do binary search row by row (or column by column)
O(N) comparisons for 2D preprocessing
Generalize to any fixed dimension d :
Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons
BotLeftMin(r, p)
r1 r2
pBinary Search
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28
Efficient Pre-Computations
For any canonical range r , cut the middle of its longer side to obtaintwo smaller canonical ranges r1 and r2
Do binary search row by row (or column by column)
O(N) comparisons for 2D preprocessing
Generalize to any fixed dimension d :
Preprocess: O(2.89d(d + 1)!N) comparisonsQuery: 2d − 1 comparisons
BotLeftMin(r, p)
r1 r2
pBinary Search
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 23 / 28
Overview of RAM Implementations
Divide the array into micro blocks of size ε log N
Each block is a d-dimensional cube, with side length (ε log N)1d
For example in 2D, make each block√
ε log N by√
ε log N
q1
q2 q3
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28
Overview of RAM Implementations
For query that crosses the border of any micro block: there existsO(N)-time preprocessing and constant-time querying data structures tosolve it, using dimension reductions and the help of the data structures in[Yao 1982] [Chazelle and Rosenberg 1989]
q1
q2 q3
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28
Overview of RAM Implementations
For query that is complete within a micro block, use table lookuptechnique (Four Russian’s Trick) to get the locations of at most 2d
candidates to compare at the querying stageBased on our linear-comparison preprocessing data structure
q1
q2 q3
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 24 / 28
Micro Block
Key Idea: If two micro blocks have the same type, then they should sharethe same data structures
Type of a micro block: Comparison results (true/false sequence) ofthe linear-comparison preprocessing algorithm
Assume cε log N comparisons to preprocess a block: at most2cε log N = Ncε possible types
Choose ε < 1c , then there are only a sublinear number of types:
Ncεpolylog(ε log N) = o(N)
Recognizing the types for all micro blocks in linear time:Build a linear-depth decision tree according to the linear-comparisonpreprocessing algorithm
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 25 / 28
Summary of RAM
Our unit-cost RAM data structure
Preprocess in O(2.89d(d + 1)!N) time and (2dd!N) space
Query in O(3d) time
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 26 / 28
Future Work
Future Work:
Extend the lower bound of [Demaine, Landau and Weimann, 2009]
If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput
Dynamic Updates [Poon, 2003]
Extend our results to the external memory model
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28
Future Work
Future Work:
Extend the lower bound of [Demaine, Landau and Weimann, 2009]
If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput
Dynamic Updates [Poon, 2003]
Extend our results to the external memory model
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28
Future Work
Future Work:
Extend the lower bound of [Demaine, Landau and Weimann, 2009]
If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput
Dynamic Updates [Poon, 2003]
Extend our results to the external memory model
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28
Future Work
Future Work:
Extend the lower bound of [Demaine, Landau and Weimann, 2009]
If at most t comparisons are allowed at the querying stage, find thelower bound for the number of comparisons required to preprocess theinput
Dynamic Updates [Poon, 2003]
Extend our results to the external memory model
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 27 / 28
END
Thank You!
Yuan & Atallah (PURDUE) Multidimensional RMQ January 17, 2010 28 / 28