Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | paul-summers |
View: | 9 times |
Download: | 0 times |
112/04/19112/04/19 Chen and ChaoChen and Chao 11
On the On the RRange ange MMaximum-Sum aximum-Sum SSegment egment QQuery Problemuery Problem
Kuan-Yu Chen and Kun-Mao Kuan-Yu Chen and Kun-Mao ChaoChao
Department of Computer Department of Computer Science and Information Science and Information
Engineering,Engineering,National Taiwan University, National Taiwan University,
TaiwanTaiwan
112/04/19112/04/19 Chen and ChaoChen and Chao 66
The Maximum-Sum The Maximum-Sum SegmentSegment
Also called the maximum-sum Also called the maximum-sum interval or the maximum-scoring interval or the maximum-scoring regionregion
Given a sequence of numbers, the Given a sequence of numbers, the maximum-sum segmentmaximum-sum segment is simply is simply the contiguous subsequence having the contiguous subsequence having the greatest total sum.the greatest total sum.
<5, -5.1, 1, 3, -4, 2, 3, -4, 7><5, -5.1, 1, 3, -4, 2, 3, -4, 7>Zero prefix-/suffix-sums are possible.
With greatest total sum = 8
112/04/19112/04/19 Chen and ChaoChen and Chao 77
A Relevant Problem - RMQA Relevant Problem - RMQ
Range Minima (Maxima) Query Problem Range Minima (Maxima) Query Problem (also called Discrete Range Searching)(also called Discrete Range Searching)
Given a sequence of numbers, by Given a sequence of numbers, by preprocessing the sequence we wishpreprocessing the sequence we wish to to retrieve the minimum (maximum) retrieve the minimum (maximum) value within a given querying interval value within a given querying interval efficientlyefficiently
<5, -5.1, 1, 3, -4, 2, 3, -4, 7><5, -5.1, 1, 3, -4, 2, 3, -4, 7>
MinimumMaximum
112/04/19112/04/19 Chen and ChaoChen and Chao 88
RRange ange MMaximum-Sum aximum-Sum SSegment egment
QQuery Problemuery Problem
Definition: Definition: The input is a sequence <aThe input is a sequence <a11,a,a22,, ………… aann> of > of
real numbers which is to be preprocessed. real numbers which is to be preprocessed. A query is comprised of two intervals S A query is comprised of two intervals S
and E.and E. Our goal is to return the maximum-sum Our goal is to return the maximum-sum
segment whose starting index lies in S and segment whose starting index lies in S and end index lies in E.end index lies in E.
112/04/19112/04/19 Chen and ChaoChen and Chao 99
A Nonoverlapping ExampleA Nonoverlapping Example
Input Sequence:Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3 Total sum = 6Starti
ng region
End region
112/04/19112/04/19 Chen and ChaoChen and Chao 1010
An Overlapping ExampleAn Overlapping Example
Input Sequence:Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3 Total sum = 8Starti
ng region
End region
112/04/19112/04/19 Chen and ChaoChen and Chao 1111
Our ResultsOur Results
We propose an algorithm that runs in O(n) We propose an algorithm that runs in O(n) preprocessing time and O(1) query time preprocessing time and O(1) query time under the unit-cost RAM model.under the unit-cost RAM model.
We show that the RMSQ techniques yield We show that the RMSQ techniques yield alternative O(n) time algorithms for the alternative O(n) time algorithms for the following problems:following problems: The maximum-sum segment with length The maximum-sum segment with length
constraintsconstraints All maximal-sum segmentsAll maximal-sum segments
112/04/19112/04/19 Chen and ChaoChen and Chao 1212
StrategyStrategy Reduce the RMSQ to the RMQ problemReduce the RMSQ to the RMQ problem
Theorem.Theorem. If there is a <f(n), g(n)>-time If there is a <f(n), g(n)>-time solution for the RMQ problem, then there solution for the RMQ problem, then there is a <f(n)+O(n), g(n)+O(1)>-time solution is a <f(n)+O(n), g(n)+O(1)>-time solution for the RMSQ problem.for the RMSQ problem.
RMSQ RMQ
O(n)
O(1)
112/04/19112/04/19 Chen and ChaoChen and Chao 1313
Cumulative Sum/ Prefix SumCumulative Sum/ Prefix Sum
prefix-sum(i) = a1+a2+…+ai
112/04/19112/04/19 Chen and ChaoChen and Chao 1414
Computing sum(i,j)Computing sum(i,j) in O(1) in O(1) timetime
prefix-sum(prefix-sum(ii) = ) = a1+a2+…+ai
all all nn prefix sums are computable in prefix sums are computable in OO((nn) ) time.time.
sum(sum(ii, , jj) = prefix-sum() = prefix-sum(jj) – prefix-) – prefix-sum(sum(ii-1)-1)
prefix-sum(j)
i j
prefix-sum(i-1)
112/04/19112/04/19 Chen and ChaoChen and Chao 1515
Case 1: NonoverlappingCase 1: Nonoverlapping
sum(i, j ) = prefix-sum(j) – prefix-sum(i-sum(i, j ) = prefix-sum(j) – prefix-sum(i-1)1)
Prefix-sum sequence:Prefix-sum sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3
Find the lowest point
here
Find the highest point
here
Range Minima Query
Maximize Maximize Minimize
112/04/19112/04/19 Chen and ChaoChen and Chao 1616
Case 2: OverlappingCase 2: Overlapping
Some problems may occurSome problems may occur Prefix-sum sequencePrefix-sum sequence 9, -10, 4, -2, 5, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 5, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3
Find the lowest point
here
Find the highest point
here
Negative Sum !!
112/04/19112/04/19 Chen and ChaoChen and Chao 1717
Case 2: OverlappingCase 2: Overlapping
Divide into 3 possible cases:Divide into 3 possible cases: Prefix-sum sequence:Prefix-sum sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3
Find the lowest point
here
Find the highest point
here
Find the lowest point
here
Find the highest point
here
Range Minima QueryPreprocessing time = f(n)Query time = g(n)
Range Minima QueryPreprocessing time = f(n)Query time = g(n)
What should we do?
112/04/19112/04/19 Chen and ChaoChen and Chao 1818
Dealing with the Special Case:Dealing with the Special Case:Single Range QuerySingle Range Query
Input Sequence:Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -
5, 35, 3
Challenge: Can this special case be Challenge: Can this special case be reduced to the RMQ problem?reduced to the RMQ problem?
Total sum = 6
112/04/19112/04/19 Chen and ChaoChen and Chao 1919
Reduction ProcedureReduction Procedure Step 1. Find a partner for each index.Step 1. Find a partner for each index. Step 2. Record the sum of each pair Step 2. Record the sum of each pair
in an arrayin an array Step 3. Retrieve the maximum-sum Step 3. Retrieve the maximum-sum
pair by applying the RMQ techniquespair by applying the RMQ techniques
112/04/19112/04/19 Chen and ChaoChen and Chao 2020
Our First Attempt (1)Our First Attempt (1)
Step 1: For each index Step 1: For each index ii, we define , we define the lowest point preceding the lowest point preceding ii as its as its partnerpartner
Prefix-sum sequence:Prefix-sum sequence:
i
Lowest point Find a
partner within this
region
112/04/19112/04/19 Chen and ChaoChen and Chao 2121
Our First Attempt (2)Our First Attempt (2)
Step 2: Record sum(Step 2: Record sum(partner(i), ipartner(i), i) in ) in an arrayan array
i
Lowest point
sum(partner(i), i)
112/04/19112/04/19 Chen and ChaoChen and Chao 2222
Our First Attempt (3)Our First Attempt (3)
Step 3: Apply the RMQ techniques to Step 3: Apply the RMQ techniques to the arraythe array
i
Lowest point
sum(partner(i), i)
The maximum-sum pair can be
retrieved
Applying RMQ to this
sequence
Querying this interval
112/04/19112/04/19 Chen and ChaoChen and Chao 2323
Bump into DifficultiesBump into Difficulties
What if its partners go beyond the What if its partners go beyond the querying interval?querying interval?
i
partner(i)
sum(partner(i), i)
Needs to be updated
We might have to update every pair!
112/04/19112/04/19 Chen and ChaoChen and Chao 2424
A Better PartnerA Better Partner
Prefix-sum sequencePrefix-sum sequence
iLeft_bound(i)
Find the nearest point
at least as large as i
Find the lowest point
New partner(i)
112/04/19112/04/19 Chen and ChaoChen and Chao 2525
Why Is It Better? (1)Why Is It Better? (1)
It remains the best choice.It remains the best choice. It saves lots of update steps.It saves lots of update steps.
It turns out that zero or one point needs It turns out that zero or one point needs to be updated.to be updated.
112/04/19112/04/19 Chen and ChaoChen and Chao 2626
Why Is It Better? (2)Why Is It Better? (2)-- Remains the Best-- Remains the Best
iLeft_bound(i)
Find the nearest
higher point
Find the lowest point
partner(i)
Impossible region
112/04/19112/04/19 Chen and ChaoChen and Chao 2727
Why Is It Better? (3)Why Is It Better? (3)-- Minimal-Maximal Property-- Minimal-Maximal Property
Height(partner(i))< Height(j) < Height(partner(i))< Height(j) < Height(i), for all partner(i)< j< iHeight(i), for all partner(i)< j< i
i
partner(i)
Next higher point
No one higher than i
No one lower than partner(i)
Maximal point
Minimal point
112/04/19112/04/19 Chen and ChaoChen and Chao 2828
Why Is It Better? (4)Why Is It Better? (4)-- Save Some Updates-- Save Some Updates
Prefix-sum sequencePrefix-sum sequence
i
partner(i)
Next higher point
Querying interval
No one higher than i
Can not be the right end of the maximum-
sum segment
112/04/19112/04/19 Chen and ChaoChen and Chao 2929
Why Is It Better? (5)Why Is It Better? (5)-- Nesting Property-- Nesting Property
For two indices i < j, it cannot be the For two indices i < j, it cannot be the case that partner(i)<partner(j) ≦i<jcase that partner(i)<partner(j) ≦i<j
j
partner(j)
Maximal point
Minimal point
partner(i)
Maximal point
Minimal point
i
112/04/19112/04/19 Chen and ChaoChen and Chao 3030
Why Is It Better? (6)Why Is It Better? (6)-- An example-- An example
No overlapping is allowedNo overlapping is allowed
9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 39, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Nesting PropertyNesting Property
9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 39, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3
112/04/19112/04/19 Chen and ChaoChen and Chao 3131
When a Query ComesWhen a Query Comes-- Case 1: No Exceeding-- Case 1: No Exceeding
The maximum pair (partner(i), i) lies The maximum pair (partner(i), i) lies in the querying intervalin the querying interval
i
partner(i)
Querying interval
Retrieve the maximum pair
We are done. Output (partner(i), i).
112/04/19112/04/19 Chen and ChaoChen and Chao 3232
(Partner(i), i) is the maximum pair.Nesting propertyCan not be the right end of the maximum-sum segment. Compare (new_partner(i), i) and (partner(j), j)
When a Query ComesWhen a Query Comes-- Case 2: Exceeding-- Case 2: Exceeding
The maximum pair (partner(i), i) The maximum pair (partner(i), i) goes beyond the querying intervalgoes beyond the querying interval
i
partner(i)
Querying interval
Retrieve the maximum pair
Maximal
Minimal
Retrieve the maximum pair
j
partner(j)Update
partner(i)
112/04/19112/04/19 Chen and ChaoChen and Chao 3333
Time ComplexityTime Complexity RMSQ can be reduced to the RMQ problem in O(n) RMSQ can be reduced to the RMQ problem in O(n)
timetime
Since under the unit-cost RAM model, there is a Since under the unit-cost RAM model, there is a <O(n), O(1)>-time solution for the RMQ problem, <O(n), O(1)>-time solution for the RMQ problem, there is a <O(n), O(1)>-time solution for the RMSQ there is a <O(n), O(1)>-time solution for the RMSQ problem.problem.
On the other hand, RMQ can be reduced to the RMSQ On the other hand, RMQ can be reduced to the RMSQ problem in O(n) time, too. (Range Maxima Query: For problem in O(n) time, too. (Range Maxima Query: For each two adjacent elements, we augment a negative each two adjacent elements, we augment a negative number whose absolute value is larger than them.)number whose absolute value is larger than them.)
RMSQ RMQ
O(n)
O(1)
112/04/19112/04/19 Chen and ChaoChen and Chao 3434
Use RMSQ Techniques to Solve Use RMSQ Techniques to Solve Two Two Relevant ProblemsRelevant Problems
1. Finding the Maximum-Sum Segment 1. Finding the Maximum-Sum Segment with length constraints in O(n) time.with length constraints in O(n) time.
- Y.-L. Lin, T. Jiang, K.-M. Chao, 2002- Y.-L. Lin, T. Jiang, K.-M. Chao, 2002
- T.-H Fan et al.,- T.-H Fan et al., 20032003
2. Finding all maximal scoring 2. Finding all maximal scoring subsequences in O(n) time.subsequences in O(n) time.
- W. L. Ruzzo & M. Tompa, 1999- W. L. Ruzzo & M. Tompa, 1999
112/04/19112/04/19 Chen and ChaoChen and Chao 3535
Problem 1:The Maximum-Sum Problem 1:The Maximum-Sum Segment with Length Segment with Length
ConstraintsConstraints Lin, Jiang, and Chao [Lin, Jiang, and Chao [JCSS JCSS 2002] and 2002] and
Fan Fan et al.et al. [ [CIAACIAA 2003] gave 2003] gave OO((nn))--time time algorithmsalgorithms for this problem. for this problem. Length at least L, and at most ULength at least L, and at most U
LU
112/04/19112/04/19 Chen and ChaoChen and Chao 3636
Problem 1: Finding the Problem 1: Finding the Maximum-Sum Segment with Maximum-Sum Segment with
Length ConstraintsLength Constraints Length at least L, at most ULength at least L, at most U For each index For each index ii, find the maximum-, find the maximum-
sum segment whose starting point sum segment whose starting point lies in [i-U+1, i-L+1] and end point is lies in [i-U+1, i-L+1] and end point is ii
LU
Runs in O(n) time since each query costs O(1) time
iRMSQ query
112/04/19112/04/19 Chen and ChaoChen and Chao 3737
Problem 2: All Maximal-Sum Problem 2: All Maximal-Sum SegmentsSegments
Ruzzo and Tompa [Ruzzo and Tompa [ISMBISMB 1999] gave 1999] gave a O(n)-time algorithm for this a O(n)-time algorithm for this problem.problem.
Recursive definition.Recursive definition.R(S)L(S)
S