Post on 25-Mar-2020
transcript
Approximating Extent Measures of Points
Pankaj K. AgarwalSariel Har-Peled
Kasturi R. Varadarajan
CS468, Winter 2006
Extent Measures
• Given a point set P ⊆ Rd
• Extent measures – statistics about P or its enclosing shape
• Some examples
- k-th largest distance
- Min. volume bounding box
- Min. width bounding slab
- Min. enclosing sphere
- Min. enclosing cylinder
- Min. width enclosing spherical shell
- Min. width enclosing cylindrical shell
Main Result
• Technique for ε-approximating (a large class of) extentmeasures
• Compute a subset of the input Q ⊆ P (coreset) which
- Preserves the solution to ε-accuracy- Small size (does not depend on |P |, only on ε)
• General properties
- Strong LTAS, running time O(
n +(
1ε
)O(1))
- Coreset size O((
1ε
)O(1))
- Exponents depend on d
• Simple to implement, and some improvements
- Min. enclosing spherical shellexponent down from O(d2) [Chan 02] to O(d)
Key Definitions
• Lead to two main approximation primitives
• Directional width of a point set P in the direction u ∈ Rd−1
w(u,P) = maxp∈P〈[u, 1], p〉 − minp∈P〈[u, 1], p〉
• Extent of a set F of (d − 1)-variate functions at x ∈ Rd−1
e(x ,F ) = maxf∈F f (x) − minf∈F f (x)
Optimization Primitives
• Q ⊆ P is an ε-approximation for P on ∆ ⊆ Rd−1 if for all
u ∈ ∆(1 − ε)w(u,P) ≤ w(u,Q) ≤ w(u,P)
• G ⊆ F is an ε-approximation for F on ∆ ⊆ Rd−1 if for all
x ∈ ∆(1 − ε)e(x ,F ) ≤ w(u,G ) ≤ e(x ,F )
• Note: Always pick a subset of the input – coreset
Classes of Extent Measures
• Faithful
- Approximated through directional width
- “Convex” measures (bounding shapes)
• Other
- Approximated through extent
- “Concave” measures (“shells”)
Overview
(A) Strong LTASs for directional width
- Reduction to “fat” point sets
- Algorithm 1: Grid
- Algorithm 2: Polytope
- Algorithm 3: Decomposition
Strong LTASs for extent
- Linear functions (hyperplanes)
- Polynomial functions
- r -th roots of polynomials
(B) Dynamic updates
Applications to specific extent measures
Reduction to “Fat” Point Sets
• A point set P ∈ Rd is α-fat if there exists a translation t such
thatαC ⊆ CH(P) + t ⊆ C = [−1, 1]d
• Sufficient to consider computing coresets for α-fat pointsets
• Step 1:Every point set can be made α-fat by applying a lineartransformation, where α = α(d)
• Step 2:Every linear transformation preserves the approximation ratioof an arbitrary coreset
- The size and construction time are clearly preserved
Step 1: There Exists a “Fattening Transform”
• [Barequet, Har-Peled 01]: Let P ⊆ Rd be of size n. Can
compute in O(n) time a box B and a vector t ∈ Rd such that
αB ⊆ CH(P + t) ⊆ B
B
αB
CH(P) • Recall:α = 1/10? for d = 3
• Choose T so thatT (B) = C
Step 2: Invariance Under Linear Transforms
• Lemma:Let T (x) = Mx + b be a non-degenerate linear transform.
Q ⊆ P ε-approximates P over ∆ ⊆ Rd−1
if and only if
T (Q) ε-approximates T (P) over
{v | [v , 1] = MT [u, 1], u ∈ ∆}
• Proof: Easy by definition
- Note: can assume T (x) = Mx
- Also, 〈[u, 1], Mp〉 = 〈MT [u, 1], p〉
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R
d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x
||x || , p〉︸ ︷︷ ︸
projection
≥ ||x ||α
For xd = 1: maxp∈P
〈x , p〉 ≥ α||x || minp∈P
〈x , p〉 ≤ −α||x ||
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R
d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x
||x || , p〉︸ ︷︷ ︸
projection
≥ ||x ||α
For xd = 1: maxp∈P
〈x , p〉 ≥ α||x || minp∈P
〈x , p〉 ≤ −α||x ||
• Lemma [Hausdorff dist.]: If maxp∈P minq∈Q ||p − q|| ≤ εαthen Q is an ε-approximation for P
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R
d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x
||x || , p〉︸ ︷︷ ︸
projection
≥ ||x ||α
For xd = 1: maxp∈P
〈x , p〉 ≥ α||x || minp∈P
〈x , p〉 ≤ −α||x ||
• Lemma [Hausdorff dist.]: If maxp∈P minq∈Q ||p − q|| ≤ εαthen Q is an ε-approximation for P
w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R
d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x
||x || , p〉︸ ︷︷ ︸
projection
≥ ||x ||α
For xd = 1: maxp∈P
〈x , p〉 ≥ α||x || minp∈P
〈x , p〉 ≤ −α||x ||
• Lemma [Hausdorff dist.]: If maxp∈P minq∈Q ||p − q|| ≤ εαthen Q is an ε-approximation for P
w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉≤ |〈x , p1 − p2〉| − |〈x , q1 − q2〉|
The Case of α-fat Point Sets
• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n
• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R
d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x
||x || , p〉︸ ︷︷ ︸
projection
≥ ||x ||α
For xd = 1: maxp∈P
〈x , p〉 ≥ α||x || minp∈P
〈x , p〉 ≤ −α||x ||
• Lemma [Hausdorff dist.]: If maxp∈P minq∈Q ||p − q|| ≤ εαthen Q is an ε-approximation for P
w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉≤ |〈x , p1 − p2〉| − |〈x , q1 − q2〉|≤ |〈x , (p1 − q1) − (p2 − q2)〉| ≤ ||x || · 2αε
Algorithm 1: Grid
• Grid of cell size εα2√
d
Algorithm 1: Grid
• Grid of cell size εα2√
d
• Clear all “internal” cells
- Convex hull moves byat most one celldiameter εα/2
����������������������������
����������������������������
����������������������������
����������������������������
������������������������������������
����������������������������
����������������������������
����������������������������
������������
�������������������
���������������������
����������������������������
� � � � � � � � � � � �
�����������������������������������
���������������������
����������������������������
����������������������������
����������������������������
����������������������������
����������������������������
����������������������������
����������������������������
���������������������������������������������
���������������������������������������������
�����������������������������������
�����������������������������������
����������������������������
���������������������������������������������������������������
����������������������������������� �������
���������������������
� � � � � � � � � � � � !�!�!�!
!�!�!�!!�!�!�!!�!�!�!!�!�!�!
"�"�"�""�"�"�""�"�"�""�"�"�""�"�"�"
#�#�#�##�#�#�##�#�#�##�#�#�#
$�$�$�$$�$�$�$$�$�$�$$�$�$�$ %�%�%�%
%�%�%�%%�%�%�%%�%�%�%%�%�%�%
&�&�&�&&�&�&�&&�&�&�&&�&�&�&&�&�&�&'�'�'�'
'�'�'�''�'�'�''�'�'�'
(�(�(�((�(�(�((�(�(�((�(�(�(
)�)�)�))�)�)�))�)�)�))�)�)�))�)�)�)
*�*�*�**�*�*�**�*�*�**�*�*�**�*�*�*
Algorithm 1: Grid
• Grid of cell size εα2√
d
• Clear all “internal” cells
- Convex hull moves byat most one celldiameter εα/2
• Eliminate duplicates inthe “boundary” cells
- Another shift of atmost εα/2
+�+�+�++�+�+�++�+�+�++�+�+�+
,�,�,�,,�,�,�,,�,�,�,,�,�,�,
-�-�-�--�-�-�--�-�-�--�-�-�-
.�.�.�..�.�.�..�.�.�..�.�.�.
/�/�/�/�//�/�/�/�//�/�/�/�//�/�/�/�/
0�0�0�00�0�0�00�0�0�00�0�0�0
1�1�1�11�1�1�11�1�1�11�1�1�1
2�2�2�22�2�2�22�2�2�22�2�2�2
3�3�3�33�3�3�33�3�3�33�3�3�3
4�4�4�44�4�4�44�4�4�44�4�4�45�5�5�5
5�5�5�55�5�5�55�5�5�5
6�6�6�66�6�6�66�6�6�66�6�6�6
7�7�7�77�7�7�77�7�7�77�7�7�7
8�8�8�88�8�8�88�8�8�88�8�8�89�9�9�9
9�9�9�99�9�9�99�9�9�9
:�:�:�::�:�:�::�:�:�::�:�:�:
;�;�;�;;�;�;�;;�;�;�;;�;�;�;
<�<�<�<<�<�<�<<�<�<�<<�<�<�<
=�=�=�==�=�=�==�=�=�==�=�=�=
>�>�>�>>�>�>�>>�>�>�>>�>�>�>
?�?�?�??�?�?�??�?�?�??�?�?�?
@�@�@�@@�@�@�@@�@�@�@@�@�@�@
A�A�A�A�AA�A�A�A�AA�A�A�A�AA�A�A�A�AA�A�A�A�A
B�B�B�B�BB�B�B�B�BB�B�B�B�BB�B�B�B�BB�B�B�B�B
C�C�C�CC�C�C�CC�C�C�CC�C�C�CC�C�C�C
D�D�D�DD�D�D�DD�D�D�DD�D�D�DD�D�D�D
E�E�E�EE�E�E�EE�E�E�EE�E�E�E
F�F�F�FF�F�F�FF�F�F�FF�F�F�FG�G�G�GG�G�G�GG�G�G�GG�G�G�GG�G�G�G
H�H�H�HH�H�H�HH�H�H�HH�H�H�HH�H�H�H I�I�I�I
I�I�I�II�I�I�II�I�I�I
J�J�J�JJ�J�J�JJ�J�J�JJ�J�J�J K�K�K�K
K�K�K�KK�K�K�KK�K�K�KK�K�K�K
L�L�L�LL�L�L�LL�L�L�LL�L�L�LL�L�L�L
M�M�M�MM�M�M�MM�M�M�MM�M�M�M
N�N�N�NN�N�N�NN�N�N�NN�N�N�N O�O�O�O
O�O�O�OO�O�O�OO�O�O�OO�O�O�O
P�P�P�PP�P�P�PP�P�P�PP�P�P�PP�P�P�PQ�Q�Q�Q
Q�Q�Q�QQ�Q�Q�QQ�Q�Q�Q
R�R�R�RR�R�R�RR�R�R�RR�R�R�R
S�S�S�SS�S�S�SS�S�S�SS�S�S�SS�S�S�S
T�T�T�TT�T�T�TT�T�T�TT�T�T�TT�T�T�T
Algorithm 1: Grid
• Grid of cell size εα2√
d
• Clear all “internal” cells
- Convex hull moves byat most one celldiameter εα/2
• Eliminate duplicates inthe “boundary” cells
- Another shift of atmost εα/2
U�U�U�UU�U�U�UU�U�U�UU�U�U�U
V�V�V�VV�V�V�VV�V�V�VV�V�V�V
W�W�W�WW�W�W�WW�W�W�WW�W�W�W
X�X�X�XX�X�X�XX�X�X�XX�X�X�X
Y�Y�Y�Y�YY�Y�Y�Y�YY�Y�Y�Y�YY�Y�Y�Y�Y
Z�Z�Z�ZZ�Z�Z�ZZ�Z�Z�ZZ�Z�Z�Z
[�[�[�[[�[�[�[[�[�[�[[�[�[�[
\�\�\�\\�\�\�\\�\�\�\\�\�\�\
]�]�]�]]�]�]�]]�]�]�]]�]�]�]
^�^�^�^^�^�^�^^�^�^�^^�^�^�^_�_�_�_
_�_�_�__�_�_�__�_�_�_
`�`�`�``�`�`�``�`�`�``�`�`�`
a�a�a�aa�a�a�aa�a�a�aa�a�a�a
b�b�b�bb�b�b�bb�b�b�bb�b�b�bc�c�c�c
c�c�c�cc�c�c�cc�c�c�c
d�d�d�dd�d�d�dd�d�d�dd�d�d�d
e�e�e�ee�e�e�ee�e�e�ee�e�e�e
f�f�f�ff�f�f�ff�f�f�ff�f�f�f
g�g�g�gg�g�g�gg�g�g�gg�g�g�g
h�h�h�hh�h�h�hh�h�h�hh�h�h�h
i�i�i�ii�i�i�ii�i�i�ii�i�i�i
j�j�j�jj�j�j�jj�j�j�jj�j�j�j
k�k�k�k�kk�k�k�k�kk�k�k�k�kk�k�k�k�kk�k�k�k�k
l�l�l�l�ll�l�l�l�ll�l�l�l�ll�l�l�l�ll�l�l�l�l
m�m�m�mm�m�m�mm�m�m�mm�m�m�mm�m�m�m
n�n�n�nn�n�n�nn�n�n�nn�n�n�nn�n�n�n
o�o�o�oo�o�o�oo�o�o�oo�o�o�o
p�p�p�pp�p�p�pp�p�p�pp�p�p�pq�q�q�qq�q�q�qq�q�q�qq�q�q�qq�q�q�q
r�r�r�rr�r�r�rr�r�r�rr�r�r�rr�r�r�r s�s�s�s
s�s�s�ss�s�s�ss�s�s�s
t�t�t�tt�t�t�tt�t�t�tt�t�t�t u�u�u�u
u�u�u�uu�u�u�uu�u�u�uu�u�u�u
v�v�v�vv�v�v�vv�v�v�vv�v�v�vv�v�v�v
w�w�w�ww�w�w�ww�w�w�ww�w�w�w
x�x�x�xx�x�x�xx�x�x�xx�x�x�x y�y�y�y
y�y�y�yy�y�y�yy�y�y�yy�y�y�y
z�z�z�zz�z�z�zz�z�z�zz�z�z�zz�z�z�z{�{�{�{
{�{�{�{{�{�{�{{�{�{�{
|�|�|�||�|�|�||�|�|�||�|�|�|
}�}�}�}}�}�}�}}�}�}�}}�}�}�}}�}�}�}
~�~�~�~~�~�~�~~�~�~�~~�~�~�~~�~�~�~
• Directional width changes at most εα
- Determined by convex hull
Algorithm 1: Analysis
• Coreset size O(1/(αε)d−1)
- One point per “boundary” grid cell
- Cell size O(1/(αε))
• Running time O(n + 1/(αε)d−1)
• O-notation hides√
d
• α has a bad dependence on d
Algorithm 2: Polytope
• Run Algorithm 1, return (ε/2)-approximation P1
• Apply [Dudley 1974] to P1, lose another ε/2
- Sample the sphere of radius√
d + 1
- Closest point routine [Gartner 1995], runs in linear time,returns all (at most d) closest points
• Return the set of closest points P2
Algorithm 2: Analysis
• Correctness
- Fact 1: Dudley polytope is also valid for P2
- Fact 2: CH(P2) ⊆ CH(P1) ⊆ Dudley
• Want (αε/2)-approximation of the convex hull
- Yields ε/2 for directional width
• Required number of samples
O((√
αε/2)d−1) = O((αε)(d−1)/2)
• This is also (roughly) the size of the result P2
- At most d per sample
- Compare to Algorithm 1: O((αε)d−1)
Algorithm 2: Analysis
• Find closest points for O((αε)(d−1)/2) samples
• The closest point routine [Gartner 1995]
- Linear in the number of points
- In this case, |P1| = O((αε)d−1)
- Total, O((αε)3(d−1)/2)
• Including the call to Algorithm 1, we get O(n + (αε)3(d−1)/2)
- Compare to Algorithm 1: O(n + (αε)d−1)
Algorithm 3: Arrangement
• Run Algorithm 2, get an (ε/2)-approximation P2
• Decompose a set of directions Rd−1 using an arrangement of
(d − 2)-flats
u, v ∈ Rd−1 in the same cell ⇒
∥∥∥
u||u|| − v
||v ||
∥∥∥ ≤ αε
4√
d
- Identify diametrically opposite cells(induced partition of Pd−1, the set of unoriented directions)
• Return P3, consisting of two points per cell
- Extremal points in one arbitrary direction within the cell
- Lose another ε/2
〈angle error, extreme point error〉 ≤
≤∥∥∥∥
u
||u|| −v
||v ||
∥∥∥∥· ||p − q|| ≤ αε
4√
d· 2
√d =
αε
2
Algorithm 3: Computing the Arrangement
Algorithm 3: Computing the Arrangement
• (d − 1)-dimensional grid within each of the d facets of C
- Cell size αε4√
d
- Results in a set of (d − 2)-flats
• Affine hull with the origin, intersect with Pd−1
Algorithm 3: Computing the Arrangement
• (d − 1)-dimensional grid within each of the d facets of C
- Cell size αε4√
d
- Results in a set of (d − 2)-flats
• Affine hull with the origin, intersect with Pd−1
• Complexity of the arrangement
- Number of hyperplanes d(d − 1) 4√
dαε
- Complexity O(1/(αε)d−1)
Algorithm 3: Analysis
• Coreset size |P3| = O(1/(αε)d−1)
- Same as the arrangement complexity
• Extremal points are computed by linear search through P2
- |P2| = O(1/(αε)(d−1)/2) time per cell
- O(1/(αε)d−1) cells
- O(1/(αε)3(d−1)/2) total
- Combined with Algorithm 2, O(n + 1/(αε)3(d−1)/2)
• Note |P3| > |P2|, a point may be extremal in multiple cells
Approximating (d − 1)-variate Functions
• Recall: ε-approximation in terms of extent
Approximating (d − 1)-variate Functions
• Recall: ε-approximation in terms of extent
1. Linear functions (hyperplanes), using duality
2. General polynomials, using linearization
3. r -th roots of polynomials, r ∈ Z
Point-Hyperplane Duality
• The hyperplaneh : xd = a1x1 + a2x2 + · · · + ad−1xd−1 + ad
corresponds to the pointh∗ = (a1, a2, . . . , ad ) ∈ R
d
• Lemma: Set of hyperplanes H = {h1, h2, . . . , hn} isε-approximated by K ⊂ H over ∆ ∈ R
d−1 iff H∗ isε-approximated by K ∗ over ∆
• Proof: Directional width w(u,H∗) is the same as the extente(u,H)
- Holds for any u ∈ Rd−1 and any set of hyperplanes H in Rd
- Important to define w(·, ·) using the xd = 1 plane
• Immediately implies ε-approximation algorithms for linearfunctions in (d − 1)-variables
Linearization
• A set of (d − 1)-variate polynomials F = {f1, f2, . . . , fn}• Linearization of dimension k
fi(x) = a(i)0 + a
(i)1 φ1(x) + · · · + a
(i)k φk(x) i = 1, 2, . . . , n
• Reduces to a set of k-variate linear functions
fi(y) = a(i)0 + a
(i)1 y1 + · · · + a
(i)k yk i = 1, 2, . . . , n
with y = φ(x), φ : Rd−1 → R
k
• An ε-approximation for {fi (y)} over ∆ ∈ Rk implies an
ε-approximation for {fi (x)} over φ−1(∆ ∩ φ(Rd−1))
Linearization Example
• fi(x) is the distance of the point (x1, x2) ∈ R2 to a circle in
R2 with center (p(i), q(i)) and radius r (i)
fi (x) = (r (i))2 − (x1 − p(i))2 − (x2 − q(i))2
• Can be written as
fi (x) = (r (i) − p(i) − q(i))2 + (2p(i))x1 + (2q(i))x2 − [x21 + x2
2 ]
• A linearization of dimension k = 3 with
a(i) = [(r (i) − p(i) − q(i))2, 2p(i), 2q(i), −1]
φ(x) = [x1, x2, x21 + x2
2 ]
• [Agarwal, Matousek 1994] Computing linearization ofminimum dimension
Polynomials: Algorithms
• For a set F of n (d − 1)-variate polynomials admits alinearization of dimension k , can compute
- Algorithm 1: an ε-approximation of size O(1/εk), in timeO(n + 1/εk)
- Algorithm 2: an ε-approximation of size O(1/εk/2), in timeO(n + 1/ε3k/2)
- Algorithm 3: a set of O(1/ε)(d − 2)-dimensional surfaces inRd−1, in time O(n + 1/ε3k/2), such that within each cell oftheir arrangement F is ε-approximated by 2 of its elements
- Complexity of the arrangement O(1/εd−1) [Agarwal, Sharir 00]Cells are of complexity O(1)
- Algorithms 2 and 3: an ε-approximation of sizeO(1/εmin{k/2,d−1}), in time O(n + 1/ε3k/2)
Roots of Polynomials
• Want ε-approximation for F = {(f1)1/r , (f2)1/r , . . . , (fn)
1/r}where r is integer
• Cannot linearize directly
• Special cases can be handled [Chan 02]
• If a, b,A,B ≥ 0 and [A,B ] ⊆ [a, b] then
B−A ≥ (1−δ)(b−a) ⇒ B1/r−A1/r ≥ (1−ε)(b1/r−a1/r )
for δ =(
ε2(r−1)
)r
• It suffices to compute O(εr ) approximation to {f1, f2, . . . , fn}
Roots of Polynomials: Algorithms
• Set of F , |F | = n, contains r -th roots of (d − 1)-variatenon-negative polynomials that admit a linearization of ofdimension k
• Algorithm 1: ε-approximation in time O(1/εkr ) and sizeO(1/εkr )
• Algorithms 2 and 3: ε-approximation in time O(1/ε3kr/2)and size O(1/εr min{d−1,k/2})
• Algorithm 3: A set of O(1/εr ) (d − 2)-dimensional surfacesin R
d−1, in time O(1/ε3kr/2) , such that within each cell oftheir arrangement, F is ε-approximated by two of its elements
Classes of Extent Measures
• Faithful
- Approximated through directional width
- “Convex” measures (bounding shapes)
- Examples:
DiameterWidthRadius of the smallest enclosing ballVolume of the minimum bounding boxVolume enclosed by the convex hullSurface area of the convex hull
• Other
- Approximated through extent
- “Concave” measures (“shells”)
- Examples:
Minimum width spherical shell (annulus)Minimum width cylindrical shell
Dynamic Updates
• Based on a balanced tree data structure
- Both insertions and deletions- Recompute approximations along the unique path to the root- Update time per operation O((logk+1 n/ε)k + f (log n/ε) log n)- Amortized O((logk n/ε)k + f (log n/ε))
• Based on the ranked subsets data structure
- Insertions only- Partition the points into ranked subsets, based on the binary
encoding of n
- Data structure size O(log2k+1 n/εk)- Coreset size O(log2k+1 n/εk) ,
amortized insertion time O((1/ε)k + f (ε))- Coreset size O(1/εk) ,
amortized insertion time O(log2k+1 n/εk + f (ε))