Approximating Extent Measures of Points - Computer...

transcript

Approximating Extent Measures of Points

Pankaj K. AgarwalSariel Har-Peled

Kasturi R. Varadarajan

CS468, Winter 2006

Extent Measures

• Given a point set P ⊆ Rd

• Extent measures – statistics about P or its enclosing shape

• Some examples

- k-th largest distance

- Min. volume bounding box

- Min. width bounding slab

- Min. enclosing sphere

- Min. enclosing cylinder

- Min. width enclosing spherical shell

- Min. width enclosing cylindrical shell

Main Result

• Technique for ε-approximating (a large class of) extentmeasures

• Compute a subset of the input Q ⊆ P (coreset) which

- Preserves the solution to ε-accuracy- Small size (does not depend on |P |, only on ε)

• General properties

- Strong LTAS, running time O(

)O(1))

- Coreset size O((

)O(1))

- Exponents depend on d

• Simple to implement, and some improvements

- Min. enclosing spherical shellexponent down from O(d2) [Chan 02] to O(d)

Key Definitions

• Lead to two main approximation primitives

• Directional width of a point set P in the direction u ∈ Rd−1

w(u,P) = maxp∈P〈[u, 1], p〉 − minp∈P〈[u, 1], p〉

• Extent of a set F of (d − 1)-variate functions at x ∈ Rd−1

e(x ,F ) = maxf∈F f (x) − minf∈F f (x)

Optimization Primitives

• Q ⊆ P is an ε-approximation for P on ∆ ⊆ Rd−1 if for all

u ∈ ∆(1 − ε)w(u,P) ≤ w(u,Q) ≤ w(u,P)

• G ⊆ F is an ε-approximation for F on ∆ ⊆ Rd−1 if for all

x ∈ ∆(1 − ε)e(x ,F ) ≤ w(u,G ) ≤ e(x ,F )

• Note: Always pick a subset of the input – coreset

Classes of Extent Measures

• Faithful

- Approximated through directional width

- “Convex” measures (bounding shapes)

• Other

- Approximated through extent

- “Concave” measures (“shells”)

Overview

(A) Strong LTASs for directional width

- Reduction to “fat” point sets

- Algorithm 1: Grid

- Algorithm 2: Polytope

- Algorithm 3: Decomposition

Strong LTASs for extent

- Linear functions (hyperplanes)

- Polynomial functions

- r -th roots of polynomials

(B) Dynamic updates

Applications to specific extent measures

Reduction to “Fat” Point Sets

• A point set P ∈ Rd is α-fat if there exists a translation t such

thatαC ⊆ CH(P) + t ⊆ C = [−1, 1]d

• Sufficient to consider computing coresets for α-fat pointsets

• Step 1:Every point set can be made α-fat by applying a lineartransformation, where α = α(d)

• Step 2:Every linear transformation preserves the approximation ratioof an arbitrary coreset

- The size and construction time are clearly preserved

Step 1: There Exists a “Fattening Transform”

• [Barequet, Har-Peled 01]: Let P ⊆ Rd be of size n. Can

compute in O(n) time a box B and a vector t ∈ Rd such that

αB ⊆ CH(P + t) ⊆ B

CH(P) • Recall:α = 1/10? for d = 3

• Choose T so thatT (B) = C

Step 2: Invariance Under Linear Transforms

• Lemma:Let T (x) = Mx + b be a non-degenerate linear transform.

Q ⊆ P ε-approximates P over ∆ ⊆ Rd−1

if and only if

T (Q) ε-approximates T (P) over

{v | [v , 1] = MT [u, 1], u ∈ ∆}

• Proof: Easy by definition

- Note: can assume T (x) = Mx

- Also, 〈[u, 1], Mp〉 = 〈MT [u, 1], p〉

The Case of α-fat Point Sets

• From now on assume αC ⊆ CH(P) ⊆ C where α dependsonly on d , not on n

• Lemma [Rough approximation]: w(x ,P) ≥ 2α||x ||∀x ∈ R

d : maxp∈P〈x , p〉 = ||x || · maxp∈P 〈 x

||x || , p〉︸︷︷︸

projection

≥ ||x ||α

For xd = 1: maxp∈P

〈x , p〉 ≥ α||x || minp∈P

〈x , p〉 ≤ −α||x ||

||x || , p〉︸︷︷︸

projection

≥ ||x ||α

〈x , p〉 ≥ α||x || minp∈P

〈x , p〉 ≤ −α||x ||

• Lemma [Hausdorff dist.]: If maxp∈P minq∈Q ||p − q|| ≤ εαthen Q is an ε-approximation for P

||x || , p〉︸︷︷︸

projection

≥ ||x ||α

〈x , p〉 ≥ α||x || minp∈P

〈x , p〉 ≤ −α||x ||

w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉

||x || , p〉︸︷︷︸

projection

≥ ||x ||α

〈x , p〉 ≥ α||x || minp∈P

〈x , p〉 ≤ −α||x ||

w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉≤ |〈x , p1 − p2〉| − |〈x , q1 − q2〉|

||x || , p〉︸︷︷︸

projection

≥ ||x ||α

〈x , p〉 ≥ α||x || minp∈P

〈x , p〉 ≤ −α||x ||

w(x ,P) − w(x ,Q) ≤ 〈x , p1 − p2〉 − 〈x , q1 − q2〉≤ |〈x , p1 − p2〉| − |〈x , q1 − q2〉|≤ |〈x , (p1 − q1) − (p2 − q2)〉| ≤ ||x || · 2αε

Algorithm 1: Grid

• Grid of cell size εα2√

Algorithm 1: Grid

• Clear all “internal” cells

- Convex hull moves byat most one celldiameter εα/2

��

� � � � � � � � � � � �

��

� � � � � � � � � � � � !�!�!�!

!�!�!�!!�!�!�!!�!�!�!!�!�!�!

"�"�"�""�"�"�""�"�"�""�"�"�""�"�"�"

#�#�#�##�#�#�##�#�#�##�#�#�#

$�$�$�$$�$�$�$$�$�$�$$�$�$�$ %�%�%�%

%�%�%�%%�%�%�%%�%�%�%%�%�%�%

&�&�&�&&�&�&�&&�&�&�&&�&�&�&&�&�&�&'�'�'�'

'�'�'�''�'�'�''�'�'�'

(�(�(�((�(�(�((�(�(�((�(�(�(

)�)�)�))�)�)�))�)�)�))�)�)�))�)�)�)

*�*�*�**�*�*�**�*�*�**�*�*�**�*�*�*

Algorithm 1: Grid

• Eliminate duplicates inthe “boundary” cells

- Another shift of atmost εα/2

+�+�+�++�+�+�++�+�+�++�+�+�+

,�,�,�,,�,�,�,,�,�,�,,�,�,�,

-�-�-�--�-�-�--�-�-�--�-�-�-

.�.�.�..�.�.�..�.�.�..�.�.�.

/�/�/�/�//�/�/�/�//�/�/�/�//�/�/�/�/

0�0�0�00�0�0�00�0�0�00�0�0�0

1�1�1�11�1�1�11�1�1�11�1�1�1

2�2�2�22�2�2�22�2�2�22�2�2�2

3�3�3�33�3�3�33�3�3�33�3�3�3

4�4�4�44�4�4�44�4�4�44�4�4�45�5�5�5

5�5�5�55�5�5�55�5�5�5

6�6�6�66�6�6�66�6�6�66�6�6�6

7�7�7�77�7�7�77�7�7�77�7�7�7

8�8�8�88�8�8�88�8�8�88�8�8�89�9�9�9

9�9�9�99�9�9�99�9�9�9

:�:�:�::�:�:�::�:�:�::�:�:�:

;�;�;�;;�;�;�;;�;�;�;;�;�;�;

<�<�<�<<�<�<�<<�<�<�<<�<�<�<

=�=�=�==�=�=�==�=�=�==�=�=�=

>�>�>�>>�>�>�>>�>�>�>>�>�>�>

?�?�?�??�?�?�??�?�?�??�?�?�?

@�@�@�@@�@�@�@@�@�@�@@�@�@�@

A�A�A�A�AA�A�A�A�AA�A�A�A�AA�A�A�A�AA�A�A�A�A

B�B�B�B�BB�B�B�B�BB�B�B�B�BB�B�B�B�BB�B�B�B�B

C�C�C�CC�C�C�CC�C�C�CC�C�C�CC�C�C�C

D�D�D�DD�D�D�DD�D�D�DD�D�D�DD�D�D�D

E�E�E�EE�E�E�EE�E�E�EE�E�E�E

F�F�F�FF�F�F�FF�F�F�FF�F�F�FG�G�G�GG�G�G�GG�G�G�GG�G�G�GG�G�G�G

H�H�H�HH�H�H�HH�H�H�HH�H�H�HH�H�H�H I�I�I�I

I�I�I�II�I�I�II�I�I�I

J�J�J�JJ�J�J�JJ�J�J�JJ�J�J�J K�K�K�K

K�K�K�KK�K�K�KK�K�K�KK�K�K�K

L�L�L�LL�L�L�LL�L�L�LL�L�L�LL�L�L�L

M�M�M�MM�M�M�MM�M�M�MM�M�M�M

N�N�N�NN�N�N�NN�N�N�NN�N�N�N O�O�O�O

O�O�O�OO�O�O�OO�O�O�OO�O�O�O

P�P�P�PP�P�P�PP�P�P�PP�P�P�PP�P�P�PQ�Q�Q�Q

Q�Q�Q�QQ�Q�Q�QQ�Q�Q�Q

R�R�R�RR�R�R�RR�R�R�RR�R�R�R

S�S�S�SS�S�S�SS�S�S�SS�S�S�SS�S�S�S

T�T�T�TT�T�T�TT�T�T�TT�T�T�TT�T�T�T

Algorithm 1: Grid

• Eliminate duplicates inthe “boundary” cells

- Another shift of atmost εα/2

U�U�U�UU�U�U�UU�U�U�UU�U�U�U

V�V�V�VV�V�V�VV�V�V�VV�V�V�V

W�W�W�WW�W�W�WW�W�W�WW�W�W�W

X�X�X�XX�X�X�XX�X�X�XX�X�X�X

Y�Y�Y�Y�YY�Y�Y�Y�YY�Y�Y�Y�YY�Y�Y�Y�Y

Z�Z�Z�ZZ�Z�Z�ZZ�Z�Z�ZZ�Z�Z�Z

[�[�[�[[�[�[�[[�[�[�[[�[�[�[

\�\�\�\\�\�\�\\�\�\�\\�\�\�\

]�]�]�]]�]�]�]]�]�]�]]�]�]�]

^�^�^�^^�^�^�^^�^�^�^^�^�^�^_�_�_�_

_�_�_�__�_�_�__�_�_�_

`�`�`�``�`�`�``�`�`�``�`�`�`

a�a�a�aa�a�a�aa�a�a�aa�a�a�a

b�b�b�bb�b�b�bb�b�b�bb�b�b�bc�c�c�c

c�c�c�cc�c�c�cc�c�c�c

d�d�d�dd�d�d�dd�d�d�dd�d�d�d

e�e�e�ee�e�e�ee�e�e�ee�e�e�e

f�f�f�ff�f�f�ff�f�f�ff�f�f�f

g�g�g�gg�g�g�gg�g�g�gg�g�g�g

h�h�h�hh�h�h�hh�h�h�hh�h�h�h

i�i�i�ii�i�i�ii�i�i�ii�i�i�i

j�j�j�jj�j�j�jj�j�j�jj�j�j�j

k�k�k�k�kk�k�k�k�kk�k�k�k�kk�k�k�k�kk�k�k�k�k

l�l�l�l�ll�l�l�l�ll�l�l�l�ll�l�l�l�ll�l�l�l�l

m�m�m�mm�m�m�mm�m�m�mm�m�m�mm�m�m�m

n�n�n�nn�n�n�nn�n�n�nn�n�n�nn�n�n�n

o�o�o�oo�o�o�oo�o�o�oo�o�o�o

p�p�p�pp�p�p�pp�p�p�pp�p�p�pq�q�q�qq�q�q�qq�q�q�qq�q�q�qq�q�q�q

r�r�r�rr�r�r�rr�r�r�rr�r�r�rr�r�r�r s�s�s�s

s�s�s�ss�s�s�ss�s�s�s

t�t�t�tt�t�t�tt�t�t�tt�t�t�t u�u�u�u

u�u�u�uu�u�u�uu�u�u�uu�u�u�u

v�v�v�vv�v�v�vv�v�v�vv�v�v�vv�v�v�v

w�w�w�ww�w�w�ww�w�w�ww�w�w�w

x�x�x�xx�x�x�xx�x�x�xx�x�x�x y�y�y�y

y�y�y�yy�y�y�yy�y�y�yy�y�y�y

z�z�z�zz�z�z�zz�z�z�zz�z�z�zz�z�z�z{�{�{�{

{�{�{�{{�{�{�{{�{�{�{

|�|�|�||�|�|�||�|�|�||�|�|�|

}�}�}�}}�}�}�}}�}�}�}}�}�}�}}�}�}�}

~�~�~�~~�~�~�~~�~�~�~~�~�~�~~�~�~�~

• Directional width changes at most εα

- Determined by convex hull

Algorithm 1: Analysis

• Coreset size O(1/(αε)d−1)

- One point per “boundary” grid cell

- Cell size O(1/(αε))

• Running time O(n + 1/(αε)d−1)

• O-notation hides√

• α has a bad dependence on d

Algorithm 2: Polytope

• Run Algorithm 1, return (ε/2)-approximation P1

• Apply [Dudley 1974] to P1, lose another ε/2

- Sample the sphere of radius√

- Closest point routine [Gartner 1995], runs in linear time,returns all (at most d) closest points

• Return the set of closest points P2

• Correctness

- Fact 1: Dudley polytope is also valid for P2

- Fact 2: CH(P2) ⊆ CH(P1) ⊆ Dudley

• Want (αε/2)-approximation of the convex hull

- Yields ε/2 for directional width

• Required number of samples

O((√

αε/2)d−1) = O((αε)(d−1)/2)

• This is also (roughly) the size of the result P2

- At most d per sample

- Compare to Algorithm 1: O((αε)d−1)

• Find closest points for O((αε)(d−1)/2) samples

• The closest point routine [Gartner 1995]

- Linear in the number of points

- In this case, |P1| = O((αε)d−1)

- Total, O((αε)3(d−1)/2)

• Including the call to Algorithm 1, we get O(n + (αε)3(d−1)/2)

- Compare to Algorithm 1: O(n + (αε)d−1)

Algorithm 3: Arrangement

• Run Algorithm 2, get an (ε/2)-approximation P2

• Decompose a set of directions Rd−1 using an arrangement of

(d − 2)-flats

u, v ∈ Rd−1 in the same cell ⇒

∥∥∥

u||u|| − v

||v ||

∥∥∥ ≤ αε

- Identify diametrically opposite cells(induced partition of Pd−1, the set of unoriented directions)

• Return P3, consisting of two points per cell

- Extremal points in one arbitrary direction within the cell

- Lose another ε/2

〈angle error, extreme point error〉 ≤

≤∥∥∥∥

||u|| −v

||v ||

∥∥∥∥· ||p − q|| ≤ αε

√d =

Algorithm 3: Computing the Arrangement

• (d − 1)-dimensional grid within each of the d facets of C

- Cell size αε4√

- Results in a set of (d − 2)-flats

• Affine hull with the origin, intersect with Pd−1

Algorithm 3: Computing the Arrangement

• (d − 1)-dimensional grid within each of the d facets of C

- Cell size αε4√

- Results in a set of (d − 2)-flats

• Affine hull with the origin, intersect with Pd−1

• Complexity of the arrangement

- Number of hyperplanes d(d − 1) 4√

- Complexity O(1/(αε)d−1)

• Coreset size |P3| = O(1/(αε)d−1)

- Same as the arrangement complexity

• Extremal points are computed by linear search through P2

- |P2| = O(1/(αε)(d−1)/2) time per cell

- O(1/(αε)d−1) cells

- O(1/(αε)3(d−1)/2) total

- Combined with Algorithm 2, O(n + 1/(αε)3(d−1)/2)

• Note |P3| > |P2|, a point may be extremal in multiple cells

Approximating (d − 1)-variate Functions

• Recall: ε-approximation in terms of extent

Approximating (d − 1)-variate Functions

• Recall: ε-approximation in terms of extent

1. Linear functions (hyperplanes), using duality

2. General polynomials, using linearization

3. r -th roots of polynomials, r ∈ Z

Point-Hyperplane Duality

• The hyperplaneh : xd = a1x1 + a2x2 + · · · + ad−1xd−1 + ad

corresponds to the pointh∗ = (a1, a2, . . . , ad ) ∈ R

• Lemma: Set of hyperplanes H = {h1, h2, . . . , hn} isε-approximated by K ⊂ H over ∆ ∈ R

d−1 iff H∗ isε-approximated by K ∗ over ∆

• Proof: Directional width w(u,H∗) is the same as the extente(u,H)

- Holds for any u ∈ Rd−1 and any set of hyperplanes H in Rd

- Important to define w(·, ·) using the xd = 1 plane

• Immediately implies ε-approximation algorithms for linearfunctions in (d − 1)-variables

Linearization

• A set of (d − 1)-variate polynomials F = {f1, f2, . . . , fn}• Linearization of dimension k

fi(x) = a(i)0 + a

(i)1 φ1(x) + · · · + a

(i)k φk(x) i = 1, 2, . . . , n

• Reduces to a set of k-variate linear functions

fi(y) = a(i)0 + a

(i)1 y1 + · · · + a

(i)k yk i = 1, 2, . . . , n

with y = φ(x), φ : Rd−1 → R

• An ε-approximation for {fi (y)} over ∆ ∈ Rk implies an

ε-approximation for {fi (x)} over φ−1(∆ ∩ φ(Rd−1))

Linearization Example

• fi(x) is the distance of the point (x1, x2) ∈ R2 to a circle in

R2 with center (p(i), q(i)) and radius r (i)

fi (x) = (r (i))2 − (x1 − p(i))2 − (x2 − q(i))2

• Can be written as

fi (x) = (r (i) − p(i) − q(i))2 + (2p(i))x1 + (2q(i))x2 − [x21 + x2

• A linearization of dimension k = 3 with

a(i) = [(r (i) − p(i) − q(i))2, 2p(i), 2q(i), −1]

φ(x) = [x1, x2, x21 + x2

• [Agarwal, Matousek 1994] Computing linearization ofminimum dimension

Polynomials: Algorithms

• For a set F of n (d − 1)-variate polynomials admits alinearization of dimension k , can compute

- Algorithm 1: an ε-approximation of size O(1/εk), in timeO(n + 1/εk)

- Algorithm 2: an ε-approximation of size O(1/εk/2), in timeO(n + 1/ε3k/2)

- Algorithm 3: a set of O(1/ε)(d − 2)-dimensional surfaces inRd−1, in time O(n + 1/ε3k/2), such that within each cell oftheir arrangement F is ε-approximated by 2 of its elements

- Complexity of the arrangement O(1/εd−1) [Agarwal, Sharir 00]Cells are of complexity O(1)

- Algorithms 2 and 3: an ε-approximation of sizeO(1/εmin{k/2,d−1}), in time O(n + 1/ε3k/2)

Roots of Polynomials

• Want ε-approximation for F = {(f1)1/r , (f2)1/r , . . . , (fn)

1/r}where r is integer

• Cannot linearize directly

• Special cases can be handled [Chan 02]

• If a, b,A,B ≥ 0 and [A,B ] ⊆ [a, b] then

B−A ≥ (1−δ)(b−a) ⇒ B1/r−A1/r ≥ (1−ε)(b1/r−a1/r )

for δ =(

ε2(r−1)

• It suffices to compute O(εr ) approximation to {f1, f2, . . . , fn}

Roots of Polynomials: Algorithms

• Set of F , |F | = n, contains r -th roots of (d − 1)-variatenon-negative polynomials that admit a linearization of ofdimension k

• Algorithm 1: ε-approximation in time O(1/εkr ) and sizeO(1/εkr )

• Algorithms 2 and 3: ε-approximation in time O(1/ε3kr/2)and size O(1/εr min{d−1,k/2})

• Algorithm 3: A set of O(1/εr ) (d − 2)-dimensional surfacesin R

d−1, in time O(1/ε3kr/2) , such that within each cell oftheir arrangement, F is ε-approximated by two of its elements

Classes of Extent Measures

• Faithful

- Approximated through directional width

- “Convex” measures (bounding shapes)

- Examples:

DiameterWidthRadius of the smallest enclosing ballVolume of the minimum bounding boxVolume enclosed by the convex hullSurface area of the convex hull

• Other

- Approximated through extent

- “Concave” measures (“shells”)

- Examples:

Minimum width spherical shell (annulus)Minimum width cylindrical shell

Dynamic Updates

• Based on a balanced tree data structure

- Both insertions and deletions- Recompute approximations along the unique path to the root- Update time per operation O((logk+1 n/ε)k + f (log n/ε) log n)- Amortized O((logk n/ε)k + f (log n/ε))

• Based on the ranked subsets data structure

- Insertions only- Partition the points into ranked subsets, based on the binary

encoding of n

- Data structure size O(log2k+1 n/εk)- Coreset size O(log2k+1 n/εk) ,

amortized insertion time O((1/ε)k + f (ε))- Coreset size O(1/εk) ,

amortized insertion time O(log2k+1 n/εk + f (ε))

Approximating Extent Measures of Points - Computer...

Documents