Coreset Construction and
Estimation over Stochastic Data
Dissertation Submitted to
Tsinghua University
in partial fulfillment of the requirement
for the degree of
Doctor of Philosophy
in
Computer Science and Technology
by
Lingxiao Huang
Dissertation Supervisor: Assistant Professor Jian Li
June 2017
Coreset Construction and Estimation over
Stochastic Data
by
Lingxiao Huang
Submitted to the Institute for Interdisciplinary Information Sciencesin partial fulfillment of the requirements for the degree of
Doctor of Philosophy
at
TSINGHUA UNIVERSITY
June 2017
c© TSINGHUA UNIVERSITY 2017. All rights reserved.
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Institute for Interdisciplinary Information Sciences
April 15, 2015
Certified by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Jian Li
Assistant ProfessorThesis Supervisor
Coreset Construction and Estimation over
Stochastic Data
by
Lingxiao Huang
Submitted to the Tsinghua Universityon April 15, 2015, in partial fulfillment of the
requirements for the degree ofDoctor of Philosophy
Abstract
In recent years, designing algorithms for geometric or combinatorial optimizationproblems over stochastic data have attracted more and more research interest. Inthis dissertation, we consider two well-known stochastic geometry models. One isthe existential model where each point’s location is fixed but only occurs with acertain probability. The other one is the locational model where each point has aprobability distribution describing its location. Both stochastic geometry modelshave been widely studied in recent years. In this dissertation, we mainly focus on thefollowing problems, coreset construction for shape fitting problems and estimation forcombinatorial optimization problems on these stochastic geometry models.
The first problem concerns with a useful technique for handling large determinis-tic datasets, called coreset. Roughly speaking, a coreset is a small summary of theoriginal large dataset while guaranteeing that answers for certain queries are prov-ably close to the exact answer of corresponding queries. In this dissertation, westudy how to construct coresets on stochastic models. We first extend the conceptε-kernel coresets to stochastic data. We consider approximating the expected width(an ε-exp-kernel), as well as the probability distribution on the width (an (ε, τ)-quant-kernel) for any direction and show how to construct such coresets efficiently.Then we consider two stochastic shape fitting problems, stochastic k-center and s-tochastic j-flat-center. We propose a new notion called generalized coresets, which isa generalization of coresets. We also provide a framework for constructing generalizedcoresets of constant size for both the stochastic k-center problem and the stochas-tic j-flat-center problem. Using these generalized coresets, we give the first PTASs(polynomial time approximation schemes) for both stochastic shape fitting problems.
Secondly, we study the problems of computing the expected lengths of sever-al combinatorial or geometric optimization problems in stochastic geometry models,including closest pair, minimum spanning tree, k-clustering, minimum perfect match-ing, and minimum cycle cover. Most of the above problems are known to be #P-hard.
III
In this dissertation, we propose two new techniques, called finding stoch-core and Hi-erachical Partition Family (HPF). Combining our new techniques and Monte Carlomethod, we obtain the first FPRAS (Fully Polynomial Randomized ApproximationScheme) for most of these problems in stochastic geometry models.
Dissertation Supervisor: Assistant Professor Jian Li
IV
CONTENTS
Contents
1 Introduction 1
1.1 ε-Kernel Coresets over Stochastic Data . . . . . . . . . . . . . . . . . 4
1.2 Coreset Construction for Shape Fitting Problems over Stochastic Data 9
1.3 Estimation for Combinatorial Optimization Problems over Stochastic
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Preliminaries 20
3 Related Work 21
4 ε-Kernel Coresets over Stochastic Data 25
4.1 ε-Kernel Coresets over Deterministic Data . . . . . . . . . . . . . . . 25
4.2 ε-Kernels for Expectations of Width . . . . . . . . . . . . . . . . . . . 27
4.2.1 A Nearly Linear Time Algorithm for Constructing ε-exp-kernels 31
4.2.2 ε-exp-kernel Under the Subset Constraint . . . . . . . . . . 33
4.3 ε-Kernels for Probability Distributions of Width . . . . . . . . . . . . 34
4.3.1 A Simple (ε, τ)-quant-kernel Construction . . . . . . . . . 34
4.3.2 Improved (ε, τ)-quant-kernel for Existential Models . . . . 37
4.3.3 (ε, τ)-quant-kernel Under the Subset Constraint . . . . . . 51
4.4 (ε, r)-fpow-kernel Under the β-Assumption . . . . . . . . . . . . . 52
4.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.1 Approximating the Extent of Uncertain Functions . . . . . . . 54
4.5.2 Stochastic Moving Points . . . . . . . . . . . . . . . . . . . . . 56
4.5.3 Shape Fitting Problems . . . . . . . . . . . . . . . . . . . . . 56
4.5.4 Shape Fitting Problems (Under the β-assumption) . . . . . . 58
4.6 Missing Details in Section 4.2 . . . . . . . . . . . . . . . . . . . . . . 61
V
CONTENTS
4.6.1 Details for Section 4.2.1 . . . . . . . . . . . . . . . . . . . . . 61
4.6.2 Details for Section 4.2.2 . . . . . . . . . . . . . . . . . . . . . 62
4.6.3 Locational uncertainty . . . . . . . . . . . . . . . . . . . . . . 64
4.7 Missing Details in Section 4.3 . . . . . . . . . . . . . . . . . . . . . . 66
4.8 Missing Details in Section 4.4 . . . . . . . . . . . . . . . . . . . . . . 71
4.9 Computing the Expected Direction Width . . . . . . . . . . . . . . . 72
4.9.1 Computing Expected Width for Existential Uncertainty . . . . 72
4.9.2 Computing Expected Width for Locational Uncertainty . . . . 77
5 Coreset Construction for Stochastic Shape Fitting Problems 82
5.1 Coreset Construction for Deterministic Shape Fitting Problems . . . 82
5.2 Generalized Shape Fitting Problems and Generalized Coresets . . . . 85
5.3 Stochastic Minimum k-Center . . . . . . . . . . . . . . . . . . . . . . 90
5.3.1 Existential uncertainty model . . . . . . . . . . . . . . . . . . 91
5.3.2 Locational uncertainty model . . . . . . . . . . . . . . . . . . 104
5.4 Stochastic Minimum j-Flat-Center . . . . . . . . . . . . . . . . . . . 107
5.4.1 Case 1: B < ε . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4.2 Case 2: B ≥ ε . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.5 Constructing additive ε-coresets . . . . . . . . . . . . . . . . . . . . . 115
6 Estimating the Expected Value of Combinatorial Optimization Prob-
lems over Stochastic Data 121
6.1 The Closest Pair Problem . . . . . . . . . . . . . . . . . . . . . . . . 121
6.1.1 Estimating Pr[C ≤ 1] . . . . . . . . . . . . . . . . . . . . . . . 121
6.1.2 Estimating E[C] . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.2 k-Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.3 Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . 131
6.4 Minimum Perfect Matchings . . . . . . . . . . . . . . . . . . . . . . . 137
6.5 Minimum Cycle Covers . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.6 kth Longest m-Nearest Neighbor . . . . . . . . . . . . . . . . . . . . 150
6.7 Missing Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
VI
CONTENTS
6.7.1 Closest Pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.7.2 Minimum Spanning Tree . . . . . . . . . . . . . . . . . . . . . 152
6.7.3 Minimum Perfect Matching . . . . . . . . . . . . . . . . . . . 154
6.8 The Closest Pair Problem . . . . . . . . . . . . . . . . . . . . . . . . 155
6.8.1 Estimating kth Closest Pair in the Existential Uncertainty Model155
6.8.2 Hardness for Closest Pair . . . . . . . . . . . . . . . . . . . . . 156
6.9 Another FPRAS for MST . . . . . . . . . . . . . . . . . . . . . . . . 158
7 Concluding Remarks 160
Acknowledgements 171
VII
LIST OF FIGURES
List of Figures
4-1 The figure depicts a pentagon M in R2 to illustrate some intuitive
facts in convex geometry. (1) The plane can be divided into 5 cones
C1, . . . , C5, by 5 angles θ1, . . . , θ5. ~uθi is the unit direction correspond-
ing to angle θi. Each cone Ci corresponds to a vertex si and for any
direction ~u ∈ Ci, f(M,~u) = 〈~u, si〉 and the vector ∇f(M,~u) is si. (2)
Each direction θi is perpendicular to an edge of M . M = ∩5i=1Hi where
Hi is the supporting halfplane with normal vector ~uθi . . . . . . . . . 31
4-2 The construction of the (ε, τ)-quant-kernel S. The dashed polygon
is H. The inner solid polygon is ConvH(EH) and the outer one is
K = (1 + ε)ConvH(EH). K is the set of points outside K. . . . . . . . 42
4-3 Illustration of the interval graph I. For illustration purpose, co-located
points (e.g., points that are split in A) are shown as overlapping points.
The arrows indicate the assignment of the segments to the points in B.
Theorem 32 ensures that any vertical line can not stab many intervals. 68
4-4 Illustrating the definition of the angle α of: (a) a ray ρ and (b) a line l. . . . . . 74
4-5 Illustrating the computation of the coordinate x(s, ~u) on l(~u): v(~u) is the perpen-
dicular projection of s on l(~u). The length of ov is ds. . . . . . . . . . . . . . 76
VIII
LIST OF FIGURES
5-1 An example for Algorithm 1 when k = 2. In this figure, P = s1, . . . , s11consists of all points, and S = s3, s5, s7 consists of black points. Then
by Lemma 70, we have that PrP∼P [E(P ) = S] = p3p5p7(1 − p1)(1 −p2)(1 − p4)(1 − p10)(1 − p11). Now we run Algorithm 1 on S. In Step
1, we first construct a Cartesian grid G(S) as in the figure, and con-
struct a cell collection C(S) = C1, C2, C3 since C4∩S = ∅. Note that
E(S) = S (by Lemma 70) and |S| = 3 > k. We directly go to Step
3 and want to compute the value Q(Ci) for each cell Ci. For cell C1,
two rectangle points s1 and s2 are of smaller index than s3 ∈ S. So
we compute that Q(C1) = p3(1 − p1)(1 − p2). Similarly, we compute
Q(C2) = p5(1 − p4), Q(C3) = p7, and Q(C4) = (1 − p10)(1 − p11).
Finally in Step 4, we output PrP∼P [E(P ) = S] =∏
C∈G(S) Q(C) =
p3p5p7(1− p1)(1− p2)(1− p4)(1− p10)(1− p11). . . . . . . . . . . . . 95
5-2 In the figure, Si is the black point set, F ∗ is the white point set, and
F ∗i is the dashed point set. Here, s∗i ∈ Si is the farthest point to F ∗i
satisfying d(s∗i , F∗i ) = K(Si, F
∗i ), and f ∗i ∈ F ∗ is the closest point to s∗i
satisfying d(s∗i , f∗i ) = d(s∗i , F
∗). . . . . . . . . . . . . . . . . . . . . . 97
IX
LIST OF TABLES
List of Tables
1.1 Our results for some problems in different stochastic models. . . . . . 16
X
CHAPTER 1. INTRODUCTION
Chapter 1 Introduction
In recent years, stochastic data are pervasive in applications. Managing, analyzing
and optimizing over such stochastic data have become an increasingly important issue
and have attracted significant attentions from several research communities includ-
ing theoretical computer science, databases, machine learning and sensor network-
s [29, 35, 100]. Theoretically, any deterministic combinatorial or geometry optimiza-
tion problem has many uncertain counterparts (corresponding to different uncertainty
models). A variety of classic problems in deterministic setting have been well stud-
ied, while systematic studies of them under uncertainty have only been initiated. For
example, suppose we want to build k facilities to serve a set of uncertain demand
points, and our goal is to minimize the expectation of the maximum distance from
any realized demand point to its closest facility. This problems is called stochastic
k-center which is first considered in this dissertation. Estimation and solving opti-
mization problems over stochastic models and data have recently attracted significant
attentions in several research communities (see e.g., [96, 100, 102]).
In the following, we list some application examples to exemplify where the stochas-
tic data comes from, and what types of stochastic data we may encounter, see [79]
for more examples.
1. Stochastic shortest path. Consider a traffic problem where we want to
arrive at the airport before a specific time. There are several paths leading to
the airport. In deterministic setting, each road costs a certain time. Our goal
is to find a shortest path which minimizes the travel time. However in reality,
we often know the distribution of the travel time of each road rather than the
exact time. In this stochastic setting, we want to pick a path that maximizes
the probability we can arrive at the airport on time. This problem has been
considered a lot since the 1980s [18, 78, 81, 86, 90, 91].
1
CHAPTER 1. INTRODUCTION
2. Fixed set stochastic knapsack. The knapsack problem is a classic scheduling
problem. One natural motivation is as follows. Suppose we are given a series
of jobs and a single machine. Each job has a processing time and a profit.
Our goal is to pick a subset of jobs which can be finished by a single machine
one by one before the deadline and to maximize the total profit of these jobs.
However, the processing time of each job is often random and follows from an
individual distribution. So a stochastic variant of this problem is called the
fixed set stochastic knapsack problem. We still want to choose a set of jobs
to maximize the total profit. Since the processing time is random, we have
an additional constraint that the probability that finishing all jobs before the
deadline is at least some fixed constant γ > 0. In previous work, many groups
of researchers have studied different distributions, including Bernoulli [50, 75],
exponential [50] and Gaussian [52].
3. Adaptive stochastic process. Note that the solution is chosen in advance
for the above two examples. The adaptive variants of these problems have also
been studied extensively. For the adaptive stochastic shortest path problem, we
know the exact travel time immediately when we pass through a road. Thus,
adaptively choosing the following path may increase the probability that we
arrive the airport before the deadline.
Similarly we can consider adaptive policies for stochastic knapsack. In fact,
researchers studied a more complicated version where the processing time is
also random, and the precise values of the processing time and the profit are
revealed when the job is completed. The goal is to gain as much profit as
possible. Dean et al. [33] initially studied this problem and proposed a greedy
algorithm. Later on, [24, 81] considered the same problem and improved their
results.
Many other adaptive stochastic problems have been well studied, such as adap-
tive stochastic matching [17] and adaptive stochastic probing [107].
4. Stochastic geometry. Theoretically, all deterministic computational geom-
2
CHAPTER 1. INTRODUCTION
etry problems have natural stochastic counterparts. We take the following s-
tochastic variant of the facility location problem as an example. Suppose we
want to build k facilities to serve a set of uncertain demand points (i.e., their
locations are random), and our goal is to minimize the expectation of the maxi-
mum distance from any realized demand point to its closest facility. A variety of
classic geometry problems in deterministic setting have been well studied, while
systematic studies of them in different stochastic models have been initiated in
recent years. Munteanu et al. [88] studied the stochastic minimum enclosing
ball problem in fixed dimensional Euclidean space and gave a PTAS. In this
dissertation, we also studied some fundamental problems in this area.
In this dissertation, we focus on two well-known stochastic geometry models: the
locational uncertainty model and the existential uncertainty model. Both models
have been studied extensively for a variety of computational geometry optimization
problems or combinatorial optimization problems, such as closest pairs [70], nearest
neighbors [5, 70], minimum spanning trees [65, 71], convex hulls [101], maxima [2],
perfect matchings [65], clustering [30, 53] , minimum enclosing balls [88] and range
queries [1, 4, 80]. We give the formal definitions of these two stochastic geometry
models as follows.
1. Locational uncertainty model: We are given a metric space P . The location of
each node v ∈ V is a random point in the metric space P and the probability
distribution is given as the input. Formally, we use the term nodes to refer
to the vertices of the graph, points to describe the locations of the nodes in
the metric space. We denote the set of nodes as V = v1, . . . , vm and the
set of points as P = s1, . . . , sn, where m = |V| and n = |P|. A realization
r can be represented by an m-dimensional vector (r1, . . . , rm) ∈ Pm where
point ri is the location of node vi for 1 ≤ i ≤ m. Let R denote the set of
all possible realizations. We assume that the distributions of the locations of
nodes in the metric space P are independent, thus r occurs with probability
Pr[r] =∏
i∈[m] pviri , where pvs represents the probability that the location of
3
CHAPTER 1. INTRODUCTION
node v is point s ∈ P . The model is also termed as the locational uncertainty
model in [71].
2. Existential uncertainty model: A closely related model is the existential uncer-
tainty model where the location of a node is a fixed point in the given metric
space, but the existence of the node is probabilistic. In this model, we use psi
to denote the probability that node vi exists (if exists, its location is si). For
simplicity, we also use pi to represent psi . A realization r can be represented by
a subset P ⊂ P and Pr[r] =∏
si∈P pi∏
si /∈P (1− pi).
Now, we introduce and motivate two classes of problems in this dissertation: core-
set construction and estimating the expected values of combinatorial optimization
problems over stochastic data. We also briefly state our contributions one by one.
1.1 ε-Kernel Coresets over Stochastic Data
Given a large dataset P and a class C of queries, a coreset S is a dataset of much
smaller size such that for every query r ∈ C, the answer r(S) for the small dataset
S is close to the answer r(P ) for the original large dataset P . Coresets have become
more relevant in the era of big data as they summarize large datasets by datasets
with potentially much smaller size, and at the same time guarantee the answer to
certain classes of queries to be close to the true answer. The notion of a coreset was
studied in the directional width problem (in which a coreset is called an ε-kernel) and
several other geometric shape fitting problems in the seminal paper [7].
We introduce some notation and briefly review the definition of ε-kernel. For a
set P of deterministic points, the support function f(P, ~u) is defined to be f(P, ~u) =
maxs∈P 〈~u, s〉 for ~u ∈ Rd, where 〈., .〉 is the inner product. The directional width of P
in direction u ∈ Rd, denoted by ω(P, ~u), is defined by ω(P, ~u) = f(P, ~u) + f(P,−~u).
It is easy to see that the support function and the directional width only depend
on the convex hull of P . A subset Q ⊆ P is called an ε-kernel of P if for each
direction ~u ∈ Rd, (1 − ε)ω(P, ~u) ≤ ω(Q,~u) ≤ ω(P, ~u). For any set of n points, there
4
CHAPTER 1. INTRODUCTION
is an ε-kernel of size O(ε−(d−1)/2) [7, 8], which can be constructed in O(n+ ε−(d−3/2))
time [26, 88].
Our contribution. (Chapter 4) Our main results can be summarized as follows:
1. Suppose P is a set of stochastic points (in either the existential or locational
uncertainty model). Define the expected directional width of P in direction u
to be ω(P , ~u) = EP∼P [ω(P, ~u)], where P ∼ P means that P is a (random)
realization of P . We first consider how to construct ε-exp-kernel which is
defined as follows:
Definition 1. For a constant ε > 0, a set S of (deterministic or stochastic)
points in Rd is called an ε-exp-kernel of P, if for all directions ~u ∈ Rd,
(1− ε)ω(P , ~u) ≤ ω(S, ~u) ≤ ω(P , ~u).
Our first main result is that an ε-exp-kernel of size O(ε−(d−1)/2) exists for
both existential and locational uncertainty model and can be constructed in
nearly linear time.
Theorem 2. P is a set of n uncertain points in Rd (in either locational uncer-
tainty model or existential uncertainty model). There exists an ε-exp-kernel
of size O(ε−(d−1)/2) for P. For existential uncertainty model (locational uncer-
tainty model resp.), such an ε-exp-kernel can be constructed in O(ε−(d−1)n log n)
time, where n is the number of points (possible locations).
The existential result is a simple Minkowski sum argument. We first show that
there exists a convex polytope M such that for any direction, the directional
width of M is exactly the same as the expected directional width of P (Lem-
ma 16). This immediately implies the existence of a ε-exp-kernel consisting
O(ε−(d−1)/2) deterministic points (using the result in [7]), but without the subset
constraint. The Minkowski sum argument seems to suggest that the complexi-
ty of M is exponential. However, we show that the complexity of M is in fact
5
CHAPTER 1. INTRODUCTION
polynomial O(n2d−2) and we can construct it explicitly in O(n2d−1 log n) time
(Theorem 20).
Although the complexity of M is polynomial, we cannot afford to construct it
explicitly if we are to construct an ε-exp-kernel in nearly linear time. Thus we
construct the ε-exp-kernel without explicitly constructing M . In particular,
we show that it is possible to find the extreme vertex of M in a given direction in
nearly linear time, by computing the gradient of the support function of M . We
also provide quadratic-size data structures that can calculate the exact width
ω(P , ·) in logarithmic time under both models in R2 (Section 4.9).
We also show that under subset constraint (i.e., the ε-exp-kernel is required
to be a subset of the original point set, with the same probability distribution
for each chosen point), there is no ε-exp-kernel of sublinear size (Lemma 25).
However, if there is a constant lower bound β > 0 on the existential probabili-
ties (called β-assumption), we can construct an ε-exp-kernel of constant size
(Theorem 26 and Section 4.6).
2. Sometimes it is useful to obtain more than just the expected value (say of the
width) on a query; rather one may want to return (an approximation of) a
representation of the full probability distribution that the query can take. So
we also consider the construction of the following (ε, τ)-quant-kernel.
Definition 3. For a constant ε, τ > 0, a set S of stochastic points in Rd is
called an (ε, τ)-quant-kernel of P, if for all directions ~u and all x ≥ 0,
PrP∼P[ω(P, ~u) ≤ (1−ε)x
]−τ ≤ PrS∼S
[ω(S, ~u) ≤ x
]≤ PrP∼P
[ω(P, ~u) ≤ (1+ε)x
]+τ.
Now, we describe our main results for (ε, τ)-quant-kernels. We first propose
a quite simple but general algorithm for constructing (ε, τ)-quant-kernels,
which achieves the following guarantee.
Theorem 4. An (ε, τ)-quant-kernel of size O(τ−2ε−3(d−1)/2
)can be con-
structed in O(nτ−2ε−(d−1)
)time, under both existential and locational uncer-
6
CHAPTER 1. INTRODUCTION
tainty models.
The algorithm is surprisingly simple. Take a certain number of i.i.d. realiza-
tions, compute an ε-kernel for each realization, and then associate each kernel
with probability 1/N (so the points are not independent). The analysis requires
the VC uniform convergence bound for unions of halfspaces. The details can be
found in Section 4.3.1.
For existential uncertainty model, we can improve the size bound as follows.
Theorem 5. P is a set of uncertain points in Rd with existential uncertain-
ty. There exists an (ε, τ)-quant-kernel for P, which consists of a set of
independent uncertain points of cardinality O(ε−(d−1)τ−2). The algorithm for
constructing such a coreset runs in O(n logO(d) n) time.
We note that another advantage of the improved construction is that the (ε, τ)-
quant-kernel is a set of independent stochastic points (rather than correlated
points as in Theorem 4). We achieve the improvement by two algorithms.
The first algorithm transforms the Bernoulli distributed variables into Poisson
distributed random variables and creates a probability distribution using the
parameters of the Poissons, from which we take a number of i.i.d. samples as
the coreset. Our analysis leverages the additivity of Poisson distributions and
the VC uniform convergence bound (for halfspaces). However, the number of
samples required depends on λ(P), so the first algorithm only works when λ(P)
is small. The second algorithm complements the first one by identifying a convex
set K that lies in the convex hull of P with high probability (K exists when λ(P)
is large) and uses a small size deterministic ε-kernel to approximate K. The
points in K = P \K can be approximated using the same sampling algorithm
as in the first algorithm and we can show that λ(K) is small, thus requiring only
a small number of samples. Our algorithm can be easily extended to Rd for any
constant d and the size of the coreset is O(τ−2ε−(d−1)). In Section 4.3.2, we show
such an (ε, τ)-quant-kernel can be computed in O(npolylogn) time using an
7
CHAPTER 1. INTRODUCTION
iterative sampling algorithm. Our technique has some interesting connections
to other important geometric problems (such as the Tukey depth problem) [87],
may be interesting in its own right.
3. The notion (ε, τ)-quant-kernel is also not powerful enough for certain shape
fitting problems (e.g., the minimum enclosing cylinder problem and the mini-
mum spherical shell problem) in the stochastic setting. The main reason is the
appearance of the l2-norm in the objective function. So we need to be able to
handle the fractional powers in the objective function. For a set P of points in
Rd, the polar set of P is defined to be P ? = ~u ∈ Rd | 〈~u, s〉 ≥ 0,∀s ∈ P. Let
r be a positive integer. Given a set P of points in Rd and ~u ∈ P ?, we define a
function
Tr(P, ~u) = maxs∈P〈~u, s〉1/r −min
s∈P〈~u, s〉1/r.
We only care about the directions in P? (i.e., the polar of the points in P) for
which Tr(P, ~u),∀P ∼ P is well defined.
Definition 6. For a constant ε > 0, a positive integer r, a set S of stochastic
points in Rd is called an (ε, r)-fpow-kernel of P, if for all directions ~u ∈ P?,
(1− ε)EP∼P [Tr(P, ~u)] ≤ EP∼S [Tr(P, ~u)] ≤ (1 + ε)EP∼P [Tr(P, ~u)].
For (ε, r)-fpow-kernels, we provide a linear time algorithm for constructing
an (ε, r)-fpow-kernel of size O(ε−(rd−r+2)) in the existential uncertainty mod-
el under the β-assumption where each point is present with probability above
β. The algorithm is almost the same as the construction in Section 4.3.1 except
that some parameters are different.
Theorem 7. (Section 4.4) An (ε, r)-fpow-kernel of size O(ε−(rd−r+2)) can be
constructed in O(nε−(rd−r+4)/2
)time in the existential uncertainty model under
the β-assumption.
4. Finally, we show that the above results, combined with the duality and lineariza-
8
CHAPTER 1. INTRODUCTION
tion arguments [7], can be used to obtain constant size coresets for the function
extent problem in the stochastic setting, and to maintain extent measures for
stochastic moving points.
Using the above results, we also obtain efficient approximation schemes for vari-
ous shape-fitting problems in the stochastic setting, such as minimum enclosing
ball, minimum spherical shell, minimum enclosing cylinder and minimum cylin-
drical shell in different stochastic settings. We summarize our application results
in the following theorems. The details can be found in Section 4.5.
Theorem 8. Suppose P is a set of n independent stochastic points in Rd un-
der either existential or locational uncertainty model. There are linear time
approximation schemes for the following problems: (1) finding a center point
c to minimize E[maxs∈P ‖s − c‖2]; (2) finding a center point c to minimize
E[obj(c)] = E[maxs∈P ‖s − c‖2 − mins∈P ‖s − c‖2]. Note that when d = 2 the
above two problems correspond to minimizing the expected areas of the enclosing
ball and the enclosing annulus, respectively.
Under β-assumption, we can obtain efficient approximation schemes for the
following shape fitting problems.
Theorem 9. Suppose P is a set of n independent stochastic points in Rd, each
appearing with probability at least β, for some fixed constant β > 0. There
are linear time approximation schemes for minimizing the expected radius (or
width) for the minimum spherical shell, minimum enclosing cylinder, minimum
cylindrical shell problems over P.
1.2 Coreset Construction for Shape Fitting Problems over
Stochastic Data
We study two classic geometric optimization problems, the k-center problem and the
j-flat center problem in Euclidean spaces. Both problems are important in geomet-
ric data analysis. We generalize both problems to the stochastic settings. For the
9
CHAPTER 1. INTRODUCTION
stochastic k-center problem, we would like to find k points in a fixed dimensional
Euclidean space, such that the expected value of the k-center objective is minimized.
For the stochastic j-flat-center problem, we seek a j-flat (i.e., a j-dimensional affine
subspace) such that the expected value of the maximum distance from any point
to the j-flat is minimized. One of the motivations for this stochastic version comes
from the stochastic variant of the `∞ regression problem. We still want to construct
coresets for these two shape fitting problems.
In the following, we first define the two stochastic shape fitting problems. Then
we briefly introduce our contributions and techniques.
Stochastic k-Center. The deterministic Euclidean k-center problem is a central
problem in geometric optimization [11, 8]. It asks for a k-point set F in Rd such that
the maximium distance from any of the n given points to its closest point in F is
minimized.
Definition 10. For a set of points P ∈ Rd, and a k-point set F = (f1, . . . , fk) | fi ∈Rd, 1 ≤ i ≤ k, we define K(P, F ) = maxs∈P min1≤i≤k d(s, fi) as the k-center value
of F w.r.t. P . We use F to denote the family of all k-point sets in Rd. Given a set
P of n stochastic points (in either the existential or locational uncertainty model) in
Rd, and a k-point set F ∈ F , we define the expected k-center value of F w.r.t P as
K(P , F ) = EP∼P [K(P, F )].
In the stochastic minimum k-center problem, our goal is to find a k-point set F ∈ Fwhich minimizes K(P , F ). In this dissertation, we assume that both the dimension-
ality d and k are fixed constants.
Stochastic j-Flat-Center. The deterministic j-flat-center problem is defined as fol-
lows: given n points in Rd, we would like to find a j-flat F (i.e., a j-dimensional affine
subspace) such that the maximum distance from any given point to F is minimized.
It is a common generalization of the minimum enclosing ball (j = 0), minimum en-
closing cylinder (j = 1), and minimum width problems (j = d − 1), and has been
10
CHAPTER 1. INTRODUCTION
well studied in computational geometry [8, 45, 106]. Its stochastic version is also
naturally motivated by the stochastic variant of the `∞ regression problem: Suppose
we would like to fit a set of points by an affine subspace. However, those points may
be produced by some machine learning algorithm, which associates some confidence
level to each point (i.e., each point has an existential probability). This naturally
gives rise to the stochastic j-flat-center problem. Formally, it is defined as follows.
Definition 11. Given a set P of n points in Rd, and a j-flat F ∈ F (0 ≤ j ≤ d−1),
where F is the family of all j-flats in Rd, we define the j-flat-center value of F w.r.t.
P to be J(P, F ) = maxs∈P d(s, F ), where d(s, F ) = minf∈F d(s, f) is the distance
between point s and j-flat F . Given a set P of n stochastic points (in either the
existential or locational model) in Rd, and a j-flat F ∈ F (0 ≤ j ≤ d− 1), we define
the expected j-flat-center value of F w.r.t. P to be
J(P , F ) = EP∼P [J(P, F )].
In the stochastic minimum j-flat-center problem, our goal is to find a j-flat F which
minimizes J(P , F ).
Previous Results and Our contributions. Recall that a polynomial time approx-
imation scheme (PTAS) for a minimization problem is an algorithm A that produces
a solution whose cost is at most 1 + ε times the optimal cost in polynomial time, for
any fixed constant ε > 0.
Stochastic k-Center. Cormode and McGregor [30] first studied the stochastic k-
center problem in a finite metric graph under the locational uncertainty model, and
obtained a bi-criterion constant approximation. Guha and Munagala [53] improved
their result to a single-criterion constant factor approximation. Recently, Wang and
Zhang [108] studied the stochastic k-center problem on a line, and proposed an effi-
cient exact algorithm. No result better than a constant approximation is known for
the Euclidean space Rd (d ≥ 2). We obtain the first PTAS for the stochastic k-center
problem in Rd.
11
CHAPTER 1. INTRODUCTION
Theorem 12. Assume that both k and d are fixed constants. There exists a PTAS
for the stochastic minimum k-center problem in Rd, under either the existential or
the locational uncertainty model.
Our result generalizes the PTAS for stochastic minimum enclosing ball by Munteanu
et al. [88]. We remark that the assumption that k is a constant is necessary for get-
ting a PTAS, since even the deterministic Euclidean k-center problem is APX-hard
for arbitrary k even in R2 [42].
Stochastic j-Flat-Center. Our main result for the stochastic j-flat-center is as
follows.
Theorem 13. Assume that the dimensionality d is a constant. There exists a PTAS
for the stochastic minimum j-flat-center problem, under either the existential or the
locational uncertainty model.
This result also generalizes the PTAS for stochastic minimum enclosing ball (i.e.,
0-flat-center) by Munteanu et al. [88]. It also generalizes a previous PTAS for the
stochastic minimum enclosing cylinder (i.e., 1-flat-center) problem in the existential
model where the existential probability of each point is assumed to be lower bounded
by a small fixed constant in Chapter 4.
Our techniques. Our techniques for both problems heavily rely on the powerful
notion of coresets. In a typical deterministic geometric optimization problem, an
instance P is a set of deterministic (weighted) points. A coreset S of P is a set
of (weighted) points, such that the solution for the optimization problem over S is a
good approximate solution for P . 1 In Chapter 4, we generalize the notion of ε-kernel
coreset (for directional width) to stochastic points. However, our techniques can only
handle directional width, and extending it to problems such as stochastic minimum
enclosing cylinder requires certain technical assumption.
In this dissertation, we introduce a new framework for solving geometric opti-
mization problems over stochastic points. For a stochastic instance P , we consider
1It is possible to define coresets for other classes of optimization problems.
12
CHAPTER 1. INTRODUCTION
P as a collection of realizations P = P | P ∼ P. Each realization P has a weight
Pr[ P ], which is its realized probability. Now, we can think the stochastic problem as
a certain deterministic problem over (exponential many) all realizations (each being a
point set). Our framework constructs an object S satisfying the following properties.
1. Basically, S has a constant size description (the constants may depend on d, ε,
and k).
2. The objective value for a certain deterministic optimization problem over S can
approximate the objective for the original stochastic problem well. Moreover,
the solution to the deterministic optimization over S is a good approximation
for the original problem as well.
At a high level, S serves very similar roles as the coresets in the deterministic
setting. Note that the form of S may vary for different problems: in stochastic k-
center, it is a collection of weighted point sets (we call S an SKC-Coreset); in
stochastic j-flat-center, it is a combination of two collections of weighted point sets
for two intermediate problems (we call S an SJFC-Coreset).
For stochastic k-center under the existential model, we construct an SKC-Coreset
S in two steps. First, we map all realizations to their additive ε-coresets (for deter-
ministic k-centers) [11]. Since there are only a polynomial number of possible additive
ε-coresets, the above mapping can partition the space of all realizations into a poly-
nomial number of parts, such that the realizations in each part have very similar
objective functions. Moreover, for each additive ε-coresets, it is possible to compute
the total probability of the realizations that are mapped to the coreset. In fact, this
requires a subtle modification of the construction in [11] so that we can compute the
aforementioned probability efficiently. This step has reduced the exponential num-
ber of realizations to a polynomial size representation. Next, we define a generalized
shape fitting problem, call the generalized k-median problem, over the collection of
above additive ε-coresets. Then, we need to properly generalize the previous defini-
tion of coreset and the total sensitivity (a notion proposed in the deterministic coreset
context by Langberg and Schulman [77]), and prove a constant upper bound for the
13
CHAPTER 1. INTRODUCTION
generalized total sensitivity by relating it to the total sensitivity of the ordinary k-
median problem. The SKC-Coreset S is a generalized coreset for the generalized
k-median problem, which consists of a constant number of weighted point sets.
For stochastic k-center under the locational model, computing the weight for each
set in the SKC-Coreset S is somewhat more complicated. We need to reduce
the computational problem to a family of bipartite holant problems, and apply the
celebrated result by Jerrum, Sinclair, and Vigoda [67].
For the stochastic minimum j-flat-center problem, we proposed an efficient algo-
rithm for constructing an SJFC-Coreset. We utilize several ideas in Chapter 4, as
well as prior results on the shape fitting problem. We first partition the realizations
P ∼ P into two parts through a construction similar to the (ε, τ)-quant-kernel
construction in Chapter 4. Roughly speaking, after linearization, we need to find a
convex set K in a higher dimensional space such that the total probability of any point
falling outside K is small, but not so small such that in each direction the expected
directional width of P is comparable to that of K. Then, for those points inside K, it
is possible to use a slight modification of the construction in Chapter 4 to construct a
collection of weighted point sets. For the points outside K, since the total probability
is small, we reduce the problem to a weighted j-flat-median problem, and use the
coreset in [106] (this step is similar to that in [88]). By combining the two collections,
we obtain the SJFC-Coreset S for the problem, which is of constant size. Then,
we can easily obtain a PTAS by solving a constant size polynomial system defined by
S.
We remark that our overall approach is very different from that in Munteanu
et al. [88] (except one aforementioned step and that they also crucially used some
machinery from the coreset literature). Munteanu et al. [88] defined a near-metric
distance measure m(A,B) = maxa∈A,b∈B d(a, b) for two non-empty point sets A,B.
This near-metric measure satisfies many metric properties, like non-negativity, sym-
metry and the triangle inequality. By lifting the problem to the space defined by such
metric and utilizing a previous coreset result for clustering, they obtained a PTAS for
the problem. However, in the more general stochastic minimum k-center problem and
14
CHAPTER 1. INTRODUCTION
stochastic minimum j-flat-center problem, it is unclear how to translate the distance
function between point sets and k-centers or point sets and j-flat sets to a near-metric
distance (and still satisfies symmetry and triangle inequality).
1.3 Estimation for Combinatorial Optimization Problems over
Stochastic Data
We are interested in the following natural problem over both existential and locational
uncertainty models: estimating the expected values of certain statistics of combinato-
rial objects. In this dissertation, we study several combinatorial or geometry problems
in these two models: the closest pair problem, minimum spanning tree, minimum per-
fect matching (assuming an even number of nodes), k-clustering and minimum cycle
cover. We take the minimum spanning tree problem for example. Let MST be the
length of the minimum spanning tree (which is a random variable) and MST(r) be
the length of the minimum spanning tree spanning all points in the realization r. We
would like to estimate the following quantity:
E[MST] =∑
r∈RPr[r] ·MST(r).
However, the above formula does not give us an efficient way to estimate the ex-
pectation since it involves an exponential number of terms. In fact, computing the
exact expected value are either NP-hard or #P-hard. Following many of the theoret-
ical computer science literatures on approximate counting and estimation, our goal
is to obtain fully polynomial randomized approximation schemes for computing the
expected values.
Our contribution. (Chapter 6) We recall that a fully polynomial randomized ap-
proximation scheme (FPRAS) for a problem f is a randomized algorithm A that takes
an input instance x, a real number ε > 0, returns A(x) such that Pr[(1 − ε)f(x) ≤A(x) ≤ (1 + ε)f(x)] ≥ 3
4and its running time is polynomial in both the size of the
15
CHAPTER 1. INTRODUCTION
input n and 1/ε. Our main contributions can be summarized in Table 1.1.
Problems Existential Locational
Closest Pair (S6.1)E[C] FPRAS FPRAS
Pr[C ≤ 1] FPRAS FPRASPr[C ≥ 1] Inapprox Inapprox
Diameter (S6.1)E[D] FPRAS FPRAS
Pr[D ≤ 1] Inapprox InapproxPr[D ≥ 1] FPRAS FPRAS
Minimum Spanning Tree (S6.3) E[MST] FPRAS[71] FPRASk-Clustering (S6.2) E[kCL] FPRAS Open
Perfect Matching (S6.4) E[PM] N.A. FPRASkth Closest Pair (S6.8.1) E[kC] FPRAS Open
Cycle Cover (S6.5) E[CC] FPRAS FPRASkth Longest m-Nearest Neighbor (S6.6) E[kmNN] FPRAS Open
Table 1.1: Our results for some problems in different stochastic models.
1. Closest Pair: We use C to denote the minimum distance of any pair of two nodes.
If a realization has less than two nodes, C is zero. Computing Pr[C ≤ 1] exactly
in the existential model is known to be #P-hard even in an Euclidean plane [72],
but no nontrivial algorithmic result is known before. So is computing Pr[C ≥ 1].
In fact, it is not hard to show that computing Pr[C ≥ 1] is inapproximable within
any factor in a metric space (Section 6.8.2).
We also consider the problem of computing expected distance E[C] between
the closest pair in the same model. We prove that the problem is #P-hard in
Section 6.8.2 and give the first known FPRAS in Section 6.1. Note that an
FPRAS for computing Pr[C ≤ 1] does not imply an FPRAS for computing E[C]
2.
2. Diameter: The problem of computing the expected length of the diameter can
be reduced to the closest pair problem as follows. Assume that the longest
distance between two points in P is W . We construct the new instance P ′ as
follows: for any two points s, t ∈ P , let their distance be 2W − d(s, t) in P ′.2To the contrary, an FPRAS for computing Pr[C ≥ 1] or Pr[C = 1] would imply an FPRAS for
computing E[C] since E[C] =∑
(si,sj)Pr[C = d(si, sj)]d(si, sj) =
∫Pr[C ≥ t]dt =
∑(si,sj)
Pr[C ≥d(si, sj)](d(si, sj)− d(s′i, s
′j)).
16
CHAPTER 1. INTRODUCTION
The new instance is still a metric. The sum of the distance of closest pair in Pand the diameter in P ′ is exactly 2W (if there are at least two realized points).
Hence, the answer for the diameter can be easily derived from the answer for
closest pair in P ′.
3. Minimum Spanning Tree: Computing E[MST] exactly in both uncertainty mod-
els is known to be #P-hard [71]. Kamousi, Chan, and Suri [71] developed an
FPRAS for estimating E[MST] in the existential uncertainty model and a con-
stant factor approximation algorithm in the locational uncertainty model.
Estimating E[MST] is amendable to several techniques. We obtain an FPRAS
for estimating E[MST] in the locational uncertainty model using the stoch-core
techinque in Section 6.3. In fact, the idea in [71] can also be extended to give
an alternative FPRAS (Section 6.9). It is not clear how to extend their idea to
other problems.
4. Clustering (k-clustering): In the deterministic k-clustering problem, we want to
partition all points into k disjoint subsets such that the spacing of the partition
is maximized, where the spacing is defined to be the minimum of any d(u, v)
with u, v in different subsets [74]. In fact, the optimal cost of the problem is the
length of the (k− 1)th most expensive edge in the minimum spanning tree [74].
We show how to estimate E[kCL] using the HPF (hierarchical partition family)
technique in Section 6.2.
5. Perfect Matching: We assume that there are even number of nodes to ensure
that a perfect matching always exists. Therefore, only the locational uncertainty
model is relevant here. We give the first FPRAS for approximating the expected
length of minimum perfect matching in Section 6.4 using a more complicated
stoch-core technique.
All of our algorithms run in polynomial time. However, we have not attempted
to optimize the exact running time.
17
CHAPTER 1. INTRODUCTION
Our techniques. Perhaps the simplest and the most commonly used technique
for estimating the expectation of a random variable is the Monte Carlo method,
that is to use the sample average as the estimate. However, the method is only
efficient (i.e., runs in polynomial time) if the variance of the random variable is
small (See Lemma 14). To circumvent the difficulty caused by the high variance, a
general methodology is to decompose the expectation of the random variable into a
convex combination of conditional expectations using the law of total expectation:
E[X] = EY[E[X | Y ]
]=∑
y Pr[Y = y]E[X | Y = y]. Hopefully, Pr[Y = y] can be
estimated (or calculated exactly) efficiently, and the random variable X conditioning
on each event y has a low variance. However, choosing the events Y to condition on
can be tricky.
We develop two new techniques for choosing such events, each being capable of
solving a subset of aforementioned problems. In the first technique, we first identify
a set H of points, called the stoch-core of the problem, such that (1): with high
probability, all nodes realize in H and (2): conditioning on event (1), the variance
is small. Then, we choose Y to be the number of nodes realized to points not in H.
We compute the (1 ± ε)-estimates for Y = 0, 1 using Monte Carlo by (1) and (2).
The problematic part is when Y is large, i.e., many nodes realize to points outside H.
Even though the probability of such events is very small, the value of X under such
events may be considerably large, thus contributing nontrivially. However, we can
show that the contribution of such events is dominated by the first few events and thus
can be safely ignored. Choosing appropriate stoch-core is easy for some problems,
such as closest pair and minimum spanning tree, while it may require additional idea
for other problems such as minimum perfect matching.
Our second technique utilizes a notion called Hierarchical Partition Family (HPF).
The HPF has n levels, each representing a clustering of all points. For a combinatorial
problem, for which the solution is a set of edges, we define Y to be the highest level
such that some edge in the solution is an inter-cluster edge. Informally, conditioning
on the information of Y , we can essentially bound the variance of X (hence use the
Monte Carlo method). To implement Monte Carlo, we need to be able to take samples
18
CHAPTER 1. INTRODUCTION
efficiently conditioning on Y . We show that such sampling problems can be reduced
to, or have connections to, classical approximate counting and sampling problems,
such as approximating permanent, counting knapsack.
19
CHAPTER 2. PRELIMINARIES
Chapter 2 Preliminaries
In this chapter, we recall a useful tool, the Chernoff bound. Suppose we want to
estimate E[X]. In each Monte Carlo iteration, we take a sample (a realization of all
nodes), and compute the value of X for the sample. At the end, we output the average
over all samples. The number of samples required by this algorithm is suggested by
the following standard Chernoff bound.
Lemma 14. (Chernoff Bound) Let random variables X1, X2, . . . , XN be independent
random variables taking on values between 0 and U . Let X = 1N
∑Ni=1Xi and µ be
the expectation of X, for any ε > 0,
Pr [X ∈ [(1− ε)µ, (1 + ε)µ]] ≥ 1− 2e−NµUε2/4.
Therefore, for any ε > 0, in order to get an (1±ε)-approximation with probability
1 − 1poly(n)
, the number of samples needs to be O( Uµε2
log n). If Uµ
, the ratio between
the maximum possible value of X and the expected value E[X], is bounded by a
polynomial size of the input, we can use the above Monte Carlo method to estimate
E[X] with a polynomial number of samples.
20
CHAPTER 3. RELATED WORK
Chapter 3 Related Work
In this chapter, we will show some prior work about our research problems. We start
with introducing some related work about shape fitting problems in both deterministic
and stochastic settings. Next, we discuss related work for coreset construction. Then
we review some prior work about computing the expected value of combinatorial
optimization problems in different stochastic models. We also briefly mention some
other stochastic models that is conceptually or technically related to the dissertation.
Shape fitting problem. A number of theoretical results for the k-center problem
have been obtained in the past. In deterministic settings, Agarwal and Procopiuc
[11] considered the k-center problem and showed that there exists an additive coreset
of a constant size which can represent the whole input point set if k and d are both
constants. Har-Peled and Varadarajan [61] improved their result in high dimensions
and gave an PTAS if k is a constant. Cormode and McGregor [30] considered the
k-center problem for the locational model in a finite metric graph, and obtained a
bi-criterion constant approximation. Guha and Munagala [53] improved the result
to a true constant factor approximation. Munteanu et al. [88] studied the minimum
enclosing ball problem (a.k.a. 1-center problem) for stochastic points in fixed dimen-
sional Euclidean space and gave a PTAS. Coresets were also constructed for imprecise
points [85] to help derive results for approximating convex hulls and a variety of oth-
er shape-fitting problems. Note that their model is different from the existential or
locational models.
Other projective clustering problems have also been studied extensively. If k is
a constant, the existence of an additive coreset of a constant size for k-line-center,
i.e., for the problem of covering P by k congruent cylinders of the minimum radius,
was first proved by Agarwal et al. [11]. Har-Peled and Varadarajan [61] obtained
an PTAS for k j-flat-center problem while j and k are both constants. Langberg
21
CHAPTER 3. RELATED WORK
and Schulman [77] showed that for the weighted k-median/k-means problem, 1 there
exists an ε-coreset of size depending polynomially on d and k by bounding the total
sensitivity. Varadarajan and Xiao [106] studied the k-line clustering problem and the
(j, k) integer projective clustering problem, and showed that there exists an ε-coreset
of a poly-logarithmic size.
Coreset construction. There is a large body of literature [93] on constructing
coresets for various problems, such as shape fitting [7, 8], shape fitting with outliers
[62], clustering [27, 45, 47, 60, 77], integrals [77], matrix approximation and regression
[34, 45] and in different settings, such as geometric data streaming [8, 26] and privacy
setting [43]. We have introduced some results for shape fitting problems. In this
part, we review other applications of coreset construction. For example, Har-Peled
and Wang [62] provided a coreset construction approach for handling outliers. From
the dual (function extent) perspective, they want to approximate the distance between
two level sets in an arrangement of hyperplanes. In the locational model, coresets
are created for range counting queries [1] under the subset constraint, but these
techniques do not translate to this setting because ε-kernel coresets in general cannot
be constructed from a density-preserving subset of the data, as is preserved for the
range counting coresets.
Estimation for stochastic combinatorial optimization. Several geometric prop-
erties of a set of stochastic points have been studied extensively in the literature under
the term stochastic geometry. For instance, Bearwood et al. [21] showed that if there
are n points uniformly and independently distributed in [0, 1]2, the minimal traveling
salesman tour visiting them has an expected length Ω(√n). Asymptotic results for
minimum spanning trees and minimum matchings on n points uniformly distributed
in unit balls are established by Bertsimas and van Ryzin [23]. Similar results can be
found in e.g., [22, 73, 98]. Compared with results in stochastic geometry, we focus
on the efficient computation of the statistics, instead of giving explicit mathematical
formulas.
1The k-median/k-means problem in the existential uncertainty model can be considered as aweighted k-median/k-means problem.
22
CHAPTER 3. RELATED WORK
Recently, a number of researchers have begun to explore geometric computing
under uncertainty and many classical computational geometry problems have been
studied in different stochastic/uncertainty models. Agarwal, Cheng, Tao and Yi [3]
studied the problem of indexing probabilistic points with continuous distributions for
range queries on a line. Agarwal, Efrat, Sankararaman, and Zhang [5] also studied
the same problem in the locational uncertainty model under Euclidean metric. The
most probable k-nearest neighbor problem and its variants have attracted a lot of
attentions in the database community (See e.g., [28]). Several other problems have
also been considered recently, such as computing the expected volume of a set of
probabilistic rectangles in a Euclidean space [109], convex hulls [6], skylines (Pareto
curves) over probabilistic points [2, 15], and shape fitting [83].
Kamousi, Chan and Suri [71] initiated the study of estimating the expected length
of combinatorial objects in this model. They showed that computing the expected
length of the nearest neighbor (NN) graph, the Gabriel graph (GG), the relative
neighborhood graph (RNG), and the Delaunay triangulation (DT) can be solved
exactly in polynomial time, while computing E[MST] is #P-hard and there exists a
simple FPRAS for approximating E[MST] in the existential model. They also gave
a deterministic PTAS for approximating E[MST] in an Euclidean plane. In another
paper [72], they studied the closest pair and (approximate) nearest neighbor problems
(i.e., finding the point with the smallest expected distance from the query point) in
the same model.
The computational/algorithmic aspects of stochastic geometry have also gained a
lot of attention in recent years from the area of wireless networking. In many appli-
cation scenarios, it is common to assume that the nodes (e.g., sensors) are deployed
randomly across a certain area, thereby forming a stochastic network. It is of central
importance to study various properties in this network, such as connectivity [55],
transmission capacity [56]. We refer the interested reader to a recent survey [57] for
more references.
Other stochastic models. Besides the stochastic geometry models, geometric un-
certain data has also been studied in the imprecise model [16, 64, 76, 84, 89, 92, 104].
23
CHAPTER 3. RELATED WORK
In this model, each point is provided with a region where it might be. This originated
with the study of imprecision in data representation [54, 94], and can be used to pro-
vide upper and lower bounds on several geometric constructs such as the diameter,
convex hull, and flow on terrains [36, 104].
Convex hulls have been studied for uncertain points: upper and lower bounds are
provided under the imprecise model [41, 85, 89, 104], distributions of circumference
and volume are calculated in the locational model [69, 83], the most likely convex hull
is found in the existential model in R2 and shown NP-hard for Rd for d > 2 and in the
locational model [101], and the probability a query point is inside the convex hull [6].
As far as we know, the expected complexity of the convex hull under uncertain points
has not been studied, although it has been studied [59] under other random data
models.
The randomly weighted graph model where the edge weights are independent non-
negative variables has also been studied extensively. Frieze [48] and Steele [99] showed
that the expected value of the minimum spanning tree on such a graph with identi-
cally and independently distributed edges is ζ(3)/D where ζ(3) =∑∞
j=1 1/j3 and D
is the derivative of the distribution at 0. Alexopoulos and Jacobson [13] developed
algorithms that compute the distribution of MST and the probability that a partic-
ular edge belongs to MST when edge lengths follow discrete distributions. However,
the running time of their algorithms may be exponential in the worst cases. Recently,
Emek, Korman and Shavitt [40] showed that computing the kth moment of a class
of properties, including the diameter, radius and minimum spanning tree, admits an
FPRAS for each fixed k.
24
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Chapter 4 ε-Kernel Coresets over Stochastic
Data
In this chapter, we initiate the study of constructing ε-kernel coresets for uncertain
points. An ε-kernel coreset approximates the width of a point set in any direction.
We consider approximating the expected width (an ε-exp-kernel), as well as the
probability distribution on the width (an (ε, τ)-quant-kernel) for any direction.
Then combining with known techniques, we show a few applications to approximating
the extent of uncertain functions, maintaining extent measures for stochastic moving
points and some stochastic shape fitting problems. We first briefly introduce how to
construct an ε-kernel coreset over deterministic point sets.
4.1 ε-Kernel Coresets over Deterministic Data
For a set P of n deterministic points, recall that we define f(P, ~u) to be f(P, ~u) =
maxs∈P 〈~u, s〉 for any direction ~u ∈ Rd and ω(P, ~u) to be ω(P, ~u) = f(P, ~u) + f(P,−~u).
Also recall that a subset Q ⊆ P is called an ε-kernel of P if for any direction ~u ∈ Rd,
(1−ε)ω(P, ~u) ≤ ω(Q,~u) ≤ ω(P, ~u). In fact, there exists an ε-kernel of size O(ε−(d−1)/2)
with construction time O(n + ε−(d−3/2)), see [26, 110]. In the following, we briefly
review the construction.
By Barequet and Har-Peled [19], we first compute a bounding box B in linear
time, which has volume at most constant times of the minimum one. By their con-
struction, we also have the property that αB ⊆ ConvH(B) ⊆ B. Here, ConvH(B) is
the convex hull of B and α > 0 is some constant depending on d. By applying an
affine transformation, we assume that B = [−1, 1]d without loss of generality. This is
because if M(S) is an ε-kernel of M(P ) for a non-singular matrix M , then S must be
an ε-kernel of P . Since αB ⊆ ConvH(B), we have ω(P, ~u) ≥ 2α for any direction ~u.
25
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Let ε′ =√εα. We first construct a set J of size O(ε′−d+1) = O(ε−(d−1)/2) belonging
to the sphere of radius√d + 1 centered at the origin, satisfying that for any point
x on this sphere, there exists a point y ∈ J such that ‖x − y‖ ≤ ε′. Here, ‖x − y‖is the Euclidean distance between x and y. Next, for each point y ∈ J , we compute
a point φ(y) ∈ P which minimizes the distance ‖y − φ(y)‖. 1 The output is the
collection S of all such φ(y), which is an ε-kernel of P . Note that the size of S is at
most O(ε−(d−1)/2).
We then briefly prove the output is an ε-kernel. Fix a direction ~u and let s ∈ P be
the point maximizing 〈~u, s〉. Suppose the ray starting from s in direction ~u intersects
the sphere at point x. By the construction of J , there exists a point y ∈ J with
‖x− y‖ ≤ ε′. We then discuss the following two cases.
1. If φ(y) = s, then s ∈ S and f(P, ~u) = f(S, ~u).
2. If φ(y) 6= s, we construct a ball of radius ‖y−s‖ centered at y. Since ‖y−φ(y)‖ ≤‖y− s‖ by the above algorithm, we have that φ(y) must locate inside this ball.
Assume z is the point minimizing 〈~u, z′〉 over all points z′ in this ball. By
previous work [26, 88], it can be shown that 〈~u, s〉− 〈~u, z〉 ≤ αε. Thus, we have
f(P, ~u)− f(S, ~u) ≤ 〈~u, s〉 − 〈~u, φ(y)〉 ≤ 〈~u, s〉 − 〈~u, z〉 ≤ αε.
The above two cases imply that ω(P, ~u)−ω(S, ~u) ≤ 2αε ≤ εω(P, ~u). So the output
S is indeed an ε-kernel of P .
In the above definition, we do not require the points in S are independent. So
when they are correlated, we will specify the distribution of S. If all points in P are
deterministic and τ < 0.5, the above definition essentially boils down to requiring
(1 − ε)ω(P , ~u) ≤ ω(S, ~u) ≤ (1 + ε)ω(P , ~u). Assuming the coordinates of the input
points are bounded, an (ε, τ)-quant-kernel ensures that for any choice of ~u, the
cumulative distribution function of ω(S, ~u) is within a distance ε under the Levy
1In fact, we only need to compute an ε-approximate nearest-neighbor φ(y) ∈ P of y, whichimproves the running time.
26
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
metric, to that of ω(P , ~u). 2.
4.2 ε-Kernels for Expectations of Width
First recall the definition of ε-exp-kernel. Suppose P is a set of stochastic points (in
either the existential or locational uncertainty model). Define the expected directional
width of P in direction ~u to be ω(P , ~u) = EP∼P [ω(P, ~u)], where P ∼ P means that
P is a (random) realization of P .
Definition 15. For a constant ε > 0, a set S of (deterministic or stochastic) points
in Rd is called an ε-exp-kernel of P, if for all directions ~u ∈ Rd,
(1− ε)ω(P , ~u) ≤ ω(S, ~u) ≤ ω(P , ~u).
We first state our results in this section for the existential uncertainty model. All
results can be extended to the locational uncertainty model, with slightly different
bounds (essentially replacing the number of points n with the number of locations
m) or assumptions. We describe the difference for locational model in the appendix.
For simplicity of exposition, we assume in this section that all points in P are in
general positions and all pis are strictly between 0 and 1. For any s, s′ ∈ Rd, we use
〈s, s′〉 to denote the usual inner product∑d
i=1 sis′i. For ease of notation, we write
s s s′ as a shorthand notation for 〈s, s〉 > 〈s′, s〉. For any s′ ∈ Rd, the binary
relation s′ defines a total order of all vertices in P . (Ties should be broken in an
arbitrary but consistent manner.) We call this order the canonical order of P with
respect to s. For any two points s and s′, we use d(s, s′) or ‖s − s′‖ to denote their
Euclidean distance. For any two sets of points, A and B, the Minkowski sum of
A and B is defined as A ⊕ B := a + b | a ∈ A, b ∈ B. Recall the definitions
for a set P of deterministic points and a direction ~u ∈ Rd, the support function is
f(P, ~u) = maxs∈P 〈~u, s〉 and the directional width is ω(P, ~u) = f(P, ~u)− f(P,−~u). The
2Assuming the coordinates of the input points are bounded, the requirement for an (ε, τ)-quant-kernel is in fact stronger than that of Levy distance being no larger than ε as the former requiresa multiplicative error on length, which gives better guarantee when the length is small.
27
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
support function and the directional width only depend on the convex hull of P .
Lemma 16. Consider a set P of uncertain points in Rd (in either locational uncer-
tainty model or existential uncertainty model). There exists a set S of deterministic
points in Rd (which may not be a subset of P) such that ω(P , ~u) = ω(S, ~u) for all
~u ∈ Rd.
Proof. By the definition of the expected directional width of P , we have that
ω(P , ~u) = EP∼P [ω(P, ~u)] =∑
P∼PPr[P ]
(f(P, ~u) + f(P,−~u)
).
Consider the Minkowski sum M = M(P) :=∑
P∼P Pr[P ]ConvH(P ), where ConvH(P )
is the convex hull of P (including the interior). It is well known that the Minkowski
sum of a set of convex sets is also convex. Moreover, it also holds that for all ~u ∈ Rd
(see e.g., [95]) f(M,~u) =∑
P∼P Pr[P ]f(P, ~u). Hence, ω(P , ~u) = ω(M,~u) for all ~u ∈Rd.
By the result in [7], we know that for any convex body in Rd, there exists an
ε-kernel of size O(ε−(d−1)/2). Combining with Lemma 16, we can immediately obtain
the following corollary, which is the first half of Theorem 2.
Corollary 17. For any ε > 0, there exists an ε-exp-kernel of size O(ε−(d−1)/2).
Recall that in Lemma 16, the Minkowski sum M =∑
P∼P Pr[P ]ConvH(P ). Since
M is the Minkowski sum of exponential many convex polytopes, so M is also a
convex polytope. At first sight, the complexity of M (i.e., number of vertices) could
be exponential. However, as we will show shortly, the complexity of M is in fact
polynomial.
We need some notations first. For each pair (s, s′) of points in P consider the hy-
perplane Hs,s′ that passes through the origin and is orthogonal to the line connecting
s and s′. We call these(n2
)hyperplanes the separating hyperplanes induced by P and
use Γ to denote the set. Each such hyperplane divides Rd into 2 halfspaces. For all
directions ~u ∈ Rd in each halfspace, the order of 〈s, ~u〉 and 〈s′, ~u〉 is the same (i.e., we
28
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
have s u s′ in one halfspace and s′ u s in the other). Those hyperplanes in Γ pass
through the origin and thus partition Rd into d-dimensional polyhedral cones. 3 We
denote this arrangement as A(Γ).
Consider an arbitrary cone C ∈ A(Γ). Let intC denote the interior of C. We can
see that for all directions ~u ∈ intC, the canonical order of P with respect to ~u is the
same (since all directions ~u ∈ intC lie in the same set of halfspaces). We use |M | to
denote the complexity of M , i.e., the number of vertices in ConvH(M).
Lemma 18. Assuming the existential model and pi ∈ (0, 1) for all s ∈ P, the com-
plexity of M is the same as the cardinality of A(Γ), i.e., |M | = |A(Γ)|. Moreover,
each cone C ∈ A(Γ) corresponds to exactly one vertex s of ConvH(M) in the follow-
ing sense: the gradient ∇f(M,~u) = s for all ~u ∈ intC (note that here s should be
understood as a vector).
Proof. We have shown that M is a convex polytope. We first note that the support
function uniquely defines a convex body (see e.g., [95]). We need the following well
known fact in convex geometry (see e.g., [49]): For any convex polytope M , Rd can be
divided into exactly |M | polyhedral cones (of dimension d, ignoring the boundaries),
such that each such cone Cs corresponds to a vertex s of M , and for each vector
~u ∈ Cs, it holds f(M,~u) = 〈~u, s〉 (i.e., the maximum of f(M,~u) = maxs′∈M〈~u, s′〉 is
achieved by s for all ~u ∈ Cs).4 See Figure 4-1 for an example in R2. Hence, for each
~u ∈ intCs the gradient of of the support function (as a function of ~u) is exactly s:
∇f(M,~u) =∂f(M,~u)
∂~uj
j∈[d]
=∂〈~u, s〉
∂~uj
j∈[d]
=∂∑j∈[d] sj~uj
∂~uj
j∈[d]
= s, (4.1)
where ~uj is the jth coordinate of ~u. With a bit abuse of notation, we denote the set
of cones defined above by A(M).
Now, consider a cone C ∈ A(Γ). We show that for all ~u ∈ intC, ∇f(M,~u) is a
distinct constant vector independent of ~u. In fact, we know that f(M,~u) = f(P , ~u) =
3We ignore the lower dimensional cells in the arrangement.4One intuitive way to see this is as follows: The support function for a polytope is just the upper
envelope of a finite set of linear functions, thus a piecewise linear function, and the domain of eachpiece is a polyhedral cone. In fact, we call such a cone Cs the outer normal cones.
29
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
∑s∈P PrR(s, ~u)〈s, ~u〉, where PrR(s, ~u) =
∏s′~us(1 − ps′)ps. For all ~u ∈ intC, the
PrR(s, ~u) value is the same since the value only depends on the canonical order with
respect to ~u, which is the same for all ~u ∈ C. Hence, we can get that for all ~u ∈ intC,
∇f(M,~u) =∑
s∈PPrR(s, ~u)s, (4.2)
which is a constant independent of ~u. We prove the lemma by showing that the
gradient ∇f(M,~u) must be different for two adjacent cones C1, C2 (separated by
some hyperplane in Γ) in A(Γ). Suppose ~u1 ∈ intC1 and ~u2 ∈ intC2. Consider the
canonical orders O1 and O2 of P with respect to ~u1 and ~u2 respectively. Since C1 and
C2 are adjacently, O1 and O2 only differ by one swap of adjacent vertices. W.l.o.g.,
assume that O1 = s1, . . . , si, si+1, . . . , sn and O1 = s1, . . . , si+1, si, . . . , sn. Using
(4.2), we can get that
∇f(M,~u1)−∇f(M,~u2) = PrR(si, ~u1)si + PrR(si+1, ~u1)si+1 − PrR(si, ~u2)si − PrR(si+1, ~u2)si+1
= D · (pisi + (1− pi)pi+1si+1 − pi+1si+1 − (1− pi+1)pisi)
= D · pipi+1(si − si+1) 6= 0
where D =∏i−1
j=1(1− pj) 6= 0.
In summary, we have shown in the first paragraph that ∇f(M,~u) is piecewise
constant, with a distinct constant in each cone in A(M). The same also holds for
A(Γ). This is only possible if A(Γ) (thinking as a partition of Rd) partitions Rd
exactly the same way as A(M) does. Hence, we have A(Γ) = A(M) and the lemma
follows immediately.
Since O(n2) hyperplanes passing through the origin can divide Rd into at most
O((n2
d−1
)) d-dimensional polyhedral cones (see e.g., [12]), we immediately obtain the
following corollary.
Corollary 19. It holds that |M | ≤ O((n2
d−1
)) = O(n2d−2).
The proof of Lemma 18 can be easily made constructive. We only need to compute
30
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
~uθ1
~uθ2
~uθ3
~uθ4
~uθ5
o
M
s1
s2
s3
s4
s5
C1C2
C3C4
C5
Figure 4-1: The figure depicts a pentagon M in R2 to illustrate some intuitive factsin convex geometry. (1) The plane can be divided into 5 cones C1, . . . , C5, by 5angles θ1, . . . , θ5. ~uθi is the unit direction corresponding to angle θi. Each coneCi corresponds to a vertex si and for any direction ~u ∈ Ci, f(M,~u) = 〈~u, si〉 andthe vector ∇f(M,~u) is si. (2) Each direction θi is perpendicular to an edge of M .M = ∩5
i=1Hi where Hi is the supporting halfplane with normal vector ~uθi .
the set Γ of all O(n2) hyperplanes and the arrangement A(Γ) in O(n2d−2) time (see
e.g., [12, 39]). Given each cone C ∈ A(Γ), we can calculate ∇f(M,~u) for any ~u ∈ C,
which gives exactly one vertex of M by (4.1), in O(n log n) time using the algorithm
described in Lemma 49.
Theorem 20. In Rd for constant d, the polytope M which defines f(P , ~u) for any
direction ~u can be described with O(n2d−2) vertices in Rd, and can be computed in
O(n2d−1 log n) time. In R2, the runtime can be improved to O(n2 log n).
The improved running time in R2 is derived in Lemma 50 by carefully constructing
each vertex of M in O(1) time using its neighboring vertex. The extra O(log n) is
needed to sort the vertices of M to determine neighbors.
4.2.1 A Nearly Linear Time Algorithm for Constructing ε-exp-kernels
Now, we prove the main algorithmic result Theorem 2 of this section: we can find an
ε-exp-kernel in nearly linear time. If we already have the Minkowski sum M , we
can directly use the algorithm in [7] to find an ε-kernel for M . However, constructing
M explicitly takes O(n2d−1 log n) time according to Theorem 20 and this cannot be
improved in general as the complexity of M is O(n2d−2). Therefore, in order to
31
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
achieve a nearly linear time coreset construction, we can not compute M explicitly.
For ease of description, we first consider existential uncertainty model. The details
for locational uncertainty model can be found in Section 4.6.
Theorem 21. (second half of Theorem 2, for existential model) P is a set of n uncer-
tain points in Rd with existential uncertainty. An ε-exp-kernel of size O(ε−(d−1)/2)
for P can be constructed in O(ε−(d−1)n log n) time.
The following simple lemma provides an efficient procedure for finding the extreme
vertex in M along any give direction, and is useful in several places later as well.
Lemma 22. Given any direction ~u ∈ Rd, we can find in O(n log n) time a vertex
s? ∈M , at which 〈s, ~u〉 is maximized, over all s ∈M .
Proof. Fix an arbitrary direction ~u ∈ Rd. From the proof of Lemma 18 (in particular
(4.1)), we know that the vertex s? ∈ M that maximizes 〈s, ~u〉 can be computed by
s? = ∇f(M,~u). Using (4.2), ∇f(M,~u) can be easily computed in O(n log n) time (see
Lemma 49 for the details).
Next, we need to find an affine transform T such that the convex polytope M ′ =
T (M) is α-fat for some constant α. We recall that a set P of points is α-fat, for
some constant α ≤ 1, if there exists a point x ∈ Rd, and a unit hypercube C centered
at x such that αC ⊂ ConvH(P ) ⊂ C. According to Chapter 22 in [58], in order to
construct such T , it suffices to identify two points in M such that their distance is
a constant approximation of the diameter of M . The following lemma (proven in
Section 4.6) shows this can be done without computing M explicitly.
Lemma 23. We find an affine transform T in O(2O(d)n log n) time, such that the
convex polytope M ′ = T (M) is α-fat for some constant α (α may depend on d).
After obtaining T , we apply T to P in linear time. Notice that M ′ = T (M(P)) =
M(T (P)). Therefore, Lemma 22 also holds for M ′ (i.e., we can search over M ′ the
maximum vertex in any given direction in O(n log n) time).
Let δ = O(εα/d). We compute a set I of O(δ−(d−1)) = O(ε−(d−1)) points on the
unit sphere Sd−1 such that for any point s ∈ Sd−1, there is a point s ∈ I such that
32
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
‖s− s′‖ ≤ δ (see e.g., [10, 25]). For each s in I, we include −s in I as well. For each
vector s ∈ I, we compute x(t) = arg maxx∈M ′〈x, s〉. Based on the previous discussion,
all x(s)s∈I can be computed in O(δ−(d−1)n log n) = O(ε−(d−1)n log n) time.
Lemma 24. S = x(s)s∈I is an ε-kernel for M ′. 5
Finally, we can then run existing ε-kernel algorithms [26, 88] in O(|S|) time to
further reduce the size of S to O(ε−(d−1)/2), which finishes the proof of Theorem 21.
Lemma 18 and Theorem 21 hold for locational uncertainty models as well. The details
can be found in Section 4.6.
4.2.2 ε-exp-kernel Under the Subset Constraint
First, we show that under the subset constraint (i.e., the ε-exp-kernel is required
to be a subset of the original point set, with the same probability distribution for
each chosen point), there exists no ε-exp-kernel with small size in general. 6
Lemma 25. For some constant ε > 0, there exist a set P of stochastic points such
that no o(n) size ε-exp-kernel exists for P under the subset constraint (for both
locational model and existential model).
Proof. To see this in the existential uncertainty model, simply consider n points, each
with existence probability 1/n. n/2 of them co-locate at the origin and the other n/2
of them co-locate at x = 1. It is not hard to see that the expected length of the
diameter is Ω(1) but the expected length of the diameter of any o(n) size subset is
only o(1) (with high probability, no point would even appear).
The case for the locational uncertainty model is as simple. Again, consider n
points. For each point, with probability 1/n, it appears at x = 1. Otherwise, its
position is the origin (with probability 1−1/n). It is not hard to see that the expected
length of the diameter of the original point set is Ω(1), while that of any o(n) size
subset is only o(1) (with high probability, no point would realize at x = 1).
5This is a folklore result. A proof of the 2D case can be found in [31]. The general case is astraightforward extension and we provide a proof in Section 4.6 for completeness.
6If we require the ε-exp-kernel to be a subset of the original point set, but with possibly differentprobability for each chosen point, we do not know whether there always exists an ε-exp-kernelwith small size.
33
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
In light of the above negative result, we make the following β-assumption: we as-
sume each possible location realizes a point with probability at least β, for a constant
β > 0. The proof of the following theorem can be found in Section 4.6.
Theorem 26. Under the β-assumption, in the existential uncertainty model, there is
an ε-exp-kernel in Rd of size O(β−(d−1)ε−(d−1)/2 log(1/ε)) that satisfies the subset
constraint.
4.3 ε-Kernels for Probability Distributions of Width
Recall S is an (ε, τ)-quant-kernel if for all x ≥ 0, PrP∼P[ω(P, ~u) ≤ (1−ε)x
]−τ ≤
PrP∼S[ω(S, ~u) ≤ x
]≤ PrP∼P
[ω(P, ~u) ≤ (1 + ε)x
]+ τ. For ease of notation, we
sometimes write Pr[ω(P , ~u) ≤ t
]to denote PrP∼P
[ω(P, ~u) ≤ t
], and abbreviate the
above as Pr[ω(S, ~u) ≤ x
]∈ Pr
[ω(P , ~u) ≤ (1 ± ε)x
]± τ. We first provide a simple
linear time algorithm for constructing an (ε, τ)-quant-kernel for both existential
and locational models, in Section 4.3.1. The points in the constructed kernel are
not independent. Then, for existential models, we provide a nearly linear time (ε, τ)-
quant-kernel construction where all stochastic points in the kernel are independent
in Section 4.3.2.
4.3.1 A Simple (ε, τ)-quant-kernel Construction
In this section, we show a linear time algorithm for constructing an (ε, τ)-quant-
kernel for any stochastic model if we can sample a realization from the model in
linear time (which is true for both locational and existential uncertainty models).
Algorithm:coreset. Let N = O(τ−2ε−(d−1) log 1
ε
). We sample N independent real-
izations from the stochastic model. Let Hi be the convex hull of the present points
in the ith realization. For Hi, we use the algorithm in [7] to find a deterministic
ε-kernel Ei of size O(ε−(d−1)/2). Our (ε, τ)-quant-kernel S is the following simple
stochastic model: with probability 1/N , all points in Ei are present. Hence, S consists
of O(τ−2ε−3(d−1)/2 log 1
ε
)points (two such points either co-exist or are mutually ex-
34
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
clusive). Hence, for any direction ~u, Pr[ω(S, ~u) ≤ t] = 1N
∑Ni=1 I(ω(Ei, ~u) ≤ t), where
I(·) is the indicator function.
For a realization P ∼ P , we use E(P ) to denote the deterministic ε-kernel for P .
So, E(P ) is a random set of points, and we can think of E1, . . . , EN as samples from
the random set. Now, we show S is indeed an (ε, τ)-quant-kernel. We start with
the following simple observation.
Observation 27. For any t ≥ 0 and any direction ~u, we have that
Pr[ω(P , ~u) ≤ t] ≤ PrP∼P [ω(E(P ), ~u) ≤ t] ≤ Pr[ω(P , ~u) ≤ (1 + ε)t].
Proof. For any realization P of P , we have 11+ε
ω(P, ~u) ≤ ω(E(P ), ~u) ≤ ω(P, ~u). The
observation follows by combining all realizations.
We only need to show that S is an (ε, τ)-quant-kernel for E(P ). We need the
following two theorems.
Theorem 28. (Theorem 5.22 in [58]) (VC-dimension) Let S1 = (X,R1), . . . , Sk =
(X,Rk) be range spaces with VC-dimension δ1, . . . , δk, respectively. Next, let f(r1, . . . , rk)
be a function that maps any k-tuple of sets r1 ∈ R1, . . . , rk ∈ Rk into a subset of X.
Consider the range set
R′ = f(r1, . . . , rk) | r1 ∈ R1, . . . , rk ∈ Rk
and the associated range space (X,R′). Then, the VC-dimension of (X,R′) is bounded
by O(kδ log k), where δ = maxi δi.
Suppose (X,R) is a range space and µ is a probability measure over X. We say a
subset C ⊂ X is an ε-approximation of the range space if for any rangeR ∈ R, we have∣∣∣µC(R) − µ(R)∣∣∣ ≤ ε, where µC(R) = |C ∩ R|/|C|. We need the following celebrated
uniform convergence result, first established by Vapnik and Chervonenkis [105].
Theorem 29. (See Theorem 4.9 in [14]) Suppose (X,R) is any range space with
VC-dimension at most V , where |X| is finite and µ is a probability measure defined
35
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
over X. For any ε, δ > 0, a random subset C ⊆ X (according to µ) of cardinality
s = O (ε−2 (V + log(1/δ))) is an ε-approximation for X with probability 1− δ.
Now, we are ready to prove the main lemma in this section.
Lemma 30. Let N = O(τ−2ε−(d−1) log(1/ε)). For any t ≥ 0 and any direction ~u,
Pr[ω(S, ~u) ≤ t] ∈ PrP∼P [ω(E(P ), ~u) ≤ t]± τ.
Proof. Let L = O(ε−(d−1)/2). We first note that E(P ) has at most nL possible re-
alizations since each ε-kernel is of size at most L, We first build a mapping g that
maps each realization E(P ) to a point in RdL, as follows: Consider a realization P of
P . Suppose E(P ) = (x11, . . . , x
1d), . . . , (x
L1 , . . . , x
Ld ) (if |E(P )| < L, we pad it with
(0, . . . , 0)). We let
g(E(P )) = (x11, . . . , x
1d, . . . , x
L1 , . . . , x
Ld ) ∈ RdL.
For any t ≥ 0 and any direction ~u ∈ Rd, note that ω(E(P ), ~u) ≥ t holds if and only if
there exists some 1 ≤ i, j ≤ |E(P )|, i 6= j satisfies that∑d
k=1(xik− xjk)~uk ≥ t, which is
equivalent to saying that point g(E(P )) is in the union of those O(|E(P )|2) halfspaces
(for each i, j, we have one such halfspace).
Let X be the image set of g. Let (X,Ri,j) (1 ≤ i, j ≤ L, i 6= j) be a range space,
where Ri,j is the set of halfspaces ~u = (~u1, . . . , ~ud) ∈ Rd | ∑dk=1(xik − xjk)~uk ≥ t.
Let R′ = ∪ri,j | ri,j ∈ Ri,j, i, j ∈ [L]. Note that each (X,Ri,j) has VC-dimension
d + 1. By Theorem 28, we can see that the VC-dimension of (X,R′) is bounded by
O((d+ 1)L2 lgL2) = O(ε−(d−1) log(1/ε)). Notice that S = E1, . . . , EN is a collection
of samples from E(P ). Hence, by Theorem 29, for any t and any direction ~u, we have
that Pr[ω(S, ~u) ≤ t] ∈ PrP∼P [ω(E(P ), ~u) ≤ t]± τ.
Combining Observation 27 and Lemma 30, we obtain the following theorem.
Theorem 31. Let N = O(τ−2ε−(d−1) log(1/ε)). For any t ≥ 0 and any direction ~u,
we have that
Pr[ω(S, ~u) ≤ t] ∈ Pr[ω(P , ~u) ≤ (1± ε)t)]± τ.
36
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Running time. In each sample, the size of an ε-kernel Ki is at most O(ε−(d−1)/2
).
Note that we can computeKi inO(n+ε−(d−3/2)) time [26, 88]. We takeO(τ−2ε−(d−1) log(1/ε)
)
samples in total. So the overall running time is O(nτ−2ε−(d−1) log(1/ε) + poly( 1
ετ))
=
O(nτ−2ε−(d−1)
). In summary, we obtain our main result for (ε, τ)-quant-kernel
in this subsection.
Theorem 4. (restated) An (ε, τ)-quant-kernel of size O(τ−2ε−3(d−1)/2
)can be
constructed in O(nτ−2ε−(d−1)
)time, under both existential and locational uncertainty
models.
4.3.2 Improved (ε, τ)-quant-kernel for Existential Models
In this section, we show an (ε, τ)-quant-kernel S can be constructed in nearly lin-
ear time for the existential model, and all points in S are independent of each other.
The size bound O(τ−2ε−(d−1)) (see Theorem 5) is better than that in Theorem 4 for
the general case, and the independence property may be useful in certain application-
s. Moreover, some of the insights developed in this section may be of independent
interest (e.g., the connection to Tukey depth). Due to the independence requirement,
the construction is somewhat more involved. For ease of the description, we assume
the Euclidean plane first. All results can be easily extended to Rd. We also assume
that all probability values are strictly between 0 and 1 and 0 < ε, τ ≤ 1/2 is a fixed
constant.
Let λ(P) =∑
si∈P(− ln(1 − pi)). In the following, we present two algorithms.
The first algorithm works for any λ(P) and produces an (ε, τ)-quant-kernel Swhose size depends on λ(P). In Section 4.3.2, we present the second algorithm that
only works for λ(P) ≥ 3 ln(2/τ) but produces an (ε, τ)-quant-kernel S with a
constant size (the constant only depends on ε, τ and δ). Thus, we can get a constant
size (ε, τ)-quant-kernel by running the first algorithm when λ(P) ≤ 3 ln(2/τ) and
running the second algorithm otherwise.
37
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Algorithm 1: For Any λ(P)
In this section, we present the first algorithm which works for any λ(P). We can
think of each point s associated with a Bernoulli random variable Xs that takes value
1 with probability ps and 0 otherwise. Now, we replace the Bernoulli random variable
Xs by a Poisson distributed random variable Xs with parameter λs = − ln(1 − ps)(denoted by Pois(λs)), i.e., Pr[Xs = k] = 1
k!λks e
−λs , for k = 0, 1, 2, . . . . Here, Xs = k
means that there are k realized points located at the position of s. We call the new
instance the Poissonized instance corresponding to P . We can check that Pr[Xs =
0] = e−λs = 1− ps = Pr[Xs = 0]. Also note that co-locating points do not affect any
directional width, so the Poissonized instance is essentially equivalent to the original
instance for our problem.
The construction of the (ε, τ)-quant-kernel S is as follows: Let A be the prob-
ability measure over all points in P defined by A(s) = λs/λ for every s ∈ P , where
λ := λ(P) =∑
s∈P λs. Let τ1 be a small positive constant to be fixed later. We
take N = O(τ−21 ) independent samples from A (we allow more than one point to
be co-located at the same position), and let B be the empirical measure, i.e., each
sample point having probability 1/N . The coreset S consists of the N sample points
in B, each with the same existential probability 1−exp (−λ/N). A useful alternative
view of S is to think of each point associated with a random variable Yv following dis-
tribution Pois(λ/N) (i.e., the Poissonized instance corresponding to S). This finishes
the description of the construction.
Now, we start the analysis. Our goal is to show that S is indeed an (ε, τ)-quant-
kernel. The following theorem is a special case of Theorem 29 (specialized to the
range space consisting of all halfplanes), which shows that the empirical measure B
is close to the original measure A with respect to all half spaces.
Theorem 32. [14, 82] We denote the set of all halfplanes by H. With probability
1−δ, the empirical measure B (defined by N = O(τ−21 log(1/δ)) independent samples)
satisfies the following:
supH∈H
|A(H)−B(H)| ≤ τ1.
38
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
From now on, we assume that B satisfies the statement of Theorem 32. We first
observe a simple but useful lemma, which is a consequence of Theorem 32. For a
halfplane H, we use H |= 0 to denote the event that no point is realized in H.
Lemma 33. With probability 1− δ, for any halfplane H ∈ H, we have that
PrS[H |= 0
]∈ (1±O(λτ1))PrP
[H |= 0
].
Proof. Fix an arbitrary halfplane H ∈ H. Consider the Poissonized instance corre-
sponding to P . We first observe that PrP[H |= 0
]= PrP
[∑s∈P∩H Xs = 0
]. Since Xs
follows distribution Pois(λs),∑
s∈P∩H Xs follows Poisson distribution Pois(∑
s∈P∩H λs).
Similarly, we have that PrS[H |= 0
]= PrS
[∑s∈S∩H Yv = 0
]since
∑s∈S∩H Yv follows
Pois(∑
s∈S∩H λ/N). Hence, we can see the following:
PrP[H |= 0
]= exp
(−∑
s∈P∩Hλs)
= exp(−λA(H)
)
∈ exp(−λ(B(H)± τ1)
)= exp
(−∑
s∈S∩H
λ
N± τ1λ
)
∈ (1±O(λτ1)) exp(−∑
s∈S∩H
λ
N
)= (1±O(λτ1))PrS
[H |= 0
].
The first inequality follows from Theorem 32 and the second is due to the fact that
e−ε ≥ 1− ε and eε ≤ 1 + (e− 1)ε for any 0 < ε < 1.
For two real-valued random variables X, Y , we define the Kolmogorov distance
dK(X, Y ) between X and Y to be dK(X, Y ) = supt∈R |Pr[X ≤ t] − Pr[Y ≤ t]|. We
also need the following simple lemma.
Lemma 34. Suppose we have four independent random variables X, X ′, Y and Y ′
such that dK(X,X ′) ≤ ε and dK(Y, Y ′) ≤ ε for some ε ≥ 0. Then, dK(X + Y,X ′ +
Y ′) ≤ 2ε.
Proof. We need the following useful elementary fact about Kolmogorov distance: Let
X, Y, Z be real-valued random variables such that X is independent of Y and indepen-
dent of Z. Then we have that dK(X+Y,X+Z) ≤ dK(Y, Z). The rest of the proof is
39
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
straightforward: dK(X+Y,X ′+Y ′) ≤ dK(X+Y,X+Y ′)+dK(X+Y ′, X ′+Y ′) ≤ 2ε.
The first inequality is the triangle inequality.
Now, we are ready to show that S is really an (ε, τ)-quant-kernel. We note
that in this subsection our bound is stronger than (??) in that we do not need to relax
the length threshold. We first prove the theorem under a simplified assumption: we
assume that there is a point s? ∈ R2 (not necessarily an input point), which we call
the special point, that lies in the convex hull of P with probability at least 1 − δ/2.
With the assumption, the proof is much simpler but still instructive as the analysis in
Section 4.3.2 is an extension of this proof. The general case is proved in Theorem 36
and the proof is more technical and the size bound is slightly worse.
Theorem 35. Assume that there is a special point s? ∈ R2 that lies in the convex
hull of P with probability at least 1− δ/2. The parameters of the algorithm are set as
τ1 = O(τλ
)and N = O
( 1
τ 21
log1
δ
)= O
(λ2
τ 2log
1
δ
).
With probability at least 1− δ, for any t ≥ 0 and any direction ~u, we have that
Pr[ω(S, ~u) ≤ t
]∈ Pr
[ω(P , ~u) ≤ t
]± τ. (4.3)
Proof. We first condition on the event that s? is in the convex hull of all realized
points (which happens with probability at least 1 − δ/2). The remainder needs to
hold with probability at least 1 − δ/2. Under the condition, we can pretend that s?
is a deterministic point in the original point set (this does not affect any directional
width as s? is in the convex hull).
Fix an arbitrary direction ~u (w.l.o.g., say it is the x-axis). Rename all points as
s1, s2, . . . , sn according to the increasing order of their projections to ~u. Suppose s?
is renamed as sk. Let the random variable L be the directional width of s1, . . . , skwith respect to ~u and R be the directional width of sk, . . . , sn with respect to ~u.
Since s? is assumed to be within the left and right extents, we can easily see that
ω(P , ~u) = L+R. Similarly, we define L′ (R′ resp.) to be the directional width of all
40
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
points in S to the left (right resp.) of s?. Since the convex hull of S contains s?, we
can also see that ω(S, ~u) = L′+R′. By Lemma 33, we know that dK(L,L′) ≤ O(λτ1)
and dK(R,R′) ≤ O(λτ1). By Lemma 34, we have that dK(ω(S, ~u), ω(P , ~u)) ≤ O(λτ1).
Let τ1 = O(τ/λ), the theorem follows.
Now, we prove the theorem in the general case, where the main difficulty comes
from the fact that we can not separate the width into two independent parts L and
R. The proof is somewhat technical and can be found in Section 4.7.
Theorem 36. Let τ1 = O( τmaxλ,λ2) and N = O( 1
τ21
log 1δ) = O(maxλ2,λ4
τ2 log 1δ).
With probability at least 1 − δ, for any t ≥ 0 and any direction ~u, we have that
Pr[ω(S, ~u) ≤ t
]∈ Pr
[ω(P , ~u) ≤ t
]± τ.
Algorithm 2: For λ(P) > 3 ln(2/τ)
In the second algorithm, we assume that λ(P) =∑
s∈P λs > 3 ln(2/τ). When λ(P) is
large, we cannot directly use the sampling technique in the previous section since it
requires a large number of samples. However, the condition λ(P) ≥ 3 ln(2/τ) implies
there is a nonempty convex region K inside the convex hull of P with high probability.
Moreover, we can show the sum of λs values in K = R2 \ K is small. Hence, we can
use the sampling technique just for K and use the deterministic ε-kernel construction
for K.
Now, we describe the details of our algorithm. Again consider the Poissonized
instance of P . Imagine the following process. Fix a direction ~u ∈ S1. 7 We move
a sweep line `~u orthogonal to ~u, along the direction ~u, to sweep through the points
in P . We use H~u to denote the halfplane defined by `~u (with normal vector ~u)
and H~u denote its complement. So P(H~u) = P ∩ H~u is the set of points that
have been swept so far. We stop the movement of `~u at the first point such that∑
s∈H~uλs ≥ ln(2/τ) (ties should be broken in an arbitrary but consistent manner).
One important property about H~u is that Pr[H~u |= 0] ≤ τ/2. We repeat the above
process for all directions ~u ∈ S1 and let H = ∩~uH~u. Since λ(P) > 3 ln(2/τ), by
7Here, S1 is the surface of the unit ball in Rd.
41
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
EH
HK = (1 + ε)EH
`′L `L `R `′R
dL dR
EH ⊆ H ⊆ K
s1 s2s
s′s′′
H(s1, s2) ⊆ H(s, s′) ∪H(s, s′′)K
H(s, s′)H(s, s′′)
H(s1, s2)
Figure 4-2: The construction of the (ε, τ)-quant-kernel S. The dashed polygon isH. The inner solid polygon is ConvH(EH) and the outer one is K = (1+ε)ConvH(EH).K is the set of points outside K.
Helly’s theorem, H is nonempty. A careful examination of the above process reveals
that H is in fact a convex polytope and each edge of the polytope is defined by two
points in P . 8 Moreover, H is the region of points with Tukey depth at least ln(2/τ).
9
The construction of the (ε, τ)-quant-kernel S is as follows. First, we use the
algorithm in [7] to find a deterministic ε-kernel EH of size O(ε−1/2) for H. One
useful property of the algorithm in [7] is that EH is a subset of the vertices of H.
Hence the convex polytope ConvH(EH) is contained in H. Since EH is an ε-kernel,
(1 + ε)ConvH(EH)) (properly shifted) contains H. 10 Let K = (1 + ε)ConvH(EH) and
K = P \ K. See Figure 4-2.
Now, we apply the random sampling construction over K. More specifically, let
λ := λ(K) =∑
s∈K∩P λs. Let A be the probability measure over P ∩ K defined by
A(s) = λs/λ for every s ∈ P ∩ K. Let τ1 = O(τ/λ). We take N = O(τ−21 log(1/δ))
independent samples from A and let B be the empirical distribution with each sample
8This also implies that we only need to do the sweep for(n2
)directions. In fact, by a careful
rotational sweep, we only need O(n) directional sweeps.9The Tukey depth of a point x ∈ P is defined as the minimum total weight of points of P
contained in a closed halfspace whose bounding hyperplane passes through x.10In fact, most existing algorithms (e.g., [7]) identify a point in the interior ofH as origin, and com-
pute an ε-kernel EH such that f(EH, ~u) ≥ 11+ε f(H, ~u) for all directions ~u. So, H ⊆ (1 + ε)ConvH(EH)
since f(H, ~u) ≤ f((1 + ε)ConvH(EH), ~u) for all directions ~u.
42
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
point having probability 1/N . The (ε, τ)-quant-kernel S consists of the N points
in B, each with the same existential probability 1−exp (−λ/N), as well as all vertices
of K, each with probability 1. This finishes the construction of S.
Now, we show that the size of S is constant (only depending on ε and δ), which
is an immediate corollary of the following lemma.
Lemma 37. λ = λ(K) =∑
s∈K λs = O(
ln 1/τ√ε
).
Proof. We can see that K is the union of O(ε−1/2) half-planes, each defined by a
segment of K. It suffices to show the sum of λs values in each half-plane is O(ln(1/τ)).
Consider the half-plane H(s1, s2) defined by segment (s1, s2) of K. Suppose s is the
vertex of H that is closest to the line (s1, s2). Let (s′, s) and (s, s′′) be the two edges of
H incident on s. Clearly, H(s′, s)∪H(s, s′′), the union of the two half-planes defined
by (s′, s) and (s, s′′), strictly contains H(s1, s2). See Figure 4-2 for an illustration.
Hence,∑
s∈H(s1,s2) λs is at most 2 ln(1/τ).
Now, we prove the main theorem in this section. The proof is an extension of
Theorem 35. Here, the set K plays a similar role as the special point s? in Theorem 35.
Unlike Theorem 35, we also need to relax the length threshold here, which is necessary
even for deterministic points.
Theorem 38. In R2, let λ = λ(K) and τ1 = O(τ/λ), and N = O(
1τ21
log 1δ
)=
O(
ln2 1/τετ2 log 1
δ
). With probability at least 1− δ, for any t ≥ 0 and any direction ~u, we
have that
Pr[ω(S, ~u) ≤ t
]∈ Pr
[ω(P , ~u) ≤ (1± ε)t
]± τ. (4.4)
Proof. The proof is similar to Theorem 35. Fix an arbitrary direction ~u (w.l.o.g., say it
is the x-axis). Rename all points in P as s1, s2, . . . , sn according to the increasing order
of their x-coordinates. We use x(si) to denote the x-coordinate of si. Let `L (or `R) be
the vertical line that passes the leftmost endpoint of EH (or the rightmost endpoint
of EH). We use x(`L) (or x(`R)) to denote the x coordinate of `L (or `R) and let
d(`L, `R) = |x(`L)− x(`R)|. Suppose that s1, . . . , sk lie to the left of `L and sr, . . . , sn
43
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
lie to the right of `R. Let the random variable L = x(`L) − f(s1, . . . , sk,−~u) and
R = f(sr, . . . , sn, ~u)−x(`R). Let W = L+R+d(`R, `L). We can see that W is close
to ω(P , ~u) in the following sense. Let E denote the event that at least one point in
s1, . . . , sk is present and at least one point in sr, . . . , sn is present. Conditioning
on E, W is exactly ω(P , ~u). Moreover, we can easily see Pr[E] ≥ (1− τ/2)2 ≥ 1− τ .
Hence, we have
Pr[W ≤ t]− τ ≤ (1− τ)Pr[W ≤ t] ≤ Pr[ω(P , ~u) ≤ t | E]Pr[E]
≤ Pr[ω(P , ~u) ≤ t] = Pr[ω(P , ~u) ≤ t | E]Pr[E] + Pr[ω(P , ~u) ≤ t | ¬E]Pr[¬E]
≤ Pr[W ≤ t] + τ.
Similarly, we let `′L (or `′R) be the vertical line that passes the leftmost endpoint of
K (or the rightmost endpoint of K). Suppose that s′1, . . . , s′j (points in S) lie to the left
of `L and s′s, . . . , s′N lie to the right of `R. We define L′ = x(`′L)− f(s′1, . . . , s′j,−~u)
and R′ = f(s′s, . . . , s′N, ~u)−x(`′R). We can also see that ω(S, ~u) = L′+R′+d(`R, `L).
Let dL = x(`L) − x(`′L) and dR = x(`′R) − x(`R). Let Hs′ be the half-plane
(x, y) | x ≤ x(`′L)− t. We can see that for any t ≥ 0,
Pr[L ≤ t+ dL]−O(λτ1) = Pr[X(Hs′) = 0]−O(λτ1)
≤ Pr[L′ ≤ t] = Pr[Y (Hs′) = 0] ≤ Pr[X(Hs′) = 0] +O(λτ1)
= Pr[L ≤ t+ dL] +O(λτ1),
where the inequalities hold due to Lemma 33. Similarly, we can see that for any t ≥ 0,
Pr[R ≤ t+ dR]−O(λτ1) ≤ Pr[R′ ≤ t] ≤ Pr[R ≤ t+ dL] +O(λτ1).
Therefore, by Lemma 34, we have that for any t > 0,
Pr[L+R ≤ t+ dL + dR]−O(λτ1) ≤ Pr[L′ +R′ ≤ t] ≤ Pr[L+R ≤ t+ dL + dR] +O(λτ1).
44
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Therefore, we can conclude that for any t ≥ d(`′L, `′R),
ω(Pr[S, ~u) ≤ t] = Pr[L′ +R′ + d(`′L, `′R) ≤ t]
∈ Pr[L+R + d(`′L, `′R) ≤ t+ dL + dR]±O(λτ1)
= Pr[L+R + d(`L, `R) ≤ t]±O(λτ1)
= Pr[W ≤ t]±O(λτ1)
= Pr[ω(P , ~u) ≤ t]±O(λτ1 + τ).
Noticing that τ ≥ Pr[ω(P , ~u) ≤ d(`L, `R)] ≥ Pr[ω(P , ~u) ≤ (1 − ε)d(`′L, `′R)], we can
obtain that, for any t < d(`′L, `′R), Pr[ω(S, ~u) ≤ t] = 0 ≥ Pr[ω(P , ~u) ≤ (1 − ε)t] − τ.
Moreover, it is trivially true that Pr[ω(S, ~u) ≤ t] = 0 ≤ Pr[ω(P , ~u) ≤ (1 − ε)t] + τ.
The proof is completed.
Higher Dimensions. Our constructions can be easily extended to Rd for any con-
stant d > 2. The sampling bound (Theorem 32) still holds if the number of samples is
O(dτ−21 log(1/δ)) = O(τ−2
1 log(1/δ)). Hence, Theorem 35 and Theorem 36 hold with
the same parameters (d is hidden in the constant). In order for Algorithm 2 to work,
we need λ(P) > (d + 1) ln(2/τ) to ensure H is nonempty. Instead of constructing
an ε-kernel EH with O(ε−(d−1)/2) vertices, we construct a convex set K which is the
intersection of O(ε−(d−1)/2) halfspaces and satisfies (1 − ε)K ⊆ H ⊆ K (this can be
done by either working with the dual, or directly using the construction implicit in
[37]).
Now, we briefly sketch how to compute such K using the dual approach. We first
compute the dual H? of H in Rd. Recall the dual (also called the polar body) H? of
H is defined as the set x ∈ Rd | 〈x, y〉 ≤ 1, y ∈ H. H? has O(nd) vertices (each
corresponding to a face of H). Then, compute an ε-kernel E?H? with O(ε−(d−1)/2)
vertices for H?. Taking the dual of E?H? gives the desired K, which is an intersection
of O(ε−(d−1)/2) halfspaces (each corresponding to a point in E?H?). The correctness
can be easily seen by an argument through the gauge function g(E?H? , x) = minλ ≥0 | x ∈ λE?H?. Since E?H? ⊆ H? ⊆ (1 + ε)E?H? , we can see that 1
1+εg(E?H? , x) =
45
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
g((1 + ε)E?H? , x) ≤ g(H?, x) ≤ g(E?H? , x). The correctness follows from the duality
between the gauge function and the support function, which says g(E?H? , x) = f(K, x)
and g(H?, x) = f(H, x) for all x ∈ Sd−1 (see e.g., [95]).
We generalize Lemma 37 for Rd by the following lemma.
Lemma 39. There is a convex set K, which is an intersection of O(ε−(d−1)/2) halfs-
paces and satisfies (1−ε)K ⊆ H ⊆ K. Moreover, we have that λ(K) = O(ε−(d−1)/2 ln(1/τ)).
Plugging the new bound of λ(K), we can see that it is enough to set
N = O(τ−2ε−(d−1) log1
δpolylog
1
τ) = O(ε−(d−1)τ−2).
Running Time. The algorithm in Section 4.3.2 takes only O(Nn) time (where
N is the size of the kernel, which is constant if ε, τ and λ are constant). The
algorithm in Section 4.3.2 is substantially slower. The most time consuming part is
the construction of H, which is the intersection of all halfspaces. In Rd, we need to
sweep O(nd) directions (each determined by d points). So the polytope H may have
O(nd) faces. Using the dual approach, we can compute K in O(nd) time (linear in
the number of points in the dual space) as well. Overall, the running time is O(nd).
A Nearly Linear Time Algorithm for Constructing (ε, τ)-quant-kernels
We describe a nearly linear time algorithm for constructing an (ε, τ)-quant-kernel
in the existential uncertainty model. As mentioned before, the algorithm in Sec-
tion 4.3.2 takes linear time. So we only need a nearly linear time algorithm for con-
structing H (and K). Note that H is the set of points in Rd with Tukey depth at least
ln(2/τ). One tempting idea is to utilize the notion of ε-approximation (which can be
obtained by sampling) to compute the approximate Tukey depth for the points, as
done in [87]. However, a careful examination of this approach shows that the sample
size needs to be as large as O(λ(P)) (to ensure that for every halfspace, the difference
between the real weight and the sample weight is less than, say 0.1 ln(2/τ)). Another
useful observation is that only points with small (around ln(2/τ)) Turkey depth are
46
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
relevant in constructing H. Hence, we can first sample an ε-approximation of very
small size (say k = O(log n)), and use it to quickly identify the region H1 in which all
points have large (i.e., λ(P)/k) Turkey depth (so H1 ⊆ H). Then, we can delete all
points inside H1 and focus on the remaining points. Ideally, the total weight of the
remaining points can be reduced significantly and a random sample of the same size
k would give an ε′-approximation of the remaining points for some ε′ < ε. We repeat
the above until the total weight of the remaining points reduces to a constant, and
then a constant size sample suffices. However, it is possible that all points have fairly
small Tukey depth (consider the case where all points are in convex position), and
no point can be removed. To resolve the issue, we use the idea in Lemma 37: there
is a convex set K1 slightly larger than H1 such that the weight of points outside K1
is much smaller. Hence, we can make progress by deleting all points inside K1. Since
K1 is only slightly larger than H1, we do not lose too much in terms of the distance.
Our algorithm carefully implements the above iterative sampling idea.
For ease of exposition, we first focus on R2. Consider the Poissonized instance of
P . We would like to find two convex sets H and K satisfying the following properties.
P1. Assume without loss of generality that the origin is in H. We require that
11+εK ⊆ H ⊆ K.
P2. For a direction ~u ∈ S1, we use H(H, ~u) to denote the halfplane which does
not contain H and whose boundary is the supporting line of H with normal
direction ~u. We require that λ(H(H, ~u)) =∑
s∈H(H,~u) λs ≥ ln(2/τ) for all
directions ~u ∈ S1.
P3. λ(K) = O(1/√ε).
By a careful examination of our analysis in Section 4.3.2, we can see the above prop-
erties are all we need for the analysis.
Let H? denote the H found using the exact algorithm in Section 4.3.2. We use
the following set of parameters:
z = O(log n), ε1 = O
(ε
log n
), ε2 = O
(√ε
log n
).
47
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Our algorithm proceeds in rounds. Initially, let H0 = ConvH(s ∈ P | λs ≥ ln(2/τ)).In round i (for 1 ≤ i ≤ z), we construct two convex sets Hi and Ki such that
1. H0 ⊆ K0 ⊆ H1 ⊆ K1 ⊆ . . . ⊆ Hz ⊆ Kz;
2. 11+ε1Ki ⊆ Hi ⊆ Ki (Ki and Hi are very close to each other);
3. 1(1+ε1)i
Hi ⊆ H? (Hi is almost contained in H?);
4. λ(P ∩ Ki) ≤ 12λ(P ∩ Ki−1) (the total weight outside Ki reduces by a factor of
at least one half).
We repeat the above process until λ(Ki) ≤ O(1/√ε).
Before spelling out the details of our algorithm, we need a few definitions.
Definition 40. For a set P of weighted points in Rd, we use TK(P, γ) to denote the
set of points x ∈ Rd with Tukey depth at least γ. It is known that TK(P, γ) is convex
(see e.g., [87]). By this definition, H? = TK(P , ln(2/τ)).
Recall the definition of ε-approximation from Theorem 29. By Theorem 29 (or Theo-
rem 32), we can see that a set of O(ε−2 log(1/δ)) sampled points is an ε-approximation
with probability 1− δ.We are ready to describe the details of our algorithm. Initially H0 = ConvH(s ∈
P | λs ≥ ln(2/τ)) (obviously H0 ⊆ H?). Compute a deterministic ε1-kernel CH0
of H0 and let K0 = (1 + ε1)ConvH(CH0). Delete all point in P ∩ K0 and let P1 be
the remaining points in P (i.e., P1 = P ∩ K0). Let Ver(K0) denote all vertices of K0
(notice that some of them may not be original points in P).
Now, suppose we describe the ith round for general i > 1. We have the remaining
vertices in Pi−1 and Ver(Ki−1). Let each point s ∈ Pi has the same old weight λs
and each point in Ver(Ki−1) has weight +∞ (to make sure every point in Ki−1 has
Turkey depth +∞). Using random sampling on Pi, obtain an ε2 -approximation Ei(of size L = O(ε−2
2 )) for Pi. Then compute (using the brute-force algorithm described
in Section 4.3.2)
Hi = TK(Ei ∪ Ver(Ki−1),max4ε2λ(Pi), 2 ln(2/τ)).
48
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Note that Ki−1 ⊆ Hi. Compute a deterministic ε2-kernel CHi of Hi and let Ki =
(1 + ε1)ConvH(CHi) (hence ConvH(CHi) ⊆ Hi ⊆ Ki). Then, we delete all points in
P ∩ Ki and add all vertices of Ki (denoted as Ver(Ki)). Let Pi+1 be the remaining
points in P ∩ Ki.Our algorithm terminates when λ(Pi) ≤ O(1/
√ε). Suppose the last round is z.
Finally, we let H = 1(1+ε1)z
Hz and K = Kz.First we show the algorithm terminates after at most a logarithmic number of
rounds.
Lemma 41. z = O(log n).
Proof. If Hi = TK(Ei ∪ Ver(Ki−1), 2 ln(2/τ)), then we stop since λ(Pi+1) ≤ O(1/√ε)
by Lemma 37. Thus, we only need to bound the number of iterations where Hi =
TK(Ei ∪ Ver(Ki−1), 4ε2λ(Pi)).Initially, it is not hard to see that λ(P1) ≤ n ln(2/τ). Using Lemma 37, we can
see that λ(Pi+1) = λ(P∩Ki) ≤ O(5ε2λ(Pi)/√ε1) ≤ λ(Pi)/2 for the constant defining
ε1 sufficiently large. Hence, λ(Pi) ≤ λ(P)/2i.
We need to show H and K satisfy P1, P2 and P3. P3 is quite obvious by our
algorithm. It is also not hard to see P1 since (1 + ε1)z+1 ≤ 1 + ε and
1
(1 + ε)Kz ⊆
1
(1 + ε1)z+1Kz ⊆
1
(1 + ε1)zHz = H ⊆ Hz ⊆ Kz = K.
The most difficult part is to show P2 holds: For every direction ~u ∈ S1, λ(H(H, ~u)) =∑
s∈H(H,~u) λs ≥ ln(2/τ). In fact, we show that H ⊆ H?, from which P2 follows
trivially, which suffices to prove the following lemma.
Lemma 42. 1(1+ε1)i
Hi ⊆ H? for all 0 ≤ i ≤ z. In particular, H ⊆ H?.
Proof. We prove the lemma by induction. H0 ⊆ H? clearly satisfies the lemma. For
ease of notation, we let η = 1/(1 + ε1). Suppose the lemma is true for Hi−1, from
which we can see that
ηiKi−1 ⊆ ηi−1Hi−1 ⊆ H?.
49
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Now we show the lemma holds for Hi. Consider the ith round. Let Ei be an ε2λ(Pi)-approximation for Pi = P ∩ Ki−1 and Hi = TK(Ei ∪ Ver(Ki−1), 4ε2λ(Pi)). Fix
an arbitrary direction ~u ∈ S1 (w.l.o.g., assume that u = (0,−1), i.e., the downward
direction), let H(ηiHi, ~u) be a halfplane whose boundary is tangent to ηiHi. It suffices
to show that λ(H(ηiHi, ~u)) =∑
s∈H(ηiHi,~u) λs ≥ ln(2/τ). We move a sweep line `~u
orthogonal to ~u, along the direction ~u (i.e., from top to bottom), to sweep through
the points in Pi ∩ Ver(Ki−1) until the total weight we have swept is at least ln(2/τ).
We distinguish two cases:
1. `~u hits a point s in Ver(Ki−1) (recall the weight for such point is +∞). We
can see that s is the topmost point of Ki−1 and Hi (or equivalently, `~u is also
a supporting line for Hi). Since ηiKi−1 ⊆ H? by the induction hypothesis, the
topmost point of ηiKi−1 is lower than that for H?. The topmost point of ηiKi−1
is also the highest point of ηiHi, from which we can see H(ηiHi, ~u) is lower than
H(H?, ~u), which implies that λ(H(ηiHi, ~u)) ≥ ln(2/τ).
2. `~u stops moving when it hits an original point in Pi. Since max3ε2λ(Pi), 2 ln(2/τ)−ε2λ(Pi) > ln(2/τ), by definition ofHi, H(Hi, ~u) can not be higher than `~u. The
boundary of H(ηiHi, ~u) is even lower, from which we can see λ(H(ηiHi, ~u)) ≥ln(2/τ).
Hence, every point in ηiHi has Tukey depth at least ln(2/τ), which implies the lemma.
Running time. In each round, we compute in linear time an ε2-approximation Eiof size O(ε−2
2 log(1/δ)) = polylog(n) (with δ = poly(n) to ensure each probabilistic
event succeeds with high probability). Ki is a dilation of an ε1-kernel. So the size
of Ver(Ki) is at most 1/√ε1 = O(log1/2 n). Deciding whether a point is inside Ki
can be solved in polylog(n) time, by a linear program with |Ver(Ki)| variables. To
compute Hi, we can use the brute-force algorithm described in Section 4.3.2, which
takes poly(|Ei ∩ Ver(Ki−1)|) = polylog(n) time. There are logarithmic number of
rounds. So the overall running time is O(npolylogn).
50
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Higher Dimension. Our algorithm can be easily extended to Rd for any constan-
t d > 2. In Rd, we let ε1 = O(ε/ log n) and ε2 = O(ε/ log n)(d−1)/2. With the new
parameters, we can easily check that Lemma 41 still holds. We can construct an (ε, τ)-
quant-kernel of size minO(τ−2 maxλ2, λ4 log(1/δ)
), O(τ−2ε−(d−1) log(1/δ)polylog(1/τ).
The first term is from Theorem 36 and the second from the higher-dimensional ex-
tension to Theorem 38. Now, let us examine the running time. In Rd, |Ver(Ki)| is
at most ε−(d−1)/21 = O(log(d−1)/2 n). So deciding whether a point is inside Ki can be
solved in logO(d)(n) time. Computing Hi takes logO(d)(n) time using the brute-force
algorithm. So the overall running time is O(n logO(d) n).
In summary, we obtain the following theorem for (ε, τ)-quant-kernel.
Theorem 5. (restated) P is a set of uncertain points in Rd with existential uncertain-
ty. Let λ =∑
si∈P(− ln(1− pi)). There exists an (ε, τ)-quant-kernel for P, which
consists of a set of independent uncertain points of cardinality minO(τ−2 maxλ2, λ4),O(ε−(d−1)τ−2). The algorithm for constructing such a coreset runs in O(n logO(d) n)
time.
4.3.3 (ε, τ)-quant-kernel Under the Subset Constraint
We show it is possible to construct an (ε, τ)-quant-kernel in the existential model
under the β-assumption: each possible location realizes a point with a probability at
least β, where β > 0 is some fixed constant.
Theorem 43. Under the β-assumption, there is an (ε, τ)-quant-kernel in Rd,
which is of size O(µ−(d−1)/2 log(1/µ)) and satisfies the subset constraint, in the exis-
tential uncertainty model, where µ = minε, τ.
In fact, the algorithm is exactly the same as constructing an ε-exp-kernel and
the proof of the above theorem is implicit in the proof of Theorem 26.
51
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
4.4 (ε, r)-fpow-kernel Under the β-Assumption
In this section, we show an (ε, r)-fpow-kernel exists in the existential uncertainty
model under the β-assumption. Recall that the function Tr(P, ~u) = maxs∈P 〈~u, s〉1/r−mins∈P 〈~u, s〉1/r. For ease of notation, we write E[Tr(P , ~u)] to denote EP∼P [Tr(P, ~u)].
Our goal is to find a set S of stochastic points such that for all directions ~u ∈ P?, we
have that E[Tr(S, ~u)] ∈ (1± ε)E[Tr(P , ~u)].
Our construction of S is almost the same as that in Section 4.3.1. Suppose we
sample N (fixed later) independent realizations and take the ε0-kernel for each of
them. Suppose they are E1, . . . , EN and we associate each a probability 1/N . We
denote the resulting (ε, r)-fpow-kernel by S. Hence, for any direction ~u ∈ P?,E[Tr(S, ~u)] = 1
N
∑Ni=1 Tr(Ei, ~u) and we use this value as the estimation of E[Tr(P , ~u)].
Now, we show S is indeed an (ε, r)-fpow-kernel.
Recall that we use E(P ) to denote the deterministic ε-kernel for any realization
P ∼ P . We first compare P with the random set E(P ).
Lemma 44. For any t ≥ 0 and any direction ~u ∈ P?, we have that
(1− ε/2)E[Tr(P , ~u)] ≤ EP∼P [Tr(E(P ), ~u))] ≤ E[Tr(P , ~u)].
Proof. By Lemma 4.6 in [7], we have that (1−ε/2)Tr(P, ~u) ≤ Tr(E(P ), ~u) ≤ Tr(P, ~u).
The lemma follows by combining all realizations.
Now we show that S is an (ε, r)-fpow-kernel of E(P ). We first prove the
following lemma. The proof is almost the same as that of Lemma 30, and can be
found in Section 4.8.
Lemma 45. Let N = O(ε−2
1 ε−(d−1)/20 log(1/ε0)
), where ε0 = (ε/4(r − 1))r, ε1 = εβ2.
For any t ≥ 0 and any direction ~u ∈ P?, we have that
PrP∼S [maxs∈P〈~u, s〉1/r ≥ t] ∈ PrP∼P [ max
s∈E(P )〈~u, s〉1/r ≥ t)]± ε1/4, and
PrP∼S [mins∈P〈~u, s〉1/r ≥ t] ∈ PrP∼P [ min
s∈E(P )〈~u, s〉1/r ≥ t)]± ε1/4.
52
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Lemma 46. Let N = O(β−4ε−(rd−r+4)/2 log(1/ε)
)and ε0 = (ε/4(r − 1))r. S con-
structed above is an (ε, r)-fpow-kernel in Rd.
Proof. Fix a direction ~u ∈ P?. Let A = maxs∈P〈~u, s〉1/r, B = mins∈P〈~u, s〉1/r. We
observe that B ≤ maxs∈P 〈~u, s〉1/r ≤ A for any realization P ∼ P . We also need the
following basic fact about the expectation: For a random variableX, if Pr[X ≥ a] = 1,
then E[X] =∫∞b
Pr[X ≥ x]dx+ b for any b ≤ a. Thus, we have that
EP∼P [ maxs∈E(P )
〈~u, s〉1/r] =
∫ A
B
PrP∼P [ maxs∈E(P )
〈~u, s〉1/r ≥ x]dx+B
≤∫ A
B
PrP∼S [maxs∈P〈~u, s〉1/r ≥ x]dx+B + ε1(A−B)/4
= EP∼S [maxs∈P〈~u, s〉1/r] + ε1(A−B)/4,
where the first inequality is due to Lemma 45. Similarly, we can show the following
two inequalities:
EP∼S [maxs∈P〈~u, s〉1/r] ∈ EP∼P [ max
s∈E(P )〈~u, s〉1/r]± ε1(A−B)/4,
EP∼S [mins∈P〈~u, s〉1/r] ∈ EP∼P [ min
s∈E(P )〈~u, s〉1/r]± ε1(A−B)/4.
Recall that Tr(P, ~u) = maxs∈P 〈~u, s〉1/r −mins∈P 〈~u, s〉1/r. By the linearity of expecta-
tion, we conclude that
E[Tr(P , ~u)] ∈ EP∼P [Tr(E(P ), ~u)]± ε1(A−B)/2.
Combining Lemma 44, we have that E[Tr(S, ~u)] ∈ (1±ε/2)E[Tr(P , ~u)]±ε1(A−B)/2.
By the β-assumption, we know that E[Tr(P , ~u)] ≥ β2(A−B). Thus, ε1(A−B)/2 ≤ε2E[Tr(P , ~u)], and E[Tr(S, ~u)] ∈ (1± ε)E[Tr(P , ~u)].
Running time. In each sample, the size of a deterministic ε0-kernel Ei is at most
O(ε−(d−1)/20
). Note that constructing an ε0-kernel can be solved in linear time.
We take O(ε−2
1 ε−(d−1)/20 log(1/ε0)
)samples in total. So the overall running time is
O(nβ−4ε−(rd−r+4)/2 log(1/ε) + poly(1/ε)
)= O
(nε−(rd−r+4)/2
).
53
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Note that each ε0-kernel containsO(ε−r(d−1)/2) points. We takeN = O(ε−2
1 ε−(d−1)/20 log(1/ε0)
)
independent samples. So the total size of (ε, r)-fpow-kernel isO(β−4ε−(rd−r+2) log(1/ε)).
In summary, we obtain the following theorem.
Theorem 7. (restated) An (ε, r)-fpow-kernel of size O(ε−(rd−r+2)) can be con-
structed in O(nε−(rd−r+4)/2
)time in the existential uncertainty model under the β-
assumption. In particular, the (ε, r)-fpow-kernel consists of N = O(ε−(rd−r+4)/2)
point sets, each occuring with probability 1/N and containing O(ε−r(d−1)/2) determin-
istic points.
4.5 Applications
In this section, we show that our coreset results for the directional width problem
readily imply several coreset results for other stochastic problems, just as in the
deterministic setting. We introduce these stochastic problems and briefly summarize
our results below.
4.5.1 Approximating the Extent of Uncertain Functions
We first consider the problem of approximating the extent of a set H of uncertain
functions. As before, we consider both the existential model and the locational model
of uncertain functions.
1. In the existential model, each uncertain function h is a function in Rd associated
with a existential probability pf , which indicates the probability that h presents
in a random realization.
2. In the locational model, each uncertain function h is associated with a finite
set h1, h2, . . . of deterministic functions in Rd. Each hi is associated with a
probability value p(hi), such that∑
i p(hi) = 1. In a random realization, h is
independently realized to some hi, with probability p(hi).
We use H to denote the random instance, that is a random set of functions. We
use h ∈ H to denote the event that the deterministic function h is present in the
54
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
instance. For each point x ∈ Rd, we let the random variable EH(x) = maxh∈H h(x)−minh∈H h(x) be the extent of H at point x. Suppose S is another set of uncertain
functions. We say S is the ε-exp-kernel for H if (1 − ε)EH(x) ≤ ES(x) ≤ EH(x)
for any x ∈ Rd. We say S is the (ε, τ)-quant-kernel for H if PrS∼S[ES(x) ≤ t
]∈
PrH∼H[EH(x) ≤ (1± ε)t
]± φ. for any t ≥ 0 and any x ∈ Rd.
Let us first focus on linear functions in Rd. Using the duality transformation that
maps linear function y = a1x1 + . . .+ adxd + ad+1 to the point (a1, . . . , ad+1) ∈ Rd+1,
we can reduce the extent problem to the directional width problem in Rd+1. Let Hbe a set of uncertain linear functions (under either existential or locational model)
in Rd for constant d. From Theorem 20 and Corollary 17, we can construct a set S
of O(n2d) deterministic linear functions in Rd, such that ES(x) = E[EH(x)] for any
x ∈ Rd. Moreover, for any ε > 0, there exists an ε-exp-kernel of size O(ε−d/2)
and an (ε, τ)-quant-kernel of size O(τ−2ε−d). Using the standard linearization
technique [7], we can obtain the following generalization for uncertain polynomials.
Theorem 47. Let H be a family of uncertain polynomials in Rd (under either ex-
istential or locational model) that admits linearization of dimension k. We can con-
struct a set M of O(n2k) deterministic polynomials, such that EM(x) = E[EH(x)]
for any x ∈ Rd. Moreover, for any ε > 0, there exists an ε-exp-kernel of size
O(ε−k/2) and an (ε, τ)-quant-kernel of size minO(τ−2 maxλ2, λ4), O(ε−kτ−2).Here λ =
∑h∈H(− ln(1− ph)).
Now, we consider functions of the form u(x) = p(x)1/r where p(x) is a polynomial
and r is a positive integer. We call such a function a fractional polynomial. We
still use H to denote the random set of fractional polynomials. Let H? ⊆ Rd be
the set of points such that for any points x ∈ H? and any function ~u ∈ H, we
have u(x) ≥ 0. For each point x ∈ H?, we let the random variable Er,H(x) =
maxh∈H h(x)1/r − minh∈H h(x)1/r. We say another random set S of functions is the
(ε, r)-fpow-kernel for H if (1− ε)Er,H(x) ≤ Er,S(x) ≤ Er,H(x) for any x ∈ H?. By
the duality transformation and Theorem 7, we can obtain the following result.
Theorem 48. Let H be a family of uncertain fractional polynomials in Rd in the
55
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
existential uncertainty model under the β-assumption. Further assume that each
polynomial admits a linearization of dimension k. For any ε > 0, there exists an
(ε, r)-fpow-kernel of size O(ε−(rk−r+2)). Furthermore, the (ε, r)-fpow-kernel
consists of N = O(ε−(rk−r+4)/2
)sets, each occurring with probability 1/N and con-
taining O(ε−r(k−1)/2
)deterministic fractional polynomials.
4.5.2 Stochastic Moving Points
We can extend our stochastic models to moving points. In the existential model, each
point s is present with probability ps and follows a trajectory s(t) in Rd when present
(s(t) is the position of s at time t). In the locational model, each point s is associated
with a distribution of trajectories (the support size is finite) and the actual trajectory
of s is a random sample for the distribution. Such uncertain trajectory models have
been used in several applications in spatial databases [111]. For ease of exposition, we
assume the existential model in the following. Suppose each trajectory is a polynomial
of t with degree at most r. For each point s, any direction ~u and time t, define the
polynomial fs(~u, t) = 〈s(t), ~u〉 and let H include fs with probability ps. For a set P of
points, the directional width at time t is EH(~u, t) = maxs∈P fs(~u, t)−mins∈P fs(~u, t).
Each polynomial fs admits a linearization of dimension k = (r + 1)d − 1. Using
Theorem 47, we can see that there is a set M of O(n2k) deterministic moving points,
such that the directional width of M in any direction ~u is the same as the expected
directional width of P in direction ~u. Moreover, for any ε > 0, there exists an ε-exp-
kernel (which consists of only deterministic moving points) of size O(ε−(k−1)/2) and
an (ε, τ)-quant-kernel (which consists of both deterministic and stochastic moving
points) of size O(ε−kτ−2).
4.5.3 Shape Fitting Problems
Theorem 47 can be also applied to some stochastic variants of certain shape fitting
problems. We first consider the following variant of the minimum enclosing ball
problem over stochastic points. We are given a set P of stochastic points (under either
56
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
existential or locational model), find the center point c such that E[maxs∈P ‖s− c‖2]
is minimized. It is not hard to see that the problem is equivalent to minimizing the
expected area of the enclosing ball in R2. For ease of exposition, we assume the
existential model where s is present with probability ps. For each point s ∈ P , define
the polynomial hs(x) = ‖x‖2−2〈x, s〉+‖s‖2, which admits a linearization of dimension
d + 1 [7]. Let H be the family of uncertain polynomials hss∈P (hs exists with
probability ps). We can see that for any x ∈ Rd, maxs∈P ‖x − s‖2 = maxhs∈H hs(x).
Using Theorem 47, 11 we can see that there is a set M of O(n2d+2) deterministic
polynomials such that maxh∈M h(x) = E[maxs∈P ‖x − s‖2] for any x ∈ Rd and a set
S of O(ε−(d+1)/2) deterministic polynomials such that (1 − ε)E[maxs∈P ‖x − s‖2] ≤maxh∈S h(x) ≤ E[maxs∈P ‖x − s‖2] for any x ∈ Rd. We can store the set S instead
of the original point set in order to answer the following queries: given a point s,
return the expected length of the furthest point from s. The problem of finding the
optimal center c can be also carried out over S, which can be done in O(ε−O(d2)) time:
We can decompose the arrangement of n semialgebraic surfaces in Rd into O(nO(d+k))
cells of constant description complexity, where k is the linearization dimension (see
e.g., [12]). By enumerating all those cells in the arrangement of S, we know which
polynomials lie in the upper envelopes, and we can compute the minimum value in
each such cell in constant time when d is constant.
The above argument can also be applied to the following variant of the spherical
shell for stochastic points. We are given a set P of stochastic points (under either
existential or locational model). Our objective is to find the center point c such
that E[obj(c)] = E[maxs∈P ‖s − c‖2 − mins∈P ‖s − c‖2] is minimized. The problem
is equivalent to minimizing the expected area of the enclosing annulus in R2. The
objective can be represented as a polynomial of linearization dimension k = d + 1.
Proceeding as for the enclosing balls, we can show there is a set S of O(ε−(k−1)/2)
deterministic polynomials such that (1 − ε)E[obj(c)] ≤ ES(x) ≤ E[obj(c)] for any
x ∈ Rd. We would like to make a few remarks here.
11We can see from the proof that all results that hold for width/extent also hold for supportfunction/maximum.
57
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
1. Let us take the minimum enclosing ball for example. If we examine the con-
struction of set S, each polynomial h ∈ S may not be of the form h(x) =
‖x‖2− 2〈x, s〉+ ‖s‖2, therefore does not translate back to a minimum enclosing
ball problem over deterministic points.
2. Another natural objective function for the minimum enclosing ball and the
spherical shell problem would be the expected radius E[maxs∈P d(s, c)] and
the expected shell width E[maxs∈P d(s, c) − mins∈P d(s, c)]. However, due to
the fractional powers (square roots) in the objectives, simply using an ε-exp-
kernel does not work. This is unlike the deterministic setting. 12 We leave
the problem of finding small coresets for the spherical shell problem as an in-
teresting open problem. However, under the β-assumption, we can use (ε, r)-
fpow-kernels to handle such fractional powers, as in the next subsection.
Theorem 8. (restated) Suppose P is a set of n independent stochastic points in
Rd under either existential or locational uncertainty model. There are linear time
approximation schemes for the following problems: (1) finding a center point c to
minimize E[maxs∈P ‖s − c‖2]; (2) finding a center point c to minimize E[obj(c)] =
E[maxs∈P ‖s− c‖2 −mins∈P ‖s− c‖2]. Note that when d = 2 the above two problems
correspond to minimizing the expected areas of the enclosing ball and the enclosing
annulus, respectively.
4.5.4 Shape Fitting Problems (Under the β-assumption)
In this subsection, we consider several shape fitting problems in the existential model
under the β-assumption. We show how to use Theorem 48 to obtain linear time
approximation schemes for those problems.
1. (Minimum spherical shell) We first consider the minimum spherical shell prob-
lem. Given a set P of stochastic points (under the β-assumption), our goal is
to find the center point c such that E[maxs∈P ‖s− c‖−mins∈P ‖s− c‖] is mini-
mized. For each point s ∈ P , let hs(x) = ‖x‖2 − 2〈x, s〉+ ‖s‖2, which admits a
12In particular, there is no stochastic analogue of Lemma 4.6 in [7].
58
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
linearization of dimension d+ 1. It is not hard to see that E[maxs∈P ‖s− c‖] =
E[maxs∈P√hs(c)] and E[mins∈P ‖s − c‖] = E[mins∈P
√hs(c)]. Using Theo-
rem 48, we can see that there are N = O(ε−(d+3)
)sets Si, each containing
O(ε−(d+1)
)fractional polynomial
√hss such that for all x ∈ Rd,
1
N
∑
i∈[N ]
(maxSi
√hs(x)−min
Si
√hs(x)) ∈ (1± ε)(E[max
s∈P‖s− x‖]− E[min
s∈P‖s− x‖]).
(4.5)
Note that our (ε, r)-fpow-kernel satisfies the subset constraint. Hence, each
function√hs corresponds to an original point in P . So, we can store N point
sets Pi ⊆ P , with |Pi| = O(ε−d)
as the coreset for the original point set. By
(4.5), an optimal solution for the coreset is an (1 + ε)-approximation for the
original problem.
Now, we briefly sketch how to compute the optimal solution for the coreset.
Consider all points in ∪iPi. Consider the arrangement of O(ε−O(d)
)hyperplanes,
each bisecting a pair of points in ∪iPi. For each cell C of the arrangement, for
any point s ∈ C, the ordering of all points in ∪iPi is fixed. We then enumerate all
those cells in the arrangement and try to find the optimal center in each cell. Fix
a cell C. For any point set Pi, we know which point is the furthest one and which
point is the closest one from points in C0. Say they are si = arg maxs∈Pi ‖s−x‖and s′i = arg mins∈Pi ‖s − x‖. Hence, our problem can be formulated as the
following optimization problem:
minx
1
N
∑
i
(di−d′i), s.t. d2i = ‖si−x‖2, d′2i = ‖s′i−x‖2, di, d
′i ≥ 0,∀i ∈ [N ];x ∈ C0.
The polynomial system has a constant number of variables and constraints,
hence can be solved in constant time. More specifically, we can introduce a
new variable t and let t = 1N
∑i(di − d′i). All polynomial constraints define a
semi-algebraic set. By using constructive version of Tarski-Seidenberg theorem,
we can project out all variables except t and the resulting set is still a semi-
59
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
algebraic set (which would be a finite collection of points and intervals in R1)
(See e.g.,[20]).
2. (Minimum enclosing cylinder, Minimum cylindrical shell) Let P be a set of
stochastic points in the existential uncertainty model under the β-assumption.
Let d(`, s) denote the distance between a point s ∈ Rd and a line ` ⊂ Rd. The
goal for the minimum enclosing cylinder problem is to find a line ` such that
E[maxs∈P d(`, s)] is minimized, while that for the minimum cylindrical shell
problem is to minimize E[maxs∈P d(`, s) − mins∈P d(`, s)]. The algorithms for
both problems are almost the same and we only sketch the one for the minimum
enclosing cylinder problem.
We follow the approach in [7]. We represent a line ` ∈ Rd by a (2d − 1)-tuple
(x1, . . . , x2d−1) ∈ R2d−1: ` = p+ tq | t ∈ R, where p = (x1, · · · , xd−1, 0) is the
intersection point of ` with the hyperplane xd = 0 and q = (xd, . . . , x2d−1), ‖q‖2 =
1 is the orientation of `. Then for any point s ∈ Rd, we have that
d(`, s) = ‖(p− s)− 〈p− s, q〉q‖,
where the polynomial d2(`, s) admits a linearization of dimension O(d2). Now,
proceeding as for the minimum enclosing ball problem and using Theorem 48,
we can obtain a coreset S consisting N = O(ε−O(d2)
)deterministic point sets
Pi ⊆ P .
We briefly sketch how to obtain the optimal solution for the coreset. We can also
decompose R2d−1 (a point x in the space with ‖(xd, . . . , x2d−1)‖ = 1 represents
a line in Rd) into O(ε−O(d2)
)semi-algebraic cells such that for each cell, the
ordering of the points in S (by their distances to a line in the cell) is fixed.
Note that such a cell is a semi-algebraic cell. For a cell C, assume that si =
arg maxs∈Pi d(`, si) for all i ∈ [N ], where ` is an arbitrary line in C. We can
60
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
formulate the problem as the following polynomial system:
minl
1
N
∑
i
di, s.t. d2i = d2(`, si), di ≥ 0,∀i ∈ [N ]; ` = (p, q) ∈ C0, ‖q‖2 = 1.
Again the polynomial system has a constant number of variables and con-
straints. Thus, we can compute the optimum in constant time.
Theorem 9. (restated) Suppose P is a set of n independent stochastic points in Rd,
each appearing with probability at least β, for some fixed constant β > 0. There are
linear time approximation schemes for minimizing the expected radius (or width) for
the minimum spherical shell, minimum enclosing cylinder, minimum cylindrical shell
problems over P.
4.6 Missing Details in Section 4.2
4.6.1 Details for Section 4.2.1
Lemma 23. We find an affine transform T in O(2O(d)n log n) time, such that the
convex polytope M ′ = T (M) is α-fat for some constant α.
Proof. By the results in [19], we only need to construct an approximate bounding
box, which can be done as follows: We first identify two points y1 and y2 in M
such that their distance is a constant approximation of the diameter of M . Then
we project the points in M to a hyperplane H ∈ Rd−1 perpendicular to the line
through y1 and y2, and recursively identify two points among the projected points
as the approximate diameter. Hence, it suffices to show how to identify such two
points y1 and y2. Let δ = arccos(1/2). Suppose we are working on Rd. We compute
a set I of O(δ−(d−1)) points on the unit sphere Sd−1 such that for any point s ∈ Sd−1,
there is a point ~u ∈ I such that ∠(~u, s) ≤ δ (see e.g., [10, 25]). From Lemma 22,
we know that we can compute for each direction ~u ∈ Sd−1, the point x(~u) ∈ M
that maximizes 〈~u, x(~u)〉 in O(n log n) time. For each ~u ∈ I, compute both x(~u)
and x(−~u), and pick the pair that maximizes ‖x(~u) − x(−~u)‖. Now, we argue this
61
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
is a constant approximation of the diameter. Suppose the diameter of M is (y1, y2)
where y1, y2 ∈ M . Consider the direction ~u′ = (y1 − y2)/‖y1 − y2‖. Without loss of
generality, assume y1 = arg maxy〈y, ~u′〉 and y2 = arg maxy〈y,−~u′〉. Moreover, there
is a direction ~u ∈ I such that ∠(~u, s) ≤ δ. Therefore, we can get that
ω(M,~u) = f(M,~u) + f(M,−~u) ≥ 〈y1, ~u〉+ 〈y2,−~u〉
= 〈~u, y1 − y2〉 = ‖y1 − y2‖ cos∠(~u, ~u′) ≥ ‖y1 − y2‖/2.
In the third equation, we use the simple fact that cos∠(~u, ~u′) = 〈~u, ~u′〉/‖~u‖‖~u′‖.
Lemma 24. S = x(~u)~u∈I is an ε-kernel for M ′.
Proof. Consider an arbitrary vector s ∈ Sd−1 with ‖s‖ = 1. Suppose the point a ∈M ′
maximizes 〈s, a〉 and b ∈ M ′ maximizes 〈−s, b〉. Hence, ω(M ′, s) = 〈s, a〉 − 〈s, b〉 =
〈s, a − b〉. By the construction of I, there is a direction ~u ∈ I (with ‖~u‖ = 1) such
that ‖~u− s‖ ≤ δ. Then, we can see that
ω(S, s) ≥ 〈s, x(~u)〉 − 〈s, x(−~u)〉 = 〈s, x(~u)− x(−~u)〉
= 〈~u, x(~u)− x(−~u)〉+ 〈s− ~u, x(~u)− x(−~u)〉
≥ 〈~u, a− b〉 − ‖s− ~u‖‖x(~u)− x(−~u)‖
= 〈s, a− b〉+ 〈~u− s, a− b〉 − ‖s− ~u‖‖x(~u)− x(−~u)‖
≥ 〈s, a− b〉 − ‖s− ~u‖‖x(~u)− x(−~u)‖ − ‖~u− s‖‖a− b‖
≥ ω(M ′, s)−O(δd) ≥ (1− ε)ω(M ′, s)
In the last and 2nd to last inequalities, we use the fact that M ′ is α-fat (i.e., αC ⊂M ′ ⊂ C).
4.6.2 Details for Section 4.2.2
Theorem 26. Under the β-assumption, there is an ε-exp-kernel in Rd (for d =
O(1)), which is of size O(β−(d−1)ε−(d−1)/2 log(1/ε)) and satisfies the subset constraint,
in the existential uncertainty model.
62
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Proof. Our algorithm is inspired by the peeling idea in [9]. Let ε1 = εαβ2/4√d,
where α is a constant defined later. We repeat the following for L = O(log1−β ε1) =
O(log(1/ε)) rounds: In round i, we first compute an (ε1/√d)-kernel Si (of size
O((√d/ε1)(d−1)/2) = O(β−(d−1)ε−(d−1)/2)) for the remaining points (in the determin-
istic sense) and then delete all points of Si. Let S = ∪iSi. Now, we show that S is
an ε-exp-kernel for P .
We first establish a lower bound of ω(P , ~u) for any unit vector ~u ∈ Sd−1. Assume
without loss of generality that αC ⊂ ConvH(P) ⊂ C where C = [−1, 1]d and α is
a constant only depending on d. Since αC ⊂ ConvH(P ), we know there is a point
s ∈ ConvH(P ) such that 〈~u, s〉 ≥ α and a different point s′ ∈ ConvH(P ) such that
〈~u, s′〉 ≤ −α. Hence, we have that
ω(P , ~u) ≥ β2(〈~u, s〉 − 〈~u, s′〉) ≥ 2αβ2.
Fix an arbitrary direction ~u ∈ Sd−1. Now, we bound the difference between f(P , ~u)
and f(S, ~u). We show that for any real value x ∈ [−√d,√d],
PrP∼P [f(P, ~u) ≥ x] ≤ PrS∼S [f(S, ~u) ≥ x− ε1] + ε1. (4.6)
In fact, a proof of the above statement provides a proof for Theorem 43 (i.e., S is an
(ε, τ)-quant-kernel as well).
Let LP = s1, s2, . . . , sL be the set of L points s ∈ P that maximize 〈s, ~u〉 (i.e.,
the first L vertices in the canonical order w.r.t. ~u). Similarly, let LS = s′1, s′2, . . . , s′Lbe the set of L points s′ ∈ S that maximize 〈s′, ~u〉. We distinguish two cases:
1. LP = LS : If x ≥ 〈~u, sL〉, we can see that PrP∼P [f(P, ~u) ≥ x] = PrS∼S [f(S, ~u) ≥x]. If x < 〈~u, sL〉, both PrP∼P [f(P, ~u) ≥ x] and PrS∼S [f(S, ~u) ≥ x] are at least
1−∏s∈LP (1− ps) ≥ 1− (1− β)L ≥ 1− ε1.
2. Suppose j is the smallest index such that sj 6= s′j. For x > 〈~u, sj〉, we can see
that PrP∼P [f(P, ~u) ≥ x] = PrS∼S [f(S, ~u) ≥ x]. Now, we focus on the case where
x ≤ 〈~u, sj〉. From the construction of S, we can see that 〈s′j′ , ~u〉 ≥ 〈sj, ~u〉 − ε1
63
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
for all j′ ≥ j. 13 Hence, for x ≤ 〈~u, sj〉, we can see that
PrS∼S [f(S, ~u) ≥ x− ε1] ≥ 1−∏
s∈LS(1− ps) ≥ 1− ε1.
So, in either case, (4.6) is satisfied. We also need the following basic fact about the
expectation: For a random variable X, if Pr[X ≥ a] = 1, then E[X] =∫∞b
Pr[X ≥x]dx+ b for any b ≤ a. Since −
√d ≤ f(P, ~u) ≤
√d for any realization P , we have
f(P , ~u) =
∫ ∞
−√d
PrP∼P [f(P, ~u) ≥ x]dx−√d
≤∫ ∞
−√d
PrS∼S [f(S, ~u) ≥ x− ε1]dx+ 2√dε1 −
√d
≤∫ ∞
−√d−ε1
PrS∼S [f(S, ~u) ≥ x]dx−√d− ε1 + 3
√dε1
= f(S, ~u) + 3√dε1,
where the only inequality is due to (4.6) and the fact that PrP∼P [f(P, ~u) ≥ x] =
PrS∼S [f(S, ~u) ≥ x] = 0 for x > 1. Similarly, we can get that f(S,−~u) ≥ f(P ,−~u) −3ε1
√d. By the choice of ε1, we have that 6
√dε1 ≤ ε · 2αβ2 ≤ εω(P , ~u). Hence,
ω(S, ~u) ≥ ω(P , ~u)− 6√dε1 ≥ (1− ε)ω(P , ~u).
4.6.3 Locational uncertainty
Similar results are possible for uncertain points with locational uncertainty. Let
V = v1, . . . , vm be the set of nodes and P = s1, . . . , sn be the collection of
possible locations. Now there are n possible locations, and thus(n2
)hyperplanes Γ
that partition Rd. We can replicate all bounds in this setting, except that m replaces n
in each bound. The main difficulty is in replicating Lemma 49 that given a direction ~u
calculates the vertex of M ; for locational uncertain points this is described in Lemma
51. Moreover, the O(n2 log n) bound for R2 is also carefully described in Lemma 52.
In the locational uncertainty model, Lemma 18 also holds with a stronger general
13To see this, consider the round in which s′j′ is chosen. Let s be the vertex minimizing 〈s, ~u〉. As
sj is not chosen, we must have 〈s′j′ , ~u〉 − 〈s, ~u〉 ≥ (1− ε1/√d)(〈sj , ~u〉 − 〈s, ~u〉).
64
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
position assumption. With the new general position assumption, it is straightforward
to show that the gradient vector is different for two adjacent cones in A(Γ). Other
parts of the proof is essentially the same as Lemma 18. The details can be found
below. Theorem 2 also holds for the locational model without any change in the
proof (the running time becomes O(n log2 n)).
Now, we prove that Lemma 18 also holds for the locational model. For this pur-
pose, we need a stronger general position assumption: (1) For any v ∈ V ,∑
s∈P pvs ∈(0, 1). This suggests that we need to consider the model with both existential and
locational uncertainty. We can make this assumption hold by subtracting an infinites-
imal value from each probability value without affecting the directional width in any
essential way. (2) For any two nodes v1, v2 ∈ V , two locations s1, s2 ∈ P and two
subsets of locations S1, S2 ⊆ P , pv1s1(∑
s∈S2pv2s)s1 6= pv2s2(
∑s∈S1
pv1s)s2 (this is in-
deed a general position assumption since we only have a finite number of equations
to exclude but uncountable number of choices of the positions).
Lemma 18. (for the locational model). Assuming the locational model and the above
general position assumption, the complexity of M is the same as the cardinality of
A(Γ), i.e., |M | = |A(P)|. Moreover, each cone C ∈ A(P) corresponds to exactly one
vertex s of ConvH(M) in the following sense: ∇f(M,~u) = s for all ~u ∈ intC.
Proof. The proof is almost the same as that for Lemma 18 except that we need
to show f(M,~u) is different for two adjacent cones in A(Γ). Again, let C1, C2 be
two adjacent cones separated by some hyperplane in Γ. Suppose ~u1 ∈ intC1 and
~u2 ∈ intC2. Consider the canonical orders O1 and O2 of P with respect to ~u1
and ~u2 respectively. W.l.o.g., assume that O1 = s1, . . . , si, si+1, . . . , sn and O1 =
s1, . . . , si+1, si, . . . , sn.Let PrR(v, s, ~u) be the probability that the largest point along ~u is uncertain node
v ∈ V at location s ∈ P . Using the notations from Lemma 51, f(V , ~u) can computed
by∑
v∈V,s∈P PrR(v, s, ~u)〈s, ~u〉. Hence, ∇f(V , ~u) =∑
v∈V,s∈P PrR(v, s, ~u)s.
Suppose si is a possible location for v1 and si+1 is a possible location for v2.
Denote by PR(s, ~u) the subset of s′ ∈ P such that 〈s′, ~u〉 > 〈s, ~u〉 and denote by
PrR∅ (s, ~u) as the probability that no node v ∈ V appears at a larger location than
65
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
s ∈ P along direction ~u. If v1 6= v2, we have ∇f(V , ~u1) − ∇f(V , ~u2) = PrR∅ (si, ~u1) ·(sipv1si
∑s′∈PR(s,~u1) pv2s′ − si+1pv2si+1
∑s′∈PR(s,~u2) pv1s′
)6= 0. If v1 = v2 = v, we have
∇f(V , ~u1)−∇f(V , ~u2) = PrR∅ (si, ~u1) ·(sipvsi − si+1pvsi+1
)6= 0.
4.7 Missing Details in Section 4.3
Theorem 36. Let τ1 = O( τmaxλ,λ2) and N = O( 1
τ21
log 1δ) = O(maxλ2,λ4
τ2 log 1δ).
With probability at least 1 − δ, for any t ≥ 0 and any direction ~u, we have that
Pr[ω(S, ~u) ≤ t
]∈ Pr
[ω(P , ~u) ≤ t
]± τ.
Proof. Fix an arbitrary direction ~u (w.l.o.g., say it is the x-axis) and rename all points
in P as s1, s2, . . . , sn as before. Consider the Poissonized instance of P . Let s′1, . . . , s′N
be the N points in S (also sorted in nondecreasing order of their x-coordinates). Now,
we create a coupling between all mass in A and that in B, as follows. We process all
points in A from left to right, starting with s1. The process has N rounds. In each
round, we assign exactly 1/N units of mass in A to a point in B. In the first round,
if s1 contains less than 1/N units of mass, we proceed to s2, s3, . . . si until we reach
1/N units collectively. We split the last point si into two points si1 and si2 so that
the mass contained in s1, . . . , si−1, si1 is exactly 1/N , and we assign those points to
s′1. We start the next round with si2. If s1 contains more than 1/N units of mass, we
split s1 into s11 (s11 contains 1/N units) and s12 and we start the second round with
s12. We repeat this process until all mass in A is assigned.
The above coupling can be viewed as a mass transportation from A to B. We
will need one simple but useful property about this transportation: for any vertical
line x = t, at most τ1 units of mass are transported across the vertical line (by
Theorem 32).
In the construction of the coupling, many points in A may be split. We rename
them to be s1, . . . , sL (according to the order in which they are processed). The
sequence s1, . . . , sL can be divided into N segments, each assigned to a point in S.
For a point s′i in S, let seg(i) be the segment (the set of points) assigned to s′i. For
any node s and real t > 0, we use H(s, t) to denote the right open halfplane defined
66
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
by the vertical line x = x(s) + t, where x(s) is the x-coordinate of s (see Figure 4-3).
Let Xi (Yi resp.) be the Poisson distributed random variable corresponding to si
(s′i resp.) (i.e., Xi ∼ Pois(λsi) and Yi ∼ Pois(λ/N) ) for all i. For any H ⊂ R2, we
write X(H) =∑
si∈H∩P Xi and Y (H) =∑
s′i∈H∩S Yi. We can rewrite Pr[ω(S, ~u) ≤ t]
as follows:
Pr[ω(S, ~u) ≤ t] =N∑
i=1
Pr[s′i is the leftmost point and ω(S, ~u) ≤ t] + Pr[no point in S appears]
=N∑
i=1
Pr[Yi 6= 0] Pr[ i−1∑
j=1
Yj = 0]
Pr[Y (H(s′i, t)) = 0] + Pr[∑
s′i∈SYi = 0]
(4.7)
Similarly, we can write that 14
Pr[ω(P , ~u) ≤ t] =m∑
i=1
Pr[Xi 6= 0] Pr[ i−1∑
j=1
Xj = 0]
Pr[X(H(si, t)) = 0] + Pr[∑
si∈PXi = 0]
=N∑
i=1
∑
k∈seg(i)
Pr[Xk 6= 0] Pr[k−1∑
j=1
Xj = 0]
Pr[X(H(sk, t)) = 0] + Pr[∑
si∈PXi = 0]
(4.8)
We proceed by attempting to show each each summand of (4.8) is close to the cor-
responding one in (4.7). First, we can see that Pr[∑
s′i∈S Yi = 0] = Pr[∑
si∈P Xi = 0]
since both∑
s′i∈S Yi and∑
si∈P Xi follow the Poisson distribution Pois(λ).
For any segment i, we can see that∑
k∈seg(i) λsk = λ/N . Moreover, we have
λsk ≤ λ/N ≤ τ/32, thus exp(−λsk) ∈ (1− λsk , (1 + τ/16)(1− λsk)).
∑
k∈seg(i)
Pr[Xk 6= 0] =∑
k∈seg(i)
(1− exp(−λsk)) ∈ (1± τ
16)∑
k∈seg(i)
λsk
⊂ (1± τ
8)(1− exp(
λ
N)) = (1± τ
8)Pr[Yi 6= 0]. (4.9)
14Note that splitting nodes does not change the distribution of ω(P, ~u): Suppose a node s (corre-sponding to r.v. X) was spit to two nodes s1 and s2 (corresponding to X1 and X2 resp.). We cansee that Pr[X 6= 0] = Pr[X1 6= 0 and X2 6= 0] = Pr[X1 +X2 6= 0].
67
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
I
ts
H(s, t)
seg
s′5
seg(5) seg(9)
s′9
I5 I9
Figure 4-3: Illustration of the interval graph I. For illustration purpose, co-locatedpoints (e.g., points that are split in A) are shown as overlapping points. The arrowsindicate the assignment of the segments to the points in B. Theorem 32 ensures thatany vertical line can not stab many intervals.
Then, we notice that for any k ∈ seg(i) (i.e., sk is in the segment assigned to s′i), it
holds that
Pr[ k∑
j=1
Xj = 0]∈ [e−iλ/N , e−λ(i−1)/N ] ⊂ (1± τ
8)e−λ(i−1)/N = (1± τ
8)Pr[ i−1∑
j=1
Yj = 0].
(4.10)
The first inequality holds because∑k
j=1 Xj ∼ Pois(∑k
j=1 λsj)
and λ(i − 1)/N ≤∑k
j=1 λsj ≤ λi/N .
If we can show that Pr[X(H(sk, t)) = 0] is close to Pr[Y (H(s′i, t)) = 0] for k ∈seg(i), we can finish the proof easily since each summand of (4.8) would be close to
the corresponding one in (4.7). However, this is in general not true and we have to
be more careful.
Recall that the sequence s1, . . . , sL is divided into N segments. Let K = λ/τ . We
say that the ith segment (say seg(i) = sj, sj+1, . . . , sk) is a good segment if
Ei = max∣∣B(H(s′i, t))− A(H(sj, t))
∣∣ ,∣∣B(H(s′i, t))− A(H(sk, t))
∣∣≤ 1
K.
Otherwise, the segment is bad. For a good segment seg(i) and any k ∈ seg(i),
Pr[X(H(sk, t)) = 0] = exp(−λA(H(sk, t))
)∈ exp
(−λB(H(s′i, t))± λ/K
)
68
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
⊂ Pr[Y (H(s′i, t)) = 0]e±λ/K ⊂ Pr[Y (H(s′i, t)) = 0](1± τ/8).
(4.11)
We use Gs to denote the set of good segments and Bs the set of bad segments. Now,
we consider the summations in both (4.7) and (4.8) with only good segments. We
have that
∑
i∈Gs
∑
k∈seg(i)
Pr[Xk 6= 0] Pr[k−1∑
j=1
Xj = 0]
Pr[X(H(sk, t)) = 0]
∈∑
i∈GsPr[ i−1∑
j=1
Yj = 0](1± τ/8) Pr[Y (H(s′i, t)) = 0](1± τ/8)Pr[Yi 6= 0](1± τ/8)
⊂∑
i∈GsPr[Yi 6= 0] Pr
[ i−1∑
j=1
Yj = 0]
Pr[Y (H(s′i, t)) = 0]± τ/2,
where the first inequality is due to (4.10) and (4.11) and the second holds because
(4.9)
Now, we show the total contributions of bad segments to both (4.7) and (4.8)
are small. We partition all of the bad segments into log(1/τ) + 1 different sets
B0, . . . , Blog 1τ−1. Let Bi = i | 2i
K< Ei ≤ 2i+1
K for 0 ≤ i ≤ log(1/τ) − 1. Let
Blog 1τ
= i | Ei > 1λ.
With the above notations, we prove the following crucial inequality:
log 1τ∑
i=0
|Bi| · 2i = O(τ1NK). (4.12)
Now, we prove (4.12). Consider all points s1, . . . , sL and s′1, . . . , s′L lying on the
same x-axis. For each i (with seg(i) = sj, sj+1, . . . , sk), we draw the minimal
interval Ii that contains s′i, sj and sk. If the ith segment is bad and belongs to Bj,
we also say Ii is a bad interval of label j. All intervals Iii define an interval graph
I. We can see that any vertical line can stab at most τ1N + 1 intervals, because at
most τ1 unit of mass can be transported across the vertical line, and each interval is
responsible for a transportation of exactly 1/N units of mass (except the one that
69
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
intersects the vertical line). Hence, the interval graph I can be colored with at most
τ1N + 1 colors (this is because the clique number of I is at most τ1N + 1 and the
chromatic number of an interval graph is the same as its clique number). Consider a
color class C (which consists of a set of non-overlapping intervals). Imagine we move
an interval I of length t along the x-axis from left to right. When the left endpoint
of I passes through an bad interval of label j in C, by the definition of bad segments,
the right endpoint of I passes through Ω(2jN/K) segments. Suppose the color class
C contains bj bad segments in Bj. Since the right endpoint of I can pass through at
most N segments, we have the following inequalities by summing over all labels:
log 1τ−1∑
j=0
bj · 2jN/K ≤ N.
Summing up all color classes, we obtain (4.12).
For Bj (0 ≤ j ≤ log 1τ), we can bound the total contribution as follows. By the
definition of Bj, we can see that
∑
i∈Bj
∑
k∈seg(i)
Pr[Xk 6= 0] Pr[k−1∑
j=1
Xj = 0]
Pr[X(H(sk, t)) = 0]
⊂ (1± 5 · 2jτ)∑
i∈BjPr[Yi 6= 0] Pr
[ i−1∑
j=1
Yj = 0]
Pr[Y (H(s′i, t)) = 0],
Thus, the total contribution of bad segments in Bj (0 ≤ j ≤ log 1τ− 1) to the
corresponding summands in ((4.7)-(4.8)) is at most
5 · 2jτ∑
i∈BjPr[Yi 6= 0] = 5|Bj| · 2jτ × (1− exp (− λ
N)) = O(|Bi|2jτλ/N),
where Pr[Yi 6= 0] = 1− exp (− λN
) (since Yi ∼ Pois( λN
)).
For Blog 1τ, the total contribution is bounded by the following.
∣∣∣∣∣∑
i∈Blog 1
τ
( ∑
k∈seg(i)
Pr[Xk 6= 0] Pr[k−1∑
j=1
Xj = 0]
Pr[X(H(sk, t)) = 0]
70
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
− Pr[Yi 6= 0] Pr[ i−1∑
j=1
Yj = 0]
Pr[Y (H(s′i, t)) = 0])∣∣∣∣∣
≤∑
i∈Blog 1
τ
∑
k∈seg(i)
Pr[Xk 6= 0] +∑
i∈Blog 1
τ
Pr[Yi 6= 0] ≤ 3∑
i∈Blog 1
τ
Pr[Yi 6= 0]
≤ 3|Blog 1τ| × (1− exp (− λ
N) = O(|Blog 1
τ|λ/N)
Summing up all j and using (4.12), we obtain the following inequality .
log 1τ∑
j=0
O(|Bj|2jτλ/N) = O(τ1τλK) ≤ τ
4.
This finishes the proof.
4.8 Missing Details in Section 4.4
Lemma 45. Let N = O(ε−2
1 ε−(d−1)/20 log(1/ε0)
), where ε0 = (ε/4(r − 1))r, ε1 = εβ2.
For any t ≥ 0 and any direction ~u ∈ P?, we have that
PrP∼S[maxs∈P〈~u, s〉1/r ≥ t
]∈ PrP∼P
[maxs∈E(P )
〈~u, s〉1/r ≥ t)]± ε1/4, and
PrP∼S[mins∈P〈~u, s〉1/r ≥ t
]∈ PrP∼P
[mins∈E(P )
〈~u, s〉1/r ≥ t)]± ε1/4.
Proof. The argument is almost the same as that in Lemma 30. Let L = O(ε−(d−1)/20 ).
We still build a mapping g that maps each realization E(P ) to a point in RdL, as fol-
lows: Consider a realization P of P . Suppose E(P ) = (x11, . . . , x
1d), . . . , (x
L1 , . . . , x
Ld )
(if |E(P )| < L, we pad it with (0, . . . , 0)). We let g(E(P )) = (x11, . . . , x
1d, . . . , x
L1 , . . . , x
Ld ) ∈
RdL. For any t ≥ 0 and any direction ~u ∈ P?, note that maxs∈E(P )〈~u, s〉1/r ≥ t holds
if and only if there exists some 1 ≤ i ≤ |E(P )| satisfies that∑d
j=1 xij~uj ≥ tr, which is
equivalent to saying that point g(E(P )) is in the union of the those |E(P )| half-spaces.
Let X be the image set of g. Let (X,Ri) (1 ≤ i ≤ L)) be a range space, where
Ri is the set of half spaces ∑dj=1 x
ij~uj ≥ t | u = (~u1, . . . , ~ud) ∈ Rd, t ≥ 0. Let
71
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
R′ = ∪ri | ri ∈ Ri, i ∈ [L]. Note that each (X,Ri) has VC-dimension d + 1.
By Theorem 28, we have that the VC-dimension of (X,R′) is bounded by O((d +
1)L lgL) = O(ε−(d−1)/20 log(1/ε0)). Then by Theorem 29, for any t and any direction
~u, we have that PrP∼S [maxs∈P 〈~u, s〉1/r ≥ t] ∈ PrP∼P [maxs∈E(P )〈~u, s〉1/r ≥ t)] ± ε1/4.
The proof for the second statement is the same.
4.9 Computing the Expected Direction Width
We handle both the existential and location model of uncertain points in this section.
For any direction ~u, denote by ω(P , ~u) the expected width of P along the direction
~u, and f(P , ~u) = EP∼P [maxp∈P 〈~u, p〉] is the support function. Recall ω(P , ~u) =
f(P , ~u)− f(P ,−~u) by linearity of expectation.
4.9.1 Computing Expected Width for Existential Uncertainty
The existential model is a bit simpler and we handle that first. Recall in this model
we let P be a set of n uncertain points, and each point s ∈ P has a probability ps.
We have the following two lemmas.
Lemma 49. For any direction ~u, we can compute ω(P , ~u), f(P , ~u), and ∇f(P , ~u) in
O(n log n) time; if the points of P are already sorted along the direction ~u, then we
can compute them in O(n) time.
Proof. Consider any direction ~u. Without loss of generality, assume ‖~u‖ = 1. In the
following, we first show how to compute f(P , ~u). The value f(P ,−~u) can be computed
in a similar manner and we ignore the discussion. After having f(P , ~u) and f(P ,−~u),
ω(P , ~u) can be computed immediately by ω(P , ~u) = f(P , ~u)− f(P ,−~u). Finally, we
will discuss how to compute ∇f(P , ~u). Let ρ(~u) be the ray of direction ~u in the plane
passing through the origin.
Consider a point s ∈ P . Note that 〈s, ~u〉 is the coordinate of the perpendicular
projection of s on ρ(~u). Denote by PR(s, ~u) the subset of points s′ ∈ P such that
s′ >~u s (i.e., 〈s′, ~u〉 > 〈s, ~u〉). Denote by PrR(s, ~u) the probability that s appears in a
72
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
realization but all points of PR(s, ~u) do not appear (i.e., 〈s, ~u〉 is the largest among
all points of P that appear in the realization). Hence, we have
PrR(s, ~u) = ps ·∏
s′>~us
(1− ps′). (4.13)
Now f(P , ~u) can be seen as the expected largest coordinate of the projections of
the points in P on ρ(~u). According to the definition of PrR(s, ~u), we have f(P , ~u) =∑
s∈P PrR(s, ~u)〈s, ~u〉.Based on the above discussion, we can compute f(P , ~u) in the following way. First,
we project all points of P on ρ(~u) and obtain the coordinate 〈~u, s〉 for each s ∈ P .
Second, we sort all points of P by the coordinates of their projections on ρ(~u). Then,
the values PrR(s, ~u) for all points s ∈ P can be obtained in O(n) time by considering
the projection points on ρ(~u) from right to left. Finally, f(P , ~u) can be computed in
additional O(n) time. Therefore, the total time for computing f(P , ~u) is O(n log n),
which is dominated by the sorting. If the points of P are given sorted along the
direction ~u, then we can avoid the sorting step and compute f(P , ~u) in overall O(n)
time.
It remains to compute ∇f(P , ~u). Recall that ∇f(P , ~u) =∑
s∈P PrR(s, ~u)s by the
proof of Lemma 18. Note that the above has already computed PrR(s, ~u) for all points
s ∈ P . Therefore, ∇f(P , ~u) can be computed in additional O(n) time. The lemma
thus follows.
Lemma 50. We can build a data structure of O(n2) size in O(n2 log n) time that can
compute ω(P , ~u), f(P , ~u), and ∇f(P , ~u) in O(log n) time for any query direction ~u.
Further, we can construct M explicitly in O(n2 log n) time.
Proof. Consider any direction ~u with ‖~u‖ = 1. We follow the definitions and notations
in the proof of Lemma 49. We first show how to build a data structure to compute
f(P , ~u). Computing f(P ,−~u) can be done similarly. Again, after having f(P , ~u) and
f(P ,−~u), ω(P , ~u) can be computed immediately by ω(P , ~u) = f(P , ~u)− f(P ,−~u).
Denote by o the origin. For any ray ρ through o in the plane, we refer to the
angle of ρ as the angle α in [0, 2π) such that after we rotate the x-axis around o
73
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
xo
αl
xo
α
(a) (b)
ρ
Figure 4-4: Illustrating the definition of the angle α of: (a) a ray ρ and (b) a line l.
counterclockwise by α the x-axis has the same direction as ρ (see Fig. 4-4(a)). For
any (undirected) line l through o, we refer to the angle of l as the angle α in [0, π)
such that after we rotate the x-axis around o counterclockwise by α the x-axis is
collinear with l.
Recall ρ(~u) is the ray through o with direction ~u. We define the angle of ~u as the
angle of the ray ρ(~u), denoted by θ~u. For ease of discussion, we assume θ~u is in [0, π)
since the the case θ~u ∈ [π, 2π) can be handled similarly.
We call the order of the points of P sorted by the coordinates of their projections
on the ray ρ(~u) the canonical order of P with respect to ~u. An easy observation
is that when we increase the angle ~u, the canonical order of P does not change
until ~u is perpendicular to a line containing two points of P . There are O(n2) lines
in the plane each of which contains two points of P and the directions of these
lines partition [0, π) into O(n2) intervals such that if θ~u changes in each interval the
canonical order of P does not change. In the following, we show that for each of the
above intervals, the value of f(P , ~u) is a function of the angle θ~u, and more specifically
f(P , ~u) = a · cos(θ~u) + b · sin(θ~u) where a and b are constants when θ~u changes in the
interval. As preprocessing for the lemma, we will compute the function f(P , ~u) for
each interval; for each query direction ~u, we first find the interval that contains θ~u
by binary search in O(log n) time and then obtain the value f(P , ~u) in constant time
using the function for the interval. The details are given below.
For simplicity of discussion, we make a general position assumption that no three
points of P are collinear. For any two points s and s′ in P , let β(s, s′) denote the
angle of the line perpendicular to the line containing s and s′, and we also say β(s, s′)
74
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
is defined by s and s′. We sort all O(n2) angles β(s, s′) for s, s′ ∈ P in increasing
order, and let β1, β2, . . . , βh be the sorted list with h = O(n2). For simplicity, let
β0 = 0 and βh+1 = π. These angles partition [0, π) into h + 1 intervals. Consider
an interval Ii = (βi, βi+1) for any 0 ≤ i ≤ h. Below we compute the function
f(P , ~u) = a · cos(θ~u) + b · sin(θ~u) for θ~u ∈ (βi, βi+1). Again, note that when θ~u changes
in Ii, the canonical order of P does not change.
According to the proof of Lemma 49, f(P , ~u) =∑
s∈P PrR(s, ~u)〈s, ~u〉. Since the
canonical order of P does not change for any θ~u ∈ Ii, for any s ∈ P , PrR(s, ~u) is a
constant when θ~u changes in Ii. Next, we consider the coordinate 〈s, ~u〉 on ρ(~u).
For each point s ∈ P , let αs be the angle of the ray originating from o and
containing s (i.e., directed from o to s), and let ds be the length of the line segment
vo. Note that αs and ds are fixed for the input. Then, we have (see Fig. 4-5)
〈s, ~u〉 = ds · cos(αs − θ~u) = ds · cos(αv) · cos(θ~u) + ds · sin(αs) · sin(θ~u).
Hence, we have the following
f(P , ~u) =∑
s∈PPrR(s, ~u)〈s, ~u〉
=∑
s∈PPrR(s, ~u) · dv · [cos(αv) · cos(θ~u) + sin(αv) · sin(θ~u)]
= cos(θ~u) ·[∑
s∈PPrR(s, ~u) · dv · cos(αv)
]+ sin(θ~u) ·
[∑
s∈PPrR(s, ~u) · dv · sin(αv)
].
Let a =∑
s∈P PrR(s, ~u) · ds · cos(αs) and b =∑
s∈P PrR(s, ~u) · ds · sin(αs). Hence,
a and b are constants when θ~u changes in Ii. Then, we have f(P , ~u) = a · cos(θ~u) +
b · sin(θ~u), for any θ~u ∈ Ii. Therefore, if we know the two values a and b, we can
compute f(P , ~u) in constant time for any direction θ~u ∈ Ii.In the sequel, we show that we can compute a and b for all intervals Ii = (βi, βi+1)
with i = 0, 1, . . . , h in O(n2) time. For each interval Ii, we use a(Ii) and b(Ii) to
denote the corresponding a and b respectively for the interval Ii.
Suppose we have computed a(Ii) and b(Ii) for the interval Ii, and also suppose
75
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
xo
sρ(~u)
αs~u
dss(~u)
Figure 4-5: Illustrating the computation of the coordinate x(s, ~u) on l(~u): v(~u) is the perpendicularprojection of s on l(~u). The length of ov is ds.
have computed the value PrR(s, ~u) for each point s ∈ P when θ~u ∈ Ii (note that
PrR(s, ~u) is a constant for any θ~u ∈ Ii). Initially, we can compute these values for
the interval I0 in O(n log n) time by Lemma 49. Below, we show that we can obtain
a(Ii+1) and b(Ii+1) in constant time, based on the above values maintained for Ii.
Recall that Ii = (βi, βi+1) and Ii+1 = (βi+1, βi+2). Suppose the angle βi+1 is
defined by the two points s1 and s2 of P . In other words, βi+1 is the angle of the line
perpendicular to the line through s1 and s2. If we increase the angle θ~u in (βi, βi+2),
the canonical order of P does not change except that s1 and s2 exchange their order
when θ~u passes the value βi+1. Therefore, for each point s ∈ P \ s1, s2, the value
PrR(s, ~u) is a constant for any θ~u ∈ (βi, βi+2). Based on this observation, we can
compute a(Ii+1) in the following way.
We first analyze the change of the values PrR(s1, ~u) and PrR(s2, ~u) when θ~u changes
from Ii to Ii+1. Let ~u and ~u′ be any two directions such that θ~u is in Ii and θ~u′ is in Ii+1.
Without loss of generality, we assume 〈s1, ~u〉 < 〈s2, ~u〉, and thus, 〈s1, ~u′〉 > 〈s2, ~u
′〉since s1 and s2 exchange their order. Observe that PrR(s1, ~u
′) = PrR(s1, ~u)/(1− ps2)
and PrR(s2, ~u′) = PrR(s2, ~u) · (1−p1). Thus, we can obtain PrR(s1, ~u
′) and PrR(s2, ~u′)
in constant time since we already maintain PrR(s1, ~u) and PrR(s2, ~u). Consequently,
we have
a(Ii+1) = a(Ii)− [PrR(s1, ~u)ds1 cos(αs1) + PrR(s1, ~u)ds2 cos(αs2)] (4.14)
+ [PrR(s1, ~u′)ds1 cos(αs1) + PrR(s1, ~u
′)ds2 cos(αs2)]
= a(Ii) + ds1 cos(αs1) · [PrR(s1, ~u′)− PrR(s1, ~u)] + ds2 cos(αs2) · [PrR(s2, ~u
′)− PrR(s2, ~u)].
76
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Hence, after we compute PrR(s1, ~u′) and PrR(s2, ~u
′), we can obtain a(Ii+1) in
constant time.
Similarly, we can obtain b(Ii+1) in constant time. Also, the values PrR(s1, ~u) and
PrR(s2, ~u) are updated for θ~u ∈ Ii+1.
In summary, after the O(n2) angles β(s, s′) are sorted in O(n2 log n) time, the
above computes the functions f(P , ~u) = a(Ii) · cos(θ~u) + b(Ii) · sin(θ~u) for all intervals
Ii with i = 0, 1, . . . , h, in additional O(n2) time. This finishes our preprocessing.
Consider any query direction ~u. By binary search, we first find the two angles βi
and βi+1 such that βi ≤ θ~u < βi+1. If βi 6= θ~u, then θ~u is in Ii and we can use the
function f(P , ~u) = a(Ii) cos(θ~u) + b(Ii) sin(θ~u) to compute f(P , ~u) in constant time. If
βi = θ~u, then the function f(P , ~u) = a(Ii) cos(θ~u) + b(Ii) sin(θ~u) still gives the correct
value of f(P , ~u) since when θ~u = βi the projections of the two points of P defining
βi on ρ(~u) overlap and we can still consider the canonical order of P for θ~u = βi the
same as that for θ~u ∈ Ii. Hence, the query time is O(log n).
Next, we show how to compute∇f(P , ~u). Recall that∇f(P , ~u) =∑
s∈P PrR(s, ~u)s
by the proof of Lemma 18. As preprocessing, we compute the value∑
s∈P PrR(s, ~u)s
for each interval θ~u ∈ (βi, βi+1) for i = 0, 1, . . . , h. This can be done in O(n2) time
(after we sort all angles), by using the similar idea as above. Specifically, suppose we
already have ∇f(P , ~u) =∑
s∈P PrR(s, ~u)s for θ~u ∈ (βi, βi+1); then we can compute
∇f(P , ~u) =∑
s∈P PrR(s, ~u)s for θ~u ∈ (βi+1, βi+2) in constant time. This is because
when θ~u changes from (βi, βi+1) to (βi+1, βi+2), PrR(s, ~u) does not change for any
s ∈ P \ s1, s2 and PrR(s, ~u) for s ∈ s1, s2 can be updated in constant time, as
shown above. Due to the above preprocessing, given any direction ~u, we can compute
∇f(P , ~u) in O(log n) time by binary search, similar to that of computing f(P , ~u).
Further, according to Lemma 18, the above preprocessing essentially computes M , in
totally O(n2 log n) time. The lemma thus follows.
4.9.2 Computing Expected Width for Locational Uncertainty
In this setting let V be a set of m uncertain points each taking one of several locations
from a set of n locations in P . The probability that a node v ∈ P is in location s ∈ P
77
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
is denoted pvs. To simplify analysis and discussion, we assume each location s ∈ Ponly has the potential to be realized by any one uncertain point v ∈ V .
We now replicate the lemmas in the previous section for this setting. We use the
same notation and structure when possible.
Lemma 51. For any direction ~u, we can compute ω(V , ~u), f(V , ~u), and ∇f(V , ~u) in
O(n log n) time; if the locations of P are already sorted along the direction ~u, then we
can compute them in O(n) time.
Proof. Again, we first compute f(V , ~u) since ω(V , ~u) = f(V , ~u)−f(V ,−~u) and f(V ,−~u)
can be computed similarly.
We follow the structure and proof of Lemma 49 and just note the changes. The
first change is that we need to keep a bit more structure since there is now dependence
between the different locations of each uncertain node v. Recall that PR(s, ~u) is the
subset of s′ ∈ P such that 〈s′, ~u〉 > 〈s, ~u〉 and PrR∅ (s, ~u) is the probability that no
node v ∈ V appears at a larger location than s ∈ P along direction ~u. To describe this
probability we first define a vector Av indexed by s as Av[s] = 1−∑s′∈PR(s,~u) pvs′ as
the probability that uncertain node v does not appear in any of its possible locations
which are after s along direction ~u. Now we can define
PrR∅ (s, u) =∏
v∈PAv[s].
Also recall that PrR(v, s, ~u) is the probability that the largest point along ~u is uncer-
tain point v ∈ V at location s ∈ P . This updates equation (4.13) to be
PrR(v, s, ~u) = pvs · PrR∅ (s, ~u)/Av[s].
Note the two key differences. First we need to sum the probabilities for each location
of v since they are mutually exclusive. Second, value Av[s] needs to be factored out
of PrR∅ (s, u) because it is already accounted for in pvs locating s at s, again since they
are mutually exclusive.
It follows that f(V , ~u) =∑
v∈V,s∈P PrR(v, s, ~u)〈s, ~u〉.
78
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
To compute f(V , ~u) we again start by projecting all points P onto ρ(~u) obtaining
coordinates 〈~u, s〉 and sorting, if needed. This takes O(n log n) time. Given these
coordinates, sorted, it now takes a bit more work in the locational setting to show
that f(V , ~u) can be computed in O(n) additional time. We focus on computing all n
values PrR(v, s, ~u); from there is straightforward to compute f(V , ~u) in O(n) time.
We sweep over the locations s ∈ P from largest 〈s, ~u〉 value to smallest, and
we maintain each Av[s] and PrR∅ (s, ~u) along the way. Given these, it is not hard
to calculate PrR(v, s′, ~u) in constant time with pvs′ for s′ ∈ P as the next smallest
value 〈s′, ~u〉. The important observation is that we only need to update Av[s]new =
Av[s]old − pvs if pvs > 0 (which by assumption holds for only one v ∈ P). Then
PrR∅ (s, u) is updated by multiplying by Av[s]new/Av[s]
old. Both operations can be
done in constant time as needed to complete the proof.
It remains to compute ∇f(V , ~u). It is not hard to see that in the locational model
∇f(V , ~u) =∑
v∈V,s∈PPrR(v, s, ~u)s.
Note that the above has already computed the n values PrR(v, s, ~u) for all v ∈ V and
s ∈ P . Therefore, ∇f(V , ~u) can be computed in additional O(n) time.
Lemma 52. We can build a data structure of O(n2) size in O(n2 log n) time that can
compute ω(V , ~u), f(V , ~u), and ∇f(V , ~u) in O(log n) time for any query direction ~u.
Further, we can construct M explicitly in O(n2 log n) time.
Proof. Again we first discuss the case for computing f(V , ~u). For ease of discussion,
we assume the angle θ~u is in [0, π). We again follow the structure of Lemma 50. The
geometry is largely the same, except that there are h = O(n2) angles β1, β2, . . . , βh
since each pair s, s′ ∈ P now defines an angle β(s, s′). But it remains to compute
f(V , ~u) = a ·cos(θ~u)+b ·sin(θ~u) for some constants a and b for any θ~u in each (βi, βi+1).
The argument is virtually the same, replacing PrR(s, ~u) with PrR(v, s, u).
It remains to show that we can calculate the constants a(Ii+1) and b(Ii+1) for
an interval Ii+1 = (βi+1, βi+2) efficiently, given the values for interval Ii = (βi, βi+1).
79
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
Assume βi+1 is defined for two points s1, s2 ∈ P , where 〈s1, ~u〉 < 〈s2, ~u〉 for θ~u ∈ Iiand 〈s1, ~u
′〉 > 〈s2, ~u′〉 for θ~u′ ∈ Ii+1. By definition, the ordering among all other pairs
of points is unchanged within (βi, βi+2). Let only v1 take location s1 with positive
probability and only v2 take location s2 with positive probability. We focus on the
more general case where v1 6= v2; when v1 = v2, it is easier to update.
We again focus on updating a and the algorithm for b is symmetric. By the
v1 6= v2 assumption Av1 [s1] and Av2 [s2] are unchanged in interval (βi, βi+2). However
from directions ~u to ~u′ Av2 [s1] increases by pv2s2 and Av1 [s2] decreases by pv1s1 ; all
other such values are unchanged. Let Av[s]~u denote the value in direction ~u. Hence
PrR∅ (s1, ~u′) = PrR∅ (s1, ~u)
A~u′v2
[s1]
A~uv2[s1]
,
and so if we can update Av2 [s1], we can update PrR∅ (s1, ~u′) in constant time. Only
Av1 [s2] and PrR∅ (s2, ~u′) also need to be updated. And then
PrR(v1, s1, ~u′) = pv1s1 · PrR∅ (s1, ~u
′)/Av1 [s1] = PrR(v1, s1, ~u)PrR∅ (s1, ~u
′)
PrR∅ (s1, ~u)
can also be updated in constant time, and similar for PrR(v2, s2, ~u′). Thus the only
remaining difficulty is accessing and updating Av1 [s2] and Av2 [s1]. We can easily do
this for if we store the full size n array Av[·] for each v ∈ V . Note that this takes
O(n ·m) space, but since the output is a structure of size O(n2) and m ≤ n, this is
not prohibitive. (We note that these full arrays are not explicitly required in Lemma
51, which only requires O(n) space.)
Finally, we can update a(Ii+1) from a(Ii) similarly to equation (4.14) using PrR(v, s, ~u)
in place of PrR(s, ~u). Thus in O(m2) time, after sorting all interval breakpoints in
O(n2 log n) time, we can build a data structure that allows calculation of f(V , ~u) for
any ~u in O(log n) time.
Next, to compute∇f(V , ~u), recall that in the locational model∇f(V , ~u) =∑
v∈V,s∈P PrR(v, s, ~u)s
by the proof of Lemma 18. As preprocessing, we compute the value∑
v∈V,s∈P PrR(s, ~u)s
for each interval θ~u ∈ (βi, βi+1) for i = 0, 1, . . . , h. This can be done in O(n2) time
80
CHAPTER 4. ε-KERNEL CORESETS OVER STOCHASTIC DATA
(after we sort all angles), by using the similar idea as above. The argument is similar
to that in Lemma 50 and we ignore the details. Due to the above preprocessing,
given any direction ~u, we can compute ∇f(V , ~u) in O(log n) time by binary search.
Further, the above preprocessing essentially computes M , in totally O(n2 log n) time.
The lemma thus follows.
81
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Chapter 5 Coreset Construction for Stochastic
Shape Fitting Problems
Solving geometric optimization problems over uncertain data has become increasingly
important in many applications and has attracted a lot of attentions in recent years.
In this chapter, we study two important geometric optimization problems, the k-
center problem and the j-flat-center problem, over stochastic/uncertain data points in
Euclidean spaces. We consider both problems under two popular stochastic geometric
models, the existential uncertainty model and the locational uncertainty model. We
provide the first PTAS (Polynomial Time Approximation Scheme) for both problems
under the two models. Our results generalize the previous results for stochastic
minimum enclosing ball and stochastic enclosing cylinder.
5.1 Coreset Construction for Deterministic Shape Fitting
Problems
In this section, we introduce an important class of problems in computational geom-
etry, called the shape fitting problems (see e.g., [106]). We also review two useful
concepts to bound the size of coresets, called total sensitivity and VC-dimension.
Finally, we briefly introduce a framework of constructing coresets for deterministic
shape fitting problems.
Definition 53. (Shape fitting problems) A shape fitting problem is specified by a triple
(Rd,F , d). Here the set F of shapes is a family of subsets of Rd (e.g., all k-point sets,
or all j-flats), and d : Rd × Rd → R≥0 is a symmetric distance function. Define
the distance of a point s ∈ Rd to a shape F ∈ F to be d(s, F ) = mins′∈F d(s, s′). An
instance P of the shape fitting problem is a (weighted) point set s1, . . . , sn (si ∈ Rd),
82
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
and each si has a positive weight wi ∈ R+. The goal is find a shape which best fits P ,
that is, a shape minimizing∑
si∈P wi · d(si, F ) over all shapes F ∈ F .
In this dissertation, if we consider the Euclidean space Rd, we let the function
d(·, ·) be the Euclidean distance function. Next, we introduce a powerful technique
for deterministic shape fitting problems.
Definition 54. (Coreset for shape fitting problems) Given a (weighted) instance P
of a shape fitting problem (Rd,F , d) with a weight function w : P → R+, an ε-coreset
of S is a (weighted) point set together with a weight function w′ : S → R+, such that
for any shape F ∈ F , we have that
∑
si∈Sw′i · d(si, F ) ∈ (1± ε)
∑
si∈Pwi · d(si, F ).1
Total sensitivity and dimension. Feldman and Langberg [45] proposed a frame-
work to construct coresets for a variety of deterministic shape fitting problems. We
briefly introduce their framework in the following. To bound the size of coresets
for deterministic shape fitting problems, a useful notion is called total sensitivity,
originally introduced in [77].
Definition 55. (Total sensitivity of a shape fitting instance). Given an instance
P = s1, . . . , sn of a shape fitting problem (Rd,F , d), with a weight function w :
P → R+, the sensitivity of si ∈ P is σP (si) := infβ ≥ 0 | wi ·d(si, F ) ≤ β∑
j∈[n] wj ·d(sj, F ),∀F ∈ F. The total sensitivity of P is defined by GP =
∑si∈P σP (si).
We also recall the definition of VC-dimension and shattering dimension and show
their relations.
Definition 56. (VC-dimension and shattering dimension) Let P = s1, . . . , sn be
an instance of a shape fitting problem (Rd,F , d). Suppose wi is the weight of si. We
consider the range space (P,R), where R is a family of subsets RF,r of P defined as
follows: given an F ∈ F and r ≥ 0, let RF,r = si ∈ P | wi ·d(si, F ) ≥ r ∈ R consist
1The notation (1± ε)B means the interval [(1− ε)B, (1 + ε)B].
83
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
of those points si whose weighted distance to the shape F is at least r. We denote
the VC-dimension of the instance of the instance P by dimV C(P ), to be the largest
integer m, such that there exists a weight function w and a subset A ⊆ P satisfying the
property |A∩RF,r | F ∈ F , r ≥ 0| = 2|A|. We also denote the shattering dimension
of the instance P by dim(P ), to be the smallest integer m, such that for any weight
function w and A ⊆ P of size |A| = a ≥ 2, we have |A∩RF,r | F ∈ F , r ≥ 0| ≤ am.
The next lemma offers a natural way to bound the VC-dimension of a range space
by bounding the shattering dimension.
Lemma 57. ([58, Lemma 5.14]) If (X,R) is a range space with shattering dimension
d, then its VC-dimension is bounded by O(d log d).
A framework for constructing coresets of shape fitting instances. The main
idea of the framework in [45] is based on the following lemma.
Lemma 58. Given any instance P = s1, . . . , sn of a shape fitting problem (Rd,F , d),
any weight function w : P → R+, and any ε ∈ (0, 1], there exists an ε-coreset for P
of size O((GPε
)2 dimV C(P )).
We take 1-median problem as an example. In this example, F = Rd and d is the
Euclidean distance function. The framework mainly consists of the following steps.
1. Compute F ⊆ F which is a constant-factor approximate shape in F . For 1-
median, we only need to choose F = s∗ = arg mins∈P∑
si∈P wi · d(si, s). It is
not hard to verify that F = s∗ is a 2-approximate shape.
2. For each si ∈ P , compute its projection s′i = arg mins∈F d(si, s). Let P ′ = s′i :
si ∈ P be the collection of all projection points. For 1-median, we observe
that s′i = s∗ for all i ∈ [n].
3. For each si ∈ P , compute an upper bound σP (si) of the sensitivity (see [106,
Theorem 7] for details). Compute GP =∑
i∈[n] σP (si) as an upper bound of the
total sensitivity. For 1-median, we let σP (si) = wi·d(si,s)∑j∈[n] wj ·d(sj ,s)
+ 3n
as an upper
bound [106]. By these values, we have GP =∑
i∈[n] σP (si) = 4.
84
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
4. Sample points from P with probabilities proportional to σP (si). Formally s-
peaking, we modify the weight wi to be wi·GPσP (si)
and sample the point si with
probability σP (si)GP
. This technique is called importance sampling, see [45, Sec-
tion 4.1] for more details. By Lemma 58, we need to take O((GPε
)2 dimV C(P ))
samples. For 1-median, the VC-dimension is d + 1. Thus, there exists an ε-
coreset for P of size O(dε−2).
5.2 Generalized Shape Fitting Problems and Generalized Core-
sets
We recall the definitions of the two stochastic shape fitting problems in this chapter.
Definition 59. For a set of points P ∈ Rd, and a k-point set F = f1, . . . , fk) | fi ∈Rd, 1 ≤ i ≤ k, we define K(P, F ) = maxs∈P min1≤i≤k d(s, fi) as the k-center value
of F w.r.t. P . We use F to denote the family of all k-point sets in Rd. Given a set
P of n stochastic points (in either the existential or locational uncertainty model) in
Rd, and a k-point set F ∈ F , we define the expected k-center value of F w.r.t P as
K(P , F ) = EP∼P [K(P, F )].
In the stochastic minimum k-center problem, our goal is to find a k-point set F ∈ Fwhich minimizes K(P , F ). In this dissertation, we assume that both the dimension-
ality d and k are fixed constants.
Definition 60. Given a set P of n points in Rd, and a j-flat F ∈ F (0 ≤ j ≤ d−1),
where F is the family of all j-flats in Rd, we define the j-flat-center value of F w.r.t.
P to be J(P, F ) = maxs∈P d(s, F ), where d(s, F ) = minf∈F d(s, f) is the distance
between point s and j-flat F . Given a set P of n stochastic points (in either the
existential or locational model) in Rd, and a j-flat F ∈ F (0 ≤ j ≤ d− 1), we define
the expected j-flat-center value of F w.r.t. P to be
J(P , F ) = EP∼P [J(P, F )].
85
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
In the stochastic minimum j-flat-center problem, our goal is to find a j-flat F which
minimizes J(P , F ).
In the following, we define generalized shape fitting problems. In fact, we can
consider stochastic k-center and stochastic j-flat-center as generalized shape fitting
problems. We also give the definition of generalized coresets and introduce a frame-
work for constructing generalized coresets.
Generalized Shape Fitting Problems and Generalized Coresets. In this
section, we define the generalized shape fitting problems, which are defined over a
collection of (weighted) point sets, (recall the traditional shape fitting problems are
defined over a set of (weighted) points). We use Rd to denote the d-dimensional
Euclidean space. Let d(s, s′) denote the Euclidean distance between point s and s′
and d(s, F ) = mins′∈F d(s, s′) for any F ⊂ Rd. Let Ud = P | P ⊂ Rd, |P | is finitebe the collection of all finite discrete point sets in Rd.
Definition 61. (Generalized shape fitting problems) A generalized shape fitting prob-
lem is specified by a triple (Rd,F , dist). Here the set F of shapes is a family of subsets
of Rd (e.g., all k-point sets, or all j-flats), and dist : Ud × F → R≥0 is a generalized
distance function, defined as dist(P, F ) = maxs∈P d(s, F ) for a point set P ∈ Ud and
a shape F ∈ F . 2 An instance S of the generalized shape fitting problem is a (weight-
ed) collection S1, . . . , Sm (Si ∈ Ud) of point sets, and each Si has a positive weight
wi ∈ R+. For any shape F ∈ F , define the total generalized distance from S to F to
be dist(S, F ) =∑
Si∈Swi ·dist(Si, F ). Given an instance S, our goal is to find a shape
F ∈ F , which minimizes the total generalized distance dist(S, F ).
If we replace Ud with Rd, the above definition reduces to the traditional shape
fitting problem, see Definition 53. Here, we give an example for Definition 61.
Example. Consider a generalized shape fitting problem where F is the collection
of all 2-point sets in R2. In this case, for a point s ∈ R2 and a 2-point set F ∈ F ,
the function d(s, F ) = minf∈F d(s, f) is the Euclidean distance between s and its
2Note that dist may not be a metric in general.
86
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
nearest point f ∈ F . For a point set P ∈ U2 and a 2-point set F ∈ F , the function
dist(P, F ) = maxs∈P d(s, F ) is the farthest distance from some point s ∈ P to F .
Then we construct an instance S = S1, S2, S3 (Si ∈ U2) of this generalized shape
fitting problem as follows. Let S1 = s1 = (0, 0), s2 = (0, 2), S2 = s3 = (6, 0), s4 =
(6, 2), and S3 = s5 = (0, 1) ∈. Each Si has a positive weight, where w1 = w2 = 1
and w3 = 2. Then our goal is to find a 2-point set F ∈ F , which minimizes the
following total generalized distance
dist(S, F ) =∑
Si∈Swi · dist(Si, F ) = dist(S1, F ) + dist(S2, F ) + 2dist(S3, F ).
Consider a 2-point set F ∗ = f1 = (0, 1), f2 = (6, 1). We can compute that
dist(S1, F∗) = maxs∈S1 d(s, F ∗) = d(s1, F
∗) = d(s1, f1) = 1. By the same way,
we compute that dist(S2, F∗) = d(s3, f2) = 1 and dist(S3, F
∗) = d(s5, f1) = 0. Thus,
we have that dist(S, F ∗) = 1 + 1 + 0 = 2. In fact, we can prove that F ∗ is the optimal
2-point set which minimizes the total generalized distance dist(S, F ).
Now, we define what is a coreset for a generalized shape fitting problem.
Definition 62. (Generalized Coreset) Given a (weighted) instance S of a generalized
shape fitting problem (Rd,F , dist) with a weight function w : S → R+, a generalized
ε-coreset of S is a (weighted) collection S ⊆ S of point sets, together with a weight
function w′ : S → R+, such that for any shape F ∈ F , we have that
∑
Si∈Sw′i · dist(Si, F ) ∈ (1± ε)
∑
Si∈Swi · dist(Si, F )
(or more compactly, dist(S, F ) ∈ (1 ± ε)dist(S, F ). We denote the cardinality of the
coreset S as |S|.
Definition 62 also generalizes the prior definition in [106], where each Si ∈ S
contains only one point.
Total sensitivity and dimension. To bound the size of the generalized coresets,
87
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
we need the notion of total sensitivity, originally introduced in [77].
Definition 63. (Total sensitivity of a generalized shape fitting instance). Let Ud be
the collection of all finite discrete point sets P ⊂ Rd, and let dist : Ud × F → R≥0
be a continuous function. Given an instance S = Si | Si ⊂ Ud, 1 ≤ i ≤ n of a
generalized shape fitting problem (Rd,F , dist), with a weight function w : S → R+,
the sensitivity Si ∈ S is σS(Si) := infβ ≥ 0 | wi ·dist(Si, F ) ≤ β ·dist(S, F ),∀F ∈ F.The total sensitivity of S is defined by GS =
∑Si∈S σS(Si).
Note that this definition generalizes the one in [77]. In fact, if each Si ∈ S contains
only one point, this definition is equivalent to Definition 55.
We also generalize the definition of dimension defined in [45] (it is in fact the
primal shattering dimension (See e.g., [45, 58]) of a certain range space. It plays a
similar role to VC-dimension).
Definition 64. (Generalized dimension) Let S = Si | Si ∈ Ud, 1 ≤ i ≤ n be
an instance of a generalized shape fitting problem (Rd,F , dist). Suppose wi is the
weight of Si. We consider the range space (S,R), where R is a family of subsets
RF,r of S defined as follows: given an F ∈ F and r ≥ 0, let RF,r = Si ∈ S |wi · dist(Si, F ) ≥ r ∈ R consist of the sets Si whose weighted distance to the shape
F is at least r. Finally, we denote the generalized dimension of the instance S by
dim(S), to be the smallest integer m, such that for any weight function w and A ⊆ S
of size |A| = a ≥ 2, we have |A ∩RF,r | F ∈ F , r ≥ 0| ≤ am.
The definition [77] is a special case of the above definition when each Si ∈ S
contains only one point. On the other hand, the above definition is a special case of
Definition 7.2 [45] if thinking each wi · dist(Si, ·) = gi(·) as a function from F to R≥0.
We first have the following lemma for bounding the size of generalized coresets by
the generalized total sensitivity and dimension.
Lemma 65. Given any instance S = Si | Si ⊂ Ud, 1 ≤ i ≤ n of a generalized shape
fitting problem (Rd,F , dist), any weight function w : S → R+, and any ε ∈ (0, 1],
there exists a generalized ε-coreset for S of cardinality O((GS
ε)2 dim(S) log dim(S)).
88
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
The proof is a straightforward extension of the following theorem (a restatement of
Theorem 4.1 and its proof in [45]). Lemma 65 is a direct corollary from the following
theorem.
Theorem 66. Let D = gi | 1 ≤ i ≤ n be a set of n functions. For each g ∈ D,
g : X → R≥0 is a function from a ground set X to [0,+∞). Let 0 < ε < 1/4 be a
constant. Let q : D → R+ be a function on D such that
q(g) ≥ maxx∈X
g(x)∑g∈D g(x)
. (5.1)
Then there exists a collection S ⊆ D of functions, together with a weight function
w′ : S → R+, such that for every x ∈ X
|∑
g∈Dg(x)−
∑
g∈Sw′(g) · g(x)| ≤ ε
∑
g∈Yg(x),
Moreover, the size of S is
O
((∑g∈D q(g)
ε
)2
dim(D) · log(dim(D))
),
where dim(D) is the generalized shattering dimension of D (see Definition 7.2 in
[45]). 3
Now we are ready to prove Lemma 65.
Proof. Suppose that we are given a (weighted) instance S = Si | Si ⊂ Rd, 1 ≤ i ≤ nof a generalized shape fitting problem (Rd,F , dist), with a weight function w : S →R+. A generalized ε-coreset is a collection S ⊆ S of point sets, together with a weight
function w′ : S → R+ such that, for any shape F ∈ F , we have
∑
Si∈Sw′i · dist(Si, F ) ∈ (1± ε)
∑
Si∈Swi · dist(Si, F ). (5.2)
3Note that there is an additional log(dim(D)) term comparing to Lemma 58. This is becausewe consider shattering dimension instead of VC-dimension. By Lemma 57, we should have thisadditional term.
89
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
For every Si ∈ S and F ∈ F , let gi(F ) = wi · dist(Si, F ) and D = gi | Si ∈ S.Define
q(gi) = σS(Si) +1
n= infβ ≥ 0 | wi · dist(Si, F )
≤ β ·∑
Si∈Swi · dist(Si, F ),∀F ∈ F+
1
n.
It is not hard to verify that this definition satisfies Inequality (5.1). The additional
1/n term will be useful in Section 5.3, where we need a lower bound of q(gi). Thus,
we have GS + 1 =∑
Si∈S(σS(Si) + 1/n) =∑
gi∈D q(gi). Recall that dim(S) is the
generalized shattering dimension of S. By Theorem 66, we conclude that there exists
a collection S of cardinality O((GS
ε)2 dim(S) · log(dim(S))
)with a weight function
w′ : S → R+ satisfying Inequality (5.2).
5.3 Stochastic Minimum k-Center
In this section, we consider the stochastic minimum k-center problem in Rd in the
stochastic model. Let F be the family of all k-point sets of Rd, and let P be the
set of stochastic points. Our main technique is to construct an SKC-Coreset S of
constant size. For any k-point set F ∈ F , K(S, F ) should be a (1 ± ε)-estimation
for K(P , F ) = EP∼P [K(P, F )]. Recall that K(P, F ) = maxs∈P minf∈F d(s, f) is the
k-center value between two point sets P and F . Constructing S includes two main
steps: 1) Partition all realizations via additive ε-coresets, which reduces an expo-
nential number of realizations to a polynomial number of point sets. 2) Show that
there exists a generalized coreset of constant cardinality for the generalized k-median
problem defined over the above set of polynomial point sets. Finally, we enumerate
polynomially many possible collections Si (together with their weights). We show
that there is an SKC-Coreset S among those candidate. By solving a polynomial
system for each Si, and take the minimum solution, we can obtain a PTAS.
We first need the formal definition of an additive ε-coreset [11] as follows.
Definition 67. (additive ε-coreset) Let B(f, r) denote the ball of radius r centered
90
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
at point f . For a set of points P ∈ Ud, we call Q ⊆ P an additive ε-coreset of P if
for every k-point set F = f1, . . . , fk, we have
P ⊆ ∪ki=1B(fi, (1 + ε)K(Q,F )),
i.e., the union of all balls B(fi, (1 + ε)K(Q,F )) (1 ≤ i ≤ k) covers P . 4
5.3.1 Existential uncertainty model
We first consider the existential uncertainty model.
Step 1: Partitioning realizations
We first provide an algorithm A, which can construct an additive ε-coreset for
any deterministic point set. We can think A as a mapping from all realizations of Pto all possible additive ε-coresets. The mapping naturally induces a partition of all
realizations. Note that we do not run A on every realization.
Algorithm A for constructing additive ε-coresets. Given a realization P ∼ P ,
we first compute an approximation value rP of the optimal k-center value minF∈F K(P, F ).
Then we build a Cartesian grid G(P ) of side length depending on rP . Let C(P ) =
C | C ∈ G,C ∩ P 6= ∅ be the collection of those nonempty cells (i.e., cells that
contain at least one point in P ). In each non-empty cell C ∈ C(P ), we maintain the
point sC ∈ C ∩ P of smallest index. Let E(P ) = sC | C ∈ G, which is an additive
ε-coreset of P . Finally the output of A(P ) is E(P ), G(P ), and C(P ). The details can
be found in Section 5.5.
Note that we do not use the construction of additive ε-coresets [11], because it is
not easy to recover the set of original realizations with a certain additive ε-coreset.
We need the set of additive ε-coresets to have some extra properties (in particular,
Lemma 70 below), which allows us to compute certain probability values efficiently.
We first have the following lemma.
Lemma 68. The running time of A on any n point set P is O(knk+1). Moreover,
4Our definition is slight weaker than that in [11]. The weaker definition suffices for our purpose.
91
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
the output E(P ) is an additive ε-coreset of P of size at most O(k/εd).
Denote E(P) = E(P ) | P ∼ P be the collection of all possible additive ε-
coresets. By Lemma 68, we know that each S ∈ E(P) is of size at most O(k/εd). Thus,
the cardinality of E(P) is at most nO(k/εd). For a point set S, denote PrP∼P [E(P ) =
S] =∑
P :P∼P,E(P )=S Pr[ P ] to be the probability that the additive ε-coreset of a
realization is S. The following simple lemma states that we can have a polynomial
size representation for the objective function K(P , F ).
Lemma 69. Given P of n points in Rd in the existential uncertainty model, for any
k-point set F ∈ F , we have that
∑
S∈E(P)
PrP∼P [E(P ) = S] ·K(S, F ) ∈ (1± ε)K(P , F ).
Proof. By the definition of PrP∼P [E(P ) = S], we can see that for any k-point set
F ∈ F ,
∑
S∈E(P)
PrP∼P [E(P ) = S] ·K(S, F ) =∑
S∈E(P)
∑
P :P∼P,E(P )=S
Pr[ P ] ·K(S, F )
∈(1± ε)∑
S∈E(P)
∑
P :P∼P,E(P )=S
Pr[ P ] ·K(P, F ) = (1± ε)K(P , F ).
The inequality above uses the definition of additive ε-coresets (Definition 67).
We can think P → E(P) as a mapping, which maps a realization P ∼ P to
its additive ε-coreset E(P ). The mapping partitions all realizations P ∼ P into a
polynomial number of additive ε-coresets. For each possible additive ε-coreset S ∈E(P), we denote E−1(S) = P ∼ P | E(P ) = S to be the collection of all realizations
mapping to S. By the definition of E(P), we have that ∪S∈E(P)E−1(S) = P .
Now, we need an efficient algorithm to compute PrP∼P [E(P ) = S] for each additive
ε-coreset S ∈ E(P). The following lemma states that the mapping constructed by
algorithm A has some nice properties that allow us to compute the probabilities.
This is also the reason why we cannot directly use the original additive ε-coreset
92
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
construction algorithm in [11]. The proof is somewhat subtle and can be found in
Section 5.5.
Lemma 70. Consider a subset S of at most O(k/εd) points. Run algorithm A(S),
which outputs an additive ε-coreset E(S), a Cartesian grid G(S), and a collection
C(S) of nonempty cells. If E(S) 6= S, then S /∈ E(P) (i.e., S is not the output of A
for any realization P ∼ P). 5 If |S| ≤ k, then E−1(S) = S. Otherwise if E(S) = S
and |S| ≥ k + 1, then a point set P ∼ P satisfies E(P ) = S if and only if
P1. For any cell C /∈ C(S), C ∩ P = ∅.
P2. For any cell C ∈ C(S), assume that point sC = C ∩ S. Then sC ∈ P , and any
point s′ ∈ C ∩ P with a smaller index than that of sC does not appear in the
realization P .
Thanks to Lemma 70, now we are ready to show how to compute PrP∼P [E(P ) = S]
efficiently for each S ∈ E(P). We enumerate every point set of size O(k/εd). For a
set S, we first run A(S) and output a Cartesian grid G(S) and a point set E(S). We
check whether S ∈ E(P) by checking whether E(S) = S or |S| ≤ k. If S ∈ E(P), we
can compute PrP∼P [E(P ) = S] using the Cartesian grid G(S). See Algorithm 1 for
details. We also give an example to explain Algorithm 1, see Figure 5.3.1.
The following lemma asserting the correctness of Algorithm 1 is a simple conse-
quence of Lemma 70.
Lemma 71. For any point set S, Algorithm 1 computes exactly the total probability
PrP∼P [E(P ) = S] =∑
P :P∼P,E(P )=S
Pr[ P ]
in O(nO(k/εd)) time.
Proof. Run A(S), and we obtain a point set E(S). If E(S) 6= S, we have that S /∈ E(P)
by Lemma 70. Thus, PrP∼P [E(P ) = S] = 0. If |S| ≤ k, we have that E−1(S) = Sby Lemma 70. Thus, PrP∼P [E(P ) = S] = Pr[ S].
5It is possible that some point set S satisfies Definition 67 for some realization P , but is not theoutput of A(S).
93
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Algorithm 1 Computing PrP∼P [E(P ) = S]
1 For each point set S ∼ P of size |S| = O(k/εd), run algorithm A(S). Assumethat the output is a point set E(S), a Cartesian grid G(S), and a cell collectionC(S) = C | C ∈ G,C ∩ S 6= ∅.2 If E(S) 6= S, output PrP∼P [E(P ) = S] = 0. If |S| ≤ k, output PrP∼P [E(P ) = S] =Pr[ S].3 For a cell C, suppose C ∩ P = ti | ti ∈ P , 1 ≤ i ≤ m. W.l.o.g., assume thatt1, . . . , tm are in increasing order of their indices. For C 6∈ C(S), let
Q(C) = PrP∼P[P ∩ C = ∅
]=
m∏
i=1
(1− pi)
be the probability that no point in C is realized. If C ∈ C(S), assume that pointtj ∈ C ∩ S, and let
Q(C) = PrP∼P[tj ∈ P and t1, . . . , tj−1 ∩ P = ∅
]= pj ·
j−1∏
i=1
(1− pi)
be the probability that tj appears, but t1, . . . , tj−1 do not appear.4 Output PrP∼P [E(P ) = S] =
∏C∈G(S) Q(C).
Otherwise if E(S) = S and |S| ≥ k+1, by Lemma 70, each realization P ∈ E−1(S)
satisfies P1 and P2. Then combining the definition of Q(C), and the independence of
all cells, we can see that∏
C∈C Q(C) is equal to∑
P∈E−1(S) Pr[ P ] = PrP∼P [E(P ) =
S].
For the running time, note that we only need to consider at most nO(k/εd) point
sets S ∼ P . For each S, Algorithm 1 needs to run A(S), which costs O(knk+1)
time by Lemma 68. Step 2 and 3 only cost linear time. Thus, we can compute all
probabilities PrP∼P [E(P ) = S] in O(nO(k/εd)) time.
Step 2: Existence of generalized coreset via generalized total sensitivity
Recall that E(P) is a collection of polynomially many point sets of size O(k/εd).
By Lemma 69, we can focus on a generalized k-median problem: finding a k-point set
F ∈ F which minimizes K(E(P), F ) =∑
S∈E(P) PrP∼P [E(P ) = S] · K(S, F ). In fact,
the generalized k-median problem is a special case of the generalized shape fitting
problem we defined in Definition 61. Here, we instantiate the shape family F to
be the collection of all k-point sets. Note that the k-center objective K(E(P), F ) is
94
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
G(S)
C1 C2
C3 C4
C1s1
s2
s3
s4
s5
s6
s7
s8 s9
s10 s11
Figure 5-1: An example for Algorithm 1 when k = 2. In this figure, P = s1, . . . , s11consists of all points, and S = s3, s5, s7 consists of black points. Then by Lemma 70,we have that PrP∼P [E(P ) = S] = p3p5p7(1−p1)(1−p2)(1−p4)(1−p10)(1−p11). Nowwe run Algorithm 1 on S. In Step 1, we first construct a Cartesian grid G(S) as inthe figure, and construct a cell collection C(S) = C1, C2, C3 since C4 ∩S = ∅. Notethat E(S) = S (by Lemma 70) and |S| = 3 > k. We directly go to Step 3 and wantto compute the value Q(Ci) for each cell Ci. For cell C1, two rectangle points s1 ands2 are of smaller index than s3 ∈ S. So we compute that Q(C1) = p3(1− p1)(1− p2).Similarly, we compute Q(C2) = p5(1−p4), Q(C3) = p7, and Q(C4) = (1−p10)(1−p11).Finally in Step 4, we output PrP∼P [E(P ) = S] =
∏C∈G(S) Q(C) = p3p5p7(1− p1)(1−
p2)(1− p4)(1− p10)(1− p11).
indeed a generalized distance function in Definition 61. To make things concrete, we
formalize it below. Recall that Ud is the collection of all finite discrete point sets in
Rd.
Definition 72. A generalized k-median problem is specified by a triple (Rd,F ,K).
Here F is the family of all k-point sets in Rd, and K : Ud×F → R≥0 is a generalized
distance function defined as follows: for a point set P ∈ Ud and a k-point set F ∈ F ,
K(P, F ) = maxs∈P d(s, F ) = maxs∈P minf∈F d(s, f). An instance S of the generalized
k-median problem is a (weighted) collection S1, . . . , Sm (Si ∈ Ud) of point sets,
and each Si has a positive weight wi ∈ R+. For any k-point set F ∈ F , the total
generalized distance from S to F is K(S, F ) =∑
Si∈Swi · K(Si, F ). The goal of the
95
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
generalized k-median problem (GKM) is to find a k-point set F which minimizes the
total generalized distance K(S, F ).
Recall that a generalized ε-coreset is a sub-collection S ⊆ S of point sets, together
with a weight function w′ : S → R+, such that for any k-point set F ∈ F , we have∑
S∈S w′(S) · K(S, F ) ∈ (1 ± ε)∑S∈Sw(S) · K(S, F ) (or K(S, F ) ∈ (1 ± ε)K(S, F )).
This generalized coreset will serve as the SKC-Coreset for the original stochastic
k-center problem.
Our main lemma asserts that a constant sized generalized coreset exists, as follows.
Lemma 73. (main lemma) Given an instance P of n stochastic points in Rd, let
E(P) be the collection of all additive ε-coresets. There exists a generalized ε-coreset
S ⊆ E(P) of cardinality |S| = O(ε−(d+2)dk4), together with a weight function w′ :
S → R+, which satisfies that for any k-point set F ∈ F ,
∑
S∈Sw′(S) ·K(S, F ) ∈ (1± ε)
∑
S∈E(P)
PrP∼P [E(P ) = S] ·K(S, F ).
Now, we prove Lemma 73 by showing a constant upper bound on the cardinality of
a generalized ε-coreset. This is done by applying Lemma 65 and providing constant
upper bounds for both the total sensitivity and the generalized dimension of the
generalized k-median instance.
Given an instance S = Si | Si ∈ Ud, 1 ≤ i ≤ n of a generalized k-median
problem with a weight function w : S → R+, we denote F ∗ to be the k-point set
which minimizes the total generalized distance K(S, F ) =∑
S∈Sw(S) · K(S, F ) over
all F ∈ F . W.l.o.g., we assume that K(S, F ∗) > 0. Since if K(S, F ∗) = 0, there are
at most k different points in the instance.
We first construct a projection instance P ∗ of a weighted k-median problem for
S, and relate the total sensitivity GS to GP ∗ . Recall that GS =∑
S∈S σS(S) is the
total sensitivity of S. Our construction of P ∗ is as follows. For each point set Si ∈ S,
assume that F ∗i ∈ F is the k-point set satisfying that F ∗i = argmaxFw(Si)·K(Si,F )
K(S,F ), i.e.,
the sensitivity σS(Si) of Si is equal tow(Si)K(Si,F
∗i )
K(S,F ∗i ). Let s∗i ∈ Si denote the point
96
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
f ∗i
d(s∗i , F∗)
s∗i
K(Si, F∗i )
Figure 5-2: In the figure, Si is the black point set, F ∗ is the white point set, and F ∗i isthe dashed point set. Here, s∗i ∈ Si is the farthest point to F ∗i satisfying d(s∗i , F
∗i ) =
K(Si, F∗i ), and f ∗i ∈ F ∗ is the closest point to s∗i satisfying d(s∗i , f
∗i ) = d(s∗i , F
∗).
farthest to F ∗i (breaking ties arbitrarily). Let f ∗i ∈ F ∗ denote the point closest to s∗i
(breaking ties arbitrarily). Denote P ∗ to be the multi-set f ∗i | Si ∈ S, and denote
the weight function w′ : P ∗ → R+ to be w′(f ∗i ) = w(Si) for any i ∈ [n]. Thus, P ∗ is
a weighted k-median instance in Rd with a weight function w′. See Figure 5-2 for an
example of the construction of P ∗.
Lemma 74. Given an instance S = Si | Si ∈ Ud, 1 ≤ i ≤ n of a generalized
k-median problem in Rd with a weight function w : S → R+, let P ∗ be its projection
instance. Then, we have GS ≤ 2GP ∗ + 1.
Proof. First note that we have the following fact. Given i, j ∈ [n], recall that s∗j ∈ Sjis the farthest point to F ∗j , and f ∗j ∈ F ∗ is the closest point to s∗j . Let f ∈ F ∗i be the
point closest to s∗j .
K(Sj, F∗i ) + K(Sj, F
∗) ≥ d(s∗j , F∗i ) + d(s∗j , F
∗) = d(s∗j , F∗i ) + d(s∗j , f
∗j )
= d(s∗j , f) + d(s∗j , f∗j ) ≥ d(f ∗j , f) ≥ d(f ∗j , F
∗i ), (5.3)
The first inequality follows from the definitions of K(Sj, F∗i ) and K(Sj, F
∗). The first
equality follows from the definition of f ∗j . The second inequality follows from the
triangle inequality, and the last inequality is by the definition of d(f ∗j , F∗i ).
Then we have the following fact:
∑
f∈P ∗w′(f) · d(f, F ∗i ) =
∑
f∗j ∈P ∗w′(f ∗j ) · d(f ∗j , F
∗i ) ≤
∑
Sj∈Sw(Sj) ·
(K(Sj, F
∗) + K(Sj, F∗i ))
= K(S, F ∗) + K(S, F ∗i ) ≤ 2K(S, F ∗i ), (5.4)
97
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
since K(S, F ∗) ≤ K(S, F ∗i ) and Inequality (5.3).
Let f ′ ∈ F ∗i be the point closest to f ∗i . We also notice the following fact:
K(Si, F∗) + d(f ∗i , F
∗i ) ≥ d(s∗i , f
∗i ) + d(f ∗i , F
∗i ) = d(s∗i , f
∗i ) + d(f ∗i , f
′)
≥ d(s∗i , f′) ≥ d(s∗i , F
∗i ) = K(Si, F
∗i ). (5.5)
The first inequality follows from the definition of f ∗i , the second inequality follows
from the triangle inequality, and the last inequality follows from the definition of
d(s∗i , F∗i ).
Now we are ready to analyze σS(Si) for some Si ∈ S. We can see that
w(Si) ·K(Si, F∗i ) ≤ w(Si) ·K(Si, F
∗) + w(Si) · d(f ∗i , F∗i ) [by (5.5)]
≤ w(Si) ·K(Si, F∗) + σP ∗(f
∗i ) ·
(∑
f∈P ∗w′(f) · d(f, F ∗i )
)[by the definition of σP ∗ ]
≤ w(Si) ·K(Si, F∗) + 2σP ∗(f
∗i ) ·K(S, F ∗i ) [by (5.4)]
=w(Si) ·K(Si, F
∗)
K(S, F ∗i )·K(S, F ∗i ) + 2σP ∗(f
∗i ) ·K(S, F ∗i )
≤(w(Si) ·K(Si, F
∗)
K(S, F ∗)+ 2σP ∗(f
∗i )
)K(S, F ∗i ). [by K(S, F ∗i ) ≥ K(S, F ∗)]
Finally, we bound the total sensitivity as follows:
GS =∑
Si∈SσS(Si) ≤
∑
Si∈S
(w(Si) ·K(Si, F
∗)
K(S, F ∗)+ 2σP ∗(f
∗i )
)= 1 + 2GP ∗ .
This finishes the proof of the lemma.
Since P ∗ is an instance of a weighted k-median problem, we know that the total
sensitivity GP ∗ is at most 2k + 1, by [77, Theorem 9]. 6 Then combining Lemma 74,
we have the following lemma which bounds the total sensitivity of GS.
Lemma 75. Consider an instance S of a generalized k-median problem (Rd,F ,K).
The total sensitivity GS is at most 4k + 3.
6Theorem 9 in [77] bounds the total sensitivity for the unweighted version. However, the proofcan be extended to the weighted version in a straightforward way.
98
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Now the remaining task is to bound the generalized dimension dim(S). Consider
the range space (S,R), R is a family of subsets RF,r of S defined as follows: given an
F ∈ F and r ≥ 0, let RF,r = Si ∈ S | wi ·K(Si, F ) ≥ r ∈ R. Here wi is the weight
of Si ∈ S. We have the following lemma.
Lemma 76. Consider an instance S of a generalized k-median problem in Rd. If
each point set S ∈ S is of size at most L, then the generalized dimension dim(S) is
O(dkL).
Proof. Consider a mapping g : S→ RdL constructed as follows: suppose Si = x1 =
(x11, . . . , x
1d), . . . , x
L = (xL1 , . . . , xLd ) (if |Si| < L, we pad it with x1 = (x1
1, . . . , x1d)).
We let
g(Si) = (x11, . . . , x
1d, . . . , x
L1 , . . . , x
Ld ) ∈ RdL.
For any t ≥ 0 and any k-point set F ∈ F , we observe that wi · K(Si, F ) ≥ r hold-
s if and only if there exists some 1 ≤ j ≤ L satisfying that wi · d(xj, F ) ≥ r,
which is equivalent to saying that point g(Si) is in the union of the following L sets
(x11, . . . , x
1d, . . . , x
L1 , . . . , x
Ld ) | d(xj, F ) ≥ r/wi (j ∈ [L]).
Let X be the image set of g. Let (X,Rj) (1 ≤ j ≤ L) be L range spaces, where
each Rj consists of all subsets RjF,r = (x1
1, . . . , x1d, . . . , x
L1 , . . . , x
Ld ) ∈ X | d(xj, F ) ≥
r for all F ∈ F and r ≥ 0. Note that each (X,Rj) has shattering dimension dk by
[45]. Let R′ = ∪Rj | Rj ∈ Rj, i ∈ [L]. Using the standard result for bounding the
shattering dimension of the union of set systems (e.g.,[58, Theorem 5.22]), we can see
that the shattering dimension of (X,R′) (which is the generalized dimension of S) is
bounded by O(dkL).
Note that an additive ε-coreset is of size at most O(k/εd). Then combining Lem-
ma 65, 75 and 76, we directly obtain Lemma 73. Combining Lemma 69 and 73, we
have the following theorem.
Theorem 77. Given an instance P of n points in Rd in the existential uncertainty
model, there exists an SKC-Coreset S of O(ε−(d+2)d2k4) 7 point sets with a weight
function w′ : S → R+, which satisfies that,
7Here, we hide a log k + log(1/ε) term in the notion O().
99
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
1. For each point set S ∈ S, we have S ⊆ P and |S| = O(k/εd).
2. For any k-point set F ∈ F , we have∑
S∈S w′(S) ·K(S, F ) ∈ (1± ε)K(P , F ).
PTAS for stochastic minimum k-center. It remains to give a PTAS for the
stochastic minimum k-center problem. For an instance E(P) of a generalized k-median
problem, if we can compute the sensitivity σE(P)(S) efficiently for each point set S ∈E(P), then we can construct an SKC-Coreset by importance sampling (The details
of the sampling technique are the same as described in [45, Section 4.1]). However, it is
unclear how to compute the sensitivity σE(P)(S) efficiently. Instead, we enumerate all
weighted sub-collections Si ⊆ E(P) of cardinality at most O(ε−(d+2)d2k4). We claim
that we only need to enumerate O(nO(ε−(2d+2)d2k5)) polynomially many sub-collections
Si together with their weight functions, such that there exists a generalized ε-coreset
of E(P). 8 We will show the details later.
In the next step, for each weighted sub-collection S ⊆ E(P) with a weight function
w′ : S → R+, we briefly sketch how to compute the optimal k-point set F such that
K(S, F ) is minimized. We cast the optimization problem as a constant size polynomial
system.
Denote the space F = (y1, . . . , yk) | yi ∈ Rd, 1 ≤ i ≤ k to be the collection of
ordered k-point sets ((y1, y2, . . . , yk) ∈ F and (y2, y1, . . . , yk) ∈ F to be two different
k-point sets if y1 6= y2). We first divide the space F into pieces F i, as follows: Let
L = O(k/εd) and L = (l1, . . . , lL) (1 ≤ lj ≤ k,∀j ∈ [L]) be a sequence of integers,
and let b ∈ [L] be an index. Consider a point set S = x1 = (x11, . . . , x
1d), . . . , x
L =
(xL1 , . . . , xLd ) ∈ S and a k-point set F = y1 = (y1
1, . . . , y1d), . . . , y
k = (yk1 , . . . , ykd) ∈
F . We give the following definition.
Definition 78. The k-center value K(S, F ) is decided by L and b if the following
two properties hold.
1. For any i ∈ [L] and any j ∈ [k], d(xi, yli) ≤ d(xi, yj), i.e., the closest point to
xj is ylj ∈ F .
8We remark that even though we enumerate the weight function, computing PrP∼P [E(P ) = S]is still important for our algorithm. See Lemma 81 for the details of the enumeration algorithm.
100
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
2. For any i ∈ [L], d(xi, yli) ≤ d(xb, ylb), i.e., the k-center value K(S, F ) =
d(xb, ylb).
For each point set Si ∈ S, we enumerate an integer sequence Li and an index
bi. Given a collection Li, bii (index i ranges over all Si in S), we construct a piece
FLi,bii ⊆ F as follows: for any point set Si ∈ S and any k-point set F ∈ FLi,bii , the
k-center value K(Si, F ) is decided by Li and bi. According to Definition 78, FLi,bii
is defined by a polynomial system.
Then, we solve our optimization problem in each piece FLi,bii . By definition 78,
for any point set Si ∈ S and any k-point set F ∈ FLi,bii , the k-center value
K(Si, F ) = d(xbi , yLi(bi)) (xbi ∈ Si, yLi(bi) ∈ F ). Here, the index Li(bi) is the bi-
th item of Li. Hence, our problem can be formulated as the following optimization
problem:
minF
∑
Si∈Sw′(Si)·gi, s.t., g2
i = ‖xbi−yLi(bi)‖2, gi ≥ 0,∀i ∈ [L]; yLi(bi) ∈ F ;F ∈ FLi,bii .
By Definition 78, there are at most kL|S| constraints, which is a constant. Thus, the
polynomial system has dk variables and O(kL|S|) constraints, hence can be solved
in constant time. Note that there are at most O(kL|S|) different pieces FLi,bii ⊆ F ,
which is again a constant. Thus, we can compute the optimal k-point set for the
weighted sub-collection S in constant time.
Now we return to the stochastic minimum k-center problem. Recall that we first
enumerate all possible weighted sub-collections Si ⊆ E(P) of cardinality at most
O(ε−(d+2)d2k4). Then we compute the optimal k-point set F i for each weighted sub-
collection Si as above, and compute the expected k-center value K(P , F i). 9 Let
F ∗ ∈ F be the k-point set which minimizes the expected k-center value K(P , F i)
over all F i. By Lemma 81, there is one sub-collection Si with a weight function w′
satisfying that K(Si, F i) ≤ (1 + ε) minF∈F K(P , F ). Thus, we conclude that F ∗ is a
(1 + ε)-approximation for the stochastic minimum k-center problem. For the running
9It is not hard to compute K(P, F i) in O(n log n) time by sorting all points in P in non-increasingorder according to their distances to F i.
101
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
time, we enumerate at most O(nO(ε−(2d+2)d2k5)) weighted sub-collections. Moreover,
computing the optimal k-point set for each sub-collection costs constant time. Then
the total running time is at most O(nO(ε−(2d+2)d2k5)). Thus, we have the following
corollary.
Corollary 79. If both k and d are constants, given an instance P of n stochastic
points in Rd in the existential uncertainty model, there exists a PTAS for the stochastic
minimum k-center problem in O(nO(ε−(2d+2)d2k5)) time.
Enumerating possible generalized ε-coresets. Given an instance S = Si | Si ∈Ud, 1 ≤ i ≤ N of a generalized k-median problem in Rd with a weight function
w : S → R+, now we show how to enumerate polynomially many sub-collections
Si ⊆ S together with their weight functions, such that there exists a generalized ε-
coreset of S. Recall that σS(Si) is the sensitivity of Si, and GS =∑
i∈[N ] σS(Si) is the
total sensitivity. Also recall that dim(S) is the generalized dimension of S. Define
q(Si) = σS(Si) + 1/N for 1 ≤ i ≤ M , and define qS =∑
1≤i≤N q(Si). Note that
qS = GS + 1 ≤ 4k + 4 by Lemma 75. Our algorithm is as follows.
1. Let M = O(( qSε
)2 dim(S)). Let L = 10ε
(logM + logN + log k).
2. Enumerate all collections Si ⊆ S of cardinality at most M . Note that we only
need to enumerate at most NM collections.
3. For a collection S ⊆ S, w.l.o.g., assume that S = S1, S2, · · · , Sm (m ≤ M).
Enumerate all sequences((1 + ε)a1 , . . . , (1 + ε)am
)where each 0 ≤ ai ≤ L is an
integer.
4. Given a collection S = S1, S2, · · · , Sm and a sequence((1+ε)a1 , . . . , (1+ε)am
),
we construct a weight function w′ : S → R+ as follows: for a point set Si ∈ S,
denote w′(Si) to be (1+ε)ai ·w(Si)/M . Recall that w(Si) is the weight of Si ∈ S.
Analysis. Recall that given an instance P of a stochastic minimum k-center problem,
we first reduce to an instance S = E(P) of a generalized k-median problem. Note
102
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
that the cardinality of S is at most nO(k/εd), and the cardinality of a generalized ε-
coreset is at most M = O(ε−(d+2)d2k4) by Theorem 77. Thus, we enumerate at most
NM = nO(ε−(2d+2)d2k5) polynomially many sub-collections Si ⊆ S. For each collection
Si, we construct at most ML+1 = O(nO(k/εd)) polynomially many weight functions.
In total, we enumerate NM ·ML+1 = O(nO(ε−(2d+2)d2k5)) polynomially many weighted
sub-collections.
It remains to show that there exists a generalized ε-coreset of S. We first have
the following lemma.
Lemma 80. Given an instance S = Si | Si ∈ Ud, 1 ≤ i ≤ N of a generalized k-
median problem in Rd with a weight function w : S → R+, there exists a generalized
ε-coreset S ⊆ S with a weight function w′ : S → R+, such that
∑
S∈Sw′(S) ·K(S, F ) ∈ (1± ε)
∑
S∈Sw(S) ·K(S, F ).
The cardinality of S is at most M = O(( qSε
)2 dim(S)). Moreover, each weight w′(S)
(S ∈ S) has the form that w′(S) = c·qS·w(S)q(S)·M , where 1 ≤ c ≤M is an integer.
Proof. For each S ∈ S, let gS : F → R+ be defined as gS(F ) = w(S) ·K(S, F )/q(S).
Let D = gS | S ∈ S be a collection, together with a weight function w′′ : D → R+
defined as w′′(gS) = q(S). Note that for any k-point set F ∈ F , we have that
∑
gS∈Gw′′(gS) · gS(F ) =
∑
S∈Sw(S) ·K(S, F ) = K(S, F ).
By Theorem 4.1 in [45], we can randomly sample (with replacement) a collection
S ⊆ D of cardinality at most M = O(( qSε
)2 dim(S)), together with a weight function
w′ : S → R+ defined as w′(gS) = qS/M . Then the multi-set S satisfies that for every
F ∈ F ,
∑
gS∈Sw′(gS) · gS(F ) ∈ (1± ε)
∑
gS∈Gw′′(gS) · gS(F ) = (1± ε)K(S, F ).
By the definition of gS and w′, we prove the lemma.
103
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
We are ready to prove the following lemma.
Lemma 81. Among all sub-collections S ⊆ S of cardinality at most M = O(( qSε
)2 dim(S)),
together with a weight function w′ : S → R+ of the form w′(Si) = (1 + ε)ai ·w(Si)/M
(0 ≤ ai ≤ 10(logM+logN+log k)ε
is an integer), there exists a generalized ε-coreset of S.
Proof. By Lemma 80, there exists a generalized ε-coreset S ⊆ S of cardinality at
most M together with a weight function w′ : S → R+ defined as follows: each weight
w′(S) (S ∈ S) has the form that w′(S) = cS ·qS·w(S)q(S)·M for some integer 1 ≤ cS ≤ M .
W.l.o.g., we assume that S = S1, S2, . . . , Sm | Si ∈ S (m ≤M).
By the definition of q(S), we have that 1/N ≤ q(S) ≤ qS = GS + 1 ≤ 4k + 4.
Then we conclude that for each S ∈ S,
1 ≤ cS · qSq(S)
≤ (4k + 4)MN.
For 1 ≤ i ≤ m, let ai = blog1+ε(cSi ·qSq(Si)
)c. Note that each ai satisfies that 0 ≤ai ≤ 10(logM+logN+log k)
ε. Thus, we have enumerated the following sub-collection S =
S1, S2, ·, Sm | Si ∈ S with a weight function w′′ : S → R+, such that w′′(Si) =
(1+ε)ai ·w(Si)/M . Moreover, for any k-point set F , we have the following inequality.
∑
1≤i≤mw′′(Si) ·K(Si, F ) =
∑
1≤i≤m
(1 + ε)ai · w(Si)
M·K(Si, F ) ∈ (1± ε)
∑
1≤i≤m
cS · qS · w(S)
q(Si) ·M·K(Si, F )
= (1± ε)∑
1≤i≤mw′(Si) ·K(Si, F ) ∈ (1± 3ε)
∑
S∈Sw(S) ·K(S, F ).
The last inequality is due to the assumption that the sub-collection S with a weight
function w′ is a generalized ε-coreset of S. Let ε′ = ε/3, we prove the lemma.
5.3.2 Locational uncertainty model
Next, we consider the stochastic minimum k-center problem in the locational uncer-
tainty model. Given an instance of m nodes v1, . . . , vm which may locate in the point
set P = s1, . . . , sn | si ∈ Rd, 1 ≤ i ≤ n, our construction of additive ε-coresets and
the method for bounding the total sensitivity is exactly the same as in the existential
104
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
uncertainty model. The only difference is that for an additive ε-coreset S, how to
compute the probability PrP∼P [E(P ) = S] =∑
P :P∼P,E(P )=S Pr[ P ]. Here, P ∼ Pis a realized point set according to the probability distribution of P . Run A(S), and
construct a Cartesian grid G(S). Denote T (S) =(∪P :P∼P,E(P )=SP
)\ S to be the
collection of all points s which might be contained in some realization P ∼ P with
E(P ) = S. Recall that C(S) = C ∈ G | |C∩S| = 1 is the collection of d-dimensional
Cartesian cells C which contains a point sC ∈ S. By Lemma 70, for any realization
P with E(P ) = S, we have the following observations.
1. For any cell C /∈ C(S), C ∩ P = ∅. It means that for any point s ∈ C ∩ P , we
have s /∈ T (S).
2. For any cell C ∈ C(S) and any point s′ ∈ C ∩P with a smaller index than that
of sC , we have s′ /∈ P . It means that s′ /∈ T (S).
By the above observations, we conclude that T (S) is the collection of those points s′
belonging to some cell C ∈ C(S) and with a larger index than that of sC .
Then we reduce the counting problem PrP∼P [E(P ) = S] to a family of bipartite
holant problems. We first give the definition of holant problems.
Definition 82. An instance of a holant problem is a tuple Λ =(G(V,E), (gu)u∈V
), (we)e∈E,
where for every u ∈ V , gu : 0, 1Eu → R+ is a function, where Eu is the set of edges
incident to ~u. For every assignment σ ∈ 0, 1E, we define the weight of σ as
wΛ(σ) ,∏
u∈Vgu (σ |Eu)
∏
e∈σwe.
Here σ |Eu is the assignment of Eu with respect to the assignment σ. We denote the
value of the holant problem Z(Λ) ,∑
σ∈0,1E wΛ(σ).
For a counting problem PrP∼P [E(P ) = S], w.l.o.g., we assume that S = s1, . . . , s|S|.Then we construct a family of holant instance ΛL as follows.
1. Enumerate all integer sequences L = (l1, . . . , l|S|, lt) such that∑
1≤i≤|S| li + lt =
n, li ≥ 1 (1 ≤ i ≤ |S|), and lt ≥ 0. Let L be the collection of all these integer
sequences L.
105
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
2. For a sequence L, assume that ΛL =(G(U, V,E), (g~u)u∈U∪V
)is a holant instance
on a bipartite graph, where V = v1, . . . , vn, and U = S ∪ t (we use vertex
t to represent the collection T (S)).
3. The weight function w : E → R+ is defined as follows:
(a) For a vertex vi ∈ V and a vertex sj ∈ S, wij = pij.
(b) For a vertex vi ∈ V and t ∈ V , wit =∑
sj∈T (S) pij.
4. For each vertex v ∈ V , the function gv = (= 1). 10 For each vertex si ∈ S, the
function gsi = (= li), and the function gt = (= lt).
Since each S ∈ E(P) is of constant size, we only need to enumerate at most
O(m|S|+1) = poly(n) integer sequences L. Given an integer sequence L = (l1, . . . , l|S|, lt),
we can see that Z(ΛL) is exactly the probability that li nodes are realized at point
si ∈ S (∀1 ≤ i ≤ |S|), and lt nodes are realized inside the point set T (S). Then by
Lemma 70, we have the following equality:
PrP∼P [E(P ) = S] =∑
L∈LZ(ΛL).
It remains to show that we can compute each Z(ΛL) efficiently. Fortunately, we have
the following lemma.
Lemma 83. ([67],[103]) For any bipartite graph ΛL with a specified integer sequence
L, there exists an FPRAS to compute the holant value Z(ΛL).
Thus, we have the following theorem.
Theorem 84. If both k and d are constants, given an instance of m stochastic nodes
in Rd in the locational uncertainty model, there exists a PTAS for the stochastic
minimum k-center problem.
Combining Theorem 77 and 84, we obtain the main result Theorem 12.
10Here the function gv = (= i) means that the function value gv is 1 if exactly i edges incident tov are of value 1 in the assignment. Otherwise, gv = 0
106
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
5.4 Stochastic Minimum j-Flat-Center
In this section, we consider a generalized shape fitting problem, the minimum j-
flat-center problem in the stochastic models. Let F be the family of all j-flats in
Rd. Our main technique is to construct an SJFC-Coreset of constant size, which
satisfies that for any j-flat F ∈ F , we can use the SJFC-Coreset to obtain a
(1±ε)-estimation for the expected j-flat-center value J(P , F ). Then since the SJFC-
Coreset is of constant size, we have a polynomial system of constant size to compute
the optimum in constant time.
Let B =∑
1≤i≤n pi be the total probability. We discuss two different cases. If
B < ε, we reduce the problem to a weighted j-flat-median problem, which has been
studied in [106]. If B ≥ ε, the construction of an SJFC-Coreset can be divided
into two parts. We first construct a convex hull, such that with high probability
(say 1 − ε) that all points are realized inside the convex hull. Then we construct a
collection of point sets to estimate the contribution of points insider the convex hull.
On the other hand, for the case that some point appears outside the convex hull, we
again reduce the problem to a weighted j-flat-median problem. The definition of the
weighted j-flat-median problem is as follows.
Definition 85. For some 0 ≤ j ≤ d − 1, let F be the family of all j-flats in Rd.
Given a set P of n points in Rd together with a weight function w : P → R+, denote
cost(P, F ) =∑
si∈P wi · d(si, F ). A weighted j-flat-median problem is to find a shape
F ∈ F which minimizes the value cost(P, F ).
5.4.1 Case 1: B < ε
In the first case, we show that the minimum j-flat-center problem can be reduced to
a weighted j-flat-median problem. We need the following lemmas.
Lemma 86. If B < ε, for any j-flat F ∈ F , we have∑
si∈P pi · d(si, F ) ∈ (1 ± ε) ·J(P , F ).
Proof. For a j-flat F ∈ Rd, w.l.o.g., we assume that d(si, F ) is non-decreasing in i.
107
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Thus, we have
J(P , F ) =∑
i∈[n]
pi · d(si, F ) ·∏
j>i
(1− pj)
Since B < ε, for any i ∈ [n], we have that 1− ε ≤ 1−∑j∈[n] pi ≤∏
j>i(1− pj) ≤ 1.
So we prove the lemma.
By Lemma 86, we reduce the original problem to a weighted j-flat-median prob-
lem, where each point si ∈ P has weight pi. We then need the following lemma to
bound the total sensitivity.
Lemma 87. (Theorem 18 in [106]) 11 Consider the weighted j-flat-median problem
where F is the set of all j-flats in Rd. The total sensitivity of any weighted n-point
set is O(j1.5).
On the other hand, we know that the dimension of the weighted j-flat-median
problem is O(jd) by [45]. Then by Lemma 65, there exists an ε-coreset S ⊆ Pof cardinality O(j4dε−2 log(jd)) = O(j4dε−2) to estimate the j-flat-median value∑
si∈P pi · d(si, F ) for any j-flat F ∈ F . 12 Moreover, we can compute a constant
approximation j-flat in O(ndjO(j2)) time by [44]. Then by [106], we can construct
an ε-coreseet S in O(ndjO(j2)) time. Combining Lemma 86, we conclude the main
lemma in this subsection.
Lemma 88. Given an instance P of n stochastic points in Rd, if the total probability∑
i pi < ε, there exists an SJFC-Coreset of cardinality O(j4dε−2) for the minimum
j-flat-center problem. Moreover, we have an O(ndjO(j2)) time algorithm to compute
the SJFC-Coreset.
5.4.2 Case 2: B ≥ ε
Note that if F is a j-flat, the function d(x, F )2 has a linearization. Here, a lineariza-
tion is to map the function d(x, F )2 to a k-variate linear function through variate
11Theorem 18 in [106] bounds the total sensitivity for the unweighted version. However, the proofcan be extended to the weighted version in a straightforward manner.
12We remark that for the j-flat-median problem, Feldman and Langberg [45] showed that thereexists a coreset of size O(jdε−2). However, it is unclear how to generalize their technique to weightedversion.
108
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
embedding. The number k is called the dimension of the linearization, see [8]. We
have the following lemma to bound the dimension of the linearization.
Lemma 89. ([46]) Suppose F is a j-flat in Rd, the function d(x, F )2 (x ∈ Rd) has a
linerization. Let D be the dimension of the linearization. If j = 0, we have D = d+1.
If j = 1, we have D = O(d2). Otherwise, for 2 ≤ j ≤ d− 1, we have D = O(j2d3).
Suppose P is an instance of n stochastic points in Rd. For each j-flat F ∈ Rd,
let hF (x) = d(x, F )2 (x ∈ Rd), which admits a linearization of dimension O(j2d3) by
Lemma 89. Now, we map each point s ∈ P into an O(j2d3) dimensional point s′ and
map each j-flat F ∈ Rd into an O(j2d3) dimensional direction ~u, such that d(s, F ) =
〈s′, ~u〉1/2. For convenience, we still use P to represent the collection of points after lin-
earization. Recall that Pr[ P ] is the realized probability of the realization P ∼ P . By
this mapping, we translate our goal into finding a direction ~u ∈ RO(j2d3), which min-
imizes the expected value EP∼P [maxx∈P 〈~u, x〉1/2] =∑
P∼P Pr[ P ] ·maxx∈P 〈~u, x〉1/2.
We also denote P? = ~u ∈ Rd | 〈~u, s〉 ≥ 0,∀s ∈ P to be the polar set of P . We
only care about the directions in the polar set P? for which 〈~u, s〉1/2, ∀s ∈ P is well
defined.
We first construct a convex hull H to partition the realizations into two parts.
Our construction uses the method of (ε, τ)-quant-kernel construction in Chapter
4. For any normal vector (direction) ~u, we move a sweep line l~u orthogonal to ~u,
along the direction ~u, to sweep through the points in P . Stop the movement of `~u
at the first point such that Pr[P ∩H~u)] ≥ ε′, where ε′ = εO(j2d3) is a fixed constant.
Denote H~u to be the halfplane defined by the sweep line `~u (orthogonal to the normal
vector ~u) and H~u to be its complement. Denote P(H~u) = P ∩ H~u to be the set of
points swept by the sweep line l~u. We repeat the above process for all normal vectors
(directions) ~u, and let H = ∩~uH~u. Since the total probability B ≥ ε, H is nonempty
by Helly’s theorem. We also know that H is a convex hull by Chapter 4. Moreover,
we have the following lemma.
Lemma 90. (Lemma 39 and Theorem 5) Suppose the dimensionality is d. There
is a convex set K, which is an intersection of O(ε−(d−1)/2) halfspaces and satisfies
109
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
(1− ε)K ⊆ H ⊆ K. Moreover, K can be constructed in O(n logO(d) n) time.
By the above lemma, we construct a convex set K = ∩~uK~u, which is the inter-
section of O(ε−O(j2d3)) halfspaces K~u (~u is the direction orthogonal to the halfspace
K~u). Let K~u be the complement of K~u, and let P(K~u) = P ∩ K~u be the set of points
in K~u. Denote P(K) to be the set of points outside the convex set K. Then we have
the following lemma, which shows that the total probability outside K is very small.
Lemma 91. Let K be a convex set constructed as in Lemma 90. The total probability
Pr[P(K)] ≤ ε.
Proof. Assume that K = ∩~uK~u. Consider a halfspace K~u. By Lemma 90, the convex
set K satisfies that H ⊆ K. Thus, we have that Pr[P(K~u)] ≤ Pr[P(H~u)] ≤ ε′ by the
definition of H~u.
Note that Pr[P(K)] is upper bounded by the multiplication of ε′ and the number
of halfspaces of K. By Lemma 90, there are at most O(ε−O(j2d3)) halfspaces K~u. Thus,
we have that Pr[P(K)] ≤ ε.
Our construction of SJFC-Coreset is consist of two parts. For points inside K,
we construct a collection S1. Our construction is almost the same as (ε, r)-fpow-
kernel construction in Chapter 4, except that the cardinality of the collection S1 is
different. For completeness, we provide the details of the construction here. Let P(K)
be the collection of points in K∩P , then P(K) is also an instance of a stochastic min-
imum j-flat-center problem. We show that we can estimate EP∼P(K)[maxx∈P 〈~u, x〉1/2]
by S1. For the rest points outside K, we show that the contribution for the objective
function EP∼P [maxx∈P 〈~u, x〉1/2] is almost linear and can be reduced to a weighted
j-flat-median problem as in Case 1.
We first show how to construct S1 for points inside K as follows.
1. Sample N = O((ε′ε)−2ε−O(j2d3) log(1/ε)) = O(ε−O(j2d3)) independent realiza-
tions restricted to P(K).
2. For each realization Si, use the algorithm in [7] to find a deterministic ε-kernel
Ei of size O(ε−O(j2d3)). Here, a deterministic ε-kernel Ei satisfies that (1 −
110
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
ε)CH(Si) ⊆ CH(Ei) ⊆ CH(Si), where CH(·) is the convex hull of the point
set.
3. Let S1 = Ei | 1 ≤ i ≤ N be the collection of all ε-kernels, and each ε-kernel
Ei has a weight 1/N .
Hence, the total size of S1 is O(ε−O(j2d3)). For any direction ~u ∈ P?, we use
1N
∑Ei∈S1
maxx∈Ei〈~u, x〉1/2 as an estimation of EP∼P(K)[maxx∈P 〈~u, x〉1/2]. By Chapter
4, we have the following lemma.
Lemma 92. (Lemma 44-46) For any direction ~u ∈ P?, let M~u = maxx∈P(K)〈~u, x〉1/2.
We have that
1
N
∑
Ei∈S1
maxx∈Ei〈~u, x〉1/2 ∈ (1± ε/2)EP∼P(K)[max
x∈P〈~u, x〉1/2]± ε′ε(1− ε)M~u/4
Now we are ready to prove the following lemma.
Lemma 93. For any direction ~u ∈ P?, we have the following property.
1
N
∑
Ei∈S1
maxx∈Ei〈~u, x〉1/2 +
∑
si∈P(K)
pi · 〈~u, si〉1/2 ∈ (1± 4ε)EP∼P [maxx∈P〈~u, x〉1/2].
Proof. Let E be the event that no point is present in K. By the fact Pr[K] ≤ ε, we
have that Pr[E] = Πsi∈P(K)(1−pi) ≥ 1−∑si∈P(K) pi ≥ 1− ε. Thus, we conclude that
1− ε ≤ Pr[E] ≤ 1 We first rewrite EP∼P [maxx∈P 〈~u, x〉1/2] as follows:
EP∼P [maxx∈P〈~u, x〉1/2] = Pr[E] · EP∼P [max
x∈P〈~u, x〉1/2 | E] + Pr[E] · EP∼P [max
x∈P〈~u, x〉1/2 | E]
=Pr[E] · EP∼P(K)[maxx∈P〈~u, x〉1/2] + Pr[E] · EP∼P [max
x∈P〈~u, x〉1/2 | E]
For event E, we bound the term Pr[E] · EP∼P(K)[maxx∈P 〈~u, x〉1/2] via the collection
S1. Let M~u = maxx∈P(K)〈~u, x〉1/2. By Lemma 92, for any direction ~u ∈ P?, we have
that
1
N
∑
Ei∈S1
maxx∈Ei〈~u, x〉1/2 ∈ (1± ε/2)EP∼P(K)[max
x∈P〈~u, x〉1/2]± ε′ε(1− ε)M~u/4
111
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
By Lemma 90, we have that (1− ε)K ⊆ H. Then by the construction of H~u, we have
that Pr[P ∩ (1− ε)K~u] ≥ ε′. Thus, we obtain that
EP∼P [maxx∈P〈~u, x〉1/2] ≥ ε′(1− ε) max
x∈P(K)〈~u, x〉1/2 = ε′(1− ε)M~u.
So we conclude that
(1− 2ε)Pr[E] · EP∼P(K)[maxx∈P〈~u, x〉1/2]− εEP∼P [max
x∈P〈~u, x〉1/2] ≤ 1
N
∑
Ei∈S1
maxx∈Ei〈~u, x〉1/2
≤(1 + 2ε)Pr[E] · EP∼P(K)[maxx∈P〈~u, x〉1/2] + εEP∼P [max
x∈P〈~u, x〉1/2],
(5.6)
since 1− ε ≤ Pr[E] ≤ 1.
For event E, without loss of generality, we assume that the n points s1, . . . , sn in
P are sorted in nondecreasing order according to the inner product 〈~u, si〉. Assume
that si1 , . . . , sil (i1 < i2 < . . . < il) are points in P(K). Let Ej be the event that
point sij is present and all points sik are not present for k > j. We have that
Pr[E] · EP∼P [maxx∈P〈~u, x〉1/2 | E] =
∑
j∈[l]
Pr[Ej] · EP∼P [maxx∈P〈~u, x〉1/2 | Ej]
=∑
j∈[l]
pij ·( ∏
j+1≤k≤l(1− pik)
)· EP∼P [max
x∈P〈~u, x〉1/2 | Ej].
By the above equality, on one hand, we have that
Pr[E] · EP∼P [maxx∈P〈~u, x〉1/2 | E] ≥ (1− ε)
∑
j∈[l]
pij · 〈~u, sij〉1/2, (5.7)
since maxx∈P 〈~u, x〉1/2 ≥ 〈~u, sij〉1/2 if event Ej happens. On the other hand, the
112
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
following inequality also holds.
Pr[E] · EP∼P [maxx∈P〈~u, x〉1/2 | E] =
∑
j∈[l]
Pr[Ej] · EP∼P [maxx∈P〈~u, x〉1/2 | Ej]
≤∑
j∈[l]
Pr[Ej] · EP∼P [〈~u, sij〉1/2 + maxx∈P∩P(K)
〈~u, x〉1/2 | Ej]
≤∑
j∈[l]
pij ·(EP∼P [〈~u, sij〉1/2 | Ej] + EP∼P [ max
x∈P∩P(K)〈~u, x〉1/2 | Ej]
)
≤∑
j∈[l]
pij · 〈~u, sij〉1/2 +∑
j∈[l]
pij · EP∼P(K)[maxx∈P〈~u, x〉1/2] ≤
∑
j∈[l]
pij · 〈~u, sij〉1/2 + ε · EP∼P [maxx∈P〈~u, x〉1/2].
(5.8)
The last inequality holds since that∑
j∈[l] pij = Pr[P(K)] ≤ ε by Lemma 91. Com-
bining Inequalities (5.6), (5.7) and (5.8), we prove the lemma.
By Lemma 88, we construct a point set S2 to estimate∑
si∈P(K) pi · d(si, F ) with
a weight function w′ : S2 → R. We have that the size of S2 can be bounded by
O(j4dε−2). Then S = S1 ∪ S2 is a collection of constant size, which satisfies the
following property:
1
N
∑
Ei∈S1
maxx∈Ei〈~u, x〉1/2 +
∑
si∈S2
w′i · 〈~u, si〉1/2 ∈ (1 +O(ε))EP∼P [maxx∈P〈~u, x〉1/2]. (5.9)
Here w′i is the weight of si in S2. We can think S2 = si | 1 ≤ si ≤ |S2| as a
collection of singleton point sets si. Then by Inequality 5.9, we have that S is a
generalized ε-coreset satisfying Definition 62. We conclude the following lemma.
Lemma 94. Given an instance P of n stochastic points of the stochastic minimum
j-flat-center problem in the existential model, if the total probability∑
i pi ≥ ε, there
exists an SJFC-Coreset S containing O(ε−O(j2d3) + j4dε−2) point sets of size at
most O(ε−O(j2d3)), together with a weight function w′ : S → R+, which satisfies that
for any j-flat F ∈ F ,
∑
S∈Sw′(S) · J(S, F ) ∈ (1± ε)J(P , F ).
113
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Combining Lemma 88 and Lemma 94, we can obtain the following theorem.
Theorem 95. Given an instance P of n stochastic points in the existential model,
there is an SJFC-Coreset of size O(ε−O(j2d3) + j4dε−2) for the minimum j-flat-
center problem. Moreover, we have an O(n logO(d) n + ε−O(j2d3)n) time algorithm to
compute the SJFC-Coreset.
Proof. We only need to prove the running time. Recall that the SJFC-Coreset
S can be divided into two parts S = S1 ∪ S2. For the first part S1, we construct
the convex hull K in O(n logO(d) n) by Lemma 90. Then we construct S1 by taking
O(ε−O(j2d3)) independent realizations restricted to P(K). For each sample, we con-
struct a deterministic ε-kernel in O(n + ε−(d−3/2)) by [110]. So the total time for
constructing S1 is O(n logO(d) n + ε−O(j2d3)n). On the other hand, we can construct
S2 in O(ndjO(j2)) time by Lemma 88. Thus, we prove the theorem.
PTAS for stochastic minimum j-flat-center. Given an SJFC-Coreset S to-
gether with a weight function w′ : S → R+ by Theorem 95, it remains to show how
to compute the optimal j-flat for S. Our goal is to find the optimal j-flat F ∗ such
that the total generalized distance∑
S∈S w′(S) ·J(S, F ∗) is minimized. The argument
is similar to the stochastic minimum k-center problem.
We first divide the family F of j-flats into a constant number of sub-families. In
each sub-family F ′ ⊆ F , we have the following property: for each Si ∈ S, and each
j-flat F ∈ F ′, the point si = arg maxs∈Si d(s, F ) is fixed. By Lemma 41, we have that
hF (x) = d(x, F )2 (x ∈ Rd) admits a linearization of dimension O(j2d3). For each
sub-family F ′, we can formulate the optimization problem as a polynomial system of
constant degree, a constant number of variables, and a constant number of constraints.
Then we can compute the optimal j-flat in constant time for all sub-families F ′ ⊆ F .
Thus, we can compute the optimal j-flat-center for the SJFC-Coreset S in constant
time. We then have the following corollary.
Corollary 96. If the dimensionality d is a constant, given an instance of n stochastic
points in Rd in the existential uncertainty model, there exists a PTAS for the stochastic
minimum j-flat-center problem in O(n logO(d) n+ ε−O(j2d3)n) time.
114
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Locational Uncertainty Model. Note that in the locational uncertainty model,
we only need to consider Case 2. We use the same construction as in the existential
model. Let pi =∑
j pvjsi . Similarly, we make a linearization for the function d(x, F )2,
where x ∈ Rd and F ∈ F is a j-flat. Using this linearization, we also map P into
O(j2d3)-dimensional points. For the jth node and a set P of points, we denote
pj(P ) =∑
si∈P pvjsi to be the total probability that the jth node locates inside P .
By the condition Pr[P(K)] ≤ ε, we have that Pr[E] = 1 −∏j∈[m](1 − pj(K)) ≤1 − (1 − ε) = ε, where event E represents that there exists a point present in K.
So we can regard those points outside K independent. On the other hand, for any
direction ~u, since Pr[P ∩ (1 − ε)H~u] ≥ ε′, we have that Pr[E~u] = 1 − ∏j∈[m](1 −pj(P ∩ (1 − ε)H~u)) ≥ 1 − (1 − ε′
m)m ≥ ε′/2, where event E~u represents that there
exists a point present in P ∩ (1 − ε)H~u. Moreover, we can use the same method to
construct a collection S1 as an estimation for the point set P(K) in the locational
uncertainty model. So Lemma 93 still holds. Then by Lemma 94, we can construct
an SJFC-Coreset of constant size.
Theorem 97. Given an instance P of n stochastic points in the locational uncer-
tainty model, there is an SJFC-Coreset of cardinality O(ε−O(j2d3) + j4dε−2) for the
minimum j-flat-center problem. Moreover, we have a polynomial time algorithm to
compute the gerneralized ε-coreset.
By a similar argument as in the existential model, we can give a PTAS for the
locational uncertainty model. Then combining with Corollary 96, we prove the main
result Theorem 13.
5.5 Constructing additive ε-coresets
In this section, we first give the algorithm for constructing an additive ε-coreset. We
construct Cartesian grids and maintain one point from each nonempty grid cell, which
is similar to [11]. However, our algorithm is more complicated. See Algorithm 2 for
details.
Now we analyze the algorithm.
115
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Algorithm 2 Constructing additive ε-coresets (A)
1 Input: a realization P ∼ P . W.l.o.g., assume that P = s1, . . . , sm.2 Let rP = minF :F⊆P,|F |=k K(P, F ). If rP = 0, output E(P ) = P . Otherwise assumethat 2a ≤ rP < 2a+1 (a ∈ Z).3 Draw a d-dimensional Cartesian grid G1(P ) of side length ε2a/4d centered at point0d.4 Let C1(P ) = C | C ∈ G,C∩P 6= ∅ be the collection of those cells which intersectsP .5 For each cell C ∈ C1(P ), let sC ∈ C ∩ P be the point in C of smallest index. LetE1(P ) = sC | C ∈ C1(P ).6 Compute rE1(P ) = minF :F⊆P,|F |=k K(E1(P ), F ). If rE1(P ) ≥ 2a, let E(P ) = E1(P ),G(P ) = G1(P ), and C(P ) = C1(P ).7 If rE1(P ) < 2a, draw a d-dimensional Cartesian grid G2(P ) of side length ε2a/8dcentered at point 0d. Repeat Step 4 and 5, construct C2(P ) and E2(P ) based on thenew Cartesian grid G2(P ). Let E(P ) = E2(P ), G(P ) = G2(P ), and C(P ) = C2(P ).8 Output E(P ), G(P ), and C(P ).
Lemma 98. rP is a 2-approximation for the minimum k-center problem w.r.t. P .
Proof. By Gonzalez’s greedy algorithm [51], there exists a subset F ⊆ P ⊆ P of size
k such that the k-center value K(P, F ) is a 2-approximation for the minimum k-center
problem w.r.t. P . Thus, we prove the lemma.
By the above lemma, we have the following lemma.
Lemma 68. The running time of A on any n point set P is O(knk+1). Moreover,
the output E(P ) is an additive ε-coreset of P of size at most O(k/εd).
Proof. Since rP is a 2-approximation, E(P ) is an additive ε-coreset of P of sizeO(k/εd)
by Theorem 2.4 in [11]. For the running time, consider computing rP in Step 2 (also
rE1(P ) in Step 6). There are at most nk point sets F ⊆ P such that |F | = k. Note that
computing K(P, F ) costs at most nk time. Thus, it costs O(knk+1) time to compute
rP (also rE1(P )) for all k-point sets F ⊆ P . On the other hand, it only costs linear
time to construct the Cartesian grid G(P ), the cell collection C(P ) and E(P ) after
computing rP and rE1(P ), which finishes the proof.
We then give the following lemmas, which is useful for proving Lemma 70.
116
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Lemma 99. For two point sets P, P ′, if P ′ ⊆ P , then rP ′ ≤ rP . Moreover, if P ′ is
an additive ε-coreset of P , then (1− ε)rP ≤ rP ′ ≤ rP .
Proof. Suppose F ⊆ P is the k-point set such that the k-center value K(P, F ) = rP .
Since P ′ ⊆ P , we have K(P ′, F ) ≤ rP . Thus, we have rP ′ ≤ K(P ′, F ) ≤ rP .
Moreover, assume that P ′ is an additive ε-coreset of P . Suppose F ′ ⊆ P is the
k-point set such that the k-center value K(P ′, F ′) = rP ′ . Then by Definition 67, we
have K(P, F ′) ≤ (1+ε)rP ′ . Thus, we have (1−ε)rP ≤ (1−ε)K(P, F ′) < rP ′ ≤ rP .
Lemma 100. Assume that a point set P ′ = E(P ) for another point set P ∼ P ′.Running A(P ′) and A(P ), assume that we obtain two Cartesian grids G(P ′) and
G(P ) respectively. Then we have G(P ′) = G(P ).
Proof. If rP = 0, we have that rP ′ ≤ rP = 0 by Lemma 99. Thus we do not construct
the Cartesian grid for both P and P ′. Otherwise assume that 2a ≤ rP < 2a+1 (a ∈ Z).
Run A(P ). In Step 5, we construct a Cartesian grid G1(P ) of side length ε2a/4d, a
cell collection C1(P ), and a point set E1(P ). Since E1(P ) is an additive ε-coreset of
P by [11], we have 2a−1 < (1 − ε)rP ≤ rE1(P ) ≤ rS < 2a+1. Then we consider the
following two cases.
Case 1: rE1(P ) ≥ 2a. Then P ′ = E(P ) = E1(P ), and G(P ) = G1(P ) in this
case. Running A(P ′), we have that 2a ≤ rE1(P ) = rP ′ ≤ rP < 2a+1 by Lemma 99.
Thus, we construct a Cartesian grid G1(P ′) = G1(P ) of side length ε2a/4d, and a
point set E1(P ′) in Step 5. Since G1(P ′) = G1(P ) and P ′ = E1(P ), we have that
E1(P ′) = P ′ by the construction of E1(P ′). Thus, rE1(P ′) = rE1(P ) ≥ 2a, and we obtain
that G(P ′) = G1(P ′) in Step 6, which proves the lemma.
Case 2: 2a−1 ≤ rE1(P ) < 2a. Then in Step 7, we construct a Cartesian grid G2(P )
of side length ε2a/8d for P , a cell collection C2(P ), and a point set E2(P ). In this case,
we have that E(P ) = E2(P ), G(P ) = G2(P ), and C(P ) = C2(P ). Now run A(P ′), and
obtain E(P ′), G(P ′), and C(P ′). By Lemma 99, we have
2a+1 > rP ≥ rP ′ = rE(P ) ≥ (1− ε)rP > 2a−1.
We need to consider two cases. If 2a−1 ≤ rP ′ < 2a, we construct a Cartesian grid
117
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
G1(P ′) of side length ε2a/8d, and a point set E1(P ′) in Step 5. Since G1(P ′) = G2(P )
and P ′ = E2(P ), we have that E1(P ′) = P ′ by the construction of E1(P ′). Then we
let G(P ′) = G1(P ′) in Step 6. In this case, both G(P ) and G(P ′) are of side length
ε2a/8d, which proves the lemma.
Otherwise if 2a ≤ rP ′ < 2a+1, we construct the Cartesian grid G1(P ′) = G1(P ) of
side length ε2a/4d, a cell collection C1(P ′), and a point set E1(P ′) in Step 5. We then
prove that E1(P ′) = E1(P ). Since all Cartesian grids are centered at point 0d, a cell
in G1(P ) can be partitioned into 2d equal cells in G2(P ). Rewrite a cell C∗ ∈ G1(P )
as C∗ = ∪1≤i≤2dCi where each Ci ∈ G2(P ). Assume that point s∗ ∈ C∗ ∩ P =
∪1≤i≤2d(Ci ∩ P ) has the smallest index, then point s∗ is also the point in C∗ ∩ E2(P )
of smallest index. Since E(P ) = E2(P ), we have that s∗ is the point in C∗ ∩ E(P )
of smallest index. Considering E1(P ′), note that for each cell C∗ ∈ C1(P ′), E1(P ′)
only contains the point in C∗ ∩ P ′ of smallest index. Since P ′ = E(P ), we have that
E1(P ′) = E1(P ). Thus, we conclude that rE1(P ′) = rE1(P ) < 2a. Then in Step 7, we
construct a Cartesian grid G2(P ′) = G2(P ) of side length ε2a/8d for P ′. Finally, we
output G(P ′) = G2(P ′) = G(P ), which proves the lemma.
Recall that we denote E(P) = E(P ) | P ∼ P to be the collection of all possible
additive ε-coresets. For any S, we denote E−1(S) = P ∼ P | E(P ) = S to be the
collection of all realizations mapped to S. Now we are ready to prove Lemma 70.
Lemma 70. (restated) Consider a subset S of at most O(k/εd) points. Run algo-
rithm A(S), which outputs an additive ε-coreset E(S), a Cartesian grid G(S), and
a collection C(S) of nonempty cells. If E(S) 6= S, then S /∈ E(P) (i.e., S is not the
output of A for any realization P ∼ P). If |S| ≤ k, then E−1(S) = S. Otherwise if
E(S) = S and |S| ≥ k + 1, then a point set P ∼ P satisfies E(P ) = S if and only if
P1. For any cell C /∈ C(S), C ∩ P = ∅.
P2. For any cell C ∈ C(S), assume that point sC = C ∩ S. Then sC ∈ P , and any
point s′ ∈ C ∩ P with a smaller index than that of sC does not appear in the
realization P .
118
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
Proof. If E(S) 6= S, we have that rS > 0. Assume that S ∈ E(P). There must exist
some point set P ∼ P such that E(P ) = S. By Lemma 100, running A(P ) and A(S),
we obtain the same Cartesian grid G(P ) = G(S). Since E(S) 6= S, there must exist
a cell C ∈ C(S) such that |C ∩ S| ≥ 2 (by the construction of E(S)). Note that
C ∈ G(P ). We have |C ∩ E(P )| = 1, which is a contradiction with E(P ) = S. Thus,
we conclude that S /∈ E(P).
If |S| ≤ k, assume that there exists another point set P 6= S, such that E(P ) = S.
By Lemma 68, we know that S is an additive ε-coreset of P . By Definition 67, we
have S ⊆ P and K(P, S) ≤ (1 + ε)K(S, S) = 0. Thus we conclude that P = S. On
the other hand, we have E(S) = S since rS = 0. So we conclude that E−1(S) = S.If |S| ≥ k+1 and E(S) = S, we have that rS > 0. Running A(P ) and A(S), assume
that we obtain two Cartesian grids G(P ) and G(S) respectively. By Lemma 100, if
E(P ) = S, then we have G(P ) = G(S). Moreover, by the construction of E(P ), P1
and P2 must be satisfied.
We then prove the ’only if’ direction. If P1 and P2 are satisfied, we have that S
is an additive ε-coreset of P satisfying Definition 67 by [11]. Then by Lemma 99, we
have that (1 − ε)rP ≤ rS ≤ rP . Assume that 2a ≤ rS < 2a+1 (a ∈ Z), we conclude
2a ≤ rP < 2a+2. Now run A(S). In Step 5, assume that we construct a Cartesian grid
G1(S) of side length ε2a/4d, a cell collection C1(S), and a point set E1(S). Since E1(S)
is an additive ε-coreset of S by [11], we have 2a−1 < (1 − ε)rS ≤ rE1(S) ≤ rS < 2a+1.
Then we consider the following two cases.
Case 1: 2a ≤ rE1(S) < 2a+1. In this case, we have that G(S) = G1(S), C(S) =
C1(S), and S = E(S) = E1(S). Running A(P ), assume that we obtain G(P ), C(P ),
and E(P ). Consider the following two cases. If 2a ≤ rP < 2a+1, we construct a
Cartesian grid G1(P ) = G(S) of side length ε2a/4d, and a point set E1(P ) in Step 5.
Since P1 and P2 are satisfied, we know that E1(P ) = S. Then since 2a ≤ rE1(P ) = rS <
2a+1, we obtain that E(P ) = E1(P ) = S in this case. Otherwise if 2a+1 ≤ rS < 2a+2,
run A(P ). We construct a Cartesian grid G1(P ) of side length ε2a/2d, and a point
set E1(P ) in Step 5. Since P1 and P2 are satisfied, we have that E1(P ) ⊆ S. Thus,
we have rE1(P ) ≤ rS < 2a+1 by Lemma 99. Then in Step 7, we construct a Cartesian
119
CHAPTER 5. CORESET CONSTRUCTION FOR STOCHASTIC SHAPEFITTING PROBLEMS
grid G2(P ) = G1(S) of side length ε2a/4d, and a point set E2(P ). In this case, we
have that G(P ) = G2(P ) = G1(S), and E(P ) = E2(P ). By P1 and P2, we have that
E(P ) = E2(P ) = S.
Case 2: 2a−1 ≤ rE1(S) < 2a. Running A(S), we construct a Cartesian grid G2(S)
of side length ε2a/8d, and a point set E2(S) in Step 7. In this case, we have that
G(S) = G2(S), and S = E(S) = E2(S). Since E1(S) is an additive ε-coreset of S,
we conclude that E1(S) is also an additive 3ε-coreset of P satisfying Definition 67.
Then we have that 2a ≤ rP ≤ (1 + 3ε)rE1(S) < 2a+1 by Lemma 99. Running A(P ),
we construct a Cartesian grid G1(P ) = G1(S) of side length ε2a/4d, and a point set
E1(P ) in Step 5. Since P1 and P2 are satisfied, we know that E1(P ) = E1(S). Thus,
we have 2a−1 ≤ rE1(P ) = rE1(S) < 2a. Then in Step 7, we construct a Cartesian grid
G2(P ) = G2(S) of side length ε2a/8d, and a point set E2(P ). Again by P1 and P2,
we have that E2(P ) = E2(S). Thus, we output E(P ) = E2(P ) = S, which finishes the
proof.
120
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Chapter 6 Estimating the Expected Value of
Combinatorial Optimization Problems
over Stochastic Data
In this chapter, we consider the stochastic geometry model where the location of
each node is a random point in a given metric space, or the existence of each node
is uncertain. We study the problems of computing the expected lengths of several
combinatorial or geometric optimization problems over stochastic points, including
closest pair, minimum spanning tree, k-clustering, minimum perfect matching, and
minimum cycle cover. We also consider the problem of estimating the probability
that the length of closest pair, or the diameter, is at most, or at least, a given
threshold. Most of the above problems are known to be #P-hard. We obtain FPRAS
(Fully Polynomial Randomized Approximation Scheme) for most of them in both
the existential and locational uncertainty models. Our result for stochastic minimum
spanning trees in the locational uncertain model improves upon the previously known
constant factor approximation algorithm. Our results for other problems are the first
known to the best of our knowledge.
6.1 The Closest Pair Problem
6.1.1 Estimating Pr[C ≤ 1]
As a warmup, we first demonstrate how to use the stoch-core technique for the
closest pair problem in the existential uncertainty model. Given a set of points
P = s1, . . . , sn in the metric space, where each point si ∈ P is present with proba-
bility pi. We use C to denote the distance between the closest pair of vertices in the
realized graph. If the realized graph has less than two points, C is zero. The goal is
121
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
to compute the probability Pr[C ≤ 1].
For a set H of points and a subset S ⊆ H, we use H〈S〉 to denote the event that
among all points in H, all and only points in S are present. For any nonnegative
integer i, let H〈i〉 denote the event∨S⊆H:|S|=iH〈S〉, i.e., the event that exactly i
points are present in H.
The stoch-core of the closest pair problem is simply defined to be
H =si | pi ≥
ε
n2
.
Let F = P \H. We consider the decomposition
Pr[C ≤ 1] =
|F|∑
i=0
Pr[F〈i〉 ∧ C ≤ 1] =
|F|∑
i=0
Pr[F〈i〉] · Pr[C ≤ 1 | F〈i〉].
Our algorithm is very simple: estimate the first three terms (i.e., i = 0, 1, 2) and use
their sum as our final answer.
We can see that H satisfies the two properties of a stoch-core mentioned in the
introduction:
1. The probability that all nodes are realized in H, i.e., Pr[F〈0〉], is at least 1−n ·εn2 = 1− ε
n;
2. If there exist two points si, sj ∈ H such that d(si, sj) ≤ 1, we have Pr[C ≤ 1 |F〈0〉 ] ≥ ε2
n4 ; otherwise, Pr[C ≤ 1 | F〈0〉] = Pr[H〈0〉 | F〈0〉] + Pr[H〈1〉 | F〈0〉].Note that we can compute Pr[H〈0〉 | F〈0〉] and Pr[H〈1〉 | F〈0〉] in polynomial
time. We do not consider this case in the following analysis.
Both properties guarantee that the random variable I(C ≤ 1), conditioned on F〈0〉,is poly-bounded 1, hence we can easily get a (1 ± ε)-estimation for Pr[F〈0〉 ∧ C ≤ 1]
with polynomial many samples with high probability. Similarly, Pr[F〈i〉 ∧C ≤ 1] can
also be estimated with polynomial number of samples for i = 1, 2. The algorithm can
be found in Algorithm 3.
1I() is the indicator function. Note that E[I(C ≤ 1)] = Pr[C ≤ 1].
122
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Algorithm 3 Estimating Pr[C ≤ 1]
1 Estimate Pr[F〈0〉∧C ≤ 1]: TakeN0 = O((n/ε)4 lnn
)independent samples. Suppose
M0 is the number of samples satisfying C ≤ 1 and F〈0〉. T0 ← M0
N0.
2 Estimate Pr[F〈1〉 ∧ C ≤ 1]: For each point si ∈ F, take N1 = O((n/ε)4 lnn)independent samples conditioning on the event F〈si〉. Suppose there are Mi samplessatisfying C ≤ 1. T1 ←
∑si∈F piMi/N1.
3 Estimate Pr[F〈2〉 ∧ C ≤ 1]: For each point pair si, sj ∈ F, take N2 = O((n/ε)4 lnn)independent samples conditioning on the event F〈si, sj〉. Suppose there are Mij
samples satisfying C ≤ 1. T2 ←∑
si,sj∈F pipjMij/N2.4 Output: T0 + T1 + T2
Lemma 101. Steps 1,2,3 in Algorithm 3 provide (1±ε)-approximations for Pr[F〈i〉∧C ≤ 1] for i = 0, 1, 2 respectively, with high probability.
Theorem 102. There is an FPRAS for estimating the probability of the distance
between the closest pair of nodes is at most 1 in the existential uncertainty model.
Proof. We only need to show that the contribution from the rest of terms (where
more than two points outside stoch-core H are present) is negligible compared to the
third term. Suppose S is the set of all present points such that C ≤ 1 and there are at
least 3 points not in H. Suppose si, sj are the closest pair in S. We associate S with
a smaller set S ′ ⊂ S by making 1 present point in (S ∩ F) \ si, sj absent (if there
are several such S ′, we choose an arbitrary one). We denote it as S ∼ S ′. We use the
notation S ∈ Fi to denote that the realization S satisfies (F〈i〉 ∧ C ≤ 1). Then, we
can see that for i ≥ 3,
Pr[F〈i〉 ∧ C ≤ 1] =∑
S:S∈FiPr[S] ≤
∑
S′:S′∈Fi−1
∑
S:S∼S′Pr[S].
For a fixed S ′, there are at most m different sets S such that S ∼ S ′ and Pr[S] ≤2εn2 Pr[S ′] for any such S. Hence, we have that
∑
S:S∼S′Pr[S] ≤ 2ε
nPr[S ′].
123
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Therefore,
Pr[F〈i〉 ∧ C ≤ 1] ≤ 2ε
n·∑
S′:S′∈Fi−1
Pr[S ′] =2ε
n· Pr[F〈i− 1〉 ∧ C ≤ 1].
Hence, overall we have∑
i≥3 Pr[F〈i〉 ∧ C ≤ 1] ≤ εPr[F〈2〉 ∧ C ≤ 1]. This finishes the
analysis.
Note that the number of samples is dominated by estimating Pr[F〈2〉 ∧ C ≤ 1].
Since there are O(n2)
different pairs si, sj ∈ F. We take N2 independent samples for
each pair. Overall, we take O(n6
ε4lnn)
independent samples.
Locational Uncertainty Model. The algorithm for the locational uncertainty
model is similar to the one for the existential uncertainty model. Here we briefly
sketch the algorithm. For ease of exposition, we assume that for each point, there
is only one node that may be realized at this point. In principle, if more than one
node may be realized at the same point, we can create multiple copies of the point
co-located at the same place.
For any node v ∈ V and point s ∈ P , we use the notation v s to denote the
event that node v is realized at point s. Let pvs = Pr[v s], i.e., the probability that
node v is realized at point s. For each point s ∈ P , we let p(s) denote the probability
that point s is present (p(s) = pvs, v is the unique node which may be realized at s).
Let H〈i〉 denote the event that exactly i nodes are realized to the point set H.
We construct the stoch-core H = s | p(s) ≥ ε(nm)2. Let F = P \ H. Then we
rewrite Pr[C ≤ 1] =∑
0≤i≤n Pr[F〈i〉∧C ≤ 1]. We only need to estimate the first three
terms.
Estimating Pr[F〈0〉 ∧ C ≤ 1].
1. If there exist two points s, s′ ∈ H with d(s, s′) ≤ 1 which correspond to different
nodes, then Pr[F〈0〉∧C ≤ 1] ≥ p(s)p(s′) ≥ ε2
(nm)4 by the definition of stoch-core ,
we can simply estimate Pr[F〈0〉 ∧ C ≤ 1] by taking O( (nm)4
ε4lnn) independent
samples using the Monte Carlo method.
2. If no such two points s, s′ ∈ H exist, Pr[F〈0〉 ∧ C ≤ 1] = 0.
124
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Estimating Pr[F〈1〉 ∧ C ≤ 1]. We first rewrite this term by∑
v∈V,s∈F Pr[F〈1〉 ∧ C ≤1 ∧ v s]. For a node v ∈ V and point s ∈ F, we denote Bs = s′ ∈ H : d(s, s′) ≤ 1.If Bs contains any point corresponding to a node other than v, we can use Monte
Carlo for estimating Pr[F〈1〉 ∧ C ≤ 1 | v s] since it is at least ε(nm)2 . Otherwise,
computing Pr[F〈1〉∧C ≤ 1 | v s] is equivalent to computing Pr[F〈0〉∧C ≤ 1] in the
instance without v (since v is at distance more than 1 from any other nodes).
Estimating Pr[F〈2〉 ∧ C ≤ 1]. We rewrite it as∑
v,v′∈V,s,s′∈F Pr[F〈2〉 ∧ C ≤ 1 ∧ v s ∧ v′ s′]. We estimate each term in the same way as the former case. We do not
repeat the argument here.
Analysis. Similar to the existential uncertainty model, we can show that the con-
tribution of∑
3≤i≤n Pr[F〈i〉 ∧ C ≤ 1] is negligible. The argument is almost the same
as before. Suppose S is a realization such that C ≤ 1 and there are at least 3 points
not in H. Suppose vi, vj are the closest pair in S. We associate S with S ′, where
S ′ is obtained by sending node v in S (except vi, vj) located in F to a point s ∈ H
such that pvs ≥ 12n
. We denote it as S ∼ S ′. Then for a fixed S ′, there are at most
nm different sets S such that S ∼ S ′ and Pr[S] ≤ 2εm
Pr[S ′] for any such S. The rest
arguments are the same.
Theorem 103. There is an FPRAS for estimating the probability of the distance
between the closest pair of nodes is at most 1 in the locational uncertainty model.
The number of samples is dominated by estimating Pr[F〈2〉 ∧ C ≤ 1]. Since there
are O(m2)
different pairs of nodes v, v′ ∈ V and O(n2)
different pairs of points
s, s′ ∈ F, we separate F〈2〉 into O(m2n2
)different terms. For each term, we take
O( (nm)4
ε4lnn)
independent samples. Thus, we take O(m6n6
ε4lnn)
independent samples
in total.
6.1.2 Estimating E[C]
In this section, we consider the problem of estimating E[C], where C is the distance
of the closest pair of present points, in the existential uncertainty model. Now,
we introduce our second main technique, the hierarchical partition family (HPF)
125
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
technique, to solve this problem. An HPF is a family Ψ of partitions of P , formally
defined as follows.
Definition 104. (Hierarchical Partition Family (HPF)) Let T be any minimum s-
panning tree spanning all points of P. Suppose that the edges of T are e1, . . . , em−1
with d(e1) ≥ d(e2) ≥ . . . ≥ d(em−1). Let Ei = ei, ei+1, . . . , em−1. The HPF Ψ(P)
consists of m partitions Γ1, . . . ,Γn. Γ1 is the entire point set P. Γi consists of i dis-
joint subsets of P, each corresponding to a connected component of Gi = G(P , Ei).
Γn consists of all singleton points in P. It is easy to see that Γj is a refinement of
Γi for j > i. Consider two consecutive partitions Γi and Γi+1. Note that Gi contains
exactly one more edge (i.e., ei) than Gi+1. Let µ′i+1 and µ′′i+1 be the two components
(called the split components) in Γi+1, each containing an endpoint of ei. Let νi ∈ Γi
be the connected component of Gi that contains ei. We call νi the special component
in Γi. Let Γ′i = Γi \ νi.
We observe two properties of Ψ(P) that are useful later.
P1. Consider a component C ∈ Γi. Let s1, s2 be two arbitrary points in C. Then
d(s1, s2) ≤ (n− 1)d(ei) (this is because s1 and s2 are connected in Gi, and ei is
the longest edge in Gi).
P2. Consider two different components C1 and C2 in Γi. Let s1 ∈ C1 and s2 ∈ C2
be two arbitrary points. Then d(s1, s2) ≥ d(ei−1) (this is because the minimum
inter-component distance is d(ei−1) in Gi).
Let the random variable Y be smallest integer i such that there is at most one
present point in each component of Γi+1. Note that if Y = i then each component of
Γi contains at most one point, except that the special component νi contains exactly
two present points. The following lemma is a simple consequence of P1 and P2.
Lemma 105. Conditioning on Y = i, it holds that d(ei) ≤ C ≤ nd(ei) (hence, C is
poly-bounded).
126
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Consider the following expansion of E[C]:
E[C] =m−1∑
i=1
Pr[Y = i]E[C | Y = i].
For a fixed i, Pr[Y = i] can be estimated as follows: For a component C ⊂ P , we
use C〈j〉 to denote the event that exactly j points in C are present, C〈s〉 the event
that only s is present in C and C〈≤ j〉 the event that no more than j points in C are
present. Let µ′i and µ′′i be the two split components in Γi. Note that
Pr[Y = i] = Pr[µ′i+1〈1〉] · Pr[µ′′i+1〈1〉] ·∏
C∈Γ′i
Pr[C〈≤ 1〉].
Each term can be easily computed in polynomial time. The remaining is to show
how to estimate E[C | Y = i]. Since C is poly-bounded, it suffices to give an efficient
algorithm to take samples conditioning on Y = i. This is again not difficult: We
take exactly one point s ∈ µ′i+1 with probability Pr[µ′i+1〈s〉]/Pr[µ′i+1〈1〉]. Same for
µ′′i+1. For each C ∈ Γ′i, take no point from C with probability Pr[C〈0〉]/Pr[C〈≤ 1〉];otherwise, take exactly one point s ∈ C with probability Pr[C〈s〉]/Pr[C〈≤ 1〉].
By Lemma 105, conditioning on Y = i, taking O( nε2
lnn) independent samples are
enough using the Monte Carlo method. Since there are m levels, we take O(n2
ε2lnn)
independent samples in total. This finishes the description of the FPRAS in the
existential uncertainty model.
Locational Uncertainty Model. Our algorithm is almost the same as the exis-
tential model. We first construct the HPF Ψ(P). The random variable Y is defined
in the same way. The only difference is how to estimate Pr[Y = i] and how to take
samples efficiently conditioning on Y = i. First consider estimating Pr[Y = i]. We
can consider the problem as the following bins-and-balls problem: we have m balls
(corresponding to nodes) and i bins (corresponding to components in Γi). Each ball
v is thrown to bin C with probability pvC =∑
s∈C pvs (note that∑
C pvC = 1). We
want to compute the probability that each of the first and second bins (correspond-
ing to the two split components) contains exactly one ball, and for other bins each
127
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
contains at most one ball. Consider the following i × i (i ≥ m) matrix M with
MvC =
pvC =∑
s∈C pvs, for v ∈ [m] and C ∈ [i];
1, otherwise. It is not difficult to see that
the permanent
Per(M) =∑
σ∈Si
∏
v
Mvσ(v)
is exactly the probability that each bin contains at most one ball. To enforce each of
the first two bins contains exactly one ball, simply consider the Laplace expansion of
Per(M), expanded along the first two columns, and retain those relevant terms:
Pr[Y = i] =∑
k∈[n]
∑
j∈[n],j 6=kMk1Mj2Per(M?
kj)
where M?kj is M with the 1st and 2nd columns and kth and jth rows removed. Then,
we can use the celebrated result for approximating permanent by Jerrum, Sinclair,
and Vigoda [?] to get an FPRAS for approximating Pr[Y = i]. In fact, the algorithm
in [?] provides a fully polynomial time approximate sampler for perfect matchings 2.
This can be easily translated to an efficient sampler conditioning on Y = i 3. Finally,
we remark that the above algorithm can be easily modified to handle the case with
both existential and locational uncertainty model.
Theorem 106. There is an FPRAS for estimating the expected distance between the
closest pair of nodes in both existential and locational uncertainty models.
kth Closest Pair. In addition, we consider the problem of the expected distance
E[kC] between the kth closest pair under the existential uncertainty model. We use
the HPF technique, and construct an efficient sampler via a dynamic programming.
The details can be found in Section 6.8.1.
2The approximate sampler can return in poly-time a permutation σ ∈ Si with probability (1 ±ε)∏sMsσ(s)/Per(M).
3We can also use the generic reduction by Jerrum, Valiant and Vazirani [68] which can turn anFPRAS into a poly-time approximate sampler for self-reducible relations.
128
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
6.2 k-Clustering
In this section, we study the k-clustering problem in the existential uncertainty model.
According to [74], the optimal objective value for k-clustering is the (k − 1)th most
expensive edge of the minimum spanning tree. We consider estimating E[kCL] under
the existential uncertainty model.
Denote the point set P = s1, . . . , sn, where each point si ∈ P is present with
probability pi. We construct the HPF Ψ(P). Let the random variable Y be the
largest integer i such that at most k−1 components in Γi contain at least one present
point. Let Γ′i = Γi \ νi. Note that if Y = i then at most k − 2 components in Γ′i
contain present points while the special component νi contains at least two present
points, since both component µ′i+1 and µ′′i+1 contain at least one present point. By
the property P1 and P2 of HPF, we have the following lemma.
Lemma 107. Conditioning on Y = i, it holds that d(ei) ≤ kCL ≤ nd(ei) (hence, kCL
is poly-bounded)..
Proof. Since Γi+1 contains at least k nonempty components, any spanning tree must
have at least k − 1 inter-component edges. Any inter-component edge is of length
at least d(ei), so is the (k − 1)th expensive edge. Now we show the other direction.
Assume w.l.o.g. that all pairwise distances are distinct. Consider a realization satis-
fying Y = i and the graphical matroid which consists of all forests of the realization.
Suppose kCL = d(e) for some edge e. Let Ee be all edges with length no larger
than e in this realization. We can see that rank(Ee) = n − k + 1 where rank is the
matroid rank function and n the number of present points in the realization. Hence,
any spanning tree contains no more than n− k + 1 edges from Ee. Equivalently, the
(k−1)th most expensive edge of any spanning tree is no smaller than kCL. Moreover,
since Γi has no more than k − 1 nonempty components, there exists a spanning tree
such that the (k − 1)th most expensive edge is an intra-component edge in Γi. The
lemma follows from P1.
129
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Consider the following expansion E[kCL] =∑m−1
i=1 Pr[Y = i]E[kCL | Y = i]. Recall
that for a component C ⊂ P , we use C〈j〉 to denote the event that exactly j points
in C are present, C〈s〉 the event that only s is present in C and C〈≤ j〉 (C〈≥ j〉) the
event that at most (at least) than j points in C are present. For a partition Γ on P ,
we use Γ〈j,≥ 1〉 to denote the event that exactly j components in Γ contain at least
one present point. Note that
Pr[Y = i] = Pr[µ′i+1〈≥ 1〉] · Pr[µ′′i+1〈≥ 1〉] · Pr[Γ′i〈k − 2,≥ 1〉].
Note that Pr[µ′i+1〈≥ 1〉] and Pr[µ′′i+1〈≥ 1〉] can be easily computed in polynomial
time. The remaining task is to show how to compute Pr[Γ′i〈k − 2,≥ 1〉] and how to
estimate E[kCL | Y = i]. We first present a simple lemma which is useful later.
Lemma 108. For a component C and j ∈ Z, we can compute Pr[C〈j〉] (or Pr[C〈≥j〉]) in polynomial time. Moreover, there exists a poly-time sampler to sample present
points from C conditioning on C〈j〉 (or C〈≥ j〉).
Proof. The idea is essentially from [38]. W.l.o.g, we assume that the points in C are
s1, . . . , sn. We denote the event that among the first a points, exactly b points are
present by E[a, b] and denote the probability of E[a, b] by Pr[a, b]. Note that our goal
is to compute Pr[n, j], which can be solved by the following dynamic program:
1. If a < b, Pr[a, b] = 0. If a = b, Pr[a, b] =∏
1≤l≤a pl. If b = 0, Pr[a, b] =∏
1≤l≤a(1− pl).
2. For a > b and b ≥ 1, Pr[a, b] = paPr[a− 1, b− 1] + (1− pa)Pr[a− 1, b].
We can also use this dynamic program to construct an efficient sampler. Consider the
point sn. With probability pnPr[n − 1, j − 1]/Pr[n, j], we make it present and then
recursively consider the point sn−1 conditioning on the event E[n − 1, j − 1]. With
probability (1 − pn)Pr[n − 1, j]/Pr[n, j], we discard it and then recursively sample
conditioning on the event E[n − 1, j]. Pr[C〈≥ j〉] can be handled in the same way
and we omit the details.
130
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Computing Pr[Γ′i〈k − 2,≥ 1〉]. Now, it is ready to show how to compute Pr[Γ′i〈k −2,≥ 1〉] in polynomial time. Note that for each component Cj ∈ Γ′i, we can easily
compute qj = Pr[Cj〈≥ 1〉] in polynomial time. Since all components in Γ′i are disjoint,
using Lemma 108 (consider each component Cj in Γ′i as a point with existential
probability qj), we can compute Pr[Γ′i〈k − 2,≥ 1〉].To take samples conditioning on Y = i, we first sample k − 2 components in Γ′i
which contain present points. Then for these k − 2 components and µ′i+1, µ′′i+1,
we independently sample present points in each component using Lemma 108. By
Lemma 107, for estimating E[kCL | Y = i], we need to take O(nε2
lnn)
independent
samples. So we take O(n2
ε2lnn)
independent samples in total.
Theorem 109. There is an FPRAS for estimating the expected length of k-th expen-
sive edge in the minimum spanning tree in the existential uncertainty model.
6.3 Minimum Spanning Trees
We consider the problem of estimating the expected size of minimum spanning tree
in the locational uncertainty model. In this section, we briefly sketch how to solve it
using our stoch-core method. Recall that the term nodes refers to the vertices V of
the spanning tree and points describes the locations in P . For ease of exposition, we
assume that for each point, there is only one node that may realize at this point.
Recall that we use the notation v s to denote the event that node v is present
at point s. Let pvs = Pr[v s]. Since node v is realized with certainty, we have∑
s∈P pvs = 1. For each point s ∈ P , we let p(s) denote the probability that point s
is present. For a set H of points, let p(H) =∑
s∈H p(s), i.e., the expected number
of points present in H. For a set H of points and a set S of nodes, we use H〈S〉to denote the event that all and only nodes in S are realized to some points in H.
If S only contains one node, say v, we use the notation H〈v〉 as the shorthand for
H〈v〉. Let H〈i〉 denote the event∨S:|S|=iH〈S〉, i.e., the event that exactly i nodes
are in H. We use diam(H), called the diameter of H, to denote maxs,t∈H d(s, t). Let
d(p,H) be the closest distance between point p and any point in H.
131
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Finding stoch-core Firstly, we find in poly-time the stoch-core H as follows:
Algorithm 4 Constructing stoch-core H for Estimating E[MST ]
1 Among all points r with p(r) ≥ ε16n
, find the furthest two points s and t.2 Set H← B(s, d(s, t)) = s′ ∈ P | d(s′, s) ≤ d(s, t).
Lemma 110. Algorithm 4 finds a stoch-core H such that
Q1. p(H) ≥ m− ε16
= m−O(ε)
Q2. E[MST | H〈m〉 ] = Ω(
diam(H) ε2
n2
).
Furthermore, the algorithm runs in linear time.
Proof. For each point r that is not in H, we know p(r) < ε16n
. Therefore, we have
that and p(P \H) < ε16
. and p(H) ≥ m− ε16
. Consider two cases:
1. Points s and t relate to different nodes. In this case, we have that
E[MST | H〈m〉] ≥ d(s, t)Pr[∃(v, u), v 6= u, v s, u t] = d(s, t)p(s)p(t) ≥ d(s, t)ε2
256n2.
2. Points s and t relate to the same node v. In this case, conditioning on the event
that a different node u is realized to an arbitrary point q, E[MST | H〈m〉] ≥d(s, q)Pr[v s] + d(t, q)Pr[v t] ≥ d(s, t) ε
16n.
In either case, H satisfies both Q1 and Q2.
Estimating E[MST] Let F = P \ H. We rewrite E[MST] by∑
i≥0 E[MST | F〈i〉] ·Pr[F〈i〉]. We only need to estimate E[MST | F〈0〉 ] · Pr[F〈0〉] and E[MST | F〈1〉 ] ·Pr[F〈1〉].
Lemma 111. Algorithm 5 produces a (1 ± ε)-estimate for the first term with high
probability.
132
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Algorithm 5 Estimating E[MST | F〈0〉 ] · Pr[F〈0〉]1 Take N0 = O
(mn2
ε4lnm
)random samples. Set A← ∅ at the beginning.
2 For each sample Gi, if it satisfies F〈0〉, A← A ∪ Gi.3 T0 ← 1
N0
∑Gi∈AMST(Gi).
Proof. Based on the event F〈0〉, the length of MST is at most m · diam(H). Due
to (Q2), we have a poly-bounded random variable and can therefore obtain a (1 ±ε)-estimate for E[MST | H〈m〉 ] using the Monte Carlo method with O
(mn2
ε4lnm
)
samples satisfying H〈m〉 (by Lemma 14). By the first property of H, with probability
close to 1, a sample satisfies H〈m〉. So, the expected time to obtain an useful sample
is bounded by a constant. Overall, we can obtain a (1± ε)-estimate of the first term
with using N0 = O(mn2
ε4lnm
)samples with high probability.
Algorithm 6 Estimating E[MST | F〈1〉 ] · Pr[F〈1〉]1 Set B ← s | s ∈ F, d(s,H) < m
ε· diam(H). Let Cl(v) be the event that v is the
only node that is realized to some point s ∈ B.2 Conditioning on Cl(v), take N1 = O
(mn2
ε5lnm
)independent samples.
Let Av ← Gv,i | 1 ≤ i ≤ N1 be the set of N1 samples for Cl(v).3 Tv ← 1
N1
∑Gv,i∈Av MST(Gv,i) (estimating E[MST | Cl(v)])
4 T1 ←∑
v∈V
(Pr[Cl(v)]Tv +
∑s∈F\B Pr[F〈v〉 ∧ v s] d(s,H)
).
Lemma 112. Algorithm 6 produces a (1± ε)-estimate for the second term with high
probability.
Analysis Note that the number of samples is asymptotically dominated by estimating
E[MST | F〈1〉 ] · Pr[F〈1〉]. For each node v ∈ V , we take N1 independent samples.
Thus, we need to take O(m2n2
ε5lnm
)independent samples. Now, we analyze the
performance guarantee of our algorithm. We need to show that the total contribution
from the scenarios where more than one node are not in the stoch-core is very small.
We need some notations first. Suppose S is the set of nodes realized out of stoch-core
H. We use FS to denote the set of all possible realizations of all nodes in S to points
in F (we can think of each element in FS as an |S|-dimensional vector where each
133
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
coordinate is indexed by a node in S and its value is a point in F). Similarly, we
denote the set of realizations of S = V \ S to points in H by HS. For any FS ∈ FSand HS ∈ HS, we use (FS, HS) to denote the event that both FS and HS happen and
MST(FS, HS) to denote the length of the minimum spanning tree under the realization
(FS, HS). We need the following combinatorial fact.
Lemma 113. Consider a particular realization (FS, HS), where S is the set of nodes
realized out of H. |S| ≥ 2. Let d = d(vS, uS) = minv∈S,u∈Sd(u, v) where vS ∈ FS,
uS ∈ HS. The realization (FS′ , HS′) is obtained from (FS, HS) by sending the node vS
to H, where S ′ = S \ vS. Then MST(FS, HS) ≤ 4MST(FS′ , HS′).
Proof. We have
4MST(FS′ , HS′) ≥ 2MST(FS′ , HS′) + 2d ≥ MST(FS′ , HS) + 2d ≥ MST(FS, HS)
The second inequality holds since the length of the minimum spanning tree is at most
two times the length of the minimum Steiner tree (We consider MST(FS′ , HS) as a
Steiner tree connecting all nodes in FS′ ∪HS).
The only remaining part for establishing Theorem 115 is to show the following
essential lemma.
Lemma 114. For any ε > 0, if H satisfies the properties in Lemma 110, we have
that∑
i>1
E[MST | F〈i〉] · Pr[F〈i〉] ≤ ε · E[MST | F〈1〉] · Pr[F〈1〉].
Proof. We claim that for any i > 1, E[MST | F〈i + 1〉] · Pr[F〈i + 1〉] ≤ ε2E[MST |
F〈i〉]·Pr[F〈i〉]. If the claim is true, then we can show the lemma easily by noticing that,
for any m ≥ 2,∑
i>1 E[MST | F〈i〉]Pr[F〈i〉] ≤ ∑m−1i=1
(ε2
)iE[MST | F〈1〉]Pr[F〈1〉] ≤εE[MST | F〈1〉]Pr[F〈1〉]. Now, we prove the claim. First, we rewrite the LHS as
follows:
E[MST | F〈i+ 1〉] · Pr[F〈i+ 1〉] =∑
|S|=i+1
∑
FS∈FS
∑
HS∈HS
(Pr[(FS, HS)] ·MST(FS, HS)
),
134
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Similarly, the RHS can be written as:
E[MST | F〈i〉] · Pr[F〈i〉] =∑
|S′|=i
∑
FS′∈FS′
∑
HS′∈HS′
(Pr[(FS′ , HS′)] ·MST(FS′ , HS′)
).
For each pair (FS, HS), let C(FS, HS) = Pr[FS, HS] · MST(FS, HS). Consider each
pair (FS, HS) with |S| = i + 1 as a seller and each pair (FS′ , HS′) with |S ′| = i as a
buyer. The seller (FS, HS) wants to sell the term C(FS, HS) and the buyers want to
buy all this term. The buyer (FS′ , HS′) has a budget of C(FS′ , HS′). We show that
there is a charging scheme such that each term C(FS, HS) is fully paid by the buyers
and each buyer spends at most an ε2
fraction of her budget. Note that the existence
of such a charging scheme suffices to prove the claim.
Suppose we are selling the term C(FS, HS). Consider the following charging
scheme. Suppose v ∈ S is the node closest to any node in S. Let S ′ = S \ vand FS′ be the restriction of FS to all coordinates in S except v. We say (FS′ , HS′)
is consistent with (FS, HS), denoted as (FS′ , HS′) ∼ (FS, HS), if HS′ agrees with HS
for all vertices in S. and FS′ agrees with FS for all vertices in S \ v. Intuitively,
(FS′ , HS′) can be obtained from (FS, HS) by sending v to an arbitrary point in H.
Let
Z(FS, HS) =∑
(FS′ ,HS′ )∼(FS ,HS)
Pr[(FS′ , HS′)].
We need the following inequality later: For any fixed (FS′ , HS′),
∑
(FS ,HS)∼(FS′ ,HS′ )
Pr[FS, HS]
Z(FS, HS)≤∑
v∈S′
Pr(v ∈ F)
Pr(v ∈ H)≤ ε
8.
To see the inequality, for a fixed node v, consider the quantity
∑
(FS ,HS)∼(FS′ ,HS′ ),S=S′\v
Pr[FS, HS]
Z(FS, HS).
A crucial observation here is that the denominators of all terms are in fact the same,
by the definition of Z, which is∑
Pr[(F ′S′ , H′S′)], and the summation is over all
135
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
(F ′S′ , H′S′)s which are the same as (FS′ , HS′) except that the location of v is a different
point in H. The numerator is the summation over all (FS, HS)s which are the same
as (FS′ , HS′) except that the location of v is a different point in F. Canceling out the
same multiplicative terms from the numerators and the denominator, we can see it is
at most Pr(v∈F)Pr(v∈H)
.
Now, we specify how to charge each buyer. For each buyer (FS′ , HS′) ∼ (FS, HS),
we charge her the following amount of money
Pr[(FS′ , HS′)] · C(FS, HS)
Z(FS, HS)
We can see that C(FS, HS) is fully paid by all buyers consistent with (FS, HS). It
remains to show that each buyer (FS′ , HS′) has been charged at most ε2C(FS′ , HS′). By
the above charging scheme, the terms (FS, HS)s in LHS that charge buyer (FS′ , HS′)
are consistent with (FS′ , HS′). Now, we can see that the total amount of money
charged to buyer (FS′ , HS′) can be bounded as follows:
∑
(FS ,HS)∼(FS′ ,HS′ )
Pr[FS′ , HS′ ] · C(FS, HS)
Z(FS, HS)≤ 4MST(FS′ , HS′) ·
∑
(FS ,HS)∼(FS′ ,HS′ )
Pr[FS′ , HS′ ] · Pr[(FS, HS)]
Z(FS, HS)
= 4MST(FS′ , HS′)Pr[FS′ , HS′ ] ·∑
(FS ,HS)∼(FS′ ,HS′ )
Pr[FS, HS]
Z(FS, HS)
≤ ε
2MST(FS′ , HS′)Pr[FS′ , HS′ ]
The first inequality follows from Lemma 113. This completes the proof.
Theorem 115. There is an FPRAS for estimating the expected length of the mini-
mum spanning tree in the locational uncertainty model.
Finally, we remark that the problem can be solved by a variety of methods. The
stoch-core method presented in this section is not the simplest one, but may be still
helpful for understanding a very similar but somewhat more technical application of
the method to minimum perfect matching (see Section 6.4).
136
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
6.4 Minimum Perfect Matchings
In this section, we consider the minimum perfect matching (PM) problem. We use
the stoch-core method. The same stoch-core construction for MST can not be directly
used here since PM can be much smaller than MST. For example, suppose there are
only two points. There are even number of nodes residing at each point. In this case,
PM is 0. Now, if we change the location of one particular node to the other point, the
value of PM increase dramatically while the value of MST stays the same. In some
sense, PM is more sensitive to the location of nodes, hence requires new stoch-core
construction. There are two major differences from the algorithm for MST. First, the
stoch-core is composed by several clusters of points, instead of a single ball. Second,
we need a more careful charging argument.
Finding stoch-core. First, we show how to find in poly-time the stoch-core H.
Initially, H consists of all singleton points, each being a component by itself. Then,
we gradually grow the ball from each point, and merge two components if they touch.
We stop until certain properties Q1 and Q2 are satisfied. See the Pseudo-code in
Algorithm 7 for details. For a node v and a set H of points, we let pv(H) =∑
s∈H pvs.
We use diam(H), called the diameter of H, to denote maxs,s′∈H∩P d(s, s′).
Algorithm 7 Constructing stoch-core H for Estimating E[PM]
1 Initially, t← 0 and each point s ∈ P is a component Hs = B(s, t) by itself.2 Gradually increase t. If two different components HS1 and HS2 intersect (whereHS := ∪s∈SB(s, t)), merge them into a new component HS1∪S2 .3 Stop increasing t while the first time the following two conditions are satisfied bycomponents at t. 4
Q1. For each node v, there is a unique component Hj such that pv(Hj) ≥ 1−O( εmn3 ).
We call Hj the stoch-core of node v, denoted as H(v).
Q2. For all j, |v ∈ V | H(v) = Hj| is even.
4 Output the stopping time T and the components H1, . . . ,Hk.
We need the following lemma which is useful for bounding E[PM] from below.
4Note that we only need to consider those t = d(s, s′)/2 for some points s, s′ ∈ P. Thus, wecompute on at most O(n2) different time ts.
137
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Lemma 116. For any two disjoint sets H1 and H2 of points, and any node v, we
have
E[PM] ≥ minpv(H1), pv(H2) · d(H1, H2)/n.
Here, d(H1, H2) = mins∈H1∩P,s′∈H2∩P d(s, s′).
Proof. Suppose s = arg maxspvs | s ∈ H1, and s′ = arg maxspvs | s ∈ H2.Obviously, we have pvs ≥ pv(H1)
nand pvs′ ≥ pv(H2)
n. So it suffices to show E[PM] ≥
minpvs, pvs′ · d(s, t). We first see that
E[PM] ≥ pvsE[PM | v s] + pvs′E[PM | v s′]
≥ minpvs, pvs′(E[PM | v s] + E[PM | v s′]
).
Then it is sufficient to prove that E[PM | v s] + E[PM | v s′] ≥ d(s, s′). Fix a
realization of all nodes except v. Conditioning on this realization, we consider the
following two minimum perfect matchings, one for the case v s, (denoted as PM1)
and the other one for v s′ (denoted as PM2). Consider the symmetric difference
PM1 ⊕ PM2 := (PM1 \ PM2) ∪ (PM2 \ PM1).
We can see that it is a path (s, p1, p2, . . . , pk, s′), such that (s, p1) ∈ PM1,(p1, p2) ∈
PM2, . . . , (pk, s′) ∈ PM2. So PM1 + PM2 ≥ d(s, s′) by the triangle inequality. There-
fore, we have E[PM | v s] + E[PM | v s′] ≥ d(s, s′) ≥ d(H1, H2).
By Q1, Q2 and the above lemma, we can show that the following additional
property holds.
Lemma 117. Q3. E[PM] = Ω( εDmn5 ) where D = maxidiam(Hi).
Proof. Note that the stopping time T must exist, because the set of all points satisfies
the first two properties. Now, we show that Q3 also holds. Firstly, note that D ≤2mT . Secondly, consider T ′ = T−ε for some infinitesimal ε > 0. At time T ′, consider
two situations:
138
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
1. There exists a node v, such that ∀j, pv(Hj) < 1−O( εmn3 ). Then there must exist
two components C1 and C2 such that pv(C1) > Ω( εmn3 ) and pv(C2) > Ω( ε
mn3 ).
Moreover, since C1 and C2 are two distinct components, d(C1, C2) ≥ 2T ′. Then,
by Lemma 116, we have E[PM] ≥ Ω( εmn4 ) · 2T ≥ Ω( εD
mn5 ).
2. Suppose that Q1 is true but Q2 is still false. Suppose Hj is a component which
homes odd number of nodes. Note that with probability at least (1− 1mn3 )m ≈ 1,
each node is realized to a point in its stoch-core. When this is the case, there is
at least one node in Hj that needs to be matched with some node outside Hj,
which incurs a cost of at least 2T .
Estimating E[PM]. We use H〈m〉 to denote the event that for each node v, v H(v).
We denote the event that there are exactly i nodes which are realized out of their
stoch-cores by F〈i〉. Again, we only need to estimate two terms: E[PM | F〈0〉]] ·Pr[F〈0〉] and E[PM | F〈1〉] · Pr[F〈1〉]. Using Properties Q1, Q2 and Q3, we can
estimate these terms in polynomial time. Our final estimation is simply the sum of
the first two terms.
Algorithm 8 Estimating E[PM | F〈0〉 ] · Pr[F〈0〉]1 Take N1 = O(m
2n5
ε4lnm) independent samples. Set A← ∅ at the beginning.
2 For each sample Gi, if it satisfies H〈m〉, A← A ∪ Gi.T0 ← 1
N1
∑Gi∈A PM(Gi).
Lemma 118. Algorithm 6.4 produces a (1± ε)-estimate for the first term with high
probability.
Proof. Note that Pr[H〈m〉] is close to 1 (by union bound) and can be computed exact-
ly. To estimate E[PM | H〈m〉]], the algorithm takes the average of N1 = O(m2n5
ε4lnm)
samples. Note that conditioning on H〈m〉, the minimum perfect matching could be
at most mD. We distinguish the following two cases.
139
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
1. E[PM | H〈m〉] ≥ ε2E[PM] = Ω( ε
2Dmn5 ). We can get a (1 ± ε)-approximation
using the Monte Carlo method with O(m2n5
ε4lnm) samples. Therefore PM is
poly-bounded conditioning on H〈n〉.
2. E[PM | H〈m〉] < ε2E[PM]. Then the probability that the sample average is
larger than εE[PM] is at most poly( 1m
) by Chernoff Bound. We can thus ignore
this part safely.
Algorithm 9 Estimating E[PM | F〈1〉 ] · Pr[F〈1〉]1 For each node v, set Bv ← s | s ∈ P \ H(v), d(s,H(v)) < 4mD
ε. Let Cl(v) be the
event that v is the only node that is realized to some point s ∈ Bv.2 Conditioning on Cl(v), take N1 = O
(m2n5
ε4lnm
)independent samples. Let
Av ← Gv,i | 1 ≤ i ≤ N2 be the set of N1 samples for Cl(v).3 Tv ← 1
N1
∑Gv,i∈Av PM(Gv,i) (estimating E[PM | Cl(v)])
4 T1 ←∑
v∈V
(Pr[Cl(v)]Tv +
∑s∈F\Bv Pr[F〈v〉 ∧ v s] d(s,H(v))
).
Lemma 119. Algorithm 6.4 produces a (1±ε)-estimate for the second term with high
probability.
Analysis Note that the number of samples is asymptotically dominated by estimating
E[PM | F〈1〉 ]·Pr[F〈1〉]. For each node v ∈ V , we take N1 independent samples. Thus,
we need to take O(m3n5
ε4lnm
)independent samples in total.
We still need to show that for i > 1, the contribution from event F〈i〉 is negligible.
Suppose S is the set of nodes that are realized out of their stoch-cores. We use FSand HS to denote the set of all realizations of the all nodes in S to points out of
their stoch-cores, and the set of realizations of S = V \ S to points in their stoch-
cores respectively. We use PM(FS, HS) to denote the length of the minimum perfect
matching under the realization (FS, HS), where FS ∈ FS and HS ∈ HS. The following
combinatorial fact plays the same role in the charging argument as Lemma 113 does
in the previous section. Differing from the MST problem, we can not achieve a similar
bound as the one in Lemma 113 since PM(FS, HS) may decrease significantly if we
140
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
send only one node outside its stoch-core back to its stoch-core. However, we show
that in such case, if we send one more node back to its stoch-core, PM(FS, HS) can
still be bounded.
We need the following structural result about minimum perfect matchings, which
is essential for our charging argument.
Lemma 120. Fix a realization (FS, HS). We use `(v) to denote d(v,H(v)) for all
nodes v ∈ S. Suppose v1 ∈ S has the smallest ` value and v2 has the second smallest `
value. Let S ′ = S\v1, S ′′ = S ′\v2. Further let (FS′ , HS′) be a realization obtained
from (FS, HS) by sending v1 to a point in its stoch-core H(v1) and (FS′′ , HS′′) be a
realization obtained from (FS′ , HS′) by sending v2 to a point in its stoch-core H(v2).
Then we have that PM(FS, HS) ≤ 2(n+ 2)PM(FS′ , HS′) + 2(n+ 2)PM(FS′′ , HS′′)
Proof. Let d = minv `(v) and D = maxi diam(Hi). Note that d ≥ Dn
as d ≥ 2T and
D ≤ 2nT . We distinguish the following three cases:
1. PM(FS, HS) ≤ d2. Using a similar argument to the one in Lemma 116, we have
PM(FS′ , HS′) + PM(FS, HS) ≥ `(v) = d
So, we have PM(FS, HS) ≤ PM(FS′ , HS′) in this case.
2. PM(FS, HS) ≥ (n+ 2)d. By the triangle inequality, we can see that
PM(FS′ , HS′) + (n+ 1)d ≥ PM(FS′ , HS′) + d+D ≥ PM(FS, HS)
So, we have PM(FS, HS) ≤ (n+ 2)PM(FS′ , HS′).
3. d2≤ PM(FS, HS) ≤ (n+ 2)d.
(a) PM(FS′ , HS′) ≥ d2. We directly have PM(FS, HS) ≤ 2(n+ 2)PM(FS′ , HS′).
(b) PM(FS′ , HS′) ≤ d2. By Lemma 116, we have
PM(FS′ , HS′) + PM(FS′′ , HS′′) ≥ d
141
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Then we have PM(FS, HS) ≤ 2(n+ 2)PM(FS′′ , HS′′).
In summary, we prove the lemma.
The remaining is to establish the following key lemma. The proof is similar to,
but more involved than that of Lemma 114.
Lemma 121. For any ε > 0, if H satisfies the properties Q1, Q2 in Algorithm 7, we
have that
∑
i>1
E[PM | F〈i〉] · Pr[F〈i〉] ≤ ε · E[PM | F〈0〉] · Pr[F〈0〉] + ε · E[PM | F〈1〉] · Pr[F〈1〉].
Proof. We claim that for any i > 1,
E[PM | F〈i+1〉]·Pr[F〈i+1〉] ≤ ε
6
(E[PM | F〈i〉]·Pr[F〈i〉]+E[PM | F〈i−1〉]·Pr[F〈i−1〉]
)
If the claim is true, the lemma can be proven easily as follows. For ease of notation,
we use A(i) to denote E[PM | F〈i〉] · Pr[F〈i〉]. First, we can see that
A(i+ 2) + A(i+ 1) ≤ ε
6A(i+ 1) +
2ε
6A(i) +
ε
6A(i− 1) ≤ ε
2(A(i) + A(i− 1)).
So if i is odd, A(i + 2) + A(i + 1) ≤ ( ε2)(i+1)/2(A(1) + A(0)). Therefore,
∑i>1A(i) ≤
ε/21−ε/2(A(1) + A(0)) ≤ ε(A(1) + A(0)). Now, we prove the claim. Again, we rewrite
the LHS as
E[PM | F〈i+ 1〉] · Pr[F〈i+ 1〉] =∑
|S|=i+1
∑
FS
∑
HS
(Pr[FS, HS] · PM(FS, HS)
).
Similarly, we have the RHS to be
E[PM | F〈i〉] · Pr[F〈i〉] =∑
|S′|=i
∑
FS′
∑
HS′
(Pr[FS′ , HS′ ] · PM(FS′ , HS′)
)and
E[PM | F〈i− 1〉] · Pr[F〈i− 1〉] =∑
|S′′ |=i−1
∑
FS′′
∑
HS′′
(Pr[FS′′ , HS′′ ] · PM(FS′′ , HS′′ )
).
142
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Let C(FS, HS) = Pr[FS, HS]·PM(FS, HS). Consider all (FS′ , HS′) with |S ′| = i and all
(FS′′ , HS′′) with |S ′′ | = i−1 as buyers. The buyers want to buy all terms in LHS. The
budget of buyer (FS′ , HS′)/(FS′′ , HS′′) is C(FS′ , HS′)/C(FS′′ , HS′′). We show there is
a charging scheme such that each term C(FS, HS) is fully paid by the buyers and
each buyer spends at most an ε6
fraction of her budget.
Suppose we are selling the term C(FS, HS). Consider the following charging
scheme. Suppose v1 ∈ S the node that is realized to point s1 ∈ P \ H(v1) which
is the closest point to its stoch-core in FS. Suppose v2 ∈ S the node that is realized
to point s2 ∈ P \H(v2) which is the second closest point to its stoch-core in FS. Let
S ′ = S \v1, S ′′ = S ′ \v2. If (FS′ , HS′) is obtained from (FS, HS) by sending v1 to
a point in its stoch-core H(v1), we say (FS′ , HS′) is consistent with (FS, HS), denoted
as (FS′ , HS′) ∼ (FS, HS). If (FS′′ , HS′′) is obtained from (FS′ , HS′) by sending v2 to a
point in its stoch-core H(v2), we say (FS′′ , HS′′) is consistent with (FS′ , HS′), denoted
as (FS′ , HS′) ∼ (FS, HS). Let
Z(FS, HS) =∑
(FS′ ,HS′ )∼(FS ,HS)
Pr[(FS′ , HS′)], and
Z(FS′ , HS′) =∑
(FS′′ ,HS′′ )∼(FS′ ,HS′ )
Pr[FS′′ , HS′′ ]
Now, we claim that for any fixed (FS′′ , HS′′),
∑
(FS′ ,HS′ )∼(FS′′ ,HS′′ )
Pr[FS′ , HS′ ]
Z(FS′ , HS′)≤∑
v∈S′′
Pr[v /∈ H(v)]
Pr[v ∈ H(v)].
The proof of the claim is essentially the same as in Lemma 114. We first observe
that for a fixed node v = S ′ \ S ′′, the denominators of all terms are in fact the same
by the definition of Z. Then, the proof can be completed by canceling out the same
multiplicative terms from the numerators and the denominator.
Now, we specify how to charge each buyer. For each buyer (FS′ , HS′) ∼ (FS, HS),
143
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
we charge (FS′ , HS′) the following amount of money
2(n+ 2)Pr[FS, HS]PM(FS′ , HS′) ·Pr[FS′ , HS′ ]
Z(FS, HS),
and we charge each buyer (FS′′ , HS′′) consistent with (FS′ , HS′) the following amount
of money
2(n+ 2)Pr[F ′′S , HS′′ ]PM(FS′′ , HS′′) ·Pr[FS, HS]
Z(FS, HS)· Pr[FS′ , HS′ ]
Z(FS′ , HS′).
In this case, we call (FS′′ , HS′′) a sub-buyer of the term C(FS, HS). By Lemma 120, we
can see that A(FS, HS) is fully paid. To prove the claim, it suffices to show that each
buyer (FS′ , HS′) and each sub-buyer (FS′′ , HS′′) has been charged at most ε6A(FS′ , HS′)
dollars. By the above charging scheme, the terms in LHS that are charged to buyer
(FS′ , HS′) are consistent with (FS′ , HS′). Using the same argument as in Lemma 114,
we can show that the spending of (FS′ , HS′) as a buyer is at most
ε
nm· PM(FS′ , HS′) · Pr[FS′ , HS′ ].
For notational convenience, we let B = 2(n + 2)PM(FS′′ , HS′′)Pr[FS′′ , HS′′ ]. The
spending of (FS′′ , HS′′) as a sub-buyer can be bounded as follows:
B ·∑
(FS′ ,HS′ )∼(FS′′ ,HS′′ )
∑
(FS ,HS)∼(FS′ ,HS′ )
(Pr[FS, HS]
Z(FS, HS)· Pr[FS′ , HS′ ]
Z(FS′ , HS′)
)
≤B ·∑
(FS′ ,HS′ )∼(FS′′ ,HS′′ )
∑
(FS ,HS)∼(FS′ ,HS′ )
Pr[FS′ , HS′ ]
Z(FS′ , HS′)
≤B ·mn ·∑
(FS′ ,HS′ )∼(FS′′ ,HS′′ )
Pr[FS′ , HS′ ]
Z(FS′ , HS′)
≤B ·mn ·∑
v∈S′′
Pr[v /∈ H(v)]
Pr[v ∈ H(v)]
≤ ε
6· PM(FS′′ , HS′′) · Pr[FS′′ , HS′′ ]
In the first inequality, we use the fact that Pr[FS ,HS ]
Z(FS ,HS)≤ 1. Note that for each (FS′ , HS′),
144
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
there are at most mn different (FS, HS) such that (FS, HS) ∼ (FS′ , HS′). So we have
the second inequality. This completes the proof of the lemma.
Theorem 122. Assuming the locational uncertainty model and that the number of
nodes is even, there is an FPRAS for estimating the expected length of the minimum
perfect matching.
Remark. We have also tried to use the HPF method for this problem. The problem
can be essentially reduced to the following bins-and-balls problem: Again each ball
is thrown to the bins with nonuniform probabilities and we want to estimate the
probability that each bin contains even number of balls. To the best of our knowledge,
the problem is not studied before. The structure of the problem is somewhat similar
to the permanent problem. We attempted to use the MCMC technique developed
in [67], but the details become overly messy and we have not been able to provide a
complete proof.
6.5 Minimum Cycle Covers
In this section, we consider the expected length of minimum cycle cover problem. In
the deterministic version of the cycle cover problem, we are asked to find a collection
of node-disjoint cycles such that each node is in one cycle and the total length is
minimized. Here we assume that each cycle contains at least two nodes. If a cycle
contains exactly two nodes, the length of the cycle is two times the distance between
these two nodes. The problem can be solved in polynomial time by reducing the
problem to a minimum bipartite perfect matching problem. 5 W.l.o.g., we assume
that no two edges in P ×P have the same length. For ease of exposition, we assume
that for each point, there is only one node that may realize at this point. In principle,
5If we require each cycle consist at least three nodes, the problem is still poly-time solvable bya reduction to minimum perfect matching by Tutte [103]. Hartvigsen [63] obtained a polynomialtime algorithm for minimum cycle cover with each cycle having at least 4 nodes Cornuejols andPulleyblank [32] have reported that Papadimitriou showed the NP-completeness of minimum cyclecover with each cycle having at least 6 nodes.
145
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
if more than one nodes may realize at the same point, we can create multiple copies
of the point co-located at the same place, and impose a distinct infinitesimal distance
between each pair of copies, to ensure that no two edges have the same distance.
We need the notion of the nearest neighbor graph, denoted by NN. For an undi-
rected graph, an edge e = (u, v) is in the nearest neighbor graph if u is the nearest
neighbor of v, or vice versa. We also use NN to denote its length. E[NN] can be
computed exactly in polynomial time [71]. As a warmup, we first show that E[NN]
is a 2-approximation of E[CC] in the following lemma.
Lemma 123. E[NN] ≤ E[CC] ≤ 2E[NN].
Proof. We show that NN ≤ CC ≤ 2NN satisfies for each possible realization. We prove
the first inequality. For each node u, there are two edges incident on u. Suppose they
are eu1 and eu2. We have CC =∑u(d(eu1)+d(eu2))
2≥ NN. The second inequality can be
seen by doubling all edges in NN and the triangle inequality.
We denote the longest edge in NN (and also its length) by L. Note that L is also
a random variable. By the law of total expectation, we estimate E[CC] based on the
following formula:
E[CC] =∑
e∈P×PPr[L = e] · E[CC | L = e]
It is obvious to see that NNm≤ L ≤ NN. Combined with Lemma 123, we have that
d(e) ≤ E[CC | L = e] ≤ 2md(e). (6.1)
However, it is not clear to us how to estimate Pr[L = e] and how to take samples
conditioning on event L = e efficiently. To circumvent the difficulty, we consider some
simpler events. Consider a particular edge e = (s, t) ∈ P × P . Denote as Ns(t) the
event that the nearest neighbor of s is t. Let Lst be the event the longest edge L in
NN is e = (s, t). Let As(t) = Ns(t)∧Lst. First we rewrite E[CC | L = e] ·Pr[L = e] by
E[CC | L = e] · Pr[L = e] =E[CC | As(t) ∨ At(s)] · Pr[As(t) ∨ At(s)]
=E[CC | As(t)] · Pr[As(t)] + E[CC | At(s)] · Pr[At(s)]
146
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
− E[CC | As(t) ∧ At(s)] · Pr[As(t) ∧ At(s)]
Now, we show how to estimate E[CC | As(t)] · Pr[As(t)] for each edge e = (s, t). The
other two terms can be estimated in the same way. Also notice that the third term
is less than both the first term and the second term. Therefore, for any points s and
t, we have the following fact which is useful later:
E[CC] ≥ E[CC | L = e] · Pr[L = e] ≥ E[CC | As(t)] · Pr[As(t)]. (6.2)
By the above inequality, we can see that the total error for estimating the three terms
is negligible compared to E[CC | L = e] · Pr[L = e]. Moreover, we have that
E[CC | As(t)] · Pr[As(t)] = E[CC | As(t)] · Pr[Lst ∧Ns(t)]
= E[CC | As(t)] · Pr[Lst | Ns(t)] · Pr[Ns(t)]
Suppose v is the node that may be realized to point s and u is the node that may
be realized to point t. We use B as a shorthand notation for B(s, d(s, t)). We first
observe that Pr[Ns(t)] can be computed exactly in poly-time as follows:
Pr[Ns(t)] = pvs · put ·∏
w 6=v,u
(1− pw(B)
)
Also note that we can take samples conditioning on the event Ns(t) (the corresponding
probability distribution for node v is: Pr[v r | Ns(t)] = pvr1−pw(B)
).
Estimating E[CC | As(t) ] · Pr[Lst | Ns(t)]. Next, we show how to estimate E[CC |As(t)]·Pr[Lst | Ns(t)]. The high level idea is the following. We take samples condition-
ing on Ns(t). If Pr[Lst | Ns(t)] is large (i.e., at least 1/poly(nm)), we can get enough
samples satisfying Lst, thus As(t). Therefore, we can get (1 ± ε)-approximation for
both Pr[Lst | Ns(t)] and E[CC | As(t)] in poly-time (we also use the fact that if As(t)
is true, CC is at least d(s, t) and at most 2md(s, t)). However, if Pr[Lst | Ns(t)] is
small, it is not clear how to obtain a reasonable estimate of this value. In this case,
147
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
we show the contribution of the term to our final answer is extremely small and even
an inaccurate estimation of the term will not affect our answer in any significant way
with high probability.
Now, we elaborate the details. We iterate the following steps for N times (N =
O(m2n4
ε3(lnm+ lnn)) suffices). Since there are O
(n2)
different edges between points,
we totally need O(m2n6
ε3(lnm+ lnn)) iterations.
• Suppose we are in the ith iteration. We take a sample Gi of the stochastic
graph conditioning on the event Ns(t). We compute the nearest neighbor graph
NN(Gi) and the minimum length cycle cover CC(Gi). If e = (s, t) is the longest
edge in NN(Gi), let Ii = 1. Otherwise Ii = 0.
Our estimate of E[CC | As(t) ] · Pr[Lst | Ns(t)] is the following:
(∑Ni=1 Ii · CC(Gi)∑N
i=1 Ii
)(∑Ni=1 IiN
)=
∑Ni=1 Ii · CC(Gi)
N
It is not hard to see that the expectation of∑Ni=1 Ii·CC(Gi)
Nis exactly E[CC | As(t) ] ·
Pr[Lst | Ns(t)].
We distinguish the following two cases:
1. Pr[Lst | Ns(t)] ≥ ε2mn4 . By Lemma 14,
∑Ni=1 IiN
∈ (1 ± ε)Pr[Lst | Ns(t)] with
high probability. In this case, we have enough successful samples (samples with
Ii = 1) to guarantee that∑Ni=1 IiCC(Gi)∑N
i=1 Iiis a (1±ε)-approximation of E[CC | As(t) ]
with high probability, again by Lemma 14. We note that under the condition
As(t), we can get a (1 ± ε)-approximation since CC is at least d(s, t) and at
most 2nd(s, t).
2. Pr[Lst | Ns(t)] <ε
2mn4 . We note that Ii = 0 means that while Ns(t) happens, the
longest edge L in NN is longer than e = (s, t). Suppose e′ = (s′, t′) is the edge
with the maximum Pr[Ls′t′|Ns(t)]. Since Pr[Lst | Ns(t)] ≤ ε2mn4 , e′ = (s′, t′)
must be different from e = (s, t) and Pr[Ls′t′ | Ns(t)] ≥ 4mn2
εPr[Lst | Ns(t)].
148
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Hence, we have that
E[CC | As(t)] · Pr[As(t)] = E[CC | As(t)] · Pr[Lst | Ns(t)] · Pr[Ns(t)]
≤ 2m · d(s, t) · ε
4mn2· Pr[Ls′t′ | Ns(t)] · Pr[Ns(t)]
≤ ε
2n2· d(s′, t′) · Pr[Ls′t′ | Ns(t)] · Pr[Ns(t)]
≤ ε
2n2· E[CC | As′(t′)] · Pr[Ls′t′ ]
≤ ε
2n2· E[CC]
The first and third inequalities are due to (6.1) and the fourth are due to (6.2).
By Chernoff Bound, we have that
Pr
[∑Ni=1 Ii · CC(Gi)
N≥ ε
n2· E[CC]
]≤ e−m
n2
Then, with probability at least 1−poly( 1m
), the contribution from all such edges
is less than εE[CC].
Summing up, we have obtained the following theorem.
Theorem 124. There is an FPRAS for estimating the expected length of the min-
imum length cycle cover in both the locational uncertainty model and the existential
uncertainty model.
Finally, we remark that our algorithm also works in presence of both location-
al uncertainty and node uncertainty, i.e., the existence of each node is a Bernoulli
random variable. It is not hard to extend our technique to handle the case where
each cycle is required to contain at least three nodes. This is done by considering
the longest edge in the 2NN graph (each node connects to the nearest and the second
nearest neighbors). The extension is fairly straightforward and we omit the details
here.
149
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
6.6 kth Longest m-Nearest Neighbor
We consider the problem of computing the expected length of the kth longest m-
nearest neighbor (i.e., for each point, find the distance to its m-nearest neighbor, then
compute the kth longest one among these distances) in the existential uncertainty
model. We use kmNN to denote the length of the kth longest m-nearest neighbor.
Similar to k-clustering, we use the HPF Ψ(P) for estimating E[kmNN]. We call
a component a small component if it contains at most m present points. Let the
random variable Y be the largest integer i such that there are at most k − 1 present
points among those small components in Γi. We can see that if Y = i then the
special component νi is not a small component, while both µ′i+1 and µ′′i+1 should not
be empty, and one of µ′i+1 and µ′′i+1 must be a small component. Moreover, Γ′i contains
at most k − 1 present points among those small components.
We can rewrite E[kmNN] by E[kmNN] =∑n
i=1 Pr[Y = i]E[kmNN | Y = i]. By the
Property P1 and P2 of Ψ(P), we directly have the following lemma.
Lemma 125. Conditioning on Y = i, it holds that d(ei) ≤ kmNN ≤ nd(ei).
For a partition Γ on P , we use Γ〈#j,≤ m〉 to denote the event that there are
exactly j present points among those small components in Γ. The remaining task is
to show how to compute Pr[Y = i] and how to estimate E[kmNN | Y = i]. We first
prove the following lemma.
Lemma 126. For a partition Γ on P, we can compute Pr[Γ〈#j,≤ m〉] in polynomial
time. Moreover, there exists a polynomial time sampler for sampling present points
in Γ conditioning on Γ〈#j,≤ m〉.
Proof. W.l.o.g, we assume that the components in Γ are C1, . . . , Cn. We denote E[a, b]
the event that among the first a components, exactly b points are present in those
small components. We denote the probability of E[a, b] by Pr[a, b]. Note that our
goal is to compute Pr[n, j]. We have the following dynamic program:
1. If∑
1≤l≤a minm, |Cl| < b, Pr[a, b] = 0. If b = 0, Pr[a, b] =∏
1≤l≤a(Pr[Cl〈0〉] +
Pr[Cl〈≥ m+ 1〉]).
150
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
2. For 1 ≤ b ≤∑1≤l≤a minm, |Cl|, Pr[a, b] =∑
0≤l≤n Pr[Ca〈l〉] ·Pr[a− 1, b− l] +
Pr[Ca〈≥ m+ 1〉] · Pr[a− 1, b].
Thus we can compute Pr[n, j] in polynomial time. Similar to Lemma 108, we can
also construct a polynomial uniform sampler.
To prove Theorem 128, we only need the following lemma.
Lemma 127. We can compute Pr[Y = i] in polynomial time. Moreover, there exists
a polynomial time sampler conditioning on Y = i.
Proof. By the definition of Y = i, we can rewrite Pr[Y = i] as follows:
Pr[Y = i] =∑
1≤n1≤m,m+1−n1≤n2≤mPr[µ′i+1〈n1〉] · Pr[µ′′i+1〈n2〉] ·
( ∑
k−n1−n2≤l≤k−1
Pr[Γ′i〈#l,≤ m〉])
+∑
m+1≤n1≤|µ′i+1|,1≤n2≤mPr[µ′i+1〈n1〉] · Pr[µ′′i+1〈n2〉] ·
( ∑
k−n2≤l≤k−1
Pr[Γ′i〈#l,≤ m〉])
+∑
1≤n1≤m,m+1≤n2≤|µ′′i+1|Pr[µ′i+1〈n1〉] · Pr[µ′′i+1〈n2〉] ·
( ∑
k−n1≤l≤k−1
Pr[Γ′i〈#l,≤ m〉])
Note that we can compute Pr[Y = i] in polynomial time by Lemma 126. Using the
same argument as in Lemma 129, we can construct a polynomial uniform sampler
conditioning on Y = i. By Lemma 125, we only need to take O( nε2
lnn) independent
samples for estimating E[kmNN | Y = i]. So we take O(n2
ε2lnn) independent samples
in total.
Theorem 128. There is an FPRAS for estimating the expected length of the kth
longest m-nearest neighbor in the existential uncertainty model.
151
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
6.7 Missing Proofs
6.7.1 Closest Pair
Lemma 101. Steps 1,2,3 in Algorithm 3 provide (1±ε)-approximations for Pr[F〈i〉∧C ≤ 1] for i = 0, 1, 2 respectively, with high probability.
Proof. As we just argued, Pr[F〈1〉 ∧ C ≤ 1] can be estimated since I(C ≤ 1),
conditioned on F〈0〉, is poly-bounded. For estimating Pr[F〈1〉 ∧ C ≤ 1], we first
rewrite this term by∑
si∈F Pr[F〈si〉 ∧ C ≤ 1]. For a point si ∈ F, note that
Pr[F〈si〉 ∧ C ≤ 1] = Pr[F〈si〉] · Pr[C ≤ 1 | F〈si〉]. Since we have that
pi(1 − εn) ≤ Pr[F〈si〉] ≤ pi by the first property of the stoch-core H, we can
use pi to estimate Pr[F〈si〉]. For estimating Pr[C ≤ 1 | F〈si〉], we denote
Bsi = t ∈ H : d(si, t) ≤ 1. If Bsi is not empty, we can use Monte Carlo for
estimating Pr[C ≤ 1 | F〈si〉] since its value is at least εn2 . Otherwise, computing
Pr[C ≤ 1 | F〈si〉] is equivalent to computing Pr[C ≤ 1 | F〈0〉] in the instance
without si (since si is at distance more than 1 from any other point). The proof for
Pr[F〈2〉 ∧ C ≤ 1] is almost the same and we do not repeat it.
6.7.2 Minimum Spanning Tree
Lemma 112. Algorithm 6 produces a (1±ε)-estimate for the second term with high
probability.
Proof. To compute the second term, we first rewrite it as follows:
E[MST | F〈1〉 ] · Pr[F〈1〉] =∑
v∈V
(∑
s∈FPr[F〈v〉 ∧ v s]E[MST | F〈v〉, v s]
)
Fix a node v. To estimate∑
s∈F Pr[F〈v〉 ∧ v s]E[MST | F〈v〉, v s], we consider
the following two situations:
1. Point s ∈ B, i,e, d(s,H) < mε· diam(H).
152
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
We estimate the sum for all s ∈ B. Notice that the sum is in fact Pr[Cl(v)] ·E[MST | Cl(v)]. We can see that Pr[Cl(v)] can be computed exactly in linear
time. We argue that the quality of the estimation taken on N1 = O(mn2
ε5lnm
)
samples is sufficient by considering the following two cases:
(a) Assume that E[MST | Cl(v)] ≥ 12E[MST | H〈m〉] ≥ Ω
(ε2
n2
)diam(H). In
this case, we have a poly-bounded random variable. This is because un-
der the condition Cl(v), the maximum possible length of any minimum
spanning tree is O(mε
diam(H)). Hence we can use Monte Carlo to get a
(1± ε)-approximation of E[MST | Cl(v)] with O(mn2
ε5lnm
)samples.
(b) Otherwise, we assume that E[MST | Cl(v)] ≤ 12E[MST | H〈m〉]]. Let V0
be the collection of these nodes. The probability that the sample average
is larger than E[MST | H〈m〉]] is at most poly( 1m
) by Chernoff Bound.
The probability that for all nodes v ∈ V0, the sample average are at most
E[MST | H〈m〉]] is at least 1−poly( 1m
) by union bound. If this is the case,
we can see their total contribution to the final estimation of E[MST] is less
than εE[MST | H〈m〉]]Pr[H〈m〉]. In fact, this is because
∑
v∈V0
Pr[Cl(v)] · Tv ≤∑
v∈V0
Pr[Cl(v)] · E[MST | H〈m〉] < εE[MST | H〈m〉]Pr[H〈m〉].
The second inequality is due to the fact that∑
v∈V0Pr[Cl(v)] ≤ m−p(H) <
ε/16 < εPr[H〈m〉].
2. Point s ∈ F \B, each term has d(s,H) > mε· diam(H).
We just use d(s,H) as the estimation of E[MST | F〈v〉, v s]. This is because
the length of MST is always at least d(s,H) and at most d(s,H)+m·diam(H) ≤(1 + ε)d(s,H).
153
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
6.7.3 Minimum Perfect Matching
Lemma 119. Algorithm 6.4 produces a (1 ± ε)-estimate for the second term with
high probability.
Proof. To compute the second term, we first rewrite it as follows:
E[PM | F〈1〉] · Pr[H〈1〉] =∑
v∈V
( ∑
s/∈H(v)
Pr[F〈v〉 ∧ v s] E[PM | F〈v〉, v s]).
Fix a particular node v. To estimate∑
s∈F Pr[F〈v〉 ∧ v s]E[PM | F〈v〉, v s], we
consider the following two situations:
1. Point s ∈ Bv, i,e, d(s,H(v)) < 4mDε
.
We estimate the sum for all s ∈ Bv. Notice that the sum is in fact Pr[Cl(v)] ·E[PM | Cl(v)]. We can see that Pr[Cl(v)] can be computed exactly in linear
time. We argue that the quality of the estimation taken on N2 = O(m2n5
ε4lnm
)
samples is poly-bounded by considering the following two cases:
(a) Assume that E[PM | Cl(v)] ≥ 12E[PM | H〈m〉] = Ω
(εDmn5
). In this case,
our estimation is poly-bounded. This is because under the condition Cl(v),
the maximum possible length of any minimum perfect matching is O(mDε
).
Hence we can use Monte Carlo to get a (1 ± ε)-approximation of E[PM |Cl(v)] with O
(m2n5
ε4lnm
)samples.
(b) Otherwise, we assume that E[PM | Cl(v)] ≤ 12E[PM | H〈m〉]]. Let V0 be
the collection of these nodes. The probability that the sample average is
larger than E[PM | H〈m〉]] is at most poly( 1m
) by Chernoff Bound. The
probability that for each node v ∈ V0, the sample average is at most
E[PM | H〈m〉]] is at least 1− poly( 1m
) by union bound. If this is the case,
we can see their total contribution to the final estimation of E[PM] is less
than εE[PM | H〈m〉]]Pr[H〈m〉]. In fact, this is because
∑
v∈V0
Pr[Cl(v)]·Tv ≤∑
v∈V0
Pr[Cl(v)]·E[PM | H〈m〉]] < εE[PM | H〈m〉]]Pr[H〈m〉].
154
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
The second inequality is due to the fact that∑
v∈V Pr[Cl(v)] ≤ m −∑
v∈V0pv(H(v)) ≤ ε
n3 < εPr[H〈m〉].
2. Point s ∈ P \ (Bv ∪H(v)), each term has d(s,H(v)) > 4mDε
. The algorithm uses
d(s,H(v)) as the estimation of E[PM | F〈v〉, v s]. Note that the length of PM
is always at least d(s,H(v))−mD ≥ (1− ε4)d(s,H(v)). This is because such an
instance PM contains a path from s to some point t ∈ H(v) deleting no more
than m segments of length at most D (each segment is in some Hj). On the
other hand, the length of PM is at most d(s,H(v)) + mD ≤ (1 + ε4)d(s,H(v)).
So it is a (1± ε)-estimation.
6.8 The Closest Pair Problem
6.8.1 Estimating kth Closest Pair in the Existential Uncertainty Model
Again, we construct the HPF Ψ(P). Let the random variable Y be the largest integer
i such that there are at least k point collisions in Γi. Here we use a point collision to
denote that a pair of points are present in the same component. Note that if there are
exactly i points in a component, the amount of point collisions in this component is(i2
). We denote as Γ〈#j〉 the event that there are exactly j point collisions among the
partition Γ on P . Similarly, we can rewrite E[kC] by E[kC] =∑m−1
i=1 Pr[Y = i]E[kC |Y = i].
We use dynamic programming technique to achieve an FPRAS for computing
E[kC]. Note that conditioning on Y = i, the value of kC is between d(ei) and m ·d(ei).
So we only need to show the following lemma.
Lemma 129. We can compute Pr[Y = i] in polynomial time. Moreover, there exists
a polynomial time sampler conditioning on Y = i.
Proof. We denote E[a, b] (1 ≤ a ≤ i − 1, b ≤ k) the event that among the first a
components in Γ′i, there are exactly b ≤ k point collisions. We denote the probability
155
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
of E[a, b] by Pr[a, b]. We give the dynamic programming as follows.
1. If∑
1≤j≤a(|Cj |
2
)< b, Pr[a, b] = 0. If b = 0, Pr[a, b] =
∏1≤j≤a Pr[Cj〈≤ 1〉]. If
b < 0, Pr[a, b] = 0.
2. If∑
1≤j≤a(|Cj |
2
)≥ b, 1 ≤ b ≤ k, Pr[a, b] =
∑0≤l≤na Pr[Ca〈l〉] · Pr[a− 1, b−
(l2
)].
By the above dynamic programming, we can compute Pr[i− 1, l] for 0 ≤ l ≤ k− 1 in
polynomial time.
By the definition of Y = i, it is no hard to see that we can rewrite Pr[Y = i] as
follows:
Pr[Y = i] =∑
1≤n1≤|µ′i+1|,1≤n2≤|µ′′i+1|Pr[µ′i+1〈n1〉]·Pr[µ′′i+1〈n2〉]·
∑
k−(n1+n22 )≤l≤k−1−(n1
2 )−(n22 )
Pr[Γ′i〈#l〉]
Note that we can compute Pr[Y = i] in polynomial time. We need to describe our
sampler conditioning on Y = i. We first sample the event µ′i+1〈n1〉 ∧ µ′′i+1〈n2〉 with
probability Pr[µ′i+1〈n1〉 ∧ µ′′i+1〈n2〉 | Y = i]. Then conditioning on k −(n1+n2
2
)≤
l ≤ k − 1 −(n1
2
)−(n2
2
), we sample the total number of point collisions in Γ′i. Then
we sample the number of present points in each component in Γ′i using the dynamic
programming. Finally, based on the number of present points in each component, we
sample the present points by Lemma 108.
Using the Monte Carlo method, we only need to take O( nε2
lnn) independent samples
for estimating E[kC | Y = i]. Thus, we totally take O(n2
ε2lnn) independent samples.
Theorem 130. There is an FPRAS for estimating the expected distance between the
kth closest pair in the existential uncertainty model.
6.8.2 Hardness for Closest Pair
Theorem 131. Computing Pr[C ≥ 1] is #P-hard to approximate within any factor
in a metric space in both the existential and locational uncertainty models.
156
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
Proof. First consider the existential uncertainty model. Consider a metric graph G
with edge weights being either 0.9 or 1.8. Each vertex in this graph exists with
probability 1/2. Let G′ be the unweighted graph with the same number of vertices.
G′ contains only those edges corresponding to edges with weight 0.9 in G. It is not
hard to see that
Pr[C ≥ 1] = #independent sets of size at least two in G′ · 1
2n.
The right hand side is well known to be inapproximable for arbitrary graphs [97].
For the locational model, let the instance be G (with n vertices s1, . . . , sn) with
n additional vertices t1, . . . , tn which are far away from each other and any vertex in
G. Let the probability distribution of node vi be pvisi = 1/2, and pviti = 1/2. We can
see that in this locational uncertainty model, the value Pr[C ≥ 1] is the same as that
in the corresponding existential model G.
Theorem 132. Computing E[C] exactly in both the existential and locational uncer-
tainty models is #P-hard in a metric space.
Proof. Consider a metric graph G with edge weights being either 1 or 2. Each vertex
in this graph exists with probability 1/2. Note that
E[C] = Pr[C = 1] + 2Pr[C = 2] = (Pr[C ≤ 1]− Pr[C = 0]) + 2(1− Pr[C ≤ 1])
Computing Pr[C = 0] can be easily done in polynomial time. Computing Pr[C ≤ 1]
in such a graph is as hard as counting independent sets in general graphs, hence is
also #P-hard (as in Theorem 131). So, computing E[C] is #P-hard as well.
For the locational model, let the instance be G (with n vertices s1, . . . , sn) with n
additional vertices t1, . . . , tn which satisfies d(si, tj) = d(ti, tj) = 5 (1 ≤ i, j ≤ m, i 6=j). Let the probability distribution of node vi be pvisi = 1/2, and pviti = 1/2. It is
not hard to see that in this locational uncertainty model, the value E[C] is linearly
related to the value E[C] in the existential model G. Therefore, computing E[C] is
157
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
also #P-hard in the locational uncertainty model.
6.9 Another FPRAS for MST
W.l.o.g., we assume that for each point, there is only one node that may be realized
to this point. Our algorithm is a slight generalization of the one proposed in [71]. Let
E[i] be the expected MST length conditioned on the event that all nodes v1, . . . , vmare realized to points in si, . . . , sn (denote the event by In(i, n)). Let E′[i] be the
expected MST length conditioned on the event that all nodes v1, . . . , vn are realized
to si, . . . , sn and at least one node is realized to si. We use s s to denote the
event that node v is realized to point s. Note that
E[i] = E′[i]Pr[∃v, v si | In(i,m)] + E[i+ 1]Pr[ 6 ∃v, v si | In(i,m)]
For a particular point si, we reorder the points si, . . . , sn as si = ri, . . . , rnin increasing order of distance from si. Let E′[i, j] be the expected MST length for
all nodes conditioned on the event that all nodes are realized to ri, . . . , rj (denoted
as In′(i, j)) and ∃v, v si. Let E′′[i, j] be the expected MST length for all nodes
conditioned on the event In′(i, j) ∧ (∃v, v ri) ∧ (∃s′, s′ rj). We can see that
E′[i, j] =E′′[i, j]Pr[∃v′, v′ rj | In′(i, j),∃v, v ri]
+ E′[i, j − 1]Pr[6 ∃v, v ri | In′(i, j),∃v, v ri]
It is not difficult to see the probability Pr[∃v′, v′ rj | In′(i, j),∃v, v ri] can be
computed in polynomial time. Here we use the assumption that for each point, only
one node that may realize to it. Moreover, we can also take samples conditioning on
event In′(i, j) ∧ (∃v, v ri) ∧ (∃v′, v′ rj). Therefore E′′[i, j] can be approximated
within a factor of (1± ε) using the Monte Carlo method in polynomial time since it
is poly-bounded. The number of samples needed can be bounded by O(mn2
ε2lnm
).
We can easily generalize the above algorithm to the case where∑n
j=1 pij ≤ 1,
158
CHAPTER 6. ESTIMATING THE EXPECTED VALUE OF COMBINATORIALOPTIMIZATION PROBLEMS OVER STOCHASTIC DATA
i.e., node i may not be present with some certainty. Indeed, this can be done by
generalizing the definition of In(i, j) (and similarly In′(i, j)) to be the event that each
node is either absent or realized to some point in ri, . . . , rj.
159
CHAPTER 7. CONCLUDING REMARKS
Chapter 7 Concluding Remarks
In this dissertation, we study two famous stochastic geometry models, the locational
uncertainty model and the existential uncertainty model.
In the first part of the dissertation, we study how to construct coresets for dif-
ferent stochastic problems. We initiate the study of constructing ε-kernel coresets in
stochastic geometry models. We consider approximating the expected width (an ε-
exp-kernel), as well as the probability distribution on the width (an (ε, τ)-quant-
kernel) for any direction. We provide efficient algorithms for constructing such
ε-kernel coresets in nearly linear time. Our ε-kernel coresets have a few applications,
including approximating the extent of uncertain functions, maintaining extent mea-
sures for stochastic moving points and giving PTAS for some stochastic shape fitting
problems.
We also study another two important stochastic geometric optimization problems,
the k-center problem and the j-flat-center problem in stochastic geometry models
in Euclidean spaces. We first think each of the stochastic problems as a certain
deterministic problem over (exponential many) all possible realizations (each being a
point set). By this view, we introduce a new notion called generalized coreset, which
is a collection of realizations (instead of points for coresets). We also propose a new
framework for generalized coreset construction. By the framework, we provide the
first PTAS (Polynomial Time Approximation Scheme) for both stochastic geometry
optimization problems, which generalize the previous results for stochastic minimum
enclosing ball [88] and stochastic enclosing cylinder [66].
The second part of the dissertation is to estimate the expected value of a variety
combinatorial objects over stochastic data. Several geometric properties of a set of s-
tochastic points have been studied extensively in the literature under the term stochas-
tic geometry. For instance, it is well known that if there are n points uniformly and
160
CHAPTER 7. CONCLUDING REMARKS
independently distributed in [0, 1]2, the minimal traveling salesman tour/minimum s-
panning tree/minimum matching visiting them has an expected length Θ(√n) [21, 28].
Compared with results in stochastic geometry, we focus on the efficient computation
of the statistics, instead of giving explicit mathematical formulas.
We study the problems of computing the expected lengths of several combinatorial
or geometric optimization problems in both models, including closest pair, minimum
spanning tree, k-clustering, minimum perfect matching, and minimum cycle cover.
We also consider the problem of estimating the probability that the length of closest
pair, or the diameter, is at most, or at least, a given threshold. Most of the above
problems are known to be #P-hard. Caused by the high variance, we can not directly
use Monte Carlo method to estimate the expected value. Thus, we develop two new
techniques: stoch-core and Hierarchical Partition Family (HPF). Both techniques
are used to decompose the expectation of certain random variable into a convex
combination of conditional expectations, such that each conditional expectation has
a low variance. Combining our new techniques and Monte Carlo method, we obtain
FPRAS (Fully Polynomial Randomized Approximation Scheme) for most of these
problems in stochastic geometry models.
There are still many open optimization problems over different stochastic models.
In this dissertation, we study the discrete stochastic models. In practice, the distribu-
tion of point locations often follows some continuous distribution in practice, such as
GPS system, robot control, and so on. Many fundamental issues in this domain, such
as many classic geometry computation and optimization problems are still not well
understood by researchers. We believe it is a fruitful direction for further research.
161
BIBLIOGRAPHY
Bibliography
[1] A. Abdullah, S. Daruki, and J.M. Phillips. Range counting coresets for uncer-tain data. In Proceedings 29th ACM Syposium on Computational Geometry,pages 223–232, 2013.
[2] P. Afshani, P.K. Agarwal, L. Arge, K.G. Larsen, and J.M. Phillips. (Approxi-mate) uncertain skylines. In Proceedings of the 14th International Conferenceon Database Theory, pages 186–196, 2011.
[3] P.K. Agarwal, S.W. Cheng, Y. Tao, and K. Yi. Indexing uncertain data. InProceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposiumon Principles of database systems, pages 137–146. ACM, 2009.
[4] P.K. Agarwal, S.W. Cheng, and K. Yi. Range searching on uncertain data.ACM Transactions on Algorithms (TALG), 8(4):43, 2012.
[5] P.K. Agarwal, A. Efrat, S. Sankararaman, and W. Zhang. Nearest-neighborsearching under uncertainty. In Proceedings of the 31st Symposium on Principlesof Database Systems, pages 225–236, 2012.
[6] P.K. Agarwal, S. Har-Peled, S. Suri, H. Yıldız, and W. Zhang. Convex hullsunder uncertainty. In Proceedings of the 22nd Annual European Symposium onAlgorithms, pages 37–48, 2014.
[7] P.K. Agarwal, S. Har-Peled, and K. Varadarajan. Approximating extent mea-sures of points. Journal of the ACM, 51(4):606–635, 2004.
[8] P.K. Agarwal, S. Har-Peled, and K. Varadarajan. Geometric approximation viacoresets. Combinatorial and Computational Geometry, 52:1–30, 2005.
[9] P.K. Agarwal, S. Har-Peled, and H. Yu. Robust shape fitting via peeling andgrating coresets. Discrete & Computational Geometry, 39(1-3):38–58, 2008.
[10] P.K. Agarwal, J. Matousek, and S. Suri. Farthest neighbors, maximum spanningtrees and related problems in higher dimensions. Computational Geometry -Theory and Applications, 1(4):189–201, 1992.
[11] P.K. Agarwal and C.M. Procopiuc. Exact and approximation algorithms forclustering. Algorithmica, 33(2):201–226, 2002.
[12] P.K. Agarwal and M. Sharir. Arrangements and their applications. Handbook ofComputational Geometry, J. Sack and J. Urrutia (eds.), pages 49–119. Elsevier,Amsterdam, The Netherlands, 2000.
162
BIBLIOGRAPHY
[13] C. Alexopoulos and J.A. Jacobson. State space partition algorithms for s-tochastic systems with applications to minimum spanning trees. Networks,35(2):118–138, 2000.
[14] M. Anthony and P.L. Bartlett. Neural network learning: Theoretical founda-tions. cambridge university press, 2009.
[15] M.J. Atallah, Y. Qi, and H. Yuan. Asymptotically efficient algorithms forskyline probabilities of uncertain data. ACM Trans. Datab. Syst, 32(2):12,2011.
[16] D. Bandyopadhyay and J. Snoeyink. Almost-Delaunay simplices: Nearestneighbor relations for imprecise points. In Proceedings of the 15th ACM-SIAMSymposium on Discrete Algorithms, pages 410–419, 2004.
[17] N. Bansal, A. Gupta, J. Li, J. Mestre, V. Nagarajan, and A. Rudra. When lpis the cure for your matching woes: Improved bounds for stochastic matchings.In European Symposium on Algorithms, pages 218–229. Springer, 2010.
[18] J.F. Bard and J.E. Bennett. Arc reduction and path preference in stochasticacyclic networks. Management Science, 37(2):198–215, 1991.
[19] G. Barequet and S. Har-Peled. Efficiently approximating the minimum-volumebounding box of a point set in three dimensions. Journal of Algorithms,38(1):91–109, 2001.
[20] S. Basu, R. Pollack, and M. Roy. Algorithms in real algebraic geometry. AMC,10:12, 2011.
[21] J. Beardwood, J. H. Halton, and J. M. Hammersley. The shortest path throughmany points. In Proc. Cambridge Philos. Soc, pages 55:299–327, 1959.
[22] M. W. Bern and D. Eppstein. Worst-case bounds for suadditive geometricgraphs. In Symposium on Computational Geometry, pages 183–188, 1993.
[23] D.J. Bertsimas and G. van Ryzin. An asymptotic determination of the minimumspanning tree and minimum matching constants in geometrical probability. Op-erations Research Letters, 9(4):223–231, 1990.
[24] A. Bhalgat. A (2+ε)-approximation algorithm for the stochastic knapsack prob-lem. Unpublished manuscript, 2011.
[25] T.M. Chan. Approximating the diameter, width, smallest enclosing cylinder,and minimum-width annulus. In Proceedings of the 16th Annual Son Compu-tational Geometry, pages 300–309, 2000.
[26] T.M. Chan. Faster core-set constructions and data-stream algorithms in fixeddimensions. Computational Geometry: Theory and Applications, 35:20–35,2006.
163
BIBLIOGRAPHY
[27] K. Chen. On coresets for k-median and k-means clustering in metric and Eu-clidean spaces and their applications. SIAM Journal on Computing, 39(3):923–947, 2009.
[28] R. Cheng, J. Chen, M. Mokbel, and C. Chow. Probabilistic verifiers: Evaluatingconstrained nearest-neighbor queries over uncertain data. In ICDE, 2008.
[29] R. Cheng, J. Chen, and X. Xie. Cleaning uncertain data with quality guarantees.Proceedings of the VLDB Endowment, 1(1):722–735, 2008.
[30] G. Cormode and A. McGregor. Approximation algorithms for clustering un-certain data. In Proceedings of the 27th Symposium on Principles of DatabaseSystems, pages 191–200, 2008.
[31] G. Cormode and S. Muthukrishnan. Radial histograms for spatial streams.Technical Report 2003-11, Center for Discrete Mathematics and Computer Sci-ence (DIMACS), 2003.
[32] G. Cornuejols and W. Pulleyblank. A matching problem with side constraints.Discrete Math., 29, 1980.
[33] B.C. Dean, M.X. Goemans, and J. Vondrdk. Approximating the stochasticknapsack problem: The benefit of adaptivity. In Foundations of ComputerScience, 2004. Proceedings. 45th Annual IEEE Symposium on, pages 208–217.IEEE, 2004.
[34] A. Deshpande, L. Rademacher, S. Vempala, and G. Wang. Matrix approxima-tion and projective clustering via volume sampling. In Proceedings of the 17thACM-SIAM symposium on Discrete algorithm, pages 1117–1126, 2006.
[35] X. Dong, A.Y. Halevy, and C. Yu. Data integration with uncertainty. InProceedings of the 33rd International Conference on Very Large Data Bases,pages 687–698, 2007.
[36] A. Driemel, H. HAverkort, M. Loffler, and R.I. Silveira. Flow computations onimprecise terrains. Journal of Computational Geometry, 4:38–78, 2013.
[37] R.M. Dudley. Metric entropy of some classes of sets with differentiable bound-aries. Journal of Approximation Theory, 10(3):227–236, 1974.
[38] M. Dyer. Approximate counting by dynamic programming. In ACM Symposiumon Theory of Computing, pages 693–699, 2003.
[39] H. Edelsbrunner, J. O’Rourke, and R. Seidel. Constructing arrangementsof lines and hyperplanes with applications. SIAM Journal on Computing,15(2):341–363, 1986.
164
BIBLIOGRAPHY
[40] Y. Emek, A. Korman, and Y. Shavitt. Approximating the statistics of var-ious properties in randomly weighted graphs. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1455–1467. SIAM, 2011.
[41] W. Evans and J. Sember. The possible hull of imprecise points. In Proceedingsof the 23rd Canadian Conference on Computational Geometry, 2011.
[42] T. Feder and D. Greene. Optimal algorithms for approximate clustering. InProceedings of the twentieth annual ACM symposium on Theory of computing,pages 434–444. ACM, 1988.
[43] D. Feldman, A. Fiat, H. Kaplan, and K. Nissim. Private coresets. In Proceedingsof the 41st Annual ACM Symposium on Theory of Computing, pages 361–370,2009.
[44] D. Feldman, A. Fiat, and M. Sharir. Coresets forweighted facilities and their ap-plications. In 2006 47th Annual IEEE Symposium on Foundations of ComputerScience (FOCS’06), pages 315–324. IEEE, 2006.
[45] D. Feldman and M. Langberg. A unified framework for approximating andclustering data. In Proceedings of the 43rd ACM Symposium on Theory ofComputing, pages 569–578, 2011.
[46] D. Feldman, M. Schmidt, and C. Sohler. Turning big data into tiny data:Constant-size coresets for k-means, pca and projective clustering. In Proceedingsof the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms,pages 1434–1453. SIAM, 2013.
[47] D. Feldman and L.J. Schulman. Data reduction for weighted and outlier-resistant clustering. In Proceedings of the twenty-third annual ACM-SIAM sym-posium on Discrete Algorithms, pages 1343–1354. SIAM, 2012.
[48] A.M. Frieze. On the value of a random minimum spanning tree problem. Dis-crete Applied Mathematics, 10(1):47–56, 1985.
[49] P.K. Ghosh and K.V. Kumar. Support function representation of convex bod-ies, its application in geometric computing, and some related representations.Computer Vision and Image Understanding, 72(3):379–403, 1998.
[50] A. Goel and P. Indyk. Stochastic load balancing and related problems. InFoundations of Computer Science, 1999. 40th Annual Symposium on, pages579–586. IEEE, 1999.
[51] T.F. Gonzalez. Clustering to minimize the maximum intercluster distance.Theoretical Computer Science, 38:293–306, 1985.
[52] V. Goyal and R. Ravi. A ptas for the chance-constrained knapsack problemwith random item sizes. Operations Research Letters, 38(3):161–164, 2010.
165
BIBLIOGRAPHY
[53] S. Guha and K. Munagala. Exceeding expectations and clustering uncertaindata. In Proceedings of the 28th Symposium on Principles of Database Systems,pages 269–278, 2009.
[54] L.J. Guibas, D. Salesin, and J. Stolfi. Constructing strongly convex approximatehulls with inaccurate primitives. Algorithmica, 9:534–560, 1993.
[55] P. Gupta and P.R. Kumar. Critical power for asymptotic connectivity. InProceedings of the 37th IEEE Conference on Decision and Control, volume 1,pages 1106–1110. IEEE, 1998.
[56] P. Gupta and P.R. Kumar. The capacity of wireless networks. IEEE Transac-tions on Information Theory, 46(2):388–404, 2000.
[57] M. Haenggi, J.G. Andrews, F. Baccelli, O. Dousse, and M. Franceschetti. S-tochastic geometry and random graphs for the analysis and design of wirelessnetworks. IEEE Journal on Selected Areas in Communications, 27(7):1029–1046, 2009.
[58] S. Har-Peled. Geometric approximation algorithms, volume 173. Americanmathematical society Providence, 2011.
[59] S. Har-Peled. On the expected complexity of random convex hulls. arX-iv:1111.5340, 2011.
[60] S. Har-Peled and S. Mazumdar. On coresets for k-means and k-median cluster-ing. In Proceedings of the 36th Annual ACM Symposium on Theory of Com-puting, pages 291–300, 2004.
[61] S. Har-Peled and K. Varadarajan. Projective clustering in high dimensions usingcore-sets. In Proceedings of the eighteenth annual symposium on Computationalgeometry, pages 312–318. ACM, 2002.
[62] S. Har-Peled and Y. Wang. Shape fitting with outliers. SIAM Journal onComputing, 33(2):269–285, 2004.
[63] D. Hartvigsen. An extension of matching theory. phd thesis, carnegie-mellonuniversity. 1984.
[64] M. Held and J.S.B. Mitchell. Triangulating input-constrained planar point sets.Information Processing Letters, 109(1):54–56, 2008.
[65] L. Huang and J. Li. Approximating the expected values for combinatorialoptimization problems over stochastic points. In Automata, Languages, andProgramming, pages 910–921. Springer, 2015.
[66] Lingxiao Huang, Jian Li, Jeff M Phillips, and Haitao Wang. ε-kernel coresetsfor stochastic points. In European Symposium on Algorithms. Springer, 2016.
166
BIBLIOGRAPHY
[67] M. Jerrum, A. Sinclair, and E. Vigoda. A polynomial-time approximationalgorithm for the permanent of a matrix with nonnegative entries. Journal ofthe ACM (JACM), 51(4):671–697, 2004.
[68] M. Jerrum, L.G. Valiant, and V. Vazirani. Random generation of combinatorialstructures from a uniform distribution. Theoretical Computer Science, 43:169–188, 1986.
[69] A.G. Jørgensen, M. Loffler, and J.M. Phillips. Geometric computation on inde-cisive points. In Proceedings of the 12th Algorithms and Data Structure Sym-posium, pages 536–547, 2011.
[70] P. Kamousi, T.M. Chan, and S. Suri. The stochastic closest pair problemand nearest neighbor search. In Proceedings of the 12th Algorithms and DataStructure Symposium, pages 548–559, 2011.
[71] P. Kamousi, T.M. Chan, and S. Suri. Stochastic minimum spanning treesin Euclidean spaces. In Proceedings of the 27th annual ACM symposium onComputational Geometry, pages 65–74. ACM, 2011.
[72] P. Kamousi, T.M. Chan, and S. Suri. Closest pair and the post office problemfor stochastic points. Computational Geometry, 47(2):214–223, 2014.
[73] H.J. Karloff. How long can a Euclidean traveling salesman tour be? In J.Discrete Math, page 2(1). SIAM, 1989.
[74] J. Kleinberg and T. Eva. Algorithm design. Pearson Education India, 2006.
[75] J. Kleinberg, Y. Rabani, and E. Tardos. Allocating bandwidth for bursty con-nections. SIAM Journal on Computing, 30(1):191–217, 2000.
[76] H. Kruger. Basic measures for imprecise point sets in Rd. Master’s thesis,Utrecht University, 2008.
[77] M. Langberg and L.J. Schulman. Universal ε-approximators for integrals. InProceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms,2010.
[78] J. Li and A. Deshpande. Maximizing expected utility for stochastic combinato-rial optimization problems. In Foundations of Computer Science (FOCS), 2011IEEE 52nd Annual Symposium on, pages 797–806. IEEE, 2011.
[79] J. Li and Y. Liu. Approximation algorithms for stochastic combinatorial op-timization problems. Journal of the Operations Research Society of China,4(1):1–47, 2016.
[80] J. Li and H. Wang. Range queries on uncertain data. Theoretical ComputerScience, 609:32–48, 2016.
167
BIBLIOGRAPHY
[81] J. Li and W. Yuan. Stochastic combinatorial optimization via poisson approxi-mation. In Proceedings of the forty-fifth annual ACM symposium on Theory ofcomputing, pages 971–980. ACM, 2013.
[82] Y. Li, P.M. Long, and A. Srinivasan. Improved bounds on the samples com-plexity of learning. Journal of Computer and System Sciences, 62:516–527,2001.
[83] M. Loffler and J. Phillips. Shape fitting on point sets with probability distri-butions. In Proceedings of the 17th European Symposium on Algorithms, pages313–324, 2009.
[84] M. Loffler and J. Snoeyink. Delaunay triangulations of imprecise points in lineartime after preprocessing. In Proceedings of the 24th Sympoium on Computa-tional Geometry, pages 298–304, 2008.
[85] M. Loffler and M. van Kreveld. Approximating largest convex hulls for imprecisepoints. Journal of Discrete Algorithms, 6:583–594, 2008.
[86] R.P. Loui. Optimal paths in graphs with stochastic or multidimensional weights.Communications of the ACM, 26(9):670–676, 1983.
[87] J. Matousek. Computing the center of planar point sets. Discrete and Compu-tational Geometry, 6:221, 1991.
[88] A. Munteanu, C. Sohler, and D. Feldman. Smallest enclosing ball for probabilis-tic data. In Proceedings of the thirtieth annual symposium on Computationalgeometry, page 214. ACM, 2014.
[89] T. Nagai and N. Tokura. Tight error bounds of geometric problems on convexobjects with imprecise coordinates. In Jap. Conf. on Discrete and Comput.Geom., LNCS 2098, pages 252–263, 2000.
[90] E. Nikolova. Approximation algorithms for reliable stochastic combinatorialoptimization. In Approximation, Randomization, and Combinatorial Optimiza-tion. Algorithms and Techniques, pages 338–351. Springer, 2010.
[91] E. Nikolova, J.A. Kelner, M. Brand, and M. Mitzenmacher. Stochastic shortestpaths via quasi-convex maximization. In European Symposium on Algorithms,pages 552–563. Springer, 2006.
[92] Y. Ostrovsky-Berman and L. Joskowicz. Uncertainty envelopes. In Abstractsof the 21st European Workshop on Comput. Geom., pages 175–178, 2005.
[93] J.M. Phillips. Coresets and sketches. In Handbook of Discrete and Computa-tional Geometry, number Chapter 49. CRC Press, 3rd edition, 2016.
[94] D. Salesin, J. Stolfi, and L.J. Guibas. Epsilon geometry: building robust algo-rithms from imprecise computations. In Proceedings of the 5th Symposium onComputational Geometry, pages 208–217, 1989.
168
BIBLIOGRAPHY
[95] R. Schneider. Convex bodies: the Brunn-Minkowski theory, volume 44. Cam-bridge University Press, 1993.
[96] A. Shapiro, D. Dentcheva, and A. Ruszczynski. Lectures on stochastic program-ming: modeling and theory, volume 16. SIAM, 2014.
[97] A. Sly. Computational transition at the uniqueness threshold. In Foundationsof Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages287–296. IEEE, 2010.
[98] T.L. Snyder and J.M. Steele. A priori bounds on the Euclidean traveling sales-man. In J. Comput, page 24(3). SIAM, 1995.
[99] J.M. Steele. On frieze’s ζ(3) limit for lengths of minimal spanning trees. DiscreteApplied Mathematics, 18(1):99–103, 1987.
[100] D. Suciu, D. Olteanu, C. Re, and C. Koch. Probabilistic databases. SynthesisLectures on Data Management, 3(2):1–180, 2011.
[101] S. Suri, K. Verbeek, and H. Yıldız. On the most likely convex hull of uncertainpoints. In Proceedings of the 21st European Symposium on Algorithms, pages791–802, 2013.
[102] C. Swamy and D. B. Shmoys. Approximation algorithms for 2-stage stochasticoptimization problems. pages 37(1):33–46, 2006.
[103] W.T. Tutte. A short proof of the factor theorem for finite graphs. Canad. J.Math., 6, 1954.
[104] M. van Kreveld and M. Loffler. Largest bounding box, smallest diameter, andrelated problems on imprecise points. Computational Geometry: Theory andApplications, 43:419–433, 2010.
[105] V.N. Vapnik and A.Y. Chervonenkis. On the uniform convergence of relativefrequencies of events to their probabilities. Theory of Probability & Its Appli-cations, 16(2):264–280, 1971.
[106] K. Varadarajan and X. Xiao. On the sensitivity of shape fitting problems.In 32nd International Conference on Foundations of Software Technology andTheoretical Computer Science, page 486, 2012.
[107] J. Vondrak, C. Chekuri, and R. Zenklusen. Submodular function maximizationvia the multilinear relaxation and contention resolution schemes. In Proceedingsof the forty-third annual ACM symposium on Theory of computing, pages 783–792. ACM, 2011.
[108] H. Wang and J. Zhang. One-dimensional k-center on uncertain data. TheoreticalComputer Science, 602:114–124, 2015.
169
BIBLIOGRAPHY
[109] H. Yıldız, L. Foschini, J. Hershberger, and S. Suri. The union of probabilisticboxes: Maintaining the volume. European Symposia on Algorithms, pages 591–602, 2011.
[110] H. Yu, P.K. Agarwal, R. Poreddy, and K. Varadarajan. Practical method-s for shape fitting and kinetic data structures using coresets. Algorithmica,52(3):378–402, 2008.
[111] K. Zheng, G. Trajcevski, X. Zhou, and P. Scheuermann. Probabilistic rangequeries for uncertain trajectories on road networks. In Proceedings of the 14thInternational Conference on Extending Database Technology, pages 283–294,2011.
170
ACKNOWLEDGEMENTS
Acknowledgements
First, I would like to thank my advisor, Jian Li, for his invaluable help and patient
instructions. He always offers me advices of great value and inspiration of new ideas.
I enjoyed discussing with him and have learnt a lot from him. It is my honor to be
his academic student.
I would like to take this opportunity to thank all of my collaborators: Pingzhong
Tang, Yicheng Liu, Lingqing Ai, Xian Wu, Longbo Huang, Qicai Shi, Jeff M. Phillips,
Haitao Wang, Pinyan Lu, Chihao Zhang, Hu Ding, Yu Liu, Yifei Jin and Lunjia Hu.
I have benefited a lot and improved my research skills while working with them. I am
indebted to Professor Pinyan Lu, who was my mentor during my visit to MSRA in
Shanghai in the spring of 2015, and Chihao Zhang, who helped me a lot during my
internship at MSRA. My PhD life in IIIS was so enjoyable with my dear friends. I
would like to thank my friends, Yu, Jianan, Ye, Ruichuang, Linyun, Xian, Lingqing,
Yifei, Wei, Mengwen, Dong, Hao, Zhize and Qicai (I have been moving a lot). We
have so many colorful shared memory of learning and playing together. I would also
like to thank my Dota-mates who won the champion of 9cg in Tsinghua with me:
Lingqing, Xian, Linyun, Jianan and Ruichuang.
Finally, I must give my most special thanks to my mother Xifeng and my girl
friend Zhongjun, who always accompanied me. Whenever I have met difficulties,
they always support me and encourage me. Their selfless love has made this all
possible.
171
Declaration
172
Declaration
I solemnly declare that the submissions of the dissertation are the results of my
independent research work under the guidance of the instructor. As far as I know, the
research results of this dissertation do not contain any content that is copyrighted by
others, except as already quoted in the text. Other individuals and collectives who have
contributed to the research work involved in this paper have been identified in a clear
manner.
Signature: Date:
Personal Resume, Academic Papers Published During the Study and Research Results
116
Personal Resume, Academic Papers Published During the Study and Research Results
Personal Resume
2008-2012 Bachelor Degree, IIIS, Tsinghua University
2012-2017 PhD Degree, IIIS, Tsinghua University; Advisor: Prof. Jian Li
Research interest: algorithm design (including approximation algorithm,
computational geometry and random algorithm), machine learning and game theory
Published Academic Papers
1. AAMAS 2014. Egalitarian Pairwise Kidney Exchange: Fast Algorithms via Linear
Programming and Parametric Flow. Authors: Jian Li, Yicheng Liu, Lingxiao
Huang, Pingzhong Tang.
2. SIGMETRICS 2014. The Multi-shop Ski Rental Problem. Authors: Lingqing
Ai, Xian Wu, Lingxiao Huang, Longbo Huang, Pingzhong Tang, Jian Li
3. COCOON 2015. Approximation Algorithms for the Connected Sensor Cover
Problem. Authors: Lingxiao Huang, Jian Li, Qicai Shi.
4. ICALP 2015. Approximating the Expected Values for Combinatorial Optimization
Problems over Stochastic Points Authors: Lingxiao Huang, Jian Li.
5. SODA 2016. Canonical Paths for MCMC: from Art to Science Authors: Lingxiao
Huang, Pinyan Lu, Chihao Zhang.
6. ICML 2016. K-Means Clustering with Distributed Dimensions Authors: Hu
Ding, Yu Liu, Lingxiao Huang, Jian Li.
7. ESA 2016. "-kernal Coresets for Stochastic Points Authors: Lingxiao Huang,
Jian Li, Je_ M. Phillips, Haitao Wang.
8. SODA 2017. Stochastic k-Center and j-Flat-Center Problems Authors: Lingxiao
Huang, Jian Li.