Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | matthew-owen |
View: | 220 times |
Download: | 0 times |
Computational Geometry
Piyush Kumar(Lecture 2: NN Search)
Welcome to CIS5930
Our First Problem
Nearest neighbor searching Applications?
oPattern ClassificationoGraphics oData CompressionoDocument retrievaloStatisticsoMachine Learningo…
Similarity Measure
In terms of Euclidean distance
(2,3)
(4,5)
2
1( , ) ( )
d
i iidist p q p q
2 2( , ) (2 4) (3 5) 2 2dist p q
Similarity Measure
Similar?
Similarity measure
Other similarity Measures
2
2
|| ||
2( , )p q
rd p q e
1( , ) cos( , )| || |
d
i iip q
d p q p qp q
The dimension
Lets assume that our points are in one dimensional space. ( d = 1 ). We will generalize to higher dimension ( Where d = some constant ).
Question
Given a set of points S on the real line, preprocess them to answer the following question : Find all the pair of points (p,q) such
that distance of (p,q) < r .
q
Points in S
Fixed radius near neighbor problem
Question
Given a set of points S on the real line, preprocess them to answer the following query : Given a query point q, find the
neighbor of q which is closest in S.
q
Points in Snn(q)
Nearest neighbor search
Answers?
Fixed NN Search O( n2 ) ? O(nlogn + k) ? O(n + k) ?
NN Search O( n ) ? O( log n) ?
Answers?
NN Search O( n ) ? Brute Force [ Trivial ] O( log n) ? Binary Search Tree
Fixed NN Search O( n2 ) ? Brute Force O(nlogn + k) ? Sorting O(n + k) ? Hashing?
NN search
NN Searching : Balanced binary tree
q
Points in Snn(q)
O( log n )
K-nearest neighbor search
Problem: Given a set of points P on the real line and a query point q, find the k-nearest neighbors of q in P.O(nlogn) Trivial bruteforce Do you see how?
Thought Problem: How do we do this in O(n) time? (Hint: Median finding works in O(n) time).
Fixed NN Search
Brute Force implementation
What can we speed up here? What can we speed up here? How do we speed this up?
Fixed NN Search: By Sorting
Once we sort the points on the real line, we can just go left and right to identify the pairs we need.Each pair is visited at most twice, so the asymptotics do not change.
Total work done after sorting
Total work done after sorting
• ki denotes the pairs generated when visiting pi • With this approach, we need at least Ω(nlogn) (for sorting).
Fixed radius near neighbor searching
How do we avoid sorting? How do we get a running time of O(n+k) ?
Solution using bucketing
Interval b is [ br, (b+1)r ]x lies in b = floor (x/r)
r
0
b=0b= -2
• Put points in infinite number of buckets (Assume an infinite array B)
Solution using bucketing Only n buckets of B might get occupied at most. How do we convert this infinite array into a finite one: Use hashing In O(1) time, we can determine which bucket a point falls in. In O(1) expected time, we can look the bucket up
in the hash table Total time for bucketing is expected O(n) The total running time can be made O(n) with high probability using multiple hash functions ( essentially using more than one hash function and choosing one at run time to fool the adversary ).
The Algorithm
Store all the points in buckets of size r In a hash table [ Total complexity = O(n) ]
For each point x b = floor(x/r); Retrieve buckets b, b+1 Output all pairs (x,y) such that y is either
in bucket b or b+1 and x < y and ||xy|| < r
0 x
Running Time
Let nb denote the number of points in bucket b of the input pointset P. Define
Note that there are nb2 pairs in
bucket b alone that are within distance r of each other.
Observation
Since each pair gets counted twice :
2 21 2b b
b B b B
S n n k
Running Time
Depends on the number of distance computations D.
2 2 21 3 3 ( )b b b
b B b B b B
n n n S k O k
2 2( )xy x y
Total Running Time = O(n+k)
Since :
Higher Dimensions
Send (x,y) (floor(x/r),floor(y/r)) Apply hash with two arguments Running time still O(n+k) Running time increases
exponentially with dimension
Introduction: Geometry Basics
Geometric Systems Vector Space Affine Geometry Euclidean Geometry
o AG + Inner Products = Euclidean Geometry
Vector Space
Scalar ( + , * ) = Number Types Usual example is Real Numbers R.
Let V be a set with two operations+ : V x V V
* : F x V Vo Here F is the set of Scalars
Vector Space
If (V , +, * ) follows the following properties, its called a vector space : (A1) u + (v + w) = (u + v) + w for all u,v,w in V. (A2) u + v = v + u for all u,v in V. (A3) there is unique 0 in V such that 0 + u = u for all u in
V. (A4) for every u in V, there is unique -u in V such that u + -
u = 0. (S1) r(su) = (rs)u for every r,s in R and every u in V. (S2) (r +s)u = ru + su for every r,s in R and every u in V. (S3) r(u + v) = ru + rv for every r in R and every u,v in V. (S4) 1u = u for every u in V.
Note: Vectors are closed under linear combinations A basis is a set of n linearly independent vectors that span V.
Affine Geometry
Geometry of vectors Not involving any notion of length or angle.
Consists of A set of scalars
o Say Real numbers A set of points
o Position specification A set of free vectors.
o Direction specification
Affine Geometry
Legal operations Point - Point = Vector Point +/- Vector =Point Vector +/- Vector = Vector Scalar * Vector = Vector Affine Combination
o ∑Scalar * Points = Point – such that ∑Scalar = 1– Note that scalars can range from –Infinity to
+Infinity
Affine Geometry
0 1 0 1 0 0 1 1 0 1 1 0( , ; , ) ( )aff p p p p p p p
( , )i
Affine Combination Convex Combination
(0,1)i
Affine Combinations
[ Affine Span or Affine Closure ] The set of all affine combinations of three points generates a plane. [ Convex Closure ] The set of all convex combinations of three points generates all points inside a triangle.
Euclidean Geometry
One more element added Inner Products
oMaps two vectors into a scalaroA way to `multiply’ two vectors
Example of Inner Products
Example : Dot products (u.v) = u0v0+ u1v1+…+ ud-1vd-1
o Where u,v are d-dimensional vectors.
o u = (u0,u1,…, ud-1); v = (v0,v1,…, vd-1)
Length of |u| = sqrt(u.u) (Distance from origin)
Normalization to unit length : u/|u| Distance between points |p - q| Angle (u’,v’) = cos-1(u’.v’)
o where u’=u/|u| and v’=v/|v|
Dot products
(u.v) = (+/-)|u|(projection of v on u). u is perpendicular to v (u,v) = 0 u.(v+w) = u.v + u.w If u.u not equal to zero then u.u > 0 positive definite
Some proofs using Dot Products
Cauchy Schwarz Inequality (u.v) <= |u||v| Homework.
o Hint: For any real number x– (u+xv).(u+xv) >= 0
Triangle Inequality |u+v|<=|u|+|v|
o Hint: expand |u+v|2 and use Cauchy Schwarz.
u
v
Homework: (Programming) Play with dpoint.hpp
and example.cpp (Implement orientation in d-D).Implement your own dvector.hpp, dsegment.hpp using metaprogrammingMake sure you understand how things work.
Next Lecture
Orientation and Convex hulls
Sources for this lecture: Dr. David Mount’s Notes. WWW.
Reading Assignment: Page 1-12, Notes of Dr. MountPage 1-2 and Section 1.3 of the text book.
Homework: (Theory)Cauchy SchwarzTriangle Inequality
dot n cross prod.
Due Monday
Due next Thursday
Crash course on C++“dpoint.hpp”
Namespaces Solve the problem of
classes/variables with same name namespace _cg code Outside the namespace use
_cg::dpoint Code within _cg should refer to code
outside _cg explicitly. E.g. std::cout instead of cout.
Object Oriented Programming
Identify functional units in your designWrite classes to implement these functional unitsSeparate functionality for code-reuse.
Class membership
PublicPrivate Always : Keep member variables
private This ensures that the class knows
when the variable changes
Protected
Inheritance
‘is a’ relationship is public inheritance Class SuperDuperBoss : public Boss
Polymorphism : Refer an object thru a reference or pointer of the type of a parent class of the object SuperDuperBoss JB; Boss *b = &JB;
Virtual functions
Templates
Are C macros on Steroids Give you the power to parametrize Compile time computation Performance
“The art of programming programsthat read, transform, or write other programs.”
- François-René Rideau
Generic Programming
How do we implement a linked list with a general type inside? void pointers? Using macros? Using Inheritance?
Templates
Function Templates Class Templates Template templates * Full Template specialization Partial template specialization
Metaprogramming
Programs that manipulate other programs or themselves Can be executed at compile time or runtime. Template metaprograms are executed at compile time.
Good old C
C code Double square(double x) return
x*x; osqare(3.14) Computed at compile time
#define square(x) ((x)*(x)) Static double sqrarg; #define SQR(a) (sqrarg=(a), sqrarg*sqrarg)
Templates
Help us to write code without being tied to particular type. Question: How do you swap two elements of any type? How do you return the square of any type?
Function Templates
C++ template< typename T > inline T square ( T x ) return x*x; A specialization is instantiated if needed :
o square<double>(3.14)
Template arguments maybe deduced from the function arguments square(3.14)
MyType m; … ; square(m); expands to square<MyType>(m)
Operator * must be overloaded for MyType
Function Templates
template<typename T> void swap( T& a, T& b )
T tmp(a); // cc required a = b; // ao required
b = tmp;
Mytype x = 1111;Mytype y = 100101;swap(x,y); swap<Mytype>(…) is instantiated
Note reliance on T’s concepts (properties):In above, T must be copyable and assignableCompile-time enforcement (concept-checking) techniques available
Function Template Specialization
template<> void
swap<myclass>( myclass& a, myclass& b)
a = b = 0; Custom version of a template for a specific class
Class Templates
template<typename NumType, unsigned D> class dpoint
public: NumType x[D]; ;
A simple 3-dimensional point.dpoint<float,3> point_in_3d;point_in_3d.x[0] = 5.0;point_in_3d.x[1] = 1.0;point_in_3d.x[2] = 2.0;
Note the explicit instantiation
Class Templates: Unlike function templates
Class template arguments can’t be deduced in the same way, so usually written explicitly:
dpoint <double, 2> c; // 2d point
Class template parameters may have default values:
template< class T, class U = int >class MyCls … ;
Class templates may be partially specialized:template< class U >class MyCls< bool, U > … ;
Using Specializations
First declare (and/or define) the general case:
template< class T >class C /* handle most types this way */ ;
Then provide either or both kinds of special cases as desired:
template< class T > class C< T * > /* handle pointers specially */ ;
template<> // note: fully specializedclass C< int * > /* treat int pointers thusly */ ;
Compiler will select the most specialized applicable class template
Template Template Parameters
A class template can be a template argumentExample:
template< template<class> class Bag >class C // … Bag< float > b;;
Or even:template< class E, template<class> class Bag >class C // … Bag< E > b;;
Recall C Enumerations
#define SPRING 0 #define SUMMER 1 #define FALL 2 #define WINTER 3
enum SPRING, SUMMER, FALL, WINTER ;
More Templates
template<unsigned u>class MyClass enum X = u ;
;
Cout << MyClass<2>::X << endl;
Template Metaprograms
Factorials at compile timetemplate<int N> class Factorial public: enum value = N * Factorial<N-
1>::value ; ;
class Factorial<1> public:
enum value = 1 ; ;
Int w = Factorial<10>::value;
Template Metaprograms
Metaprogramming using C++ can be used to implement a turning machine. (since it can be used to do conditional and loop constructs).
Template Metaprogramstemplate< typename NumType, unsigned D, unsigned I > struct origin static inline void eval( dpoint<NumType,D>& p ) p[I] = 0.0; origin< NumType, D, I-1 >::eval( p ); ;
// Partial Template Specializationtemplate <typename NumType, unsigned D> struct origin<NumType, D, 0> static inline void eval( dpoint<NumType,D>& p ) p[0] = 0.0; ;
const int D = 3;inline void move2origin() origin<NumType, D, D-1>::eval(*this); ;
Food for thought
You can implement IF WHILE FOR
Using metaprogramming. And then use them in your code that needs to run at compile time
For the more challenged: Implement computation of determinant / orientation / volume using metaprogramming. (Extra credit)
Traits Technique
Operate on “types” instead of data How do you implement a “mean” class without specifying the types. For double arrays it should output double For integers it should return a float For complex numbers it should return a
complex number
Traits
Template< typename T >struct average_traits typedef T T_average;;
Template<>struct average_traits<int>
typedef float T_average;
Traits
average_type(T) = Taverage_type(int) = float
Traits
template<typename T>typename
average_traits<T>::T_averageaverage(T* array, int N)
typename avearage_traits<T>::T_average
result = sum(array,N);
return result/N;
Sources C++ Templates by Vandevoorde and Josuttis. (A must have if you template) Template Metaprogramming by Todd Veldhuizen C++ Meta<Programming> Concepts and results by Walter E. Brown C++ For Game programmers by Noel LlopisC++ Primer by Lippman and Lajoie
GNU MP
GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers. Get yourself acquinted with : Mpz : Integer arithmetic mpq : Rationals Mpf : Floats
Orientation primitive for points
+ve in this case
Orientation
x
y
(0,0)
p1
–+
The sign (+ or –) of cross product depends in which half plane (relative to p1)lies p2
Intuition for 2D and 3D using the right hand rule?
Homeworks
Prove that for two vectors v1 = (v1x,v1y) and v2 = (v2x,v2y) v1x* v2x + v1y * v2y = |v1||v2| cos Ө
v1y* v2x - v2y * v1x = |v1||v2| sin Ө What do these quantities mean
geometrically? Note that one changes sign when Ө is negative.
Problems with fixed point arithmetic
: 3.141592653589793238462643383....float pi = 2 * asin(1); printf("%.35f\n", pi);
Outputs
3.14159274101257324218750000000000000
float typically are 32 bits and can deal with 7 digits after decimal
Exaggerating the error
Suppose our computer computes in base 10, but only keeps the two most significant digits. We call this fixed precision computation Let us take three points:
o p = (-94,0) o q = (92,68) o r = (400,180)
Triangle pqr is oriented clockwise, hence area < 0.
Determinant
True value: (92+94)*180-(400+94)*68=186*180-494*68=33480-33592 =-112 Fixed precision computation : (92+94)*180-(400+94)*68=190*180-490*68=34000-33000 =+1000
Floating point computation
Sometimes produces wrong results Exact computation is possible (And hence we will use GMP : An easy way out) Exact computations are slow and can be avoided. One simple method : Use floating point to estimate whether you really need Exactness, only then invoke exact computation.
Convexity
A set S is convex if for any pair of points p,q S we have pq S.
p
q
non-convex
q
p
convex
Convex hulls
The smallest convex body that contains a set of points The intersection of all convex sets that contain S.
p0
p1p2p4p5
p6
p7p8
p9
p11p12