+ All Categories
Home > Documents > 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael...

3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael...

Date post: 17-Dec-2015
Category:
Upload: egbert-harper
View: 222 times
Download: 1 times
Share this document with a friend
Popular Tags:
44
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ of Munich, Germany
Transcript
Page 1: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

3D Shape Histograms for Similarity Search and Classification in Spatial

Databases.

Mihael Ankerst,Gabi Kastenmuller,Hans-Peter-Kriegel,Thomas Seidl

Univ of Munich, Germany

Page 2: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 3: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 4: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Introduction

Classification the problem of assigning an appropriate class to the

query object Applications -molecular biology, medical imaging

mechanical engg., astronomy Objects of same class have some characteristic

properties in common. These could be geometric properties , thematic

properties.

Page 5: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Classification in Molecular Databases

Classification schemata is already available

We need a fast filter classification algorithm

Dali System - a sophisticated classification algorithm for proteins

CATH – hierarchical classification of protein domain structures

Four levels – class, architecture, topology and homologous super family.

Page 6: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Nearest Neighbor Classification

In general classification is done after training

Object is assigned if it matches the description of the class

Nearest neighbor classifiers –find the nearest neighbor and return its class

K- nearest neighbors - #k, Weights of neighbors

Page 7: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Geometry Based Similarity Search

Spatial objects transformed into high dimensional vector space

In 2D shapes can be represented as ordered set of surface points, approx rectangular coverings etc.

Section Coding technique – each polygon’s circumcircle is decomposed into number of sectors, and each of these sectors are normalized.

Similarity is defined in terms of Euclidean distance between resulting feature vectors.

Page 8: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Invariance Properties

Similarity models need to incorporate invariance against translation, rotation, scaling etc.

Most of the methods include a preprocessing step such as rotation of objects to a normalized orientation, translation of center of mass to origin etc.

Robustness against errors is not considered in most of these models

Page 9: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 10: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

3D Shape Similarity Model

We extend the concept of section coding technique to 3D.

Shape Histograms – feature vectors

Quadratic Distance Function

Page 11: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Shape Histograms

Feature transform maps a complex object onto a feature vector in a multidimensional space.

3D shape histograms are also feature vectors

Based on partitioning the space into complete and disjoint cells called the bins of the histogram

We can use any space (geometric , thematic etc.)

Page 12: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Shell Model

3D space is decomposed into concentric shells around the center point

Independent of rotation around the center

Radii of the shells are determined from the extension of the objects

Shells of uniform thickness

Page 13: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Sector Model

3D space is decomposed into sectors that emerge from the center point of the model

Distribute points uniformly on the surface of the sphere.

The Voronoi diagram gives an appropriate decomposition of the space.

Page 14: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 15: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Combined Model

Combination of shell and sector models

Results in a higher dimensionality

We can different combinations of shells and sectors for the same dimensionality

Page 16: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 17: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 18: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 19: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Euclidean Distance

Euclidean Distance between two N dimensional vectors p and q is given by

Individual components of the feature vectors are assumed to be independent

No relationships of the components such as substitutability and compensability may be regarded

Page 20: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Euclidean Distance

Consider 3 objects a, b and c

We can clearly see ‘a and b’ are closely related than ‘a and c’ or ‘b and c’

However due to rotation, the peaks of ‘a’ and ‘b’ are mapped into different bins and hence the Euclidean distance does not reflect similarity in this case

Page 21: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 22: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 23: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Quadratic Form Distance Function

Quadratic form distance function is defined in terms of similarity matrix ‘A’

The components aij of A represent similarity of the components i and j in the underlying space

Euclidean distance is a specific case of Quad Form Distance where A= I, the Identity Matrix

Page 24: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Quadratic Form Distance Functions

Euclidean distance of two vectors is totally determined

Weighted Euclidean distance is a little more flexible , for it controls the effect of individual vector component onto the overall distance

On top of this, General Quad form distance function also specifies cross-dependencies of the dimensions

Page 25: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Quadratic Form Distance Functions

The neighborhood of the bins can be represented as the similarity weights

Let d(i,j) represent the distance of the cells that correspond to bin i and j

For shells the bin distance is the difference in the corresponding radii

For sectors the bin distance is the difference in the angles of sector centers

Page 26: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Quadratic Form Distance Functions

When provided with appropriate distance function, the similarity matrix can be computed as

aij = e-σ.d(i,j)

where the parameter σ controls the global shape of the

similarity matrix.

Page 27: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Invariance Properties

During normalization , we perform translation and rotation of all objects

Translation is done such that the COM maps onto the Origin

Principal Axes Transform is done

This generally leads to unique orientation of the object

Page 28: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Principal Axes Transform

Compute the Covariance matrix for a given 3D set of points (x,y,z)

Page 29: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Principal Axes Transform

The eigen vectors of this matrix represent the principal axes of the original 3D point

set The eigen values indicate the variance of

the points in the respective direction As a result of PAT all the covariances of

the transformed points vanish

Page 30: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 31: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Extensibility of Histogram Models

Along with spatial properties we can also consider thematic properties

General approach to manage both thematic and spatial properties is to use combined histograms

Combined histogram is the cartesian product of the individual histograms

Page 32: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 33: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Outline

Introduction 3D Shape Similarity Model Quadratic Form Distance Functions Extensibility of Histogram Models Query Processing Experimental Results and Conclusion

Page 34: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Query Processing

In case of Quad Form Distance Function, the evaluation time of a single database object increases quadratically with dimension

Page 35: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Optimal Multistep k- Nearest Neighbor Search

In order to achieve a good performance , the paradigm of mutlistep query processing is used

An index-based filter step produces a set of candidates

Refinement step performs the expensive exact evaluation of the candidates

Filter is responsible for completeness and refinement for correctness

Page 36: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Page 37: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Optimal Multistep k- Nearest Neighbor Search

Based on multi-dimensional index structure, the filter step performs an incremental ranking

objects ordered by their increasing filter distance to the query are reported

In order to guarantee no false dismissals caused by the filter step, dj(p,q) ≤ do(p,q)

Where dj =filter distance and d0 = object distance

Page 38: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Reduction in Dimensionality of Quadratic Forms

Objects in high dimensional spaces are managed by reducing their dimensionality

Typically this is done by Principal Component Analysis, Discrete Fourier transform, Similarity Matrix decomposition, Feature Subselection etc.

These approaches can also be used in case of Quadratic Form Distance

Page 39: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Reduction in Dimensionality of Quadratic Forms

An algorithm to reduce the similarity matrix from a high-dim. space down to a low-dim. space was developed in the context of multimedia databases.

The method guarantees three things

the reduced distance function is a lower bound of the given high-dimensional distance function.

the reduced distance function again is a quadratic form

the reduced distance function is the greatest of all lower-bounding

distance functions in the reduced space.

Page 40: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Experimental Evaluation

Data is taken from Brookhaven Protein Databank.

Molecules are represented as surface points for the computation of shape histograms

Reduced Feature Vectors for the filter step are managed by a X-tree of dimension 10.

Page 41: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Experimental Evaluation

Similarity Matrices are computed by an adapted formula from where the similarity weights aij of bin i and j are defined as

aij = e-σ.d(i,j)

σ = 10

Page 42: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Basic Similarity Search

Page 43: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Classification by Shape Similarity

Every class has at least two molecules

From Preprocessing , 3422 proteins have been classified into 281 classes

3models pure shell model, pure sector model and combined model have been considered .

The accuracy for the combined model is the best

Page 44: 3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.

Classification by Shape Similarity


Recommended