MMDB-6 J. Teuhola 2012141 6. Image databases Image representations: Digitized (sampled)...

MMDB-6 J. Teuhola 2012 1

6. Image databases

Image representations: Digitized (sampled) representation of field-based spatial data ‘Raw’ images digital images bitmapped images raster

images m n matrix of pixels, resolution = sampling rate, pixels per inch Each pixel represented by k bits (= accuracy = color depth);

2k possible values.

Image types: Binary (bi-level) images (black = 0, white = 1; e.g. telefax) Grey-scale images (usually k = 8; enables 256 grey-levels) Color images (various representations)

Sources: Devices: scanner, digital camera, electron microscope, medical

imaging devices (PET, MRI) Wavelengths: visible light, infrared, X-rays


Color images

‘True’ color schemes: Three components per pixel (possibly 4th for -channel =

transparency) RGB = Red - Green - Blue (typically 3 x 8 = 24 bits per pixel) CMY = Cyan - Magenta - Yellow (CMYK used in printing, K = black) HSI = Hue - Saturation - Intensity (used in image processing) YUV YCbCr = Luminance (brightness) + 2 x chrominance (color)

Used in image compression (JPEG) Correlations between color components reduced. Most information is collected in the Y-component.

Indexed color schemes: Palette of e.g. 256 colors Mapping table from color indices to RGB-values Saves space, sufficient for many applications


Image formats

Tens of formats exist for different environments and applications, e.g.

BMP = Bitmap image file (MS Windows) GIF = Graphics Interchange Format (indexed colors; sincludes

compression, supported by web browsers) JBIG = Joint Bi-level Image experts Group file interchange format JPEG = Joint Photographic Experts Group

(JFIF = JPEG File Interchange Format) JP2 = JPEG 2000 PBM = Portable Bitmap Format (black-and-white) PGM = Portable Greymap Format (grey-scale) PPM = Portable Pixmap Format (color) PNG = Portable Network Graphics TIFF = Tagged Image File Format (large number of options)


Image compression

Necessary for large image archives: saves space, reduces transmission time.

Possible due to redundancy in images Several methods specialized for different types of images Image formats with compression:

JPEG, based on cosine transform JPEG 2000, based on wavelet transform GIF, based on LZW string compression PNG, based on LZ77 string compression JBIG (bi-level images), based on prediction by context


Compression method characteristics

Lossless / lossy methods: Can the original image be recoveredprecisely or only approximately? E.g. JPEG is typically lossy.

Compression efficiency (bit rate), measured in bits/pixel Speed (separately for compression and decompression) Distortion (for lossy methods):

MAE = Mean Absolute Error MSE = Mean Square of Errors RMS = Root Mean Square error SNR = Signal to Noise Ratio PSNR = Peak Signal to Noise Ratio

Robustness against transmission errors Blockiness, blurring, ... (for lossy methods)


Searching from an image database

1. Using a hierarchical classification of images:

The user follows paths in the hierarchy, e.g.

Art works Paintings

France 18th century

2. Search using keywords in metadata

Images can be considered similar to documents with index terms

3. Search by content features

Pattern matching based on similarity with a query image, shape, color distribution, etc.


Feature extraction and indexing of images

Extraction of descriptive attributes from images Manually, automatically, or using a hybrid scheme

(automatic segmentation & manual assignment of properties).

Manual indexing: Performed by a ‘knowledge worker’, trained on patterns and

vocabulary of the image database application Multiple indexers: Strict consistency rules, common glossary. Automatic tools may help in pattern recognition. Each interesting object (spatial structure) is presented manually to

the system for indexing, equipped with descriptive attributes. Assistance in selecting index terms: Hierarchical dictionaries,

cross-referencing systems, domain thesaurus. Time-consuming and costly; possibility to community-indexing, cf.

http://gimp-savvy.com/PHOTO-ARCHIVE/


Automatic indexing

Specialized for various application domains (document recognition, optical character recognition (OCR), engineering drawings, x-rays, ...)

The system must first ‘learn’ and categorize domain element objects.

A certain amount of uncertainty (fuzziness) must be tolerated. Important area of automatic image analysis and object

recognition:Transformation of paper documents into digital form, and indexing those documents appropriately (so called document imaging digital libraries).


Color feature extraction

Usually based on color histograms, i.e. number of pixels of each color (or color component):

Separate histograms can be built for various subregions of the image (e.g. top-left, top-right, middle, ...)

The quantification can be made coarser than 0..255 by grouping adjacent histogram values, in order to reduce the dimensionality of the resulting feature vectors.

0 255

#pixels

RED0 255

#pixels

GREEN0 255

#pixels

BLUE


Image segmentation

Detection of interesting regions within images. A segment is a connected region that satisfies a homogeneity

predicate. Basis for subsequent search. One of the most difficult tasks in image processing. Several possible (heuristic) methods.

Connected region: For each pair (x1, y1), (xn, yn) of pixels, there exists a chain of

pixels {(x1, y1), ..., (xn, yn)} in the region such that {(xi, yi), (xi+1, yi+1)} are adjacent for all i.


Examples of homogeneity predicates

Binary images: p % of the pixels of the connected region have the same color (black or white)

Classified grey-scales, e.g. 0...9, 10...19, etc.A connected region is homogeneous, if at least p % of its pixelsbelong to the same class.

Dynamic grey-scale classification: Class boundaries are not predefined, but the interval size is: p % of the cells should havea grey-level within units.

Grey-scale images with a reference function f for homogeneity:The number of pixels in { (x, y) | grey-level(x, y) - f(x, y) < }should be at least p % of the pixels in the region.


Miscellaneous segmentation techniques

(a) Regular block segmentation: Example: Quadtree or binary tree decomposition until

homogeneous regions are obtained. Does not usually satisfy the maximality condition for segmentation:

Neighboring blocks may constitute a homogeneous region. Generalization of binary tree segmentation: blocks can be split in

any direction: polygon segmentation.Compromise solution: splitlines only in 0, 45, 90, and 135 directions.

(b) Splitting and merging: Augments category (a) methods to satisfy the maximality condition. Merging tests the obtained regions pairwise for homogeneity. Does not usually produce a unique segmentation for an arbitrary

homogeneity predicate.


Miscellaneous segmentation techniques (cont.)

(c) Thresholding: Applicable, if objects of interest and the background have

sufficiently distinct grey-level values. The grey-level histogram of the image has two or more peaks,

between which we can choose the threshold grey-level values. Must usually be augmented with more sophisticated techniques.

(d) Region growing: Start from a set of seed points. Include neighboring pixels as long as homogeneity holds. Difficulty: How to choose the seeds?

(e) Edge-following algorithms: Follow a (hopefully circular) path of largest gradients (steepest

slope) around the object to be detected.


Example: thresholding

Threshold = 128


Examples:

Tolerance = 80 Convolution kernel: 0 -1 0-1 4 -1 0 -1 0

Region growing Edge detection


Segment feature extraction from images

Various approaches, e.g. area of the segment eccentricity/circularity shape approximation curvature

Desirable properties of segment features: Invariance to translation Invariance to rotation Invariance to scaling


Representing the shape of segments

See: Sven Loncaric: ” A Survey of Shape Analysis Techniques”, Pattern Recognition, 31 (8), pp. 983-1001,1998

Example: boundary scalar transform:

Another possibility: tangent angles at regular intervals Both can be made rotation and scaling invariant. The resulting 1D-function is usually Fourier-transformed:

amplitude (magnitude) values are rotation invariant; phase determines orientation and starting point).

The lower-frequency Fourier coefficients can be used as the feature vector representing the shape. About 20 is often enough.

Technical problems: Non-convex shapes; shapes with holes.


Representing the shape of segments (cont.)

Syntactic techniques, e.g. encoding of boundary into symbols representing quantized directions: forward, left, right, forward-left, forward-right, ...

String matching techniques (e.g. longest common subsequence) can beused to measure the distance between shapes

Global scalar transform techniques, e.g. moments:

The (infinite) set of moments contains all the information about the shape;in practice, we take a limited number (say 20) of lower order moments.This is a straightforward technique for feature vector construction for shapes.

,1,0,,),( qpwheredydxyxfyxm qppq

(f,f,f,fr,f,fr,r,f,fr,l,fr,fr,f,r,f)

start


Texture feature extraction: some approaches

(a) Pixel neighbourhood features

Measure directly the visual patterns occurring in the texture.

- Color co-occurrence matrix

Probability of co-occurring colors at a given distance & direction.

- Local binary patterns

Classifies pixel neighbourhood distributions at several distances

(b) Transform-based features

Produce coefficients representing weights of spatial frequencies

- Fourier transform Texture frequencies induce large related coefficients - Wavelet transform

Division of the image signal into ’subbands’ by a low-/high-pass filter bank. High-pass coefficients tend to be close to zero. Various approaches and filters have been suggested.


Content-based retrieval from image databases

General property: Retrieval is not 100 % precise.

Query types: Find images having certain features, e.g. color, texture, shape, etc. Find images containing certain types of objects. Find image objects having certain attributes, such as shape (circle,

triangle, arc, ...), size, color, etc. Find images where object of type 1 is located left of object of type 2

(= spatial relationship) Similarity search: Find images (segments) similar to a given query

image (segment).Applications: recognition of persons from photographs / fingerprints, recognition of military airplanes, ships, etc.


Approaches to similarity-based retrieval

(a) Direct metric approach

A distance function is defined for images (segments)

Task: Find the nearest neighbor (k nearest neighbors) of the query image (segment).

Naive distance functions for m n color images: L1-metric: Sum of pairwise Euclidean distances of RGB pixels L2-metric: Euclidean distance in (m n 3)-dimensional

space. See e.g. http://www.tineye.com

Plenty of computation needed.


Approaches to similarity-based retrieval (cont.)

(b) Feature-based metrics

Use feature extraction to reduce dimensionality, e.g. Shapes of segments in the image Color, texture, … features of segments or the whole images Different types of features often combined.

The ‘true’ distance function d of images/segments, and the distance function d’ of extracted feature vectors should satisfy approximately

d(a, b) < d (a, c) d’(a, b) < d’(a, c)

Use indexing to accelerate similarity retrieval: Multidimensional indexing for feature groups Inverted indexing for distinct features Mixture of these


Approaches to similarity-based retrieval (cont.)

(c) Transformation approach:

Subsumes the metric approach. Basic idea: Dissimilarity is proportional to the minimum cost of

transforming one image (segment) to the other. Choose the image which is the least dissimilar with the query image. Examples of transformation operators: Translation, rotation, scaling

(reduction /magnification), extension, painting, etc. Each operator has an associated cost function. The total cost of transformation is the sum of elementary costs.

Choose the minimum-cost chain of transformation steps for an image, then the minimum over all images in the target set.

More flexible than the metric approach; ‘users’ can specify their own transformation operators and cost functions.

Metric approach supports better indexing.


Example system: QBIC (Query By Image Content)See: http://www.research.ibm.com/topics/popups/deep/manage/html/qbic.html Content-based finding of pictorial info from image & video databases Feature extraction in database loading:

Positional color/texture Object identification: manual/semiautomatic/automatic

segmentation Graphically expressed queries based on:

example images user-constructed sketches and drawings (shape parameters) color (principal color or color histogram) texture patterns (coarseness, contrast, directionality) camera and object motion (in videos)

Distance functions between query and image features Fast searching:

Filtering and indexing Reducing the dimensionality by transforms


Image database structures

The storage of the pixel matrix (possibly compressed) is usually sequential, because it spans several disk blocks.

Each image can be considered as a file of its own.

(a) Relational representation Image relation: Image id and image-level (global) properties. Object relation: Objects (segments, rectangles) within images;

extracted manually or automatically.Attributes include: image id, object id, MBR coordinates, features.

Generalization: Probabilistic relations; object x is in image i with probability p.

Queries: Apply ‘normal’ database techniques using feature values in query conditions.


Image database structures (cont.)

(b) Spatial representation E.g. using R-trees, R*-trees, etc. Build a single R-tree for all images in the database. A leaf page contains a set of closely-located objects (their MBRs),

with a list of pointers to source images. Each list element contains the additional properties of the object. Separate indexes can be built for other than spatial properties.

General observations: In non-spatial respects, images can usually be treated as

documents, and retrieved using techniques developed for general information retrieval.

Combined usage of spatial and non-spatial criteria in retrieval is achieved simply by combining (union, intersection, etc.) the pointer lists from the related indexes.


Scenario for an image database architecture

Indexes for globalproperties of images

Indexes for objectfeatures in images

Spatial index forobjects in images

Oid Iid feat1 ... feat n Coord

Images

Iid prop1 ... propm Ptr

Date post:	17-Jan-2016
Category:	Documents
Upload:	eric-parrish
View:	212 times
Download:	0 times

MMDB-6 J. Teuhola 2012141 6. Image databases Image representations: Digitized (sampled)...

Documents