+ All Categories
Home > Science > IPCV 2015 Presentation: Text Skew Angle Detection in Vision-Based Scanning of Nutrition Labels

IPCV 2015 Presentation: Text Skew Angle Detection in Vision-Based Scanning of Nutrition Labels

Date post: 11-Aug-2015
Category:
Upload: vladimir-kulyukin
View: 134 times
Download: 0 times
Share this document with a friend
Popular Tags:
29
Text Skew Angle Detection in Vision-Based Scanning of Nutrition Labels Tanwir Zaman Vladimir Kulyukin Department of Computer Science Utah State University
Transcript

Text Skew Angle Detection in

Vision-Based Scanning of Nutrition Labels

Tanwir Zaman Vladimir Kulyukin

Department of Computer ScienceUtah State University

Outline

● Background● Text Skew Angle Detection with 2D Haar

Wavelets● Evaluation: Comparison of Proposed Algorithm

with Postl's and Hull's Algorithms on Nutrition Label Images

Background

Motivation

● OCR engines have significant difficulties with skewed texts

● If the text skew angle is known, the image can be rotated and then OCRed

● Or, which is cooler and faster, image can be OCRed in-place without any rotation

Text Skew Angle Detection Algorithms

● A variety of algorithms have been developed to determine text skew angle

● Many of these algorithms use horizontal & vertical project profiles

● A horizontal projection profile is a 1D array whose size is equal to the number of rows in the image

● A vertical projection profile is a 1D array whose size is equal to the number of columns in the image

Horizontal & Vertical Projections

1 4 3 2 2 3 6 0Vertical Projection

Horizontal Projection

222

36222

1) Vertical projection records count of black pixels in each column

2) Horizontal projection records count of black pixels in each row

Computing Horizontal & Vertical Projections● The image of every character

from a given alphabet is rotated and horizontal & vertical projections are computed for every rotation

● In the example on the right the image of A is rotated by 90 degrees four times & two projections are computed for each angle

● Such projections are filed away and used at run time for text skew angle detection

0HP 0VP

90HP 90VP

180HP 180VP

270HP 270VP

Text Skew Angle with Horizontal & Vertical Projections

● At run time, text is segmented into characters (this is not a trivial task, and is error-prone)

● A horizontal & vertical projection is computed for each character (or for selected characters)

● The computed projections are matched against the pre-computed horizontal & vertical projections

● The closest match determines the possible angle

Two Projection-Based Algorithms

● There are two seminal text skew angle detection algorithms: Postl's [7] & Hull's [8]

● Postl's algorithm calculates horizontal projection profiles for every character in the alphabet in small increments (e.g., in increments of 5 degrees) and uses the sum of squared differences for projection matching to determine the skew angle

● Hull's algorithm also uses projection profiles but rotates only black pixels instead of entire images

Text Skew Angle Detection with

2D Haar Wavelets

Computing Horizontal, Vertical, & Diagonal Wavelets

1) 2D HWT is applied to image a given number of times (twice in this case)

2) Application of 2D HWT returns an array of four n x n matrices [AVR, HC, VC, DC] (e.g., if the input image is 1024 x 1024, the size of each of the returned arrays is 256 x 256)

3) AVRG is matrix of averages; HOR is matrix of horizontal wavelets; VER is matrix of vertical wavelets; DIG is matrix of diagonal wavelets

crHC ,

crVC ,

crDC ,

Binarizing Horizontal, Vertical, & Diagonal Wavelets

HC, VC, DC matrices are binarized to eliminate noise

Combining Horizontal, Vertical, & Diagonal Wavelets

Binarized HOR, VER, DIG matrices are combined into one matrix using the following formula

ly.respective matrices, wavelet diagonal and

vertical,,horizontal are ,,,,,

matrix,result theis ,,1 where

,,,,,

crDCcrVCcrHC

crC

crDCcrVCcrHCcrC

crHC ,

crVC ,

crDC ,

crC ,

Computing Text Skew Angle

Convex Hull algorithm is applied to the combined matrix to find the smallest rectangle around the text area; the rectangle is used to determine the text skew angle

crC ,

Text Skew Angle Detection: Algorithmic Chain

Take Image

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Combine

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Combine Find Rectangle

Text Skew Angle Detection: Algorithmic Chain

Take Image Apply 2D HWT Binarize Find Rectangle

Angle SkewText

Combine

Text Skew Angle Computation: PseudocodeLines 1 – 4: Overall

algorithm

Lines 5 – 14: Binarize Wavelets

Lines 16 – 26: Combine wavelet matrices; threshold combined pixel values; bind thresholded values with a rectangle; compute text skew angle

Evaluation

Image Sample

● 607 random RGB images* were selected from a set of 607 smartphone video recordings of common grocery products

● Two human volunteers were recruited to determine the text skew angle with an open source protractor program**

● The text skew angles determined are the ground truth

*Images are available at https://usu.app.box.com/s/9zk660t5h1g0dmw4pjj1x1yp6r7zovp3

**Open source protractor program http://sourceforge.net/projects/osprotractor/

Ground Truth with Open Source Protractor

Image on the right shows a humanevaluator using OpenSourceProtractor to estimate text skew angle

Three Evaluated Algorithms

● Algorithm 1: T. Zaman, V. Kulyukin. "Text Skew Angle Detetion in Vision-Based Scanning of Nutrition Labels." In Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA

● Algorithm 2: Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986

● Algorithm 3: Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64

Error Dispersion Plots

Algorithm 1 Algorithm 2 Algorithm 3

X-axis is the image numbers from 0 to 606;Y-axis is the text skew angle error compared to the ground truth (0 is the ground truth)

Performance Comparison Table

Algorithm 1 Algorithm 2 Algorithm 3

Time (ms) 341.37 6253.02 5908.18

Algorithm 1 Algorithm 2 Algorithm 3

Median Error 4.62 68.85 20.92

Table I. Processing time in milliseconds

Table II. Median error in text skew angle estimation

Observations● Algorithm 1 has an average processing time of 341.37 ms, which is

significantly faster than Algorithm 2 (Postl) and Algorithm 3 (Hull), because Algorithm 1 does not perform any rotation of images; for the sake of objectivity, it should be noted that Algorithms 2 & 3 were originally designed for document scanners with smaller text skew angles and ideal (more or less) lighting conditions

● Algorithm 1 has a lower median text skew angle error than either Algorithm 2 or Algorithm 3

● Error dispersion plots show that Algorithm 1 shows closer clustering around the 0 line which is the ground truth than either Algorithm 2 or Algorithm 3

Selected Paper References

[1] Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986.

[2] Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64.


Recommended