Date post: | 11-Aug-2015 |
Category: |
Science |
Upload: | vladimir-kulyukin |
View: | 134 times |
Download: | 0 times |
Text Skew Angle Detection in
Vision-Based Scanning of Nutrition Labels
Tanwir Zaman Vladimir Kulyukin
Department of Computer ScienceUtah State University
Outline
● Background● Text Skew Angle Detection with 2D Haar
Wavelets● Evaluation: Comparison of Proposed Algorithm
with Postl's and Hull's Algorithms on Nutrition Label Images
Motivation
● OCR engines have significant difficulties with skewed texts
● If the text skew angle is known, the image can be rotated and then OCRed
● Or, which is cooler and faster, image can be OCRed in-place without any rotation
Text Skew Angle Detection Algorithms
● A variety of algorithms have been developed to determine text skew angle
● Many of these algorithms use horizontal & vertical project profiles
● A horizontal projection profile is a 1D array whose size is equal to the number of rows in the image
● A vertical projection profile is a 1D array whose size is equal to the number of columns in the image
Horizontal & Vertical Projections
1 4 3 2 2 3 6 0Vertical Projection
Horizontal Projection
222
36222
1) Vertical projection records count of black pixels in each column
2) Horizontal projection records count of black pixels in each row
Computing Horizontal & Vertical Projections● The image of every character
from a given alphabet is rotated and horizontal & vertical projections are computed for every rotation
● In the example on the right the image of A is rotated by 90 degrees four times & two projections are computed for each angle
● Such projections are filed away and used at run time for text skew angle detection
0HP 0VP
90HP 90VP
180HP 180VP
270HP 270VP
Text Skew Angle with Horizontal & Vertical Projections
● At run time, text is segmented into characters (this is not a trivial task, and is error-prone)
● A horizontal & vertical projection is computed for each character (or for selected characters)
● The computed projections are matched against the pre-computed horizontal & vertical projections
● The closest match determines the possible angle
Two Projection-Based Algorithms
● There are two seminal text skew angle detection algorithms: Postl's [7] & Hull's [8]
● Postl's algorithm calculates horizontal projection profiles for every character in the alphabet in small increments (e.g., in increments of 5 degrees) and uses the sum of squared differences for projection matching to determine the skew angle
● Hull's algorithm also uses projection profiles but rotates only black pixels instead of entire images
Computing Horizontal, Vertical, & Diagonal Wavelets
1) 2D HWT is applied to image a given number of times (twice in this case)
2) Application of 2D HWT returns an array of four n x n matrices [AVR, HC, VC, DC] (e.g., if the input image is 1024 x 1024, the size of each of the returned arrays is 256 x 256)
3) AVRG is matrix of averages; HOR is matrix of horizontal wavelets; VER is matrix of vertical wavelets; DIG is matrix of diagonal wavelets
crHC ,
crVC ,
crDC ,
Binarizing Horizontal, Vertical, & Diagonal Wavelets
HC, VC, DC matrices are binarized to eliminate noise
Combining Horizontal, Vertical, & Diagonal Wavelets
Binarized HOR, VER, DIG matrices are combined into one matrix using the following formula
ly.respective matrices, wavelet diagonal and
vertical,,horizontal are ,,,,,
matrix,result theis ,,1 where
,,,,,
crDCcrVCcrHC
crC
crDCcrVCcrHCcrC
crHC ,
crVC ,
crDC ,
crC ,
Computing Text Skew Angle
Convex Hull algorithm is applied to the combined matrix to find the smallest rectangle around the text area; the rectangle is used to determine the text skew angle
crC ,
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize Combine Find Rectangle
Text Skew Angle Detection: Algorithmic Chain
Take Image Apply 2D HWT Binarize Find Rectangle
Angle SkewText
Combine
Text Skew Angle Computation: PseudocodeLines 1 – 4: Overall
algorithm
Lines 5 – 14: Binarize Wavelets
Lines 16 – 26: Combine wavelet matrices; threshold combined pixel values; bind thresholded values with a rectangle; compute text skew angle
Image Sample
● 607 random RGB images* were selected from a set of 607 smartphone video recordings of common grocery products
● Two human volunteers were recruited to determine the text skew angle with an open source protractor program**
● The text skew angles determined are the ground truth
*Images are available at https://usu.app.box.com/s/9zk660t5h1g0dmw4pjj1x1yp6r7zovp3
**Open source protractor program http://sourceforge.net/projects/osprotractor/
Ground Truth with Open Source Protractor
Image on the right shows a humanevaluator using OpenSourceProtractor to estimate text skew angle
Three Evaluated Algorithms
● Algorithm 1: T. Zaman, V. Kulyukin. "Text Skew Angle Detetion in Vision-Based Scanning of Nutrition Labels." In Proceedings of the 19th International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2015). Las Vegas, NV, USA
● Algorithm 2: Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986
● Algorithm 3: Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64
Error Dispersion Plots
Algorithm 1 Algorithm 2 Algorithm 3
X-axis is the image numbers from 0 to 606;Y-axis is the text skew angle error compared to the ground truth (0 is the ground truth)
Performance Comparison Table
Algorithm 1 Algorithm 2 Algorithm 3
Time (ms) 341.37 6253.02 5908.18
Algorithm 1 Algorithm 2 Algorithm 3
Median Error 4.62 68.85 20.92
Table I. Processing time in milliseconds
Table II. Median error in text skew angle estimation
Observations● Algorithm 1 has an average processing time of 341.37 ms, which is
significantly faster than Algorithm 2 (Postl) and Algorithm 3 (Hull), because Algorithm 1 does not perform any rotation of images; for the sake of objectivity, it should be noted that Algorithms 2 & 3 were originally designed for document scanners with smaller text skew angles and ideal (more or less) lighting conditions
● Algorithm 1 has a lower median text skew angle error than either Algorithm 2 or Algorithm 3
● Error dispersion plots show that Algorithm 1 shows closer clustering around the 0 line which is the ground truth than either Algorithm 2 or Algorithm 3
Selected Paper References
[1] Postl, W. "Detection of linear oblique structures and skew scan in digitized documents." In Proc. of International Conference on Pattern Recognition, pp. 687-689, 1986.
[2] Hull, J.J. "Document image skew detection: survey and annotated bibliography," In J.J. Hull, S.L. Taylor (eds.), Document Analysis Systems II, World Scientific Publishing Co., 1997, pp. 40-64.