Image Processingon the GPUEdge Detection
(with implementation on a GPU)And
Text Recognition(if time permits)
Jared BarnesChris Jackson
Edge Detection◦ Wikipedia: Identifying points in a digital image at
which the image has discontinuities.
Edge Detection
http://4.bp.blogspot.com/-p_9w91wC_Rc/TbBgF7dQYhI/AAAAAAAAACM/DQTrM_a7Apg/s1600/edge-example-c.png
Canny Edge Detector John Canny “A Computational Approach to Edge
Detection” 1986
http://ieeexplore.ieee.org.libproxy.mst.edu/stamp/stamp.jsp?tp=&arnumber=4767851
1. Noise Removal
2. Image Gradient Computation
3. Non-Maximum Suppression
4. Hysteresis Thresholding
Canny Edge Detector
Gaussian Smoothing or Blurring
A pixel is changed based on a weighted average of itself and its neighbors
The number of neighbors (3x3, 5x5) and the relative weights can vary
Noise Removal
3D Gaussian Distribution
Normalized 2D Gaussian Approximation
http://www.librow.com/content/common/images/articles/article-9/2d_distribution.gif
http://homepage.cs.uiowa.edu/~cwyman/classes/spring08-22C251/homework/canny.pdf
Gaussian Smoothing – Avoid Blur
Too much
About right
http://media.tumblr.com/ccd6945141b46e5e2f5c36168f6a8037/tumblr_inline_mhcv1l0EZB1qz4rgp.png
http://www.eversparkinteractive.com/wp-content/uploads/2013/03/gaussian-blur-thumbnail.jpg
Spotty Smooth
Calculus!
First derivative inthe X and Y directions(separately)
Gradient Computation
http://homepage.cs.uiowa.edu/~cwyman/classes/spring08-22C251/homework/canny.pdf
Gx Gy
Sobel Operator (2 kernels)
Then round to:0° =←→ 90°=↑↓45°=↗↙ 135°=↘↖
Image Gradient: Sobel Operator
http://suraj.lums.edu.pk/~cs436a02/CannyImplementation.htm
X Gradient
(Horizontal Edges)
Y Gradient(Vertical Edges)
Make edges exactly one pixel thick Look at the gradient magnitude of your 2
neighbors in the direction of your angle
Non-Maximum Suppression
80 85 90
80 85 90
80 85 90
35 35 50
35 50 40
50 40 40
Example 2Angle = 0° ←→
Example 1Angle = 135° ↘↖
80 85 90
80 85 90
80 85 90
35 35 50
35 50 40
50 40 40
80 85 90
80 0 90
80 85 90
35 35 50
35 50 40
50 40 40
Keep it!
Kill it!
Non-Maximum Suppression
http://suraj.lums.edu.pk/~cs436a02/CannyImplementation.htm
Thick Edges
(Gradient Magnitude)
Thin Edges
(Gradient Magnitude)
Two thresholds are better than one!
If a pixel’s value is above Thigh, it’s an edge.
If a pixel’s value is below Tlow, it’s not an edge.
If a pixel’s value is between Thigh and Tlow, it might be an edge (provided it is connected to an actual edge)
Hysteresis Thresholding
0 0 0 0 0 0 0
0 50 50 50 50 50 0
0 0 0 50 0 0 0
0 0 0 50 0 0 0
0 36 0 50 0 40 0
39 0 0 50 0 0 39
0 38 0 50 0 43 0
0 0 0 50 0 0 0
0 40 40 50 40 40 0
0 0 0 0 0 0 0
Thigh = 45 Tlow = 35
0 0 0 0 0 0 0
0 50 50 50 50 50 0
0 0 0 50 0 0 0
0 0 0 50 0 0 0
0 36 0 50 0 40 0
39 0 0 50 0 0 39
0 38 0 50 0 43 0
0 0 0 50 0 0 0
0 40 40 50 40 40 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 50 50 50 50 50 0
0 0 0 50 0 0 0
0 0 0 50 0 0 0
0 36 0 50 0 40 0
39 0 0 50 0 0 39
0 38 0 50 0 43 0
0 0 0 50 0 0 0
0 40 40 50 40 40 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 255 255 255 255 255 0
0 0 0 255 0 0 0
0 0 0 255 0 0 0
0 0 0 255 0 0 0
0 0 0 255 0 0 0
0 0 0 255 0 0 0
0 0 0 255 0 0 0
0 255 255 255 255 255 0
0 0 0 0 0 0 0
1. Smooth image to reduce noise
2. Calculate X & Y derivatives to get edges
3. Thin all edge widths to 1 pixel
4. Remove weak, unconnected edges
(ta da!)
Canny Edge Detector
How do we parallelize the Canny Edge Detector?
Parallelization!
Convolution – Independent of order
Parallelize: Canny Edge Detector
5 5510 101020 2020
2 224 222 22
Image Kernel2 224 222 22
10 101040 202040 4040
Element-wise Multiplication
230Sum All Values
11Divide by Kernel
Sum
11
Convolve a Gaussian Kernel with the image
Each GPU core can convolve each pixel in the image individually with the Gaussian Kernel
One thread per pixel, each performing 9 multiplies, 9 adds, and 1 division
Embarrassingly Parallel with huge speedup
Step 1: Gaussian Smoothing
Convolve two Sobel Kernels with the image
Wait, convolution again?
Same as previous step – we can even reuse the convolution function!
Step 2: Gradient Computation
Comparing 3 pixel gradient magnitudes and clearing the middle pixel or leaving it alone
Similar to convolution… but simpler! Each GPU thread owns a pixel:
1. Check gradient angle of pixel2. Compare this pixel’s magnitude with two
neighbors in the direction of its angle3. If I’m greater than those neighbors, leave me
alone; otherwise, mark me as “not an edge” Less speedup than steps 1 and 2
Step 3: Non-Maximum Suppression
Mark pixels > Thigh as strong edges Mark pixels < Tlow as not edges Mark remaining pixels as weak edges if they
connect to a strong edge Typically implemented with recursion Each thread with a weak-edge pixel looks at
nearest 2 neighbors to find a strong-edge pixel
With identical algorithms on CPU and GPU, speedup is marginal (memory accesses, not much processing)
Step 4: Hysteresis Thresholding
Optical Character Recognition
http://www.flacom.com/content/uploads/2013/09/hello-world.jpg
Wikipedia: The mechanical or electronic conversion of images of printed text into computer-readable text.
http://hackadaycom.files.wordpress.com/2010/09/helloworldconsole.png
Label Connected Components
Look For Letters
Adjust for disconnected letters
I have the edges – Now what?
HELLO WORL
D
HELLO WORL
D
EF
? ü j i
Create a list of components in the image A component is simply a set of connected
edges1. Label each edge pixel with a unique
component ID2. Examine each pixel’s 8 touching neighbors
and set that pixel’s ID to the smallest neighbor ID
3. Repeat step 2 until no pixel IDs are changed
Connected Components Labelling
Uhh… what’s a letter? How do we know it’s a letter? How does the computer know it’s a letter?
Look For Letters
Letters are represented by a vector of numbers indicating the ratio of black pixels to white pixels in each division of the letter-image.
Wait, what is a letter?
A0 40 0 00
60 40 60 5555 15 55 0010 75 10 00
15 0 15 1515
Compute how closely each labelled component matches each letter in your alphabet
The component is then marked with whichever letter it most closely matches
Okay – Now look for letters
Letters like ‘i’ and ‘j’ have floating parts Sometimes edge detection may accidentally
break up a letter A letter vector should then get an additional
property indicating vertical discontinuity
Correct for Disjoint Letters
T T E R V EL E O R … … …C T 0/1