+ All Categories
Home > Documents > UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard...

UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard...

Date post: 20-Jan-2016
Category:
Upload: joleen-gaines
View: 216 times
Download: 0 times
Share this document with a friend
33
UC Berkeley CS294-9 Fall 2000 4- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California – Berkeley Xerox Palo Alto Research Center
Transcript
Page 1: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 1

Document Image AnalysisLecture 4: Image Transformations

Richard J. FatemanHenry S. Baird

University of California – BerkeleyXerox Palo Alto Research Center

Page 2: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 2

The course so far….• Reminder: All course materials are online:

http://www-inst.eecs.berkeley.edu/~cs294-9/

• Overview of the DIA Research Field

• Some applications (Postal Addresses, Checks):

• Research Objectives: more systematic

modeling, design

• Some basic engineering

Page 3: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 3

Some disclaimers: we are not experts

• contrast w/ computer vision, psychophysical image processing

• contrast w/ Gestalt theory, human reading, psychophysics of reading

Page 4: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 4

Do we attempt to emulate humans by programming? (Ha & Bunke paper)

• Image acquisition

• Image transformation

• Image segmentation

• feature extraction

• No, but we reach for similar goals

Page 5: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 5

Psychophysical questions

• Biological, especially human, vision represents the existence proof of algorithms that solve our problems

• How do brains learn to see/ connect to visual system? (the wiring is not encoded in genes): Self organization seems key.

Page 6: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 6

Psychophysical Reading

• How fast can one read?• What about comprehension (typically,

above 200wpm comprehension declines)• What do we read? words not letters.• How is reading disability related with

processing (e.g. dyslexia)• What if anything has this to do with DIA?

Page 7: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 7

Computer Vision: different emphasis from DIA

• See for example, David Forsyth’s Computer Vision text

• recognition of objects, scenes, faces, patterns, visual memory; attention; and visual (and cognitive) pleasure

• change, motion, relationship to motor activities.

Page 8: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 8

Computer Vision: relations

• Solving CV would solve DIA• Solving DIA (more likely in some senses)

might serve as a paradigm for CV. At least if we did it in some respectable fashion.

• Actually recent activity in Speech Understanding seems to be relevant to DIA...

Page 9: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 9

Gestalt Theory I

• Fundamentally, the issue is one of understanding invariance:

• How can an object, say a square or a triangle, can be recognized regardless of its

• rotation,

• translation

• scale

• contrast

• outline or solid rendering

• texture, motion…

Page 10: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 10

Page 11: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 11

Gestalt Theory II

• Biological vision handles these easily.• This suggests that invariance is

fundamental to our visual representation. • E.g. In the case of rotation invariance,

perhaps we separately perceive/encode:– structure – orientation

Page 12: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 12

Gestalt Theory III• We keep track of objects when we turn our

head or walk• Translation and rotational constancy of the

perceived world vs. what received• Whatever the computational mechanism it has

to account for these issues (+ and -)• A A A a a a• Context: Univ. of Illinois, Chapter III, 3.l4l59

• durnptruck

Page 13: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 13

Examples of post-acquisition image analysis

• Preparation for OCR• Not symbol- or character- based• (We acknowledge that this is

feedforward, and not optimal, but so it goes.)

Page 14: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 14

What can we do?

• Transform the image by local morphological computation

• Look for more global attributes (e.g. texture and FFT)

• If possible, do transformations on compressed form.

Page 15: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 15

Can we find some tools

• Finding connected components• Boundaries• Morphological transforms• Thinning or “Skeletonization”• (gray-scale) contour following• Edge encoding/ vectorization• Recursive X-Y cuts

Page 16: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 16

e.g. Removing rotation (skew)• Some excellent methods (e.g. HSB)• Humans notice skew of even a fraction

of a degree; it doesn’t inhibit our reading but it DOES make trouble for OCR.

• Removing skew approximately:

Page 17: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 17

Deskewing / matrix transform

True rotation

Again, at 90 degreesSide slip

Page 18: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 18

Remove noise (many models)

• More later (HSB)• A few for now

– Salt & Pepper– Too much ink (blurred, touching)– Too little ink (broken characters)

Page 19: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 19

Removing slant from characters

• Mask out horizontal lines (optional)• Look for “best slant”• ABCDEFGHIJKLMN• ABCDEFGHIJKLMN

Page 20: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 20

Erode, Dilate, Open, Close

• Erosion: remove 1 layer of boundary• Dilate: add 1 layer of boundary• Open: E then D• Close: D then E• Hit/Miss

Page 21: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 21

SE33SE3

Page 22: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 22

Objectives:

• SE1: looks for 3 horizontal dots• SE2: identify • SE3 & SE4: identify corners• SE6: isolate lines 6 units apart.. Or..

Page 23: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 23

Segmentation by recursive X/Y Cuts “top down”

Page 24: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 24

Segmentation by Smearing

Smear horizontally until letters touch, more until words touch lines

Smear vertically until lines touch paragraphs

Page 25: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 25

Smearing example

character

word

line

paragraph

Page 26: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 26

Canonicalize elongated objects by thinning

• A A A • These should all be “the same”• Not useful for squares, circles• Perhaps most useful for handwritten data

• Huge literature, far in excess of what it

deserves (relative to usefulness)

• Nevertheless…

Page 27: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 27

Skeletonization Requirements

• Connected image regions connected lines

• Result is minimally 8-connected• Approximate “medial lines”• Extraneous spurs should be minimized• Loss of information makes it not always

advisable.

Page 28: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 28

Medial Axis Computation

• For every point P in the object, locate the closest point on the boundary.

• If there are two such points (at the minimum distance) then P is on the Medial Axis

• Alternatively, think of pixels as point sources of a wave front. 2 waves meet at the MA.

Page 29: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 29

Medial Axis Computation

Medial axis and skeletons with

4-distance, 8-distance, Euclidean distance

(JR Parker)

Page 30: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 30

The computation is fragile

• The T-shaped object but with one pixel missing

Page 31: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 31

Iterative Morphological Thinning

Page 32: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 32

Hypermedia image processing reference ©

• http://www.cee.hw.ac.uk/hipr/html/thin.html

Page 33: UC Berkeley CS294-9 Fall 20004- 1 Document Image Analysis Lecture 4: Image Transformations Richard J. Fateman Henry S. Baird University of California.

UC Berkeley CS294-9 Fall 2000 4- 33

Other approaches

• Cellular automata more generally• Geometric computation (voronoi

diagrams)• Stroke based decomposition/ syntactic

generation• Computation based on compressed

version (RLE boundary), skew on CCs


Recommended