A theory of origami world.pdf

8/10/2019 A theory of origami world.pdf

1/46

Carnegie Mellon University

Research Showcase @ CMU

Computer Science Department School of Computer Science

1978

A theory of origami worldTakeo KanadeCarnegie Mellon University

Follow this and additional works at: hp://repository.cmu.edu/compsci

is Technical Report is brought to you for free and open access by the School of Computer Science at Research Showcase @ CMU. It has been

accepted for inclusion in Computer Science Department by an authorized administrator of Research Showcase @ CMU. For more information, please

contact [email protected] .

Published In.
http://repository.cmu.edu/?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/compsci?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/scs?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/compsci?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPagesmailto:[email protected]:[email protected]://repository.cmu.edu/compsci?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/scs?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/compsci?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPageshttp://repository.cmu.edu/?utm_source=repository.cmu.edu%2Fcompsci%2F2310&utm_medium=PDF&utm_campaign=PDFCoverPages


2/46

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:

The copyright law of the United States (title 17, U.S. Code) governs the making

of photocopies or other reproductions of copyrighted material. Any copying of this

document without permission of its author may be prohibited by law.


3/46

Takeo Kanade

Department of Computer Science

Carnegie-Mellon University

Pit tsburgh, Pa. 15213, USA

and

Department of Information Science

Kyoto University

Kyoto, JAPAN

September 20, 1978

Abstract

The recovery of three-dim ensional configurations of a scene from Its Image Is one

the most Import ant steps In comput er vision. The Origami world Is a m odel funderstanding l ine drawings In term s of surfaces* and for f inding their 3-D configuration

It assumes t hat surfaces them selves can be stand-alone objects, unl ike conventional m odel

such as the trihedral world* which assume sol id objects. We have establ ished a label i

procedure for this Origami world* which can find the 3-D m eaning of a given l ine drawing

assigning one of the four labels* (convex edge)* - (concave edge)* * -* and -> (occludi

bounda ry) to each line. The procedur e u ses a filter ing procedure not only for junction Labe

as In the Waltz label ing for the trihedral world* but also for checking the consistency

surface or ientat ions. The theor y Includes the Huffm an-Clowes-Waltz labelings for the trih edr

solid-object world as a subset. It shows great potent ial for the applicat ion of recovering 3

configurations from region-segm ented Images; other Informat ion (such as spectr

Informat ion) avai lable from Im ages can also be Incorporated smoothly. This paper also reveaInterestin g relatio nships am ong previous research In polyhedral scene analysis.

This r esearch was support ed by the Defense Advanced Research Project Agency und

contract no. F33615-78-C-1551 and monitored by the Air Force Office of Scientific Research

UWVFHSITY LIBRARIES

pARHEGlE-WaiON

UNIVIRSITV ,

PITTSBURGH.PENNSYLVANIA 15213 i


4/46

Origami Kanad

I Introduction

Origami is the Japanese traditional manual art of making various shaped objects (e.

animals) by f oldi ng a sheet of paper. Figure 1 is a typical example of Origami. It is easy

see that Figu re 1 is an Origami crane. This process of seeing and understandi ng may b

di vi ded into two processes: one is to determine the possible three-dimensional conf igurat ion

from the picture, and the other is to match them with some known concepts (such as "crane"

This paper deals wit h the fi rst process. Thus the problem is: how do we underst and th

possible three-dimensional configurations from a collection of lines?

One solu t ion i s: f ir st , model a worl d (I will call it the "Origami" wor ld ), where sur face

them selves can be stand-alone objects, rather than the conventional tr ihedral sol id -obje

wo r l d; secondl y, establ ish a procedure which can assign a 3-D meaning to each line. Th

pr ocedur e developed uses a fi ltering method both for finding consistent combinati ons

labels and for testing the consistency of surface orientations based on the gradient spac

repr esen t at ion. Not only does this surface-oriented Origami world include the cases of th

sol i d-ob ject wor ld, studi ed by Huffman [Huffman, 1971], Clowes [Clowes, 197 1], and Wal

[Waltz, 1972], as a subset, but It also demonstrates various features that have the potenti

to be used in image understanding tasks of real-world images.

Figure 1 Origami crane.

1


5/46

Figure 2 Photograph of a carton paper box.

Figure 3 Line drawing of a carton paper box.


6/46

Origami Kanade

I I Key Ideas and Related Work

Illustrative Examples

Let us have several illustrative examples of simple line drawings for the following

di scussions. Suppose that an image of a box case made of carton paper (Figur e 2) is gi ven .

How do we recognize that the object in the image has a shape of "box" (i.e., an open-faced

cube)? A line drawi ng der ived from the image such as Figure 3 has long been an import ant

product of t he initial feature extraction process. In fact, we can imagine the t h ree -

dim ensional shape of " box" from Figure 3. As other examples, the drawings in Figur e 4

usually convey to the viewer the meaning intended by the artist: (a) a cube, (b) a W-folded

pa per, and (c) two coordinate planes int ersected. However, take Figure 4(a) for example:

other possible configurations, such as those in Figure 5, are imaginable.


7/46

Origami Kanad

L ARROW FORK T

Figure 8 Junct ion types t reated in this paper.

Figure 10 Examples of configurat ions at vert ices: (a) one quadrant plane whi ch

generates an L junct ion; (b) two quadrant planes which generate an

ARROW junct ion; (c) three quadrant planes whi ch generate a T

junct ion; (d) four quadrant planes which generate a T junction.

6


8/46

OrigamiKanad

I I I The Theor y of Origami World

The presentation of the theory of the Origami world consists of seven subsections:

(1) Terminology;

(2 ) The enumerat ion of legal combinations of line labels at juncti ons;

(3 ) The links bet ween regions;

(4) The probl em concerning consistency of surface orienta t ions;

(5 ) A test method for consistency of surface orienta ti ons;

(6 ) The actual labeling procedure;

(7 ) Examples of labeling.

Il l1 Origami World and Terminology

The wor ld is assumed to be made of a collection of surfaces. A line drawi ng is

picture (ort hograph ic projecti on) of such a composite in the scene. For the time being th

surface in our Origami world are assumed to be planar; i.e.. the orientation is constan

t hrough out a surf ace (actual ly, the restrict ion to plane surfaces will be relaxed a lit tl e in t h

la ter secti ons). In this respect it is not t he paper-surface (i.e., developable surf ace) w or l

investigated in [Huffman, 1976].

The terminology we will use for the Origami world parallels that for the Waltz labelintheory for the trihedral world [Waltz, 1972]. An edge is a straight boundary of a plan

surface. A vertex is a point wher e edges of the surf ace(s) meet. A l ine is an orthographi

pr oj ect ion of an edge to the picture plane. A junction is a point in the picture where li ne

meet . A junct ion can be the projecti on of a vertex or the point where an edge is i nt err upt e

by an occluding surface. A region is an area in the picture surrounded by li nes, and

corresponds to (a visible part of) a surface.

An edge can be classified according to its three-dimensional physical meaning in the

scen e. We wil l use the fol lowing terms and labels:

convex + : edge along which two surfaces meet and form a con vexi t yconcave - : edge along which two surf aces meet and form a concavi ty

occluding - or -> : edge along which one surface occludes another

The di rect ions of ar rows of occluding boundaries are given in such a way that the occluding

sur face is on the ir right hand side. A line can therefore be labeled wi th one of the f ou

labels (+, -,


9/46

Origami Kanade

labels to t he lines in the drawing is to give a three-dimensional meaning to the drawi ng. A

set of assignments of line labels to the lines in the drawing is called an interpretation of the

dr aw ing. For example, the labeling shown in Figure 6 is an int erpret at ion of Figure 4 (a).

Junctions are classified according to the number of lines meeting at the junctions and

their geometrical configurations in the picture. In this paper we will confine ourselves to L

ARROW, FORK and T junctions shown in Figure 8.

111*2 The Enumerat ion of Legal Junction Labels

The physical world imposes constraints on the labels that lines can take at a particula

t yp e of j unct ion. A combination of line labels for one juncti on t ype is ref er re d to as junct io n la be L The crucial observat ion which was made by Huffman and Clowes, and whi c

was exploited to a great extent by Waltz, is that not all the combinatorial^ possible junction

labels can appear (are legal) in the picture. For example, for the ARROW junct ion, onl y t hree

jun ct ion label s out of the 4x4x4 possible combinations can occur in the t ri hedral wor ld

Needl ess t o say, unless we assume a certain restr ict ion on the three-dimensiona

configurations allowable at the vertices, the resultant constraints on junction labels will be

too weak to be useful. We need to confine ourselves to a reasonably limited world which

corresponds well to the real world images.

The confinement we adopt in the Origami world is that surfaces meet edge to edge

that no more than three surfaces of dif ferent orientati ons can meet at a ver t ex, and that thcombi nat ion of t he three orientat ions is "general" , in the sense that t hey span the t hr ee

dimensional space ( i .e. , each o r i en ta t i on has a vector component perpendicular to the ot he

t wo). Thus, no more than three edges of dif ferent directions are involved at a ver tex. Le

us call such vertices up-to-3-surface verti ces. This restri ction corresponds to the t ri hedra

vert i ces in the soli d-object worl d. Note, however, that the up-t o-3-sur f ace verti ce

generate a richer world than the world generated by the trihedral vertices, since the forme

can include 1- and 2 - surface vert ices; that is, it allows f ree extending surf aces as st an d

alone objects.

Possible junction labels for the up-to-3-surface vertices in the Origami world can be

enum era t ed in the fol lowing way. The planes of three general orientat ions int ersect andi vide each ot her i nto 12 partial planes. Thus we can think of 12 quadrant pl ane surf ace

around the vertex point as shown in Figure 9.

Let us fix our eye position in one of the eight octants separated by the quadran

pl anes, say, the octant bounded by the quadrants 0,4, and 7. Next, we generat e one by on

all the possible (4096) combinations by setting each quadrant plane to be either occupied o

vacant, and check how the vertex formed at the origin appears when viewed from the ey

8


10/46

Origami Kanad

pos i t ion f i xed as above. Then we can give a label to each line at the junct ion based on i

meaning, and obtain a legal junction label. Figures 10(a) through 10(d) show examples of th

ve r t ex conf igurat ions and their derived junct ion labels. As previously stated, we conside

on l y t he combinat ions which result in the juncti on types shown in Figure 8. The number o

j unct ion labels thus obt ained is: 8 for L, 15 for ARROW, 9 for FORK, and 12 for T.

For t he junct ion type T, the four additional junct ion labels shown in Figure 11 ar

included as legal. They do not correspond to actual vertices, but to the cases in which th

jun ct ion is caused because the upper half plane is in the front and occludes the edge behi nd

Table 1 compares the number of legal up-to -3-sur face junction labels thus obtai ned wit h t ha

of legal t ri hedra l junct ion labels. It gives an idea of the degree of constraint imposed by t h

up -t o-3 -su r f ac e Origami world compared with the Huffman-Clowes t rihedral-juncti on w orl d

The appendix gives a complete list of legal juncti on labels in the Origami wor l d; for eacjunct ion label , it includes an il lustrat ive f igure of the conf igurat ion which the labe

represents, and the l inks which will be explained next.

Figure 11 Legal T juncti on labels not corresponding to vert ices.

Tabl e 1. Compar ison of the size of the Origami junct ion dict ionary wi th the

Huffman-Clowes dictionary.

Junction

Type

Huffman-Clowes

Dictionary

Origami World

Dictionary

L 6 8

ARROW 3 15

FORK 3 9

T 4 16

9


11/46

Origami Kanad

I I I -3 The Links

Each junct ion label deri ved in the previous subsection implies which sur faces a

con nec ted at which edge in ord er to form that juncti on label. For later use, this inform at io

is also stored explicitly in the dictionary by means of a l ink, which links a pair of connec te

regions and points to the line at which they intersect. For example, the link in a legal FOR

jun ct ion label shown in Figure 12 represent s that the regions Rj and R2 are conn ected

the convex line L. Note that since the region R3 is totally occluded by the other two region

(in other words, it is the background), it has no relationship to others at this junction.

In the case of junction labels which involve partially occluded regions, a subt

si tuat ion occurs. Take the ARROW juncti on label shown in Figure 13(a) as an example. Th

junct ion label was original ly derived from the conf igurat ion shown in Figure 13(b); th

sur faces Sj and $2 connect at the edge BC. However, note that the junct ion label i tself ca

mean ot he r cases such as those shown in Figure 13(c) and 13(d): (c) is the case wh er e

and S2 intersect wi thin the angle ABC, and (d) is the case where Sj and S2 wil l int erse

outside of the angle ABC, when they are extended.

In the Origam i wor ld we will assume that the situat ion shown in Figure 13(c) is what

happening near the vert ex. This assumption allows more configurat ions than assumin

merely the case of Figure 13(b). It seems reasonable to exclude those situati ons lik e Figu

13 (d), because t hey are accidental cases caused by a particular relat ionship b et ween th

view direction and the vertex. In fact, if we move our view direction a little to the left, thve r t ex of Figure 13(d) may appear like Figure 13(e), even though we are looking at the sam

sides of the same surfaces.

Therefore, the link for the junction label of Figure 13(a) is given as shown in Figu

13(f); it represents that the region R^ and R 2 are connected at an "occluded intersectio

li ne L' (it s label is >), whi ch is located with in the angle ABC. Note that t he line L' can over la

Fig ur e 12 Link bet ween regions.

10


12/46

Origami Kanad

wi t h BC, but not wi th AB, because if it did, the label of AB wil l change from


13/46

Origami Kanad

D

(a) (b) (c)

Figure 14 Interpretation inconsistent with respect to surface orientations.

actually the same surface and therefore have aunique surface orienta ti on in the scene. Th

example demonstrates that we need a provision to check such global consistency of surfac

orientations.

It should be noted that the kinds of anomalies illustrated above, which are caused b

rel yi ng solely upon the* junct ion dicti onary, have also occurred in the Huff man-Clowes-Wal-

label ing for the tri hedral solid-object world. But because they occured "less fr equent ly

t hey did not show up as a very serious problem. Figure 15 is an example of such a

anoma ly show n in [Mackwort h, 1977]. All the junction labels in it are legal in the t ri hedr

world, but it can be seen that the configuration is not realizable in that world.

1 R 2 + RJ +RA

Figure 15 Anomaly in the Huffman-Clowes-Waltz labeling.

12


14/46

Origami

viewer

Figure 16 Geomet ry involving the viewer , picture plane, and object .

A Tool

In order to carry out consistency checks of surface orientations it is necessary to

repr esen t surf ace orientat ions in the scene in connection wit h thei r proper t ies proj ect e

onto the pict ure. The gradient space introduced by Mackworth [Mackworth, 1973] provides

good tool for it.

Let Figu re 16 be the geometry involving the viewer , the picture plane, and the obj ec

in the scene. A plane in the scene whose surface is visible from the viewer can b

expressed as

-z ax + by + c. (1)

The two-dimensional space made of the ordered pairs (a,b) is called the gradient space G

Let us assume for our convenience that we align the directions of the coordinates of (x,y> i

the pi ct ur e wi th those of (a,b). All planes in the scene which have the same values of a an

b a re m apped into the point (a,b), called the gradient, in G.

The values of (a,b) represent how the planes are slanting relative to the view line (z

axis). For example, the origin 0Q* (0,0) of G corresponds to those planes (-zc

perpend icul ar to the view line. Pj (l ,0 ) corresponds to the planes (-z-x+ c) whi ch ar

slant ing hor izont al ly to the right. Mathematically,

a - d(-z)/ dx, b - d(-z)/ dy, (2)

whi ch is wh y (a,b) is called the gradient. Thus the l engt h-/ a 2 + b^ of the vector from OQKO,

to P(a,b) is the tangent of the angle between the picture plane and the plane

cor respondi ng to P; and the direction tan " l (b/ a) of the vector is the directi on of th

steepest change of -z (depth) on the plane.

Kanad

13


15/46

Origami Kanade

One of the useful properties of the gradient space is the following [Mackworth, 1973].

Consi der two planes meeting at an edge and the orthographic pi cture made of regions Rj

and R 2 and a HneL, as shown in Figure 17, Then in the gradient space, the gradients Gj and

G2 of the two planes should be on a line which is perpendicular to the picture l ine L

Mo reo ver , if the edge is convex (+), Gj and G2 are ordered in the same directi on as are t he

cor respo nd ing regions in the pict ure. If the edge is concave (-), thei r order is rever sed .

In the Origami world, we additionally have the case of an occluded intersection (


16/46

Origami Kanad

b

Figu re 19 Trace of gradients of the regions in Figure 14.

An Example of Nonconsistency of Surface Orientations

Now we are ready to test the example of Figure 14. Imagine a grad ient space an

ref er to Figure 19. Let Gj be the gradient of the region Rj . Rj and R 2 are connected at th

co nvex line AD in the picture. Thus, the gradient of R 2 should be somewhere on the half lin

G^a, which is perpendicular to AD and extends toward left, because of the property of the

grad ient space. Suppose it is at G2 . Again because R 2 is connected with Rj at a convex lin

BE, t he gradi ent of Rj should be somewhere on the half line G2 b as shown. Since we hav

f i xed t he gradi ent of Rj at Gj , this half line G2 b should pass Gj , whi ch is impossibl

wherever we select G 2 on the half line Gja. This means that t here is no combinat ion o

gradients for the regions Rj and R2

which results in the configuration of Figure 14(btherefore the configuration is inconsistent.

I IX-5 The Test Procedure in the Origami World

The above example has demonstrated the necessity of and the method for globa

con si st ency checks of surface orientati ons for a set of regions. This sect ion wil l present a

algorithm which indicates on what sets of regions the consistency checks are to b

per f or med and which tells whet her they can have consistent surface orientat ions. Given a

interpretation, the method consists of first constructing a labeled graph called a Surfac

Connection Graph, and then performing a type of filt ering operat ion on the constraint s in thgrad ient space. The test procedure to be presented is closely related to the idea of the dua

gr ap h of Huff man [Huffman, 1971] and the POLY program of Mackworth [Mackwort h, 1973

In fact, the Surface Connection Graph represents by and large the topological properties o

the dual graph, and the filtering procedure uses constraints in the gradient space in a more

t ho rou gh and systemat ic way than does POLY.

UNIVERSITY UBRARItS

CARNLGiE-MELLON UNIVERSITYPITTSBURGH. PENNSYLVANIA 152U

15


17/46

Origami Kanade


18/46


19/46

Origami Kanade

Figu re 22 Spanning angle of a path.

In any event, the constraint which the SCG imposes between a node pair (p,q

connect ed b y a path 7=(p-q) is that if the gradient of one node is f ixed, the gradient of th

ot he r node should be within a fan-shaped area (or, in a special case, on a half l ine as

equat ion (3)), whi ch is spanned by a non-negat ive linear combination of vectors . Let us ca

the fan-shaped area the spanning angle S^ of the path (see Figure 22).

We ca n define the f o l l ow ing compu ta t i ons on spanning angles: inverse, union, an

int ersect ion. Suppose that a path (p-q) has a spanning angle S( p ^ q ) . If we t ra verse th

path* inversel y as (q->p), the corresponding spanning angle S( q_p ) is the angle spanned by

set of vectors obtained by inverting the vectors which define S(p_> q). Graphi call y, as show

in Figure 23(a), the spanning angle S(q

^p

) is the angle opposite to S(p

_*q

). This operation called the inverse of spanning angle, and is denoted by S q ^ p - " S p ^ q .

We can concatenate two arcs, or more generally, two paths (p-q) and (q-r) to form

Figu re 23 Computati on on spanning angles: (a) inverse; (b) union; (c) in tersect ion.

18


20/46

Origami Kanad

longer path (p-q-r). The operation of spanning angles corresponding to this concatenation i

the union. Graphical ly, as shown in Figure 23(b), S( p ^ q ^ r ) is the angle of the area whic

eit her belongs to one of S^ ^ q j and ^(n^r) or belongs to the area which is separat ed b

S(p_>q) and S( q_> r) and which has an angle less than 180. Let us denot e this oper at ion b

S(p_>q_> r) ^(p->q) u ^(q-* r)- ^ y ^ e u n ' o n operation, the resultant spanning angle can

become 360: the whole 2-6 space. This happens when the set of vect ors includes mor

than t hree vect ors and the angles made by a neighboring pai r of vect ors are all less t ha

180. In such a case, the path does not give to the node pair any constraints rega rdi ng th

relative locations of their gradients.

Now, if there are two paths y and Y 2 ' R O M ^ E N 0 ( ^ E P * then t hey imposeconst ra int s on Gp and G q simultaneously; i.e.,

G q - G p - * kjx-Pji - 2 k u-Pifr (5)

wher e {P^ } is from y and {Pj 2 } from ? 2 . This means that G q should be within the

overlapping area of Sy and (Figure 23(c)). This overlapping area is the int ersect ion o

spann ing angles, and is denoted by n .

Loop-Free SCG and Elementary Paths

The operation of intersection of spanning angles suggests that we can reduce an SCG

in to a simpler f orm on which our test will be applied. First of ail, it is easily seen that onl ythose part s of the SCG which include a loop or circuit need to be actually consi dered. Thi

implies that if the SCG can be separated into two subgraphs by cutting a single arc, then

each subgr aph can be considered independently. In part icul ar, leaf nodes can be eli minated

f rom the considerat ion. Thus, we can "prune" and "cut " the SCG into a set of Leaf-fre

connected SCG (LF-SCG)% each of which is independently subject to the consistency check

In Figure 21, the leaf (R3, Rg) can be pruned, and the remainder is the LF-SCG.

Further, it is understood that the gradients of the nodes which are connected to

exact ly t wo ot her nodes (i.e., their node degree is two) in an LF-SCG, such as the nodes s

and t in Figure 24, are rela ti vely less important. They do not affect other nodes beyon d p o

q; it is only required that the relation (vector) between Gp and G q is kept the same. Thisimpli es that we can divide an LF-SCG into a collection of pat hs, each of which begi ns and

ends wi th nodes of degree more than two, and each of which contains only nodes of degree

t wo in bet ween. Let us call such a path an elementary path. In Figure 21, paths such as

(R2 ->R 3->R4) and (R2 ->Ri ) are elementary paths. We can associate a spanning angle wi th each

(di rect ed) element ary pat h. What we need now is a computational procedure on the spanning

angles of the elementary paths of an LF-SCG, to see whether the constraints on surface

orientations can be satisfied.

19


21/46

Origami Kanade

T h e n , f i l t e r t h e S ^ n " l ) by the in tersec t ion o f S r . ( n ' l ) , and se t t he resu l t t o S ^ ( n )

i.e.,

S < n ) - ( n S r . * " - 1 * ) n S < n " 1 ) ,

If any S^ n ^ becomes null, then the test fails.

(3) If there exists an elementary path such that S^ n ^ ? S^ n ~ * \ then n-n+l and go to

(2). Otherwi se, the test terminates wit h success.

Four t hings should be noted about the test procedure. First , if an LF-SCG consist s o

a single circuit (i.e., the degree oi all the nodes is 2), then we can pick up any pai r of node

and regard the two paths connecting them as elementary paths. Or, al ternat ivel y, this i

equivalent to testing whether the spanning angle corresponding to the circuit is the entire

360.

Second, the i terat ion in the procedure always term inates, since all the S ^ n ^

monoton ical ly decrease in step (2 ) and the number of possible spanning angles which S ^ n ^ *

can take is finite. (It is bounded by the number of subsets of the vectors involved in the LF

SCG under the test.)

20

Filtering Operation on Spanning Angles of Elementary Paths

Now we are ready t o describe the fi lt ering operati on defined on the SCG. We assume

that the SCG for a given interpretation has been simplified and decomposed into LF-SCG's,

and that each LF-SCG is decomposed into elementary paths. If there is no LF-SCG, there is

no need for performing the filtering procedure, and the test trivially succeeds.

(1) For each elem entary path y, associate an init ial spanning angle which is

computed from a set of vectors defined for the arcs belonging to the path. Set n -l .

(2) For each elementary path y, let {Tj} be a set of all the paths that connect the same

node pair in the same direction as y connects. Since the LF-SCG is decomposed in to

elementary paths, each T\ is a concatenati on of several elementary paths Wj^};T j = 7 j 2 " * " The'spanning angle of T j , Sj \ ^ n ~^ , is computed from the union

of the spanning angles of the component paths, S^ .^ n -l ) ; i.e.,


22/46

Origami Kanad

(a) (b)

Figure 24 Elementary pat h: (a) SCG; (b) Gradient space.

Third, the procedure is a necessary condition for surface orientations to satisfy all th

constraints represented in the SCG, but it is no t a sufficient condition for that . An exampl

of this will be given in IV-1.

Fourt h, the presented algorithm is a conceptuall y st rai ghtf orward one, but it

inef fi cient . Implementat ion of the algorithm can exploit several proper t ies of t he SCG t

increase efficiency.

If all the LF-SCG's pass the above test procedure, then the given interpretation is sai

to pass the test for t he surface orientat ion consistency. The above test pr ocedure, togethe

wi t h the up-t o-3 -sur f ace juncti on dict ionary, defines the nature of the Origami wor ld. A

interpretation of a line drawing is called plausible in the Origami wor ld, if ail the junct ions ar

given legal junction labels contained in the dictionary, and if its SCG passes the above test.

For Figure 20, it is easy to see that the SCG consisting of two nodes does not pass th

test . Let us next consider the SCG in Figure 21. The path Y(R2 -R3->R4) is an elementar

path, and the path I 1}-( R^ Rj ^ R/ i ) is one of the paths which connect R? and R^. Th

spanning angles sj"^ and S f , ^ are calculated as shown in Figure 2*5. Since theintersection is null, the spanning angle S ^ ^ will become null in step (2), and the t est f ail s.

On the other hand, if we change the concave line labels (-) between R 2 and R3 an

be t we en R4 and R3 in Figure 15 to occluding boundari es (- ) so that R3 occludes R 2 andRQ

t hen the cor respond in g SCG wil l consist of two subgraphs: one includes nodes {Ri ,R2 ,R4,R5

and the ot her ^ 3 ^ 5 } . It is easy to see that the latter subgraph has no LF-SCG and tha

the former passes the test ; therefore the test succeeds this time. As anot her example

21


23/46

Origami Kanade

consi der Figure 26 (a) and one of its int erpretat ions Figure 26(b). It is a f igure obtained by

adding a convex line CF to Figure 14(b). It is a paper-sided, t runcated pyram id viewed f romabove. Its SCG consists of a single circuit (Figure 26(c)), and passes the test . Ther ef or e the

configuration of Figure 26(b) is plausible.

Two point s should be mentioned concerning the proper t ies of SCG. Suppose that i

Figure 21 we are filtering the spanning angle of the elementary path 7=(R4-Rg) against a

the paths that connect R4 and Rg. If we have fi lt ered 7 by the path Tj ^ R^ Rj -^ Rg ), the

we need not fi lt er by such paths which travel a component element ary path of I ^ , sa

7l=(R4-Ri), through a nonelementary path. For example, r 2=(R4-R3-* R2-Rj->Rg) travels y

t hrough the nonelementary path ( R^ -^ -^ R^ Rj ) . The reason is that since ^ ^ - ( R ^ R i ) i

itself an elementary path, the partial path ( R^ -^ -^ R^ Rj ) of r 2 has been or wil l be used i

f i l ter ing y , and th eref ore, r 2 does not add any constraint diff erent from T j .

Another point is that Tj ^ R^ Rj -^ Rg ) does surely have an overlapping spanning angl

with that of 7 = ^ 4 -^ 5 ) , for they form a circuit which surrounds a single junction. If we sto

and th ink , this is what the juncti on label means: it assures a local consistency around

junct ion. These two facts can, reduce the number of paths whi ch are act ual ly used i

fi lt ering. In our example, y needs to be filt ered only by ^ -( R^ -M ^ -^ R^ Rg ).

I I I -6 The Labeling Procedure in Origami World

In the previous four subsections we have first enumerated the legal junction labels i

the Origami world, and have stored them together with link information in the Origam

junct ion di ct ionary. Then , af ter introducing the concept of spanning angles, we have def ine

a test procedure on the SCG which can decide whether a given labeled interpretation of

line draw in g is consistent wit h respect to surface orientations. This subsect ion will present

labeling procedure which, given a Line draw ing% finds all its plausible in terpretat ions; that i

all the combinations of assignments of line labels which result in legal junction labels at a

the junctions in the drawing, and which pass the test procedure for surface orientations.

Strategy

The labeling procedure must include two tasks:

(1) Using the junct ion dict ionary of the Origami worl d, int erpretat ions are generat ed

whi ch the labels given to lines constit ute legal junct ion labels at all the junct ions. Th

could be done in two steps:

22


24/46

Origami Kanad

Fig ur e 25 Spanning ang les o f paths in the SCG in F igure 21 .

(a) (b) (c)

Figure 2 6 Paper -sided, truncated pyramid: (a) line drawing-, (b) label ing; (c) SCG.

( l a ) Filt eri ng of juncti on labels (as in the Waltz labeling [Walt z, 1972 ]),

( l b ) Tree searching to obtain the individual consistent combinations of junct ion labels

(2) For each int erpr etat ion obtained in (1), the SCG is construct ed and the consi stency o

sur face orientat ions is tested by fi lt ering the spanning angles of elem entary pat hs.

However, the number of interpretations which are generated in (1) as being consisten

with respect to the junction dictionary is very large, and most of them are inconsistent with

resp ect to the surface orientat ions. Also, if a certain subconfigurati on is inconsistent wi t

respect to surface orientations, any interpretation which includes that subconfiguration is

never plausible. Theref ore, in the actual implementation of the labeling procedure the step

( l b ) and a par t of (2 ) are combined into one process: while assigning a junct ion label t o

2 3


25/46

24


26/46

Origami Kanad

j unct ion in a dept h-f i rst manner, the process const ructs the SCG increm ental ly and check

the spanning angles as far as possible. When any inconsistency, either in the junction labeor in the surf ace orient at ions, is detected, the process backtracks the search for the ne

com binat ion. This combined process great ly increases eff iciency in the labeling p rocedure.

As a result, the labeling procedure consists of three phases:

(1) Filtering of junction labels,

(2) Tree searching combined with filtering of spanning angles on a partial SCG,

(3) Final filtering of spanning angles of elementary paths.

Phase (1): Filtering of Junction Labels

This process is exactly the same as the Waltz method [Waltz, 1972]. At first, eac

junct i on is gi ven a set of possible labels drawn from the dict ionary according to it s junct io

t yp e. Init ial constraints are that the outer boundaries should have the label ,


27/46

Origami Kanad

const ruct ed. If t he link of the present junction label adds an arc and result s in the format io

of a new circuit in the SCG, the spanning angle of this arc is checked against all the existin

pat hs which connect the same node pair. If they have a null in tersect ion, the part i

interpretation is inconsistent.

If any inconsistency in the configuration is detected, either in the combination o

junct ion labels or in the surface orientat ions, the tree search process back track s one st e

and sear ches f or the next combination. If all the junctions are labeled consistent ly, th

resultant interpretation is handed to the final phase (3).

Let us see the example of Figure 27(a). Suppose that the junct ion is f i rst gi ven

junct ion label as shown in Figure 27 (b); the corresponding part ial SCG consi sts of a singl

arc ( R j - ^ ) . Then , J7 is given a juncti on label (see Figure 27(c)). Since the link it ha

bet we en Rj and R3 is the same as the existing one, two new arcs are added to the SCG: fi rs

( R^ R2 )i and then ^ 2 -^ 3 ) . A circuit is formed when ^ 2 -^ 3 ) is added (Figure 27(d)). It

spanni ng angle is checked against the existing path ^ - ^ R ^ - ^ ) . As shown in Figure 27(e

the intersection is not null, and therefore the search proceeds to Jg. Suppose that the f ir s

choi ce of j unct ion labels for it results in the interpretat ion and the correspondi ng SCG as

Figu re 2 7(f ). Since no new circuit is generated, the assignment of juncti on labels pr oceed

J3 is given a unique junct ion label determined by the line labels already given. As shown

Figu re 27 (g), when J3 is given the junct ion label, it adds an arc ^ - ^ 4 ) and the SCG has

new circuit. Thus, the arc ^ 2 ^ 4 ) is to be checked against paths ^ 2 ^ 3 - ^ 4 ) an

(R2-R 1 -^ 3 -^ 4 ). Since the spanning angle of ^ - ^ 4 ) does not have an overlapp ing are

wi th that of ^ 2 -^ 3 -^ 4 ) as shown in Figure 27(h), this int erpretati on turns out t o binconsi stent . The process winds back to Jg, and the next choice for Jg wil l be examined.

Consider another stage of the tree search in which the junction labels have been give

as shown in Figure 27(i). When the junction J3 is examined, it adds a new arc ^ 2 -^ 4 ) . Th

time, the spanning angle of ^ - ^ 4 ) is compatible with that of ^ 2 - ^ 1 - ^ 3 - ^ 4 ) 1 and therefor

t h e sea rch p ro ceed s . S ince the res t of junc t ions do not add new arcs to the SCG, th

interpretation shown in Figure 27(j) is handed to the final phase.

If we considered only junction labels, 90 interpretations would have been generate

for the line drawing of Figure 27(a). However, by means of checking spanning angles in th

course of tree searching, only 8 out of these 90 interpretations were passed to the fin

phase.

Phase (3 ): Final Fi lteri ng of Spanning Angles of Elementary Paths

The method of this phase is exactly the same as the test procedure described in III-

The reason for the necessity of this phase is that the SCG is not completely constructed un

26


28/46

Origami Kanad

Figu re 27 (conti nued).

the tree search has completed the assignments of junction labels in an interpretation: w

could not identify the elementary paths, and therefore we have only checked the spannin

angle of newly added arcs against the existing paths which connect the same node pa

Partial duplication of this phase may appear inefficient, but usually most of th

inconsi stencies of surface orientati ons have been detected in the phase of t ree searchin

and only inconsistencies that involve very global relationships among regions remai

undet ect ed unt il this phase. This phase is also useful because it reveals the mutua

rel at ionshi ps among the gradients of regions in the SCG; this inform at ion is used i

reconstructing the 3-D shape of the scene.

In our example, all of the eight interpretations that are generated in the tree searc

pass this f inal phase. In the case of Figure 27(j), it is revealed that the gradients of the f ou

regi ons should be placed in the gradient space as shown in Figure 27(k). The diagram

represe nt s the fact that the four surfaces form a convex opening in the 3-D space, which i

probably a general description of a "box".

I I I -7 Examples of Labeling

A few examples of interpret at ions in the Origami wor ld foll ows. Figure 28 shows three

possi ble i nterpret at ions (wit hout counting rotations) that a line drawing of Figure 4 (a) ca

have: (a) a cube-like configuration; (b) a concave corner; and (c) a "roof" placed on a plane.

A "box" line drawing of Figure 3 has eight possible interpretations shown in Figure 29

(a) corresponds to an ordinary box; (b) is a "squashed" box whose front sides are pushe

backward; (c) is another "squashed" box whose rear sides are pulled forward; (d) is a box

with a triangular lid in the right corner; etc.

27


29/46


30/46

Origami Kanad

Figure 30 shows 11 interpretati ons of Huffman's "im possible" pyram id: (a) a sol id

truncated pyramid; (b) a paper-sided, truncated pyramid; (c) a view of (a) from the bottom(d) a view of (b) from the bottom; (d) a triangular "dome" with an opening in the lowest side

etc.

Figur e 31 is another example of int erpret ing an "impossible" obj ect . It has 1

interpretations. The interpretation (a) corresponds to three twisted rectangular bars.

Figure 32 includes 16 possible interpretat ions of Figure 4(b). Int erpretat ion (a

corresponds to the W-folded paper.

IV Discussion

The discussion in this section is divided into three parts. The fi rst part discusses how

the test criteria employed for checking the surface orientations in the Origami world are

re la ted to plane surf aces. The second part reveals interesting relat ionships of the Origam

wor ld to other wor ld s dealt wi th in prior work on polyhedral scene, analysis. The t hir d par

discusses how knowledge is used in understanding line drawings.

IV-1 Plane Surfaces and Origami World

We have noted that the test procedure, described in III-5, to check the consistency osurface ori ent at ions in the Origami world is not a sufficient condition for the constra in ts i n

the SCG to be sat isf ied simul taneously. But it should be remembered that the const raint s in

the gradi ent space themselves are not a sufficient condition for the conf igurat ion to be

rea l ized by plane surfaces. Consider again the configurat ion of Figure 26 (b). The

configuration made of three regions Rj, R 2 , and R3 has passed the test . However, it is a

simple geometry problem to show that unless three lines AD, BE and CF meet at a single

point , the confi gura t ion is not realizable by the three planes corresponding to Rj , R 2 , and R3

The problem arises from the fact that the gradient space does not take into account

the value of c in equat ion (1): a consistent t race in the gradient space means that the

cor respon ding regi ons can take a consistent combination of (a,b) values, but it does nonecessari ly assure a consistent combination of (a,b,c) values. Huffman [Huff man, 1 97 7

presents a ^(^* )-point test as the necessary and sufficient conditi on for a "cut set " of li nes

(equ iva lent ly, a set of regions separated by those lines) to be reali zable b y plane sur faces

Consider again the example of a paper-sided, truncated pyramid shown in Figure 33(a), and

take the set of li nes AD, BE and CF cut by the dot ted loop. Each line belongi ng to the cut set

of lines is given an ori entat ion shown as a big arrow according to its label, eit her coming i nto

the loop (if t he label is +) or going out from it (if the label is -) (see Figure 33 (b)). Then the

2 9


31/46


32/46


33/46


34/46

Origami Kanade

0(0')-point is a point that is to the right (left) of some line of the cut set and that is not to

the left (r ight ) of any other li nes. The (0' )-point test simply checks whet her ei t her a 0

point or a ^ ' -poi nt exists, and if either one exists, then the cut set is unreal izable. In fact

un less AD, BE and CF meet at a single point , or points exist, and t he ref or e the

conf igu rat ion of Figure 33(a) is unrealizable. Unfortunately, it can not be said that if all thecut set s in the int erpret at ion pass the (* )-point test, then the whole in terp ret at ion is

rea li zable by only plane surfaces. (Notice that the M' )-p oi n t test is the necessar y an

suf f ici ent conditi on for the readab i l i t y of a cut set of lines, not of the whole int erpretat ion).

To human eyes, the configuration of Figure 33(a) appears quite reasonable; sometimes

it takes time to persuade people that the configuration is not realizable by plane surfaces

Figu res 34(a) through 34(c) are other examples of interpretati ons of simple f igu res whi c

appear plausible but not actually realizable by only plane surfaces: they pass our test in the

Origami world, but not the ^(^')-point test.

The use of only the gradients (a,b) makes some sense mathematically w he n wconsider the manner in which we view a picture. Note that the gradient (a,b) of a plane i

invar iant to the x-y t ranslat ion of the picture plane (i.e. shift of eye position). In viewi ng a

or thogr aph ical l y project ed pict ure, we do not fix an absolute origi n in mind. When we see

line where two surfaces actually intersect, we tend to "place" the origin on that line, which

means we gi ve the two surf aces the same c value. Therefore, constrain ts about c ar

automat ical ly sat isf ied at that intersecti on. Also, when we see occlusions, we tend t

attribute it to the difference of the c value rather than to any relations among a, b, and c

As we shift our eye and move the origin in the picture, it is easy to keep the gradient of a

particular region in mind, but it seems difficult for us to "calculate" the value c which tha

reg ion should have in the new coordinates. These observations seem to expla in why th

objects in Figure 33(a), Figures 34(a) through 34(c) do not look impossible at first glance.


35/46

igami Kanad

Figu re 34 Examples of -plausible" int erpretati ons which can not be made of

plane surfaces (the unlabeled lines are occluding boundaries, - or

in the obvious direction).

Figu re 35 Solid t runcat ed pyramid. An example in which all the constraint s in the

gradient space cannot be satisfied simultaneously, but all the pairwise

constraints with respect to others can be satisfied.

34


36/46

Origami Kanad

Another issue about the test procedure deserves comment: its insufficiency fo

assuring that all the constraints on the surface orientations represented in the SCG arsat isf ied simult aneously. The test procedure presented in II I-4 merely assures that for an

pair of regions which are connected by an elementary path (it means that they intersec

di rect l y or t hey are connected by a sequence of free-hanging surfaces between them), th

const ra int s on the surface gradients that relate the pair of regions are all sati sfi ed. In orde

to u nder stand the dif ference, consider the int erpretat ion of Figure 35(a), which is a soli

t runcat ed pyram id viewed from above. The corresponding SCG is shown in Figure 35 (b

The SCG passes our t est procedure, but it is not possible to fi nd a set of gradi ents for a

the r egi ons Rj through R4 so that all the constraints in the SCG are sati sfied sim ult aneousl

unl ess AD, BE, and CF meet at single point . This crucial dif ference stems from the fact t ha

when we pick up a pair of regions, say, Rj and R 2 , the gradients of regions R3 and R4 ar

not fi xed uniquely in checking whether the paths ( R ^ R ^ - ^ ) and ( Rj -^ 3 -^ 4 -^ 2 ) have aoverlapp ing spanning angle with that of ( R j - ^ ) .

I V-2 Origami World and Various Worlds

The theory of the Origami world and the labeling procedure for it have interestin

rel at ionsh ips wi th the work of Guzman [Guzman, 1968], Huffman [Huffman, 1971] , Clowe

[Clowes, 19 71 ], Waltz [Waltz, 1972], Mackworth [Mackworth, 1973] , and Huffman [Huffman

1977]. Thei r wor k all concerned the problem of recovering three-dimensional configurati on

f rom li ne drawings. (The historical development in polyhedral scene analysis has been wereviewed by Mackworth [Mackworth, 1977].)

Assume we consider only a set of line drawings which are reasonably "l ik ely" f igure

that is, it does not include f igures which show too anomalous behavior . We can consider

set of all the combinatorially possible interpretations (assignments of line labels) of those lin

dr aw ings. A subset exists containing those int erpretat ions which can be rea li zed by plan

sur faces. Let us denote that subset as the Plane Surface World, S p s w . We can also think o

a subset consisting of interpretations in which all the constraints on the gradients of surface

are com pletely sati sfied. Let us call it the Consistent Gradient World, S w . Obviousl

^cgw psw

We can view a labeling procedure as a method consisting of a generator and a tester

gi ven a line drawing, a generator generates interpretati ons in a certain manner, each one o

wh ich a t est er accepts or rejects based on a certai n method. Table 2 summarizes vari ou

label ing methods according to this taxonomy. Various subsets can be def ined which ar

genera t ed b y genera tors, or are determined as legal by testers. We wil l discuss the

relationships among those subsets, referring to Figure 36.

35


37/46

Origami Kanad

Tabl e 2. Vari ous labeling schemes as a method of a generat or and a test er.

Method Generator

Subset

Tester

Subset

Huffman

Clowes

Waltz

Trihedral junction

dictionary

Trihedral junction

dictionary

with cracks, shadows,

etc

S t r i

Mackworth Sequential generation

of most connected

interpretations

Constructive test

on coherence rules

in the gradient spaceSpoly

Huffman ^(^') point test

for all the cut sets

in the line drawing

Origami

World

Up-to-3-surface

junct iondictionary

Sup3

Filtering of

spanning angleson the SCG

ongami

Huffman [Huffman, 1971], Clowes [Clowes, 1971] and Waltz [Walt z, 1972 ] used

t r ihedr al j unct ion dict ionary as the generator and did not use any tester . Let us denote th

subset of interpretations generated by the trihedral junctions as Sj rj (Waltz used crack

illuminations, shadows, etc., and the corresponding subset is different from Sj rj , but becaus

geom et ri cal ly his dict ionary is a tr ihedral one, it is included in this category). As we saw, S^

is larger than the subset of solid trihedral objects bounded by plane surfaces.

The Huffman's ^(^)-point test, when used on all the cut sets in the line drawing, ca

def ine a subset which is larger than Sp S W , but is the closest one to it articulated s

far . However , since there is no appropriate efficient generator , it would be dif fi cult , given

li ne drawi ng, to actually enumerate all the interpretat ions belonging to S^ * ) .

Mack worth' s POLY used a generator which generates combinations of line labels base

on some pr eferences. In particular, to achieve the most connected interpret at ions,

36


38/46

Origami Kanade

generates first the interpretation in which all the edges are connected (i.e., all the lines are

given either + or -) , and then when such an interpretation fails to pass the test, it generates

an interpretation with all edges but one as connected edges, and so on. The consistency in

the gradient space is tested by a constructive method which actually tries to fix the positions

of gradients corresponding to the regions, step by step. In this way, POLY avoids the use ofpredetermined interpretations for particular categories of junctions (such as L, ARROW, FORK

etc.), and thus, theoretically, the subset Sp 0| y could be equal to S C g W . However , since it is

not practical to test all the interpretations, a certain selection criterion is needed to supply

the generator with advice or preferences concerning the order of generation. The

constructive test procedure also uses some heuristic rules, because the construction is not a

trivial process. As the result, the actual nature of Sp 0j y becomes a little unclear.

In the Origami world, the subset SUp3 generated by the Origami junct ion di ct ionary

properly includes Sj rj . The subset S o rj g a mj , consisting of plausible interpretations which

have passed the test procedure, is a little larger than the up-to-3-surface objects in the

S C g W , as shown in Figure 36. One feature of S o rj g a mj is that it has a clear definition of the

membership, which allows an efficient procedure to generate the member interpretations for

a given line drawing; i.e., filtering of junction labels and spanning angles on the SCG. The

locations of several example interpretations are indicated in the diagram, and they can serve

to illustrate relationships among the subsets.

origami poly

1: cube (Figure 6)

2: box (Figure 29(a))

3: Figure 34(b)

4: pap er-si ded , tr uncated pyramid(Figure 30(b))

W-folded paper (Figure 32(a))

Figure 34(a)(c)

5: solid tr uncated pyramid (Figure 30(a))

6: Figure 15

7: Figure 14

Figure 36 Relationship among various subsets of interpretations.

37

UNIVERSITY UBRARi tS

CABNEGIE-MEUCN UNIVERSITYfJTTSBURGH. PENNSYLVANIA 152


39/46

Origami Kanad

It is interesting to review Guzman's work[Guzman, 1968] on object recognition at th

point . In its goal, his wor k is the forerunner of Huffman, Clowes and Waltz. He t ri ed to f in

"good" associations of regions into separate 3-D bodies based on heuristics concernin

jun ct ion t ypes. The idea of a "regi on" (which is a project ion of surface to the pi ct ure pl ane

is ve r y close to the idea of the surf ace-oriented wor ld . Also, the links he used represen t th

possible connections of regions like our links. However, since his links are def ined fo

jun ct ion t ypes rat her than for junct ion labels, his link information is a kind of "aver age" ove

var ious combinat ions of surfaces at a particular junct ion t ype. Because of its heur ist i

nature, Guzman's method could not explicit ly clari fy the wor ld for which it is in tended .

Table 2 suggests that we can employ different combinations of generator and teste

dependi ng on the wor ld in which we want to work. For instance, the tri hedral ju nct io

dictionary together with Huffman's ^(^')-point test is the closest to the plane-surfactrihedral solid object world; the up-to-3-surface Origami junction dictionary together wit

the $(


40/46

Origami Kanad

In the case of the Origami wor ld, the precompiled knowledge in the Origami junct io

di ct ionary can reduce the possibi li ti es to Sypg by means of fi lt ering the juncti on labels. Th

filtering of spanning angles on the SCG can reduce the possibilities further to S o rj g a mj . Th

same type of f il tering procedure is used both for exploit ing the precompi led knowledge an

f or the dedicated computat ion; one is used symbolically, the other numericall y. It seems t ha

the difference SUp3 - S o rj g a mj is fairly large in the surface-object world, and thus the teste

is real ly needed. The fol lowing is to be noted. The junct ion labels hold inf orm at io

concerni ng a ve ry local consistency, as was pointed out in III5. They can propagat

in for mat ion t o the neighboring juncti ons only through line labels. In contrast , the links ca

globally transfer information to any junctions through regions.

V Appli cat ion of Origami Theory to Recover 3-D Configurations from Image

The theory of the Origami world has great potential in applications to image

understandi ng t asks that recover three-dimensional configurati ons. That it can deal wi t h

scenes which include free extending surfaces is very attractive, because real world images

incl ude obj ect s which are practicall y or conceptually flat. In fact, the Origami wo r l d

corresponds well to the way in which we would interpret a picture which has been

segm ent ed into regi ons. Suppose that Figures 37(a) and 37 (b) are obtained as the resul ts of

regi on segment at ion of "chai r" and "door" scenes, respect ively. They are sat isf ying to us

(a) (b)

Figu re 37 Region segmented pict ure of "chai r" and "door" scenes.

39

Minsky [Minsky, 1974] uses the side view of the "impossible" truncated pyramid as a

example to show how little humans rely on numerical three-dimensional geometrica

inf ormat ion i n seeing objects. However, since the int erpretat ion of Figure 15 does loo

unreasonable to us, a more precise statement would be that humans use the geometric

information to check certain consistencies in the gradients of regions, but not the tota

consistency among them.


41/46

Origami Kanade

(a)

Figu re 38 (a) Moderat ely curved chair seat; (b) Possible line drawi ng

because we i nt erp ret them in terms of surfaces. Needless to say, the Origami worl d include

the solid-object world as its subset.

Just as we generalized from solid volume to surface, we can go further land say that

"pencil" has conceptually a line shape, thus we need a "wire-frame" world, and further,

"dot " wor l d. The more basic the unit of the worl d is, the broader class of pictures it can dea

with, but at the same time the less constraint s it provides. We feel that the Origami wor ld rich enough to accept a large class of line drawings, and at the same time it has enoug

structure to impose constraints on the possible label combinations.

Even though the Origami world is not intended for curved objects or imperfect lin

drawings, a certain class of line drawings including curved objects or imperfections can b

accommodated within it. As an illustrative example, suppose that a moderately curved cha

seat of Figure 38 (a) yields a line drawing of Figure 38(b). While it is an "impossible" f igur

in the t ri hedra l wor ld , it has an int erpretat ion in the Origami worl d corresponding to "

rect angul ar block wi th a flat sheet att ached". Once this is hypothesized, furt her processi n

pr ob ab l y i nvolving image data analysis, can discover the detailed shape and know whet her

is a square or curved, solid or flat object.

In real image understanding tasks, the number of lines is large, and ther ef ore th

num ber of possibl e interpret at ions is also large. Even the line drawing of a box (Figure 3

has eigh t i nt erpr etat ions, for instance. However, spectral (shading, color, etc.) an

geometrical (collinearity, parallelism, etc.) knowledge can be used here to reduce the numbe

of possi ble interpret at ions. There is knowledge that relates the nature of edges and the

40


42/46

OrigamiKanad

int ensi t y prof il es t aken across the edge [Horn , 1977 ]; for example, a peak -shaped edg

prof i le of t en suggests a convex edge. Another t ypical example which provides constrain ts o

li ne labels is a so-cal led matched T shown in Figure 39. If the edge prof il es of lines l j an

L 2 are simi lar (and, preferabl y, if the edge prof il es of L3 and L4 and those of Lg and Lg ar

also similar), then the labels of L.j and L 2 are likely the same, and L3 through Lg are l ikely t

be occluding boundar ies, wit h the middle region R obscuring Lj and L 2 . All these const rai n

conceivably in a probabilistic way, the possible combinations of line labels that a set of line

can take. Ther ef ore, "best " or "most plausible" interpretat ions can be defined and searched

For exam ple, in the case of Figure 27(a), if Lg and L^g are known to take the same label, th

num ber of possibi li t ies reduces from 8 to 3 (see Figure 29).

Furt her , heurist ics concerning surface orientati ons can be used to provide pref erence

f or i nt erpr etat ions. For example, if interpretat ions in Sj rj (i.e., all the junct ion label s gi ve

are trihedral ones) are to be chosen first, then in the cases of Figures 28, 30, and 31, the

int erp ret at io ns corresponding to a cube, a solid truncated pyramid, and three t wisted bar

are se lect ed, respect ivel y. Another heuristic is that parallel lines in the pict ure are als

pr ef er ab l y parallel in the scene; this would pref er the interpretati ons corresponding to a

or di na ry box and a W-folded paper over others in Figures 29 and 32 , resp ect ively. Thi

"parall el-in-the-picture/ paralIel-in-the-sceneM heuristic seems ver y powerful for pi cture

wh ich do not include strong perspect ive distort ions. The subject of applying the Origam

t heo ry t o real image understanding is furt her treated in [Kanade, 19 78 J

Figu re 39 Matched T configurat ion.

41


43/46

Origami Kanade

VI Conclusion

The theory of the Origami worl d (up -to-3-sur face juncti ons) has been devel oped . Th

contributions of this paper might be the following:

(1) The concept of selecting surfaces as basic components of the worl d, ra ther t han th

conventional solid polyhedra;

(2) The enumerat ion of the up-t o-3-sur face junction labels;

(3) The use of links to capture the global relationships of regions in the form of

Surface Connection Graph;

(4) The f il tering procedure defined on the spanning angles;

(5) The discussion of relationships among various worlds dealt wi th in pr ior work o

polyhedral scene analysis.

It seems that the Origami wor ld defines a subset of int erpretat ions whi ch a r

interesting both from the standpoint of psychological perception of shapes and from that o

practical analysis of region segmented pictures.

Acknowledgements

I woul d l ike to thank John Kender, Allen Newell , Raj Reddy, and Steven Shafer for ve

stimulating discussions in the development of the theory presented in this paper.

REFERENCES

Clowes, M. B. (1971), "On Seeing Things", Arti f icial Intel l igence, Vol . 2, No. 1, pp.79 -11 6.

Falk, G. (19 72 ), "Int erpretat ion of Imperf ect Line Data as a Three-Dimensional Scene

Artificial Intelligence, Vol . 3, No. 2, pp.10 1-14 4.

42


44/46

Origami Kana

43

Guzman, A. (1968), "Computer Recognition of Three Dimensional Objects in a Visual ScenMAC-TR-S9, MIT.

Horn, B. K. P. (19 77 ), "Underst andi ng Image Int ensit y", Artificial Intelligence, Vol. 8, No. pp.201-231.

Huffman, D. A. (197 1), "Impossible Objects as Nonsense Sentences", Machine Intel l igence V

6 (Meltzer, B. and Michie, D. eds.), Edingburgh University Press, pp.295-323.

Huffman, D. A. (1976 ), "Curvat ure and Creases: A Primer on Paper", / Trans. ComputVol. C-25, No. 10, pp.1010-1019.

Huffman, D. A. (1977), "Realizable Configurations of Lines in Pictures of Polyhedra", MachiIntelligence Vol. 8 , (Elcock , E. .W. and Michie, D., eds.), Edinburgh Universi t y PresEdingburgh, pp. 493-509.

Kanade, T. (197 8), "Task Independent Aspect of Image Understanding", Proc. Im agUnderstanding Workshop, Cambridge, Massachusetts, May. Also an extended ver si on wbe available as Technical Report, Department of Computer Science, Carnegie-MelloUniversity.

Mackwor t h, A. K. (1973 ), "Int erpret ing Pictures of Polyhedral Scenes", Artificia l IntelligencVol. 4, No. 2, pp.121-137.

Ma ck wort h, A. K. (1977 ), "How t o See a Simple Worl d: An Exegesis of Some Com pute

Programs for Scene Analysis", Machine Intel l igence Vol. 8, (Elcock, E. .W. and Michie, D

eds.), Edinburgh University Press, Edingburgh, pp. 510-537.

Minsky, M. (1974), "A Framework for Representing Knowledge", MIT AI Memo No. 306. Also in

Psychology of Comput er Vision (Winston, P. H., ed.), McGraw Hill, 1975.

Turner, K., J. (1974), "Computer Perception of Curved Objectes Using a Television Camera"

Ph.D. Thesis, School of Artificial Intelligence, Edinburgh University.

Waltz, D. (1972), "Generating Semantic Descriptions from Drawings of Scenes with Shadows",MAC AI-TR-271, MIT., rep roduced in The Psychology of Comput er Vision, (Winston, Ped.), McGraw Hill, 1975.


45/46

APPENDIX ORIGAMI JUNCTION DICTIONARYKanad

JUNCTION TYPE - L

L I * L2* L3* L4*

Al


46/46

Date post:	02-Jun-2018
Category:	Documents
Upload:	dawu
View:	233 times
Download:	0 times

A theory of origami world.pdf

Documents