+ All Categories
Home > Documents > New algorithm for calculating an invariant of 3D point sets from a single view

New algorithm for calculating an invariant of 3D point sets from a single view

Date post: 22-Apr-2023
Category:
Upload: independent
View: 1 times
Download: 0 times
Share this document with a friend
10
ELSEVIER Image and Vision Computing 14 (1996) 179-l 88 New algorithm for calculating an invariant of 3D point sets from a single view Y. Zhu, L.D. Seneviratne, S.W.E. Earles Department of Mechanical Engineering, King’s College London, Strand, London WC2R 2LS, UK Received 1 Decemberl994; revised 19 July 1995 Abstract The invariant used as an index has shown many advantages over the pose dependent methods in model-based object recognition. Although perspective, or even weak perspective invariants do not exist for general three-dimensional point sets from a single view, invariants do exist for structured three-dimensional point sets. However, such invariants are not easy to derive. A new interpretation of calculating invariants for a special structure of three-dimensional objects is presented. The 3D invariant structure proposed by Rothwell requires seven points that lie on the vertices of a six-sided polyhedral and is applicable to position free objects. In comparison, the proposed algorithm requires only six points on adjacent (virtual) planes that provide two sets of four coplanar points and does not require the position free condition. Hence it is applicable to a wider class of objects. The algorithm is demon- strated on images from real scenes. Keywords: 3D projective invariants; Object recognition; Shape descriptors 1. Introduction The fundamental difficulty in recognising objects from images is that the appearance of a shape varies with different viewpoints. In a typical model-based object recognition scheme an image of an unknown object, observed from an unknown viewpoint, is compared with a known object model stored in a library. The object model can be built up from commercial CAD software or taken from a certain viewpoint by a camera. This difficult task can be greatly facilitated if certain geometric descriptions are unaffected by different viewpoints and by intrinsic parameters of the camera. Such invariant descriptions can be measured directly from images with- out prior knowledge of the position and orientation of the camera, so camera calibration is avoided during object recognition. Much work has been done on planar invariants of objects [1,2] using plane collections of point sets or tonics. Lamdan et al. [ 11, assuming the affine transforma- tion to approximate the perspective transformation, used three non-collinear points as the basis of a special coor- dinate system. Any affine transformations applied to the object points do not change their coordinates relative to the same ordered base triplet, or on a system of the 0262-8856/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDI 0262-8856(95)01055-6 distinguish frame. Lamdan et al. presented a unified approach to the representation and matching problems, which applies to object recognition under various geo- metric transformations of planar point sets. The algorithm works in two stages: an off-line model pre- processing (shape representation) phase which is inde- pendent of the scene information, and a recognition phase, based on efficient indexing, called geometric hash- ing. However, the method still depends on other points in the scene to support the hypotheses of correspondence. In effect, it is still a voting mechanism for recognition. Forsyth et al. [2] focused their research on a plane pro- jective group to recognise curved plane objects in three dimensional space. Models are generated directly from image data. Their main contribution is that curves are handled as geometric features (tonics) with an identity of their own, and are not approximated by line segments. An invariant fitting theorem, which works for algebraic curves of any degree, was introduced. Image curves are then represented by invariant shape descriptors, which allow direct indexing into a model library. Pose recovery, using algebraic curves themselves instead of point sets via equiform invariants, was presented and the position and orientation of the object plane, with respect to the camera, are obtained. This is a typical problem of camera
Transcript

ELSEVIER Image and Vision Computing 14 (1996) 179-l 88

New algorithm for calculating an invariant of 3D point sets from a single view

Y. Zhu, L.D. Seneviratne, S.W.E. Earles

Department of Mechanical Engineering, King’s College London, Strand, London WC2R 2LS, UK

Received 1 Decemberl994; revised 19 July 1995

Abstract

The invariant used as an index has shown many advantages over the pose dependent methods in model-based object recognition. Although perspective, or even weak perspective invariants do not exist for general three-dimensional point sets from a single view, invariants do exist for structured three-dimensional point sets. However, such invariants are not easy to derive. A new interpretation of calculating invariants for a special structure of three-dimensional objects is presented. The 3D invariant structure proposed by Rothwell requires seven points that lie on the vertices of a six-sided polyhedral and is applicable to position free objects. In comparison, the proposed algorithm requires only six points on adjacent (virtual) planes that provide two sets of four coplanar points and does not require the position free condition. Hence it is applicable to a wider class of objects. The algorithm is demon- strated on images from real scenes.

Keywords: 3D projective invariants; Object recognition; Shape descriptors

1. Introduction

The fundamental difficulty in recognising objects from images is that the appearance of a shape varies with different viewpoints. In a typical model-based object recognition scheme an image of an unknown object, observed from an unknown viewpoint, is compared with a known object model stored in a library. The object model can be built up from commercial CAD software or taken from a certain viewpoint by a camera. This difficult task can be greatly facilitated if certain geometric descriptions are unaffected by different viewpoints and by intrinsic parameters of the camera. Such invariant descriptions can be measured directly from images with- out prior knowledge of the position and orientation of the camera, so camera calibration is avoided during object recognition.

Much work has been done on planar invariants of objects [1,2] using plane collections of point sets or tonics. Lamdan et al. [ 11, assuming the affine transforma- tion to approximate the perspective transformation, used three non-collinear points as the basis of a special coor- dinate system. Any affine transformations applied to the object points do not change their coordinates relative to the same ordered base triplet, or on a system of the

0262-8856/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDI 0262-8856(95)01055-6

distinguish frame. Lamdan et al. presented a unified approach to the representation and matching problems, which applies to object recognition under various geo- metric transformations of planar point sets. The algorithm works in two stages: an off-line model pre- processing (shape representation) phase which is inde- pendent of the scene information, and a recognition phase, based on efficient indexing, called geometric hash- ing. However, the method still depends on other points in the scene to support the hypotheses of correspondence. In effect, it is still a voting mechanism for recognition. Forsyth et al. [2] focused their research on a plane pro- jective group to recognise curved plane objects in three dimensional space. Models are generated directly from image data. Their main contribution is that curves are handled as geometric features (tonics) with an identity of their own, and are not approximated by line segments. An invariant fitting theorem, which works for algebraic curves of any degree, was introduced. Image curves are then represented by invariant shape descriptors, which allow direct indexing into a model library. Pose recovery, using algebraic curves themselves instead of point sets via equiform invariants, was presented and the position and orientation of the object plane, with respect to the camera, are obtained. This is a typical problem of camera

180 Y. Zhu et aLlImage and Vision Computing 14 (1996) 179-188

calibration. Their extension to the three dimensional problem is based on the recovered orientation param- eters of the object planes, and can be used for a con- strained class of objects.

Much attention has been paid to 3D object recognition [3-81. Previous 3D object recognition systems [9,10] rely on the pose of objects for recognition. The shape measured in images depend not just on the shape of the object observed, but also on its pose and the intrinsic parameters of the camera. If three dimensional informa- tion is available directly from the scene (for example range images obtained by a laser range scanner), the collected scene data can be segmented into surface patches and the segments used to calculate various sur- face properties. The schemes of object recognition do not vary significantly from system to system and from 2D to 3D objects. The complexity of 3D object matching has resulted in the extensive study of efficient and adaptive search strategies [ 11,121. If invariant shape descriptors are available, the problem of identifying objects is decoupled from that of determining their pose. It is pos- sible for recognition to be achieved without reference to the object pose. It is also unnecessary to search the model library since the invariant descriptors index a model directly.

Some 3D invariants have been constructed when a weak perspective approximation is assumed. Wayner [7] considered a structure of four points, composing three orthogonal vectors in three dimensional space, and constructed an invariant descriptor of a two- dimensional image under orthographic projection and scaling. For an arbitrary set of four points certain invar- iant relations are stated for 2D images. Weinshall [6] proposed a representation of five arbitrary non-coplanar points for affine shapes and four arbitrary non-coplanar points for rigid shapes, which are invariant under weak perspective to either rigid or linear 3D transformations. To compute these invariants the algorithm requires at least two frames for affine shapes and three frames for rigid shapes. The weak perspective assumption is valid only when the relative distances between points in the object are much smaller than their distances to the camera.

Barrett et al. [8] derived the invariant of six points in space when the images are constituted by a stereo pair. For general three dimensional point sets and uncali- brated cameras, the algorithm requires at least 34 images with 8 corresponding points to identify an object; they did not claim that this is the simplest structure for recog- nising general three dimensional point sets but rather showed how the Homogeneous Equations (HE) method can be used as a general method to derive invariants. Longuet-Higgins [ 131 obtained a projective invariant of eight points in general positions of two images. Faugeras [ 141 and Hartley et al. [ 151 developed invariants for seven points in general three dimensional positions from two uncalibrated images, and Quan [16] extended the

algorithms to the case of six points from three uncali- brated images.

Rothwell et al. [4,5] argued that, although invariants do not exist for general three-dimensional point sets from a single view supported in [17,18] invariants do exist for structured three-dimensional point sets. Projec- tive invariants were derived for two classes of objects. The first is for points that lie on the vertices of a poly- hedron and invariants are computed by using an alge- braic framework of constraints between points and planes. The second is for objects that are bilateral sym- metric. For the first class of objects, a minimum of seven points, that lie on the vertices of a six-sided polyhedron, are required in order to recover the structure of a pro- jectivity. The object should be position free [4,19]. For the second class of objects, a minimum of eight points, or four points and two lines that are bilateral symmetric, are needed.

In this paper, a new 3D object recognition algorithm based on a single 2D greylevel image is presented. Pro- jective invariants are derived from a structure with six points on adjacent (virtual) planes that provides two sets of four coplanar points. No position free condition is needed. The condition to identify a 3D object is more general compared to Rothwell’s method and can be used more widely. The algorithm is demonstrated on images of real scenes. It is noted that the six point configuration used has been described recently by Zisserman et al. [20] and Sparr [21]; however, the interpretation of this inde- pendently derived structure in this paper is novel.

2. Invariant shape descriptors from a single perspective view

An invariant is defined relative to a particular group of transformations for a set of geometric entities under a certain structure. Forsyth et al. [2] and Barrett et al. [8] gave a number of examples of invariants. In their deriva- tions, Forsyth studied invariants more from algebraic theory, while Barrett used the techniques of determinant ratio and homogeneous equations and provided a good account of the measurement of given geometric entities.

In this section, a 3D projective invariant from a single view is presented, based on a structure with six points on adjacent (virtual) planes that provides two sets of four coplanar points. The following theorem is proved using determinant ratio techniques.

Theorem 1. Let A, B, C, D, E and F be six points on adjacent planes of an object providing two sets of four coplanar points (points A, B, C and D for one coplanar set andpoints C, D, E, andFfor the other, see Fig. la). No three of the four coplanar points are collinear. Let A’, B’, C’, D’, I!?, and F’ be their image points on the image plane. Let S1 be the intersection of the lines AD and BC, S, be the

Y. Zhu et al./Image and Vision Computing 14 11996) 179-188 181

Fig. 1. (a) A structure of six points A, B, C, D, E and F provides 2 sets

of 4 coplanar points. Points S, , S, and G lie on the plane defined by the

points B, C and F. (b) A structure of seven points which provides three

sets of four coplanar points.

intersection of the lines CF and DE, and G be the intersec- tion of the lines FS1 and BS,. G’, the image point of G, is constructed in the same way on the image plane. Let PABC represent the area of the triangle composed of the three vertices A, B, and C. The cross-ratio of the areas of the corresponding triangles is a projective invariant:

I, = pCBA pCED pCFG Pc~B~A~P~IE,DIP~IFIGI

PCDAPCEFPCGB = PC'D'A'PC'E'F'PC'G'B'

Proof Point G’ is the image of its correspondent object point G under the projective transformation, though the point G may be inside the object and invisible. Points G and G’, which are constructed through the intersection of lines in the object and the corresponding intersection in the image, are identical under the projective transforma- tion [22]. Furthermore, points Sr , S, lie on the plane defined by points C, B and F; so does point G.

Consider a world coordinate system 0-xyz. Let the object points A(xA, yA, zA), B(xB,YB, zB) and C(xc, yc, zc) lie on the plane p1 (Fig. 2). Then for the point A:

~X~+~y~+~Z,4+1=0

where a, b, c and d are constants. Let the corresponding image points A’(uA , uA, WA),

@(ue,uB, wB), and C’(uc,z+, wc) lie on the image plane p2. Then for the point A’:

f ;UA+hVA+;w/$+l=O

where e, f, g and h are constants.

Fig. 2. Configuration of coplanar point projection.

For perspective projection, it can be shown that:

xA uA Y.4 uuq __=-..-_. -__=-T.-

zA M1A ’ z.4 wA

Now, the following matrix-vector equation can be written for point A, similar to Barrett [8]:

0 0 0 1

0 0 1 0

0 1 0 0 a b c

‘dd2

1 !- zA

0 0 0 1

0 0 1 0

0 1 0 0

1 e I E h h h

The equation can be expanded for the three points B and-C, as follows:

1

xA

YB

ZB

1

UB

‘UB

wB

This matrix equation is of the form:

MeQABC =N* QA~HC~

where:

0 0 0 1

0 0 1 0

M= 0 10 0

1abl: d d d 1 ; N=

0 0 0 1

0 0 1 0

0 1 0 0

1 ef_g_ h h h

182 Y. Zhu et al./Image and Vision Computing 14 (1996) 179-188

and QABc and QA,rrc, contain the coordinates of object The Q terms in the above equations depend on the points (A, B, C) and their correspondent image points unknown world coordinate system. What is known are (A’, B’, C’). the coordinates of object points in the object coordinate

Using the determinant ratio technique, both the orien- system and those of image points in the image coordinate tation matrices and the inverse of z coordinates of object system. It is therefore necessary for the invariant to be and image points are eliminated, as outlined below: measured independent of the world coordinate system.

where,

1 1 1 1

Q o xC xB xA

CBA =

0 YC YB YA

o %’ =B ZA

1 0 0 0

o xC xB xA = =

0 YC YB YA

o zC zB zA

Therefore:

XC xB xA

Yc YB YA

ZC zB zA

IIQcBAII * IlQcmlI * IIQcFGII

IIQcDAII * IlQcmll * IIQcGsII

IIQc/B~A~II * IIQc~E/LvII * IIQc~FJG~II

= IIQc~D~A~II. IIQ C/EJPII - IIQc~G~B~I

The determinant of matrix QcBA is six times the volume of the tetrahedron formed by points C,B, A and the origin of world coordinate system 0-xyz; or six times the product of the triangular area PcBA and the perpendicular distance from the plane p1 to the origin of 0-xyz. Further:

4 ’ pCBAd2 ’ PcEDh * pCFG

4 ’ PcDA d2 * pcEFd3 - pCGB

I, = pCBApCEDpCFG = PC~BIAIPCIEIDIPCIFIGI

PCDAPCEFPCGB PC~DIAIPCIEIFIPC~G~B~ (1)

The area of a triangle is a scalar term independent of the coordinate systems. Thus the theorem is proved.

During the calculation, seven points are used, the seventh point, G or G’, being derived from the six basic points. So it is not independent. For the object, the calculation is carried out on three different planes. Using the determi- nant ratio technique and other elimination techniques, certain unknown information is eliminated from the equation of the invariant: (a) the orientation of the object and image planes, (b) the inverse of the z coordinates of object and image points, and (c) the distance from the origin of the world coordinate system to the three object planes and the image plane.

Note that Eq. (1) has been proven as an invariant up to a perspective camera. In practice, if the camera is uncali- brated, it is not possible to measure the coordinates (uA, 2?q) in the image plane. It is well known that there are four intrinsic camera parameters: the coordinates of camera centre (c,, c,), focal length F and aspect ratio CL The coordinates of the image point in the camera coor- dinate system can be written as

(~;,.a) = (F*u~/o + c,,F*vA + cy)

If the coordinates (uA, VA) are replaced with (u>, &) in the preceding proof process, then the matrix N becomes N’:

0 0 0 1

-cJF - l/F 0

-c,.ajF a/F 0 0

where:

H=F.h-e.cx.a-f.cy

Thus the matrix N’ can be eliminated in a similar manner to N. So Ii is invariant to camera calibration.

Similarly, point H, which is the intersection of the line SiE and S2A, can be used to construct another

Y. Zhu et aLlImage and Vision Computing 14 (1996) 179-188 183

Y 5

I 6

0.1 F6

\ F3 Y fP@ Q 4

0.0

7 m a. b.

1

Fl -1.0 3

F5- 2

0.1 0.2 0.3 --

Fig. 3. (a)-(b) one perspective view of the polyhedron with and without hidden lines. (c)-(d) a second view. On the right the points of(b) used to compute the invariants are given. (Adapted from Rothwell et al. [4].)

invariant:

In Appendix 1 it is shown that Z2 is functionally depen- dent on the first invariant I,. Interestingly, when one of the four object points is not coplanar with the other three points, the dependency still holds in the image plane, but does not hold for the 3D object model. In other words, the difference between the dimension of the configura- tion and the dimension of the transformation group act- ing on the configuration, is larger than one. This gives another method for checking the coplanarity of four points on an object in the 3D object space, though it is a more complicated way compared to the simple alge- braic method.

The following theorem can also be proved similar to the proof of Theorem 1.

Theorem 2. Let A, B, C, D, E, F and K be seven points on three mutually adjacent planes which provide three sets of four coplanar points. Let point C be trihedral and not collinear with any other two points. Then the cross-ratio of areas of corresponding triangles is a projective invariant:

13 = pCKApCBDpCEF P~,K~A~P~,B~DIPc~E~~~

PCEKPCABPCFD = pC’E’K’pC’A’B’pC’F’Dt (3)

Since there are three cases of two sets of four coplanar

Table 1 Invariants of the polyhedron and its images shown in Fig. 3

model image 1 image 2 image 2*

11 1.2607 1.2607 1.2855 1.2607

12 0.7932 0.7932 0.7779 0.7932

a Coordinates of the image calculated by projective transformation.

points, the number of independent invariants for this structure is three.

3. Application of invariant for 3D object recognition

The method is first checked using the tutorial example in [4]. This is a polyhedral object with six faces

(Ft,Fz,. . . , Fe), and Fig. 3 shows its two different per- spective views and the image points (1, 2, . . ., 7). The plane equations of this six-sided polyhedral object and the coordinates of its image points are provided in Appendix 2. Zt and Z,, as outlined in Eqs. (1) and (2) are computed for the two images, Fig. 3(b) and (d), using the image points 1, 2, 4, 5, 6 and 7. The results are given in Table 1. The invariant of the object model and the image 1 (Fig. 3(b)) are identical. For image 2 (Fig. 3(d)), the coordinates of the image points are not given in [4] and so were measured to 1 mm accuracy. As a result, the invariants are slightly different. If the coordi- nates are calculated by projective transformation (for the measured coordinates and the modified ones of the seven image points in Fig. 3(d), see Appendix 2) the same invariants are found (Table 1: image 2a).

In Fig. 4, six images of a bearing support are pre- sented, with six correspondent points marked on each image (1,2,3,4,5,6). The images are processed by fitting straight lines to edge data and vertex positions are found by intersecting pairs of lines (by hand). The correspon- dence between object and image points is assumed. The invariants for the six images are given in Table 2. Since Z, is dependent on I,, only It is listed. The It values are fairly constant with change of the viewpoints, the devia- tion being due to measuring inaccuracies.

It can be proved that if three lines on two adjacent planes formed by the six points intersect at one point (three parallel lines can be considered to intersect at infinity), then the invariants are equal to unity. For the

Y. Zhu et a/./Image and Vision Computing 14 (1996) 179-188

a

e Fig. 4. Six images of a bearing support from different points of view and the points used to compute the invariants (points 1, 2, 3, 4 are coplanar, so are points 3, 4, 5, 6 and points 2, 4, 6, 7).

purpose of discrimination between different objects, this type of structure should be avoided. If point 1 on the top surface in Fig. 4(a) is replaced by point 7 on the base surface to form a new set of six points, the new invariants are listed in Table 3(a).

If three lines, formed by these six points on which two adjacent planes are defined (for example, lines AB, CD, and EF in Fig. l), are not parallel, the invariant can be

Table 2 Invariants of the bearing support and its six images

11

model 1 .oooo image 1 1.0136 image 2 0.9529 image 3 1.0035 image 4 0.9624 image 5 0.9546 image 6 0.9614

Table 3 (a) New invariants using another six points for the bearing support and its six images. (b) Invariants of the bearing support and its six images using five coplanar points

a 11

model 0.3482 image 1 0.3158 image 2 0.3423 image 3 0.3544 image 4 0.3486 image 5 0.3564 image 6 0.3135

b I

model 0.5342 image 1 0.4616 image 2 0.5205 image 3 0.5489 image 4 0.5352 image 5 0.5537 image 6 0.4566

calculated from five coplanar points. Provided that the four points on one plane form two lines that are not parallel, there is an intersection point which is common to both planes. In this case, five coplanar points are available but there is at least one instance of three colli- near points. Usually the theorem for plane projective invariants of five coplanar points stipulates the condition that no three points are collinear [2,8,23]. However, the invariant can still be obtained if there exists only one instance of three collinear points. The number of invariants is reduced then from two to one. The result by this method is listed in Table 3(b). If three lines con- stituted by the six points are parallel, the fifth coplanar point cannot be obtained in the object space, so no invar- iant can be calculated using the five coplanar points method. It can be obtained from Eq. (1) and the value, as mentioned above, is unity: the same as for three lines constituted by the six points intersecting at one point. In the image space, if the three lines intersect at a point, the invariant is also equal to unity.

In Fig. 5, four images of a spray head of a domestic chemical container are shown, with six points identified for computing the invariant. From observation of the object, the left corner point (point 5 is the image point in Fig. 5(a)) is not exactly coplanar with the other three points 3, 4 and 6, hence the error bound of the recon- struction could be large (Fig. 6(b)). The invariants are shown in Table 4.

If seven points that lie on the vertices of a six-sided polyhedral, as required in Rothwell’s method, are avail- able, then three projective invariants can be derived using the presented method. Theorem 2 provides another invariant for this structure. The invariant 1s (from Eq. (3)) of the polyhedral object shown in Fig. 3(b) is listed in

Y. Zhu et aLlimage and Vision Computing 14 (19961 179-188 185

a

d

Fig. 5. Four images of a spray head from different points of view and

the points used to compute invariants.

Table 5, which together with Table 1, gives two of three independent invariants for this structure.

Once the object is positively identified by matching invariants, a three dimensional projective transforma- tion can be obtained by the method described in [22]. The transformation has 11 essential parameters, since the overall scale of the matrix does not matter in homo- geneous coordinates. Six points in the image plane, hav- ing 12 degrees of freedom, can provide one invariant subject to 11 essential parameters for projective transfor- mation. Therefore six pairs of corresponding points between model and scene are enough for determining the projective transformation. If more than six points are available, the system is overconstrained and the least square method is always employed in the presence of noise. If the intrinsic parameters of the camera are known, the position and orientation of the object observed can be recovered with respect to the camera coordinate system, which is important in robot

a b

Fig. 6. Registration of 3D models onto the images.

Table 4

Invariants of four images of a spray head

image 1

0.8271

image 2

0.7702

image 3

0.8305

image 4

0.8368

applications. Interestingly, four coplanar points can pro- vide a unique solution for determining the position and orientation of the object, with respect to the camera coordinate system, if the coordinates of the principal point of the camera in the image plane and the focal length of the camera are known [24].

The transformed model is projected onto the image plane to show registration between them, and the corre- spondence of the model to the image features excluding reference points, can be used as verification (Fig. 6).

4. Conclusions

The invariant used as an index has many advantages over pose dependent methods in model-based object recognition. It is important to investigate simple and easy-to-get three dimensional structures for computing projective invariants, since invariants do not exist for unconstrained 3D point sets. A new algorithm to com- pute an invariant, based on a structure of six points on adjacent planes, providing two sets of four coplanar points, is presented. The algorithm can be extended to a structure of seven points on three mutually adjacent planes, which provides three sets of four coplanar points. The essential condition for the method is six points instead of seven. No position free condition is needed. The main idea is that a 3D object can be considered as composed of a set of surfaces, both curved (nonplanar) and planar. Objects which are composed of planar sur- faces, or point sets in virtual planes can be recognised by means of an invariant, which is a collection of areas of planar triangles. The invariant is derived by determinant ratio and other elimination techniques.

The algorithm is more general compared to Rothwell’s structure. For example, Rothwell’s method would find difficulty in dealing with a five-sided polyhedron, or a truncated tetrahedron, because it is a non-position free polyhedron. This can be recognised by the presented method. In nature, an invariant based on a group of features is a local description of an object. Such invar- iants can only be used in cases where features are

Table 5

Invariants of the polyhedron and its images shown in Fig. 3 when seven

points are available

model image 1 image 2 image 2”

1, 0.9745 0.9745 1.0597 0.9745

a Coordinates of the image modified by projective transformation.

186 Y. Zhu et al/Image and Vision Computing 14 (1996) 179-188

Fig. 7. Configuration of six points on the image plane.

completely available. Using fewer points, there is a better chance to obtain the features necessary to compute invariants. So the algorithm presented is more generally applicable than Rothwell’s. Furthermore, the presented structure has already found applications in solving the fundamental problems of computer vision under the con- dition of projective geometry [3].

Further efforts are needed to investigate how errors introduced by the sensor and the feature extraction schemes affect the invariant.

Appendix 1. Proof of dependence of Z2 on ZI

Let 1, 2, 3, 4, 5 and 6 be six planar points, no three colinear. Points 7 and 8 are constructed in the same way as points G and H. There are two invariants in the six point structure showed in Fig. 7:

I 1

= p356p341p382

p345p326p318

12 = p541 p523p561

p531 p526p514

It can be proved that:

z, ’ 12 = 1

Let

x Y 1 xi Yi l

Pjk = Xj Yj 1 ; Pgk = Xj Yj 1

Xk Yk 1 Xk Yk 1

and from matrix theory:

Pijk = Pjki = Pkg, Pijk = -Pjik

Let:

SXYV = Xiyj - XjYi

SX, = Xi - Xj

SY, = Yi -Yj

Note that if P45 = 0 represents the line equation pas- sing through points 4 and 5, and Pi3 = 0 represents the line equation passing through points 1 and 3, then the

equation:

Li : (1 - A) * P45 + A. PI3 = 0 (1)

represents all lines passing through the intersection of lines L45 and Li3, where A is an unknown parameter.

Similarly, all lines passing through the intersection of the line L56 and the line L23 can be written in the same way:

L2:(1-/.‘)‘P32+/“P56=0

where p is an unknown parameter. If L1 passes through point 6, then:

(2)

(1 -x)*P@s+x’P6i3 =0

and:

X= P645 l-X= -p613

p645 - P613 ’ p645 - P613

so L1 can be rewritten as:

L,6 : -P645 ’ PI3 + P613 ’ P45 = 0

assuming P645 # P613.

(3)

Similarly, if L2 passes through point 4, L2 can be rewritten as:

L24 : -P432 ’ P56 + p456 ’ p32 = 0

assuming P432 # P456.

(4)

(5)

Solving Eqs. (3) and (4) simultaneously, the coordi- nates of the intersection of the lines Li6 and L24 are:

x7 = _!- [csx56 ’ p432 - sx32 ’ p456) Al

x (P645 “5xyl3 - p613 * sxy45)

- cbx13 * p645 - bx45 * p613)

x cp432 ’ sxy56 - p456 ’ bxy32)1

Y7 = i[(6y45’p6,3 -sy13’p645) Al

x (P432 ’ 6xy56 - p456 . sxy32)

- (6y32 * p456 - by56 ’ p432)

x (P645dxy13 - P613 ’ 6xy45)1 (6)

in which:

A, = sy45 ‘P613 - by13 ‘p645 sxl3 ‘p645 - sx45 ‘p613

by32 ’ P456 ‘- sy56 ’ p432 sx56 . p432 - sx32 * p456

assuming Ai # 0. Through routine calculation and simplification:

A1.x7=(-P 456’p613’p532’x4+p432’P645’p156’x3

- P432 * P645 ’ P356 ’ Xl + P456 ’ P645 ’ p312 * x3)

A, ‘Y7 = (-p456 ‘p613 ’ p532 ‘Y4 + p432 ’ p645 ’ p156 ‘Y3

- P432 ’ P645 ’ p356 ’ Yl + p456 ’ p645 ’ p312 ’ Y3>

A, = (P645 * p432 * PI53 + p613 ’ p456 ’ P435

+ p613 ’ p456 ’ p245 + p456 ’ p645 ’ p123)

Y. Zhu et al/Image and Vision Computing 14 (I 996) 179- 188 187

K

M

I

Fig. 8. Point C can be anywhere on the plane.

Now to calculate the area of triangle P567:

- pCJK * PCMI + PCIK ’ PCJM

Since PCJK + PCIJ = PC~K + PIJK.

(16)

x5 Y5 1

P 561 = X6 Y6 1 =x,‘6Y56-y7’&V56+&xY56

Xl Yl l

(If C is inside the triangle Z,J, K the above equation still holds. In that case P C[K < 0. Assume PIJK < 0, ifthe order of permutation of P's subscript is clockwise. The equation is true even if C is on any side of the triangle.) Therefore:

Through routine calculation and simplification:

AI ’ p567 = p256 ’ cp361 * p352 + p365 ’ p321) (8)

AI ’ P514 = p2456 ’ cp324 ’ P315 + p312 ’ P345) (9)

In the same way:

A2 - p382 = p:32 ’ cp564 ’ p523 + p542 ’ p563) (10)

A2 ’ p318 = 632 ’ (P516 ’ P534 + p546 * P513) (ll)

in which:

A2= 6y32 ’ pl56 - sy56 ’ p132 sx56 ’ p132 - sx32 ’ p156

sy45’p213-sy13’p245 sx13’p245-~x45’p213

= cp156 ’ p213 ’ p342 + pl56 ’ p213 * p532

+ pl32 ’ p645 ’ p213 + p132 - p245 ’ p356)

assuming A2 # 0. Let I, J, K and A4 be four planar points, no three col-

linear. C is another point on the plane (Fig. 8). It can be proved that:

PCMI ’ PCJK + PCJM ’ pCIK = pCKM ’ pCIJ

The orders of permutation of all P's subscripts in the above equation are in the same direction from Fig. 8. If they are not in the same direction, change the orders of the P's subscripts and displace them to the other side of the equation if necessary.

The whole proof process is only related to pure alge- braic computation. No assumption has been made about the position of point C. Therefore, the equation is true wherever point C is on the plane.

Having this result, Eqs. (8)-(11) can be further simplified:

A, ’ p561 = p:56 ’ p315 ’ p326

A, ’ P574 = p:56 - p352 ’ P341

(17)

(18)

A2 ’ p382 = p:32 ’ p526 * P534

A2 ’ p318 = ti32 ’ p563 ’ P54l

Therefore:

(19)

(20)

pCMI ’ pCJK f pC.IM ’ pCIK = PCKM ’ pCIJ

* p34l A2 ti32 ’ p526 ’ p534 p54l ’ p523

’ p326 * % ’ pf32 ’ p563 ’ p541 ’ p531 ’ p526

whether point C is inside or outside the trapezium. Constructing two symmetric identities:

Yl ’ pCJK - l’J ’ pCIK = YC ’ pIJK - YK ’ pCIJ

X] *P CJK - xJ ’ pCIK = xC ’ PIJK - XK ’ pCIJ

Let Eq.(12) .SXcM - Eq.(13) *SYc,, then:

(12)

(13) Appendix 2. Plane equations and image coordinates of the polyhedral in Fig. 3.

PCJK ’ (sxCM ‘YI - SYCM ‘XI) The tutorial example used in [4] is a polyhedral object

Table 6

The coordinates of seven image points shown in Fig. 3(b)

+ PC,K - (6 YCM ’ .x, - ox,,,, ’ J’, )

= PIJK ’ (hxCM ’ L’C - syCM . xC)

+ PCIJ ’ (SYCM ‘XK - SXCM ‘.vK) (14)

pCJK * (-pCMI + 6xyCM) + pCIK * cPCMJ - sxyCM)

= PIJK ’ hXYCM + pCIJ ’ (PCMK - 6xyCM) (15)

PCIJ * PCKM + SXYCM * (PCJK - PCIK - PIJK + PCIJ)

I 2 3 4 5 6 7

x 0.1809485 0.0781118 0.164781 0.279058 0.142811 0.0311289 -0.0641656

Y -0.0337354 -0.149954 -0.0954759 0.0183093 0.149108 0.117607 -0.0334405

188 Y. Zhu et al/Image and Vision Computing 14 (1996) 179-188

Table 7 The coordinates of seven image points shown in Fig. 3(d) (measured)

1 2 3 4 5 6 7

x 14.5 13.5 23 34.5 20.5 7.5 9.5 Y 13.5 7.5 11 19 26 25 16

Table 8 The coordinates of seven image points shown in Fig. 3(d) (modified by projective transformation)

1 2 3 4 5 6 7

x 14.5 13.5 23 34.5 20.5 7.5 9.618425 Y 13.5 7.5 11 19 26 24.599781 15.373892

with six faces (see Fig. 3). The object is defined in a local (i.e. object based) coordinate system by the equations of the six faces:

Fl : y + 0.05x + 1 = 0; F2 : 0.15x + z + 1 = 0;

F3: -x+1=0; F4:0.1x-z+l =O;

F5:x+0.2~+1=0; F6:-l.ly+l=O.

The coordinates of seven image points for the projec- tion shown in Fig. 3b are in Table 6. The coordinates of seven image points for the projection shown in Fig. 3c (measured) are in Table 7. The coordinates of seven image points for the projection shown in Fig. 3c (modified by a projective transformation) are in Table 8.

Acknowledgements

Y. Zhu acknowledges the support given to him from an ORS Award and a KC. Wong Scholarship.

References

PI

PI

[31

[41

Y. Lamdan, J.T. Schwartz and H.-J. Wolfson, Object recognition by affine invariant matching, Proc. CVPR88,1988, pp. 335-344. D.A. Forsyth, J.L. Mundy, A.P. Zisserman, C. Coelho, A. Heller and C.A. Rothwell, Invariant descriptors for 3-D object recog- nition and pose, IEEE Trans. Pattern Anal. Machine Intell., 13 (10) (October 1991) 971-991. R. Mohr, L. Morin and E. Grosso, Relative positioning with uncalibrated cameras, in Geometrical Invariance in Computer Vision, J.L. Mundy and A. Zisserman (eds.) MIT Press, 1992. C.A. Rothwell, D.A. Forsyth, A. Zisserman and J.L. Mundy,

Extracting projective information from single views of 3D point sets, TR GUEL 1973/93, Department of Engineering Science, Oxford University, Oxford, 1993.

[5] C.A. Rothwell, D.A. Forsyth, A. Zisserman and J.L. Mundy, Extracting projective structure from single perspective views of 3D point sets, in Proc. 4th Int. Conf Comput. Vision Berlin, Germany, 1993 (the Best Paper Award received) pp. 573-582.

[6] D. Weinshall, Model-based invariants for 3-D vision, Int. Journal of Computer Vision, 10 (l), (1993) pp. 27-42.

[7] P.C. Wayner, Efficiently using invariant theory for model-base matching, in Proc. Int. Conf. Comput. Vision Pattern Recog., 1991, pp. 473-478.

[8] E.B. Barrett, P.M. Payton, N.N. Haag and M.H. Brill, General methods for determining projective invariants in imagery, CVGIP: Image Understanding, 53, (l), (January 1991) 46-65.

[9] D.G. Lowe, The viewpoint consistency constraint, International Journal of Computer Vision, 1 (l), (1987) 57-72.

[lo] D.W. Thompson and J.L. Mundy, Three-dimensional model matching from an unconstrained viewpoint, in Proc. IEEE Int. Conf Robotics Automat., Raleigh, North Carolina, 1987, pp. 208- 220.

[ll] F. Arman and J.K. Aggarwal, CAD-based vision: Object recog- nition in cluttered range images using recognition strategies, CVGIP: Image Understanding, 58 (1) (July 1993) 33-65.

[12] F. Arman and J.K. Aggarwal, Model-based object recognition in dense-range images - A review, ACM Computer Surveys, 25 (l), (March 1993) 5-43.

[13] H.C. Longuet-Higgins, A computer algorithm for reconstructing a scene from two projections, Nature, 293, (1981) 395-397.

[14] 0. Faugeras, What can be seen in three dimension with an uncali- brated stereo rig? in Proc. ECCV92, 1992, pp. 563-578.

[15] R. Hartley, R. Gupta and T. Chang, Stereo from uncalibrated cameras, in Proc. Conference on Computer Vision and Pattern Recognition, Urbana-Champaign, Illinois, USA, 1992, pp. 761-764.

[16] L. Quan, Invariants of six points and projective reconstruction from three uncalibrated images, IEEE Trans. Pattern Anal. Machine Intell., 17 (I), (January 1995) pp. 34-46.

[17] J.B. Burns, R.S. Weiss and E.M. Riseman, View variation of point-set and line-segment feature, IEEE Trans. Pattern Anal. Machine Intell., 15 (l), (January 1993) 51-68.

[18] D.T. Clemens and D.W. Jacobs, Space and time bounds on index- ing 3-D models from 2-D images, IEEE Trans. Pattern Anal. Machine Intell., 13(10) (October 1991) 1007-1017.

[19] K. Sugihara, Machine Interpretation of Line Drawings, MIT Press, 1986.

[20] A. Zisserman, D. Forsyth, J. Mundy, C. Rothwell and J. Liu, 3D object recognition using invariance, TR OUEL 2027/94, Dept. of Engineering Science, Oxford University, 1994.

[21] G. Sparr, A common framework for kinetic depth, reconstruction and motion for deformable objects, in Proceedings of ECCV94, 1994, pp. 471-482.

[22] J.L. Mundy and A. Zisserman, Projective geometry for machine vision, in Geometrical Invariance in Computer Vision, J.L. Mundy and A. Zisserman (eds.), MIT Press, 1992.

[23] I. Weiss, Geometric invariants and object recognition, Inter- national Journal of Computer Vision, 10 (3) (1993) 207-231.

[24] R. Horaud, B. Conio and 0. Leboulleux, An analytic solution for the perspective 4-point problem, CVGIP, 44 (1989) 33-44.


Recommended