LEARNING-BASED SUPER-RESOLUTION OF 3D FACE MODEL

LEARNING-BASED SUPER-RESOLUTION OF 3D FACE MODEL

Shiqi Peng, Gang Pan, Zhaohui Wu

Department of Computer ScienceZhejiang University, Hangzhou, 310027, China

{gpan,wzh}@cs.zju.edu.cn

ABSTRACT

Super resolution technique could produce a higher resolu-tion image than the originally captured one. However, nearlyall super-resolution algorithms aim at 2D images. In thispaper, we focus on generating the 3D face model of higherresolution from one input of 3D face model. In our method,the 3D face models firstly are all regularized via resamplingin cylindrical representation. The super resolution then per-forms in the regular domain of cylindrical coordinate. Theexperiments using USF HumanID 3D face database of 1373D face models are carried out, and demonstrate the pre-sented algorithm is promising.

1. INTRODUCTION

Super-resolution is a technique that could obtain the higherresolution image from the originally captured one. Numer-ous approaches have been presented, such as [1, 2, 3, 4, 5,6, 7]. These approaches could be generally classified intoreconstruction-based approaches, which reconstruct high res-olution image from a series of images, and learning-basedones, which usually generate high resolution image fromlow resolution image by learning from training samples orstrong image priors learned before.

In general, 3D super-resolution addresses to reconstruct3D model from multiple images, like the reference [12],which is a process from 2D images to 3D model. The twomost commonly used conventional method are shape-from-stereo [13] and shape-from-shading [14].

However, there is few work on 3D super resolution for3D model as input. Nowadays, with rapid advance in 3Dacquisition technique, 3D data are becoming more and morepopular. It is of great needs of this kind of 3D super-resolutionin practice, because sometimes we can’t get enough infor-mation of a 3D model for rendering or recognition due tolimitation of 3D acquisition system or environment condi-tion, for example, acquisition at a long distance, noncollab-orated target object and 3D face recognition [9].

This paper addresses 3D super-resolution from the 3Dface model with low resolution, which is a 3D to 3D prob-

Regularization Parent Structure

Predicting

Low resolution input

High resolution training samples

High resolution output

Fig. 1. Algorithm diagram.

lem, different from those 3D super resolution approacheswhich is 2D-to-3D case.

2. THE PROPOSED ALGORITHM

Our approach is a learning-based algorithm. First we reg-ularize the training samples in cylindrical coordinate, andbuild a special feature structure for each sample, called par-ent structure. When a low resolution of 3D face model isas an input, we also regularize it and calculate its parentstructure. Based on the parent structures, the prediction al-gorithm will produce the regularized image with higher res-olution, from which the high resolution of 3D face model isfinally recovered. Figure 1 shows the diagram of our algo-rithm.

2.1. Regularization of 3D Face Data

Unlike the image that could be regarded as a matrix, 3Dface model is of great difficulty to handle directly in its 3Ddomain due to its irregularity of arbitrary curved surface.In a face model, in general, some neighbor points are closeto each other while some of neighbor points are quite faraway from each other. For this reason, the first stage of thealgorithm is to regularize the 3D face model.

We employ the cylindrical coordinate as the intermedi-ate representation for the following procedures, shown inFig. 2. The z-axis in the cylindrical polar coordinate is de-cided by hand. The 3D face model is resampled regularly in

0-7803-9134-9/05/$20.00 ©2005 IEEE II-382

z

r

Fig. 2. Illustration of regularization.

cylindrical coordinate, i.e. with the same interval for z,φ.In this way, each face model f(φ, z) in cylindrical rep-

resentation can be formed as a matrix M , which is calledregularized image, shown in Fig. 3.

(a)Original model (b)Regularized image

Fig. 3. Regularization sample.

2.2. Generation of Feature Vectors

Feature vectors [18, 17] are employed to represent the regu-larized images hierarchically, similar to [10], which are veryimportant components of the special structure for super res-olution called parent structure described in the next section.

2.2.1. Gaussian and Laplacian Pyramids

Here we introduce Gaussian and Laplacian pyramid briefly,since they are used to generate the feature vectors. TheGaussian pyramid[16] from level l = k to N of a matrix Mconsists of a series of matrix Gk(M), Gk+1(M), ..., GN (M)which:

Gl(M) ={

M if l = kREDUCE(Gl−1(M) if k < l ≤ N

The REDUCE operator is a filtering followed by elim-ination of unnecessary pixel. For a filter kernel w(i,j) of

dimension 2× 2 and reduction factor 4 we have:

REDUCE(M)(i, j) =2∑

m=1

2∑

n=1

w(m,n)M(2i+m, 2j+n)

There is a corresponding EXPAND operator which willreconstruct the low-pass filtered image by interpolating be-tween pixels in the reduced image and can be used in for-mation of Laplacian pyramid.

Laplacian pyramid also consists of a series of images,which is defined in terms of the Guassian pyramid as fol-lowing:

Ll(M) ={

Gl(M)− EXPAND(Gl+1(M)) if k ≤ l < NGl(M) if l = N

2.2.2. Feature Vectors

Then we can give our definition of feature vectors. De Bonetand Viola consider this type of vectors as ”local texture mea-sures” which is created using a filter bank. In [18], the filtersare first and second derivatives, and in later work steerablefilters are used. In [17], the filters are horizontal and verti-cal, first and second derivative operators. Through experi-ments we found that the second derivations have little effecton the result. So we just apply the first derivation to ourfeature vector, defined as:

Fj(Mi) = (Lj(Mi),Hj(Mi), Vj(Mi)) for j = 0, ...., N.

Where H and V are:

Hj(M) = Gj(M)⊗

h , for k ≤ l ≤ N

Vj(M) = Gj(M)⊗

v , for k ≤ l ≤ N

Here h is the kernel for horizonal derivation and v for verti-cal derivation.

2.3. Parent Structure And Predicting

2.3.1. Definition of Parent Structure

As described above, we can get the feature vectors fromthree pyramids. And then we can form parent structure [18]which consist of a series of feature vectors.

Suppose that (m,n) is a pixel in the lth level of a pyra-mid, its parent at the (l+1)th level is (bm

2 , n2 c). Therefore

the parent structure vector of a pixel (m,n) in the lth levelis defined as:

PSl(M)(m,n) = (Fl(M)(m,n), ..., FN (M)(b m

2N−1,

n

2N−1c))

Given a low resolution model or the training sample,there parent structures can be computed. We can get vec-tors of level 0 to N from training sample but k to N froma model which is 2k times smaller. Our purpose is to pre-dict the 0 to k-1 feature vectors of the low resolution modelbased on the training set.

II-383

2.3.2. Prediction Algorithm

We adopt a simple but efficient prediction algorithm, similarto [10]. Given a training set Ti and an input sample t, thealgorithm is to predict each element (m,n) in G0(t):

1. Create PS0(t)(m,n) and copy all information forlevels k...N from PSk(t)(m,n).

2. Find j = argmini‖PSk(t)(m,n)−PSk(Ti)(m,n)‖3. Copy all information for levels 0...k−1 from PS0(Tj)(m,n)

into PS0(t)(m,n).The distance function ‖ · ‖ is a weighted L2 norm. We

give the other feature components half as much weight asthe Laplacian part which has main effect on the performance.

2.4. Dealing with ”Block Effect”

Because prediction algorithm simulates the predicted reg-ularized image just via copying from the training samplesblock by block, the ”block effect” usually appears in the re-sulted image, which often looks non-obviously in the 2Dimage but becomes noticeable after transformation to 3Dmodel. We can calculate the derivation of result at the stepof 2D image processing before reconstruction to get rid ofhigh frequency information which is useless actually. Itmakes the surface more smooth and avoid ”block effect”at a certain extent. The kernel K used in this paper is:

K =

1 1 11 1 11 1 1

/9

3. EXPERIMENTAL RESULTS

Our experimental data consist of 137 3D face models fromUSF Human ID 3D face database [19]. There are more than90,000 vertices and 180,000 faces after reduction of the in-valid vertices for our experiment. It would be too time con-suming because of the amount of data. For convenience,we use progressive mesh method by Hoppe[20] to down-sample the model, finally get the simplified version of 3Dface models, 128×96 vertices for each model after regular-ization, to test our approaches.

We use a resolution-reduced version of a 3D face modelas input, and consider others as training samples, to predictthe original model. Meanwhile, we also show the results bycubic interpolation for comparison. The super resolution re-sults of reduced twice (1/16) and reduced three times (1/64)are shown in Fig. 4 and Fig. 5 respectively. Notice that themagnified part of the nose area in the figure (denoted byblack curves), the result by our method is obviously closerto original model than that by cubic interpolation.

Figure 6 demonstrates the algorithm performance, mea-sured by average RMS pixel error, varies with the numberof training samples. Both 1/16 input and 1/64 input are

(a) Original (b) Input 1/16

(c)By our method (d)By cubic interpolation

Fig. 4. Super-resolution result of a 1/16 3D face model.

plotted. Obviously, cubic interpolation is independent oftraining samples. For the curves in the figure, the algorithmperformance increases very slow beyond about 80 trainingsamples. It indicates 80 training samples have covered mostof variations of the face shape.

4. CONCLUSIONS

This paper has presented a learning-based super resolutionmethod for one 3D face model with low resolution. The ex-periments conducted on the USF 3D face database of 137face models show that the proposed algorithm has a greatimprovement over the cubic interpolation. It could be ap-plicable to generate a higher resolution version of the in-put model when absence of high resolution of the 3D facemodel. It also could be used to improve the performance of3D face recognition system.

5. REFERENCES

[1] M.Elad and Y.Hel-Or, “A fast super-resolution recon-struction algorithm for pure translational motion andcommon space invariant blur,” IEEE Transactions onImage Processing, 10(8):1187-1193,2001

[2] N.X.Nguyen, “Numerical algorithms for image super-resolution,” PhD thesis,Stanford University, Stanford,CA, 2000.

II-384

(a)Input 1/64

(b)By our method (c)By cubic interpolation

Fig. 5. Super-resolution result of a 1/64 3D face model.

[3] F. M.Candocia and J.C.Principe, “Super-Resolutionof Images Based on Local Correlations,” IEEE-NN,10(2):372, 1999

[4] D.Capel and A.Zisserman. “Super-Resolution fromMultiple Views using Learnt Image Models,“ IEEECVPR,pp.627-634,2001

[5] M.V.Joshi, S.Chaudhuri. “Zoom Based Super resolu-tion Through SAR Model Fitting,“ IEEE ICIP, 2004.

[6] H.Chang, D.Y.Yeung, Y.Xiong, M.V.Joshi, S. Chaud-huri, “ Super-Resolution Through Neighbor Embed-ding,“ IEEE CVPR, 2004.

[7] Wen-Yi Zhao, “Super-Resolution With Significant Illu-mination Change,“ IEEE ICIP, 2004

[8] S.Baker and T.Kanade, “Limits on Super-Resolutionand How to Break Them,“ IEEE PAMI, 24(9):1167-1183, 2002.

[9] Gang Pan and Zhaohui Wu,“3D Face Recognition fromRange Data,“ Int’l Journal of Image and Graphics,5(3):1-21, 2005.

[10] S. Baker and T. Kanade, “Hallucinating faces,” 4th In-ternational Conference on Automatic Face and GestureRecognition, March, 2000

[11] Z.Lin, H.Y.Shum, “Fundamental Limits of Recon-struction Based Superresolution Algorithms under Lo-cal Translation,” IEEE PAMI, 26(1):83-97, 2004.

[12] R.D.Morris, P.Cheeseman, V.N. Smelyanskiy, D.A.Maluf “A Bayesian Approach to High Resolution 3D

20 40 60 80 100 1200

2

4

6

8

10

12

14

16

18

Number of Training Samples

RM

S E

rror

Our methodCubic Interpolation

(a) 1/16 input

20 40 60 80 100 1200

2

4

6

8

10

12

14

16

18

Number of Training Samples

RM

S E

rror

Our methodCubic Interpolation

(b) 1/64 input

Fig. 6. Performance v.s. number of training samples.

Surface Reconstruction from Multiple Images ,” Proc.of IEEE Workshop on Higher Order Statistics, 1999.

[13] Z.Zhang, R.Deriche, O.Faugeras, and Q.-T.Luong “Arobust technique for matching two uncalibrated imagesthrough the recovery of the unknown epipolar geome-try,” Technical Report No 2273, INRIA, Sophia Antipo-lis, 1994.

[14] B.Horn and M.Brooks. “Shape from Shading,” MITPress, 1989.

[15] P.Burt, “Fast filter transforms for image processing,“Computer Graphics and image Processing,16:20-51,1980

[16] P.Burt and E.Adelson, “The Laplacian pyramid as acompact image code,“ IEEE Transactions on Commu-nications. 31(4):532-540, 1983.

[17] W.T.Freeman and E.H.Adelson. “The design and useof steerable filters,“ IEEE PAMI. 13:891-906, 1991.

[18] J.DeBonet, P.Viola. “Texture recognition using a non-patrametric multi-scale statistical model,“ Proc. CVPR,pp.641-647, 1998.

[19] V.Blanz, T.Vetter, “ Morphable Model for the Synthe-sis of 3D Faces,” SIGGRAPH’99 Conference Proceed-ings, pp. 187-194, 1999

[20] Hugues Hoppe. “Progressive Meshes,“ SIG-GRAPH’96 Conference Proceedings, pp.99-108, 1996.

II-385

Date post:	03-Jan-2017
Category:	Documents
Upload:	phamkien
View:	216 times
Download:	0 times

LEARNING-BASED SUPER-RESOLUTION OF 3D FACE MODEL

Documents