An Adaptive Focal Connectivity Algorithm for Multifocus...

Abstract

Multifocus fusion is the process of fusing focal information from a set of input images into one all-in-focus image. Here, a versatile multifocus fusion algorithm is presented for application-independent fusion. A focally connected region is a region or a set of regions in an input image that falls under the depth of field of the imaging system. Such regions are segmented adaptively under the predicate of focal connectivity and fused by partition synthesis. The fused image has information from all focal planes, while maintaining the visual verisimilitude of the scene. In order to validate the fusion performance of our method, we have compared our results with those of tiling and multiscale fusion techniques. In addition to performing a seamless fusion of the focally connected regions, our method out performs the competing methods regarding overall sharpness in all our experiments. Several illustrative examples of multifocus fusion are shown and objective comparisons are provided.

1. Introduction When a 3-dimensional scene is being imaged, it is

desirable to have all the objects and surfaces comprising the scene to be in focus in the acquired image. Typically, lenses experience the problem of limited depth of field (DOF) and this makes the acquisition of such an all-in-focus image difficult. This is a major problem in many imaging applications, e.g. inspection of microscopic scenes and long range feature tracking. In multifocus fusion, the central idea is to acquire focal information from different focal planes in the scene and fuse them into one image where all the focal planes appear to be in focus, as demonstrated on the ‘Triplanar’ dataset in Figure 1. In other words, to create a scene as if imaged through a lens with extremely narrow aperture, without the sensitivity issues such lenses possess. Previous works in the literature investigate various solutions to the problem of multifocus fusion by formulating methods based on tiling, multiscale

(a) (b)

(b) (d)

Figure 1. An example of multifocus fusion on the ‘Triplanar’ dataset, (a-c) input images with different regions of the scene in focus and (d) a multifocus fused image where all imaged planes are in focus. Multifocus fusion simulates image acquisition with an extended depth of field. decomposition (MSD), and learning methods. In methods using tiling, the input images are initially divided into sets of blocks or tiles. By using a sharpness criterion to vote, one block per set is selected and a mosaic of such selected blocks forms the fused image. The most commonly reported problems in this technique are from blocking effects [1]. In MSD based methods, the input images are decomposed into multiscale coefficients initially. Various fusion rules are used in the selection or manipulation of these coefficients and synthesized via inverse transforms to form the fused image. The most widely reported issues in this family of multifocus fusion are ringing effects and related distortions [2]. Moreover, realizing a wavelet kernel that can adapt itself towards multiple applications is difficult. Many of the datasets studied in the literature, perform fusion on input images where the objects of interest are placed well apart in the 3-D scene. Commonly, two input images are used for fusion. In certain applications, such as microscopic scene inspection, the

An Adaptive Focal Connectivity Algorithm for Multifocus Fusion

Harishwaran Hariharan, Andreas Koschan, and Mongi Abidi Imaging, Robotics and Intelligent Systems Laboratory

University of Tennessee, Knoxville, TN-37996 [email protected]

1-4244-1180-7/07/$25.00 ©2007 IEEE

extremely narrow DOF requires numerous image acquisitions to gather all the information contained in the scene. Since the DOFs partly overlap one another, there are redundant sections of a focused region in successive frames.

In this effort, we discuss a general purpose multifocus fusion method that is capable of fusing data from varied applications, such as microscopic scene inspection and long range feature tracking. Our method performs fusion of multiple focal planes with narrow overlapping areas of the scene, by segmenting focally connected regions. Segmentation based methods segment the scene, based on object geometries in the image. The main contribution in this paper is that we segment regions from the set of input images by establishing focal connectivity, and not by physical connectivity. The advantage of using focal connectivity is that the algorithm is no longer dependent on geometries of the image, but on regions of the image in the effective depth of field. We unify information from such focally segmented partitions from all the focal planes, into one all-in-focus image. Before presenting the details of our focal connectivity (FC) and adaptive focal connectivity (AFC) fusion in Section 3, we discuss some related work in Section 2. We have compared our results with those of fusion methods (as discussed in [3,4]) , and present experimental results and comparisons in Section 4.

2. Related work Multifocus fusion has been an active area of

investigation in the past decade and has been addressed using various approaches. The three seminal approaches are based on region selection methods [3, 5], multiscale decomposition (MSD) methods [1, 4], and learning based methods [6, 7]. In region selection methods, the input images are divided into a set of finite regions initially, typically block neighborhoods [3, 5] or into segments using segmentation techniques [6, 8]. From sets of such regions, one region per set is selected based on a sharpness criterion vote and blended or mosaiced to synthesize the final fused image. The value of the sharpness criterion increases or decreases as objects converge into or diverge out of focus, and if the contrast changes in the scene [9]. In pixel and window-based methods, blocking effects are visible and usually are sufficient to address the problem, if objects are isolated from each other in the 3D scene. In region based methods, regions are classically selected in the image based on edge geometries. Fedorov et al. use a graph cut based region selection method that takes advantage of high frequency information in the input image and use multi-resolution splines for blending the selected regions [10]. Liao et al. use Hough transform to select regions based on object connectivity, which are mosaiced into the fused image [6]. Lewis et al. segment the

input images based on an entropy priority map and fuse the segments [8]. In segmentation based methods, principally high frequency information in the region of DOF in each input image is utilized. When the DOF is constricted, edge associations are not congruent between the input images, owing to imaging optics. An edge in one input image fails to be an edge in another input image. Hence, segmentation based on physical object boundaries becomes unreliable.

In MSD based methods, the input images are decomposed into scale coefficients and fusion is performed by using a set of fusion rules in the synthesis operations [1]. Typically, application specific basis functions are built to extract details suitable for fusion. Fusion rules center around coefficient manipulation or substitution at the decomposition levels and, this changes the intensity values of the fused image [2]. These effects are not very visible when using simple datasets but become a serious issue in precise scene inspection. Learning based methods use training machines which learn to segregate between sharp and blurred regions. The training machine selects the regions used for fusion and is typically computationally intensive. Training is performed on prescribed focused and unfocussed training data sets. When a region under analysis has no sharpness in the entire set, i.e. unseen data, misclassification occurs and learning based methods perform averaging or compel one arbitrary region as the fused image. In our method, we address the discussed issues by, (1) choosing regions based on focal connectivity, which leads to a complete partitioning of the fused image space and reducing the effects of blocking substantially, (2) by avoiding pixel value manipulations and hence preserving the original image content and (3) incorporating intelligence to tolerate unseen data by choosing the least blurred image, when presented with a set of images with varying degrees of blur. Our method is computationally straightforward with modest hardware and software requirements.

3. Adaptive focal connectivity Each image in a set of input images has certain regions

of the scene in focus. Since segmenting images based on edge geometries is often unreliable, we segment regions based on focal connectivity. A focally connected region is a region or a set of regions in an input image that fall on the same focal plane. These regions are connected focally with or without physical continuities. The central idea of our method is to isolate and attribute such partitions to one particular input image. The selected partition maps focally connected regions in one image that are in better focus than their relative counterparts from all the input images, to the fused image. Initially, a sharpness map Si(x, y) is calculated for every input image Ii(x, y) As a predecessor to this step the input images are filtered with Sobel masks,

to approximate Ixi(x,y) and Iyi(x,y), the horizontal and vertical gradients respectively. These are utilized to compute a sharpness map Si(x, y) for each of the N input images by,

{ }N,...,,i

)y,x(I)y,x(I)y,x(S yixii

21

22

∈∀

+=. (1)

When performing fusion by establishing focal

connectivity, henceforth referred to as Focal Connectivity (FC) fusion, we filter the sharpness maps by an empirically selected convolution mask to make the system less vulnerable to fluctuations dependent on sensor (e.g. noise), optics (e.g. magnification and side lobes), local contrast and illumination at the scene. This increases the accuracy of the decisions by making certain that areas with better focus influence the choice of its neighbors. In Adaptive Focal Connectivity (AFC) fusion, a data-driven process is used to select the convolution mask for filtering the sharpness maps. The motivation here is to (1) remove the empirical selection of the convolution mask and to (2) tune the convolution mask to the size of the connected objects falling in the DOF. The sharpness masks are subject to connected component analysis. The bounding boxes of all the connected components are obtained and the average dimensions of these bounding boxes are used as the size of the convolution mask. Thus, the convolution mask adapts to the average dimension of the objects in the scene. If the scene is made of very small objects, the convolution mask becomes smaller and if the scene is dominated by larger objects, the convolution mask becomes larger to accommodate the local scene. Very small connected components are ignored to make AFC fusion more resilient to noise. The sharpness maps are adaptive filtered with the convolution mask Ci(x,y) as follows,

{ }.N,...,,i)y,x(C)y,x(S)y,x(S iifi 21∈∀∗= (2)

These sharpness maps are scrutinized for regions of higher sharpness with their respective counterparts. When the sharpness image of input image Ii(x,y), of N input images, is compared with its N-1 counterparts, one focally linked region, Fi(x,y) is isolated by,

{ } { }.N,...,,ikN,...,,i

)y,x(S)y,x(S)y,x(F }ik{ffii

2121 ∈≠∀∈∀

>= ≠ (3)

3.1. Synthesis of the fused image The fused image space R(x,y) is formed by the union set

of the partitions. The intersection of the partitions is the

null set, which in our case corresponds to the out of focus regions in all of the N input images. Image partitioning by establishing focal connectivity partitions R(x,y) into n sub regions, R1, R2, …, Rn ,such that,

(4a)

Ri ∩ Rj = ∅, ∀ i & j = 1,2,3, …N, (4b)

P(Ri) = TRUE, ∀ i & j = 1,2,3, …N, (4c)

P (Ri ∪ Rj) = FALSE , ∀ {i≠j} = 1,2,3, …N, (4d)

where the operator.

× denotes pixel-wise multiplication. The condition 4a implies that partitioning should be complete and Ri is a connected region, for ∀ i = 1, 2,…, N, in the sense of the predicate of focal connectivity. This condition is paramount and forms the main crux of the synthesis. Due to the complete partitioning, the entire fused image space is partitioned and all the pixel locations are populated. Condition 4b implies that there should be no ties in the voting and partitions must be disjoint from one another. The partitioning is unique and a focally connected region is uniquely mapped to the fused image space The predicate 4c, requires that all elements of a partition must belong to only one focally connected set. The final constraint 4d necessitates that Ri and Rj should be separate in the sense of the predicate P. For each focally connected partition, a corresponding mask is created and a pixel wise multiplication is done to get the actual image partitions. The image partitions are then seamlessly mosaiced to form the fused image R(x, y) using 4a. The pixel values are not modified at any point in the algorithm and thus, the algorithm provides an undistorted representation of the scene. There is no duplication in the partitions; therefore blending at the peripheries of the partitions is not required. The formulated predicate allows us to capitalize on focal overlaps. Often, when a region falls under overlapping DOF we obtain N counterparts of the region under varying degrees of blur. Our method is able to choose between such blurred counterparts and select the least blurred counterpart for fusion. This surmounts learning based techniques in the aspect that unseen data can be handled effectively. A schematic of our fusion technique is presented in Figure 2, where various stages of our fusion method are depicted.

{ } ,N,...,,i),y,x(I)y,x(F)y,x(R

where,)y,x(R)y,x(R

i

.

ii

n

ii

21

1

∈∀×=

==�

Figure 2: A schematic of the proposed adaptive focal connectivity (AFC) multifocus fusion algorithm

4. Experimental results In our experiments, we have performed fusion and

related analyses on various datasets from different imaging applications. While acquiring images of a 3D scene, the dimensions of the objects and their relative positions in the scene characterize the complexity of the multifocus fusion algorithm needed. If the objects are placed well apart without focal overlaps, the problem of multifocus fusion becomes simpler. If the DOF is extremely narrow and information from the scene is imaged with many individual frames, the fusion algorithm requires more intelligent operations to perform fusion. In practice, it is possible to image an object at adjacent planes that have overlapping regions in focus. Data used for this effort varies from microscopic to longer range data sets. Our method assumes registered images for fusion. Image registration for multifocus fusion is a rich area of research and various methods for robust registration exist. Employing projective transformations, as discussed in [5], is appropriate for aligning multifocus data.

To compare our technique with other contemporaries in the literature, we have compared our method with a tiling and MSD based method. There is a consensus that these two families are the most widely used as observed in [2]. For the tiling based technique, multiple size

windows were used as discussed in [3]. The fused image with the highest sharpness content was chosen to establish a fair comparison against our method. For MSD based fusion, we have implemented the fusion algorithm, due to Frechette and Ingle. [4]. This method is selected as it was designed for fusion with multiple frames and has many parallels with our method. This method uses the coiflet wavelet (level 2) family which is reported suitable for multifocus fusion.

We present an example of a long range data set in Figure 3. The ‘Wall’ dataset, comprising of 16 images, is rich in texture with varying size and orientation of the objects (bricks). Numerous shots of the scene are essential to gather focal information from the 3D scene, given the extremely narrow depth of focus. In Figure 3(a-e), we show some of input images acquired. Different regions are in focus in each of the input images. We use these focused regions in synthesizing an all-in-focus image. In Figure 3(f), the results of the tiling approach are presented. A reasonable perception on the scene can be obtained. There are visible blocking effects as shown with arrows. In Figure 3(g), results from the MSD based fusion method are presented. Information from all the input images is seen in the fused image. Contrast changes are visible due to intensity manipulations in the fusion process. In Figure 3(h), image partitioning of the fused image space is shown by a color coding scheme. Each color coded section in this image is one focally

.

×

Union of partitioned input images

Input Images Partitions

Fused Image

Adaptive Filtering

Input Image

Sharpness Map

Image Partition

FOR: Each input image

FOR: All input images

Comparison with counterparts from the input image set

connected area from one input image. In Figure 3(i), we show the image fused using our method. Here, focally connected regions are selected from the input images and synthesized to form the fused image. Border artifacts are substantially reduced and a crisp overall perspective of the scene is obtained.

In our experiments, we observed that FC fusion provided better fusion than the methods implemented for comparison. We also observed the AFC fusion had better performance than FC fusion. AFC fusion further reduces border artifacts. In Figure 4, a comparison of FC fusion and AFC fusion is presented. A DOF standard was developed in-house for experimentation. Patterns of various resolutions were incorporated into a single standard for multifocus imaging. In Figure 4 (a-d), some of the input images from the ‘Standard’ data set are presented. In Figure 4(e), the FC fused image is shown. Information from all the focal planes is fused into the image. There are some minute border effects (shown with arrows) that are absent in the AFC fused image in Figure 4 (f). Since improvement in fusion quality is hard to visually validate, we performed objective evaluations of the fused images as well. The images fused using tiling, MSD, FC and AFC fusion were evaluated using the Tenengrad sharpness measure [11]. The Tenengrad sharpness measure T for a fused image R is obtained by,

��= =

+=m

i

n

j

yx )y,x(R)y,x(RT1 1

22 , (5)

where m×n is the total number of pixels in R and x and y denote directional gradient operations. The objective results are consistent with visual inspection and show that our method produces images with better overall sharpness. The results of some of the objective tests are summarized in Table 1. The images fused using our method have the most measured sharpness when compared against the other methods. AFC fusion has better fusion performance over FC Fusion.

4. Conclusions A general purpose multifocus fusion method, where

focally connected regions are used in the synthesis of an all-in-focus image, was presented. Our method capitalized on overlapping focal information to extend the DOF, while retaining the visual verisimilitude of the scene. We demonstrated multifocus fusion on datasets from different applications. Comparisons with competing methods are made. Our focal connectivity algorithms (FC and AFC) outperformed the competing algorithms regarding sharpness in all our experiments.

Table 1: Comparison of Tenengrad measures [11] of images fused using different fusion methods [3, 4]

Fusion methods compared in this effort Datasets Tiling MSD FC AFC

Microscopic 1.03 E+10

0.467 E+10

1.14 E+10

1.18 E+10

Standard 0.534 E+08

0.295 E+10

1.92 E+10

3.31 E+10

Wall 3.02 E+09

1.56 E+09

8.00 E+09

8.26 E+09

Triplanar 1.32 E+09

0.136 E+09

1.41 E+09

1.46 E+09

Acknowledgements This research was supported by the DOE URPR under grant

DOE-DE-FG02-86NE37968 and BWXT-Y12 #4300056316.

References [1] I. De and B. Chanda. A simple and efficient algorithm

for Multifocus Image Fusion using Morphological Wavelets. IEEE Transactions in Signal Processing, vol. 86, 2006, pp. 924-936.

[2] Z. Zhang and R. S. Blum. A categorization of MSD based image fusion schemes with a performance study for a digital camera application. Proc. of the IEEE, vol. 87, 1999, pp. 1315-1326.

[3] R. Redondo, F. Sroubek, S. Fischer, and G. Cristobal. Multifocus fusion with multisize windows. Proc. of SPIE, 2005, pp. 410-418.

[4] S. Frechette and V. K. Ingle. Gradient based multi-focus video image fusion. Proc. of IEEE Conference on AVSBS, 2005, pp. 486-492.

[5]

A. Goshtasby. Fusion of multi-focus images to maximize image information. Defense and Security Symposium, Orlando, Florida, 2006, pp. 17-21.

[6] Z. W. Liao, S. X. Hu, and Y. Y. Tang. Region-based multi-focus image fusion based on Hough transform and wavelet domain HMM. Proc of International Conference on Machine Learning and Cybernetics, vol. 9, 2005, pp. 5490-5495.

[7]

L. Shutao, T. K. James, and W. Yaonan. Multifocus image fusion using artificial neural networks. Proc. of International Conference on Machine Learning and Cybernetics, 2005, 985-997.

[8] J. J. Lewis, R. J. O’Callaghan, S. G. Nikolov, D. R. Bull, and C. N. Canagarajah. Region-based image fusion using complex wavelets. Proc. of International Conference on Information Fusion, 2004, 555-562.

[9]

M. B. Ahmad and T. S. Choi. A heuristic approach for finding best focused shape. IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no.4, 2005, pp. 566–574.

[10]

D. Fedorov, B. Sumengen, and B. S. Manjunath. Multi-focus imaging using local focus estimation and mosaicking. IEEE ICIP 2006, pp. 2093-2096.

[11] E. P. Krotkov, Active Computer Vision by Cooperative Focusing. USA: Springer Verlag, 1989.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 3. Comparison of multifocus fusion on the ‘Wall’ dataset (a-e) input images where different sections of the wall are in focus, (f) Fusion using tiling [3] (Blocking artifacts are shown by arrows) (g) Fusion using MSD based fusion [4] , (h) Isolated partitions shown by color coding where each color represents one focally connected region from one input image and (e) Fusion using our method, where all the information from the input images are combined to form an all-in-focus image of the 3D scene.

(a) (b) (c)

(d) (e) (f)

Figure 4. Comparison of FC fusion and AFC fusion on the ‘Standard’ dataset (a-d) input images where different sections of the standard are in focus, (e) image fused by FC fusion and, (f) Image fused by AFC Fusion. Upon close inspection, very minute border artifacts are visible in the FC fused image (highlighted with arrows) which are absent in the AFC fused image.

Date post:	03-Jun-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

An Adaptive Focal Connectivity Algorithm for Multifocus...

Documents